PDA

View Full Version : Strange bug 2


csmcsm
03-22-06, 01:14
hello,

Sometimes my hardware node crashes above night, no vps server works any more - and on the host node i have the folowing if i do an ps axu:


root 10034 0.0 0.1 5384 1388 ? S 04:12 0:00 crond
root 10035 0.0 0.1 5384 1388 ? S 04:12 0:00 crond
root 10036 0.0 0.0 2100 848 ? Ss 04:12 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10037 0.0 0.0 2100 868 ? D 04:12 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10038 0.0 0.0 3776 852 ? Ss 04:12 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10039 0.0 0.0 3776 872 ? D 04:12 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10048 0.0 0.1 5384 1388 ? S 04:15 0:00 crond
root 10049 0.0 0.0 2792 852 ? Ss 04:15 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10050 0.0 0.0 2792 872 ? D 04:15 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10055 0.0 0.1 5384 1388 ? S 04:16 0:00 crond
root 10056 0.0 0.0 3532 852 ? Ss 04:16 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10057 0.0 0.0 3532 872 ? D 04:16 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10062 0.0 0.1 5384 1388 ? S 04:18 0:00 crond
root 10063 0.0 0.0 3656 852 ? Ss 04:18 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10064 0.0 0.0 3656 872 ? D 04:18 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10069 0.0 0.1 5384 1388 ? S 04:20 0:00 crond
root 10070 0.0 0.1 5384 1388 ? S 04:20 0:00 crond
root 10071 0.0 0.0 3992 848 ? Ss 04:20 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10072 0.0 0.0 3992 868 ? D 04:20 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10073 0.0 0.0 2916 848 ? Ss 04:20 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10074 0.0 0.0 2916 868 ? D 04:20 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10084 0.0 0.1 5384 1388 ? S 04:24 0:00 crond
root 10085 0.0 0.1 5384 1388 ? S 04:24 0:00 crond
root 10086 0.0 0.0 3320 852 ? Ss 04:24 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10087 0.0 0.0 3320 872 ? D 04:24 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10088 0.0 0.0 2868 852 ? Ss 04:24 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10089 0.0 0.0 2868 872 ? D 04:24 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10092 0.0 0.1 5384 1388 ? S 04:25 0:00 crond
root 10093 0.0 0.0 2568 852 ? Ss 04:25 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10094 0.0 0.0 2568 872 ? D 04:25 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10101 0.0 0.1 5384 1388 ? S 04:28 0:00 crond
root 10102 0.0 0.0 3104 848 ? Ss 04:28 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10103 0.0 0.0 3104 868 ? D 04:28 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10108 0.0 0.1 5384 1388 ? S 04:30 0:00 crond
root 10109 0.0 0.0 3896 852 ? Ss 04:30 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10110 0.0 0.1 5384 1388 ? S 04:30 0:00 crond
root 10111 0.0 0.0 3040 852 ? Ss 04:30 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10112 0.0 0.0 3040 872 ? D 04:30 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron.pl >>/var/log/hspher
root 10113 0.0 0.0 3896 872 ? D 04:30 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-net-reconfig.pl >>/v
root 10119 0.0 0.1 5384 1388 ? S 04:32 0:00 crond
root 10120 0.0 0.0 2632 848 ? Ss 04:32 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
root 10121 0.0 0.0 2632 868 ? D 04:32 0:00 /bin/sh -c /hsphere/shared/scripts/cron/vps-cron-delete.pl >>/var/log
,........


i can't do anythink - only push the off/on button.

yura
03-22-06, 02:57
Hello,

We had problem like yours before.
Please send us following info from your box:
# uname -a
# cat /proc/meminfo
# cat /proc/cpuinfo

If the problem will appears again, we will be appreciate you for "sysrq" report.
To enable sysrq reports run:
# sysctl -w kernel.sysrq=1

Then when your box will lock (like you described), please run:
# echo "t" > /proc/sysrq-trigger
This log "showTasks" report into /var/log/messages
# echo "m" > /proc/sysrq-trigger
This log "showMem" report into /var/log/messages

Then we will be able to exam /var/log/messages file you send.

Thank you for cooperation
Best Regards

csmcsm
04-10-06, 01:16
plese send me your email-address and i will sent you the reports

csmcsm
05-03-06, 03:46
Hello,

I think i have found out the Problem:

If is set an cpu limit for the hsphere-vps servers who are running on this server (all vps servers had an 10% cpu limit) the server crashes!
If i set the cpu-limit on all vps-servers to 0 (no limitation) all works fine!


any idears?

regards
Chris

yura
05-05-06, 08:08
Hello,

Thank you for reporting us the situation.
We will check the CPU limit issue and let you know results.
However all virtual servers on our test box are CPU hard limited and not crashed.

Best Regards

csmcsm
05-07-06, 16:29
hello,

now one of my servers crashed again :o(

it is maybe an problem if java runns inside the machine..

My server crashed as i worked on an java/tomcat webmail VPS-Server - and then the whole Hardware-machine crashes with the same errors.


I NEED an fix! - i have about 5 complete different hardware-machines - and all of them crashes sometimes!!!

csmcsm
05-11-06, 04:28
hello,

now one of my servers crashed again - this was the slowes server (pIII 1GhZ) with 6 vps-servers on it - but i have NOT set any CPU Limit!!!!

regards
chris

yura
05-11-06, 05:28
Hello,

Please, configure netdump on your VPS hosts to log all crashes.

Read more about netdump configuration at:
http://forum.psoft.net/showthread.php?p=89359#post89359

We need more log information to investigate the problem.

Best regards