* [parisc-linux] init pause() on some systems but not all?
@ 2005-12-29 21:33 Joel Soete
2005-12-31 14:24 ` [parisc-linux] " Max Grabert
0 siblings, 1 reply; 2+ messages in thread
From: Joel Soete @ 2005-12-29 21:33 UTC (permalink / raw)
To: parisc-linux
Hello all,
I am experimentiting a very werid pb with some of my p-l boxes:
just after a fresh reboot, even after only some 2h, I can launch the reboot cmdl without pb.
But after some uptime, to 'reboot' them i need to force it with 'reboot -f' with inconvenience it supposes ;-(
mmm when the pb arises, simple 'reboot' just shows the common message as the system will reboot, but nothing hapen?
Any other telinit [S6] didn't respond more?
And if I kill a runing getty (a one launched by init at start up), init didn't respawn it?
I already tried:
o to downgrade sysvinit, no help;
o to stop nfs/portmap deamon (I read a similar reported about this), no help.
this occures on most systems but not all, see:
no pb on:
o my c110 running k-2.6.14.4-vs2.1.0-pa0 and unstable debian
o a b180 running k-2.6.14-pa0 and unstable debian too
but pb on:
o n4k 64bit smp debian unstable (iirc k-2.6.15-rc6-pa1)
o b2k 32bit up debian unstable & k-2.6.15-rc6-pa1
o d380 32bit up debian testing & k-2.6.14.4-vs2.1.0-pa0 (i.e. nearly the same as c110)
o another b180 runing the exactly the same k-2.6.14-pa0 as b180 above mentioned but debian testing
o the last b180 runing k-2.6.15-rc6-pa0 (gcc-4.1) and debian unstable
Unfortunately no means to strace init:
# strace -p 1
attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
(the same on my i386 ;-( )
Anyway Mike help me to figure out that on affected systems, top (with addtional field WCHAN = Sleeping in Function)
init is in 'pause' not on the others (where it's in select)?
I tried following 'Watch_Init' script on d380 and b2k:
#!/bin/sh
#set -x
AWK="/usr/bin/awk"
CAT="/bin/cat"
DATE="/bin/date"
GREP="/bin/grep"
TOP="/usr/bin/top"
TOPRC="/root/.toprc"
TEE="/usr/bin/tee"
if [ -f $TOPRC ]
then
echo "$TOPRC exist: please save it before retry."
exit 1
fi
$CAT > $TOPRC <<EOF
RCfile for "top with windows" # shameless braggin'
Id:a, Mode_altscr=0, Mode_irixps=1, Delay_time=3.000, Curwin=0
Def fieldscur=AEHIOQTWKNMbcdfgjplrsuvYzX
winflags=62777, sortindx=10, maxtasks=0
summclr=1, msgsclr=1, headclr=3, taskclr=1
Job fieldscur=ABcefgjlrstuvyzMKNHIWOPQDX
winflags=62777, sortindx=0, maxtasks=0
summclr=6, msgsclr=6, headclr=7, taskclr=6
Mem fieldscur=ANOPQRSTUVbcdefgjlmyzWHIKX
winflags=62777, sortindx=13, maxtasks=0
summclr=5, msgsclr=5, headclr=4, taskclr=5
Usr fieldscur=ABDECGfhijlopqrstuvyzMKNWX
winflags=62777, sortindx=4, maxtasks=0
summclr=3, msgsclr=3, headclr=2, taskclr=3
EOF
while true
do
# Sleeping in Function
WCHAN=$($TOP -p1 -n1 -b | $GREP " 1 root" | $AWK '{print $12}')
if [ "X$WCHAN" != "Xselect" ]
then
break
else
sleep 5
fi
done
$TOP -n1 -b 2>&1 | $TEE /var/logs/Watch_Init.doc
$DATE 2>&1 | $TEE -a /var/logs/Watch_Init.doc
exit 0
====<>====
may be not enough accurate because when it capture the 'switch', the 2 systems where doing different thing:
the d380:
top - 07:40:58 up 12:29, 2 users, load average: 2.96, 1.80, 0.88
Tasks: 72 total, 3 running, 69 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.4% us, 8.5% sy, 1.1% ni, 84.8% id, 1.1% wa, 0.0% hi, 0.1% si
Mem: 254716k total, 249192k used, 5524k free, 72340k buffers
Swap: 517480k total, 0k used, 517480k free, 66908k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ WCHAN COMMAND
5956 root 17 0 2900 1340 1024 R 7.3 0.5 13:49.14 syscall_d top
16067 root 16 0 2896 1224 924 R 7.3 0.5 0:00.25 read top
16072 root 29 10 1368 132 108 R 3.7 0.1 0:00.03 syscall_d cracklib-
16068 root 20 0 1744 528 420 S 2.4 0.2 0:00.03 pipe_wait tee
1756 jso 16 0 9676 1884 1184 S 1.2 0.7 0:27.06 select sshd
1 root 16 0 2292 808 664 S 0.0 0.3 0:48.35 pause init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd ksoftirqd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.05 msleep_in watchdog/
4 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 worker_th events/0
...
Thu Dec 29 07:40:59 CET 2005
the b2k:
top - 07:36:21 up 12:21, 3 users, load average: 1.80, 0.61, 0.25
Tasks: 81 total, 1 running, 80 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.6% us, 2.9% sy, 0.9% ni, 93.9% id, 0.6% wa, 0.0% hi, 0.0% si
Mem: 251828k total, 211644k used, 40184k free, 81176k buffers
Swap: 255928k total, 0k used, 255928k free, 90496k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ WCHAN COMMAND
7749 root 17 2 2952 1220 920 R 3.7 0.5 0:00.06 alloc_pag top
1736 root 16 0 2956 1344 1032 S 1.9 0.5 24:30.59 select top
7744 nobody 34 19 3584 1224 824 D 1.9 0.5 0:00.38 sync_buff find
1 root 15 0 1764 684 564 S 0.0 0.3 0:11.50 pause init
2 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd ksoftirqd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 msleep_in watchdog/
4 root 10 -5 0 0 0 S 0.0 0.0 0:04.52 worker_th events/0
...
Thu Dec 29 07:36:21 CET 2005
in fine, all seems different?
Am i the only one who experiment such pb?
Any idea how may I better tracing this pb? (lttng? for 2.6.14 only)
Thanks in advance,
Joel
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 2+ messages in thread
* [parisc-linux] Re: init pause() on some systems but not all?
2005-12-29 21:33 [parisc-linux] init pause() on some systems but not all? Joel Soete
@ 2005-12-31 14:24 ` Max Grabert
0 siblings, 0 replies; 2+ messages in thread
From: Max Grabert @ 2005-12-31 14:24 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
[-- Attachment #1: Type: text/plain, Size: 1135 bytes --]
Hi Joel & PA,
I also have the same, or at least similar problem on my c3700
(2.6.15-rc1-pa1, 2.6.15-rc5-pa3, debian/testing):
A 'shutdown' or 'reboot' does nothing except the wallop, and it seems that
(tel)init doesn't react to signals in general.
Also hitting the poweroff button just powers off the machine after a
certain amount
of time (around 30-60s), but the 'init 6' it should trigger doesn't work, thus
I have unchecked filesystems on the next boot.
This leads me to another, rather unrelated bug:
I only use xfs and it works almost flawlessly, except that it should
cope with a sudden
reboot, being a journaled filesystem and all ...
however due to the init/shutdown bug I often have to run a xfs_check,
and even a 'xfs_repair -L' in order to be able to mount the
filesystems again on the next boot.
Strangely the root fs was not affected so far, and luckily I didn't
have any file
corruption/loss so far (I had to use xfs_repair about 20 times by now).
(Un)fortunately I'm on vacation in Germany right now, so I cannot test/debug
the problem until mid-January.
Greetings,
Max
PS: I wish you all a happy New Year :)
[-- Attachment #2: Type: text/plain, Size: 169 bytes --]
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2005-12-31 14:24 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-29 21:33 [parisc-linux] init pause() on some systems but not all? Joel Soete
2005-12-31 14:24 ` [parisc-linux] " Max Grabert
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.