* [BUG] wait_for_devfsd_finished deadlock
@ 2001-12-02 3:11 William Lee Irwin III
2001-12-02 19:01 ` Richard Gooch
2001-12-04 5:04 ` Richard Gooch
0 siblings, 2 replies; 5+ messages in thread
From: William Lee Irwin III @ 2001-12-02 3:11 UTC (permalink / raw)
To: linux-kernel
While testing 2.4.17-pre1 with some other patches, a situation
reminiscent of a deadlock arose. mutt(1) would block indefinitely
while opening a large mbox, and all further calls to sys_open()
would block indefinitely.
After some further testing to isolate the problem, I reproduced
this behavior in the vanilla 2.4.17-pre1 kernel. The sysrq output showed
a number of processes with the following call trace in the devfs core:
Dec 1 17:29:26 holomorphy kernel: sh D C02A2A00 0 821 806
(NOTLB)
Dec 1 17:29:26 holomorphy kernel: Call Trace: [wait_for_devfsd_finished+170/208
] [devfs_d_revalidate_wait+135/160] [devfs_lookup+409/448] [d_alloc+27/400] [rea
l_lookup+83/192]
Dec 1 17:29:26 holomorphy kernel: [link_path_walk+1310/1888] [path_walk+26/3
2] [open_namei+131/1488] [filp_open+59/96] [sys_open+56/192] [system_call+51/56]
And the following call traces elsewhere:
Dec 1 17:29:26 holomorphy kernel: cron S E3540000 0 852 633
853 855 827 (NOTLB)
Dec 1 17:29:26 holomorphy kernel: Call Trace: [pipe_wait+124/176] [pipe_read+17
9/512] [sys_read+150/208] [system_call+51/56]
Dec 1 17:29:26 holomorphy kernel: procmail S E2395ED8 0 881 880
882 (NOTLB)
Dec 1 17:29:26 holomorphy kernel: Call Trace: [interruptible_sleep_on_locked+11
6/192] [locks_block_on+28/48] [posix_lock_file+178/1232] [fcntl_setlk+335/512] [
do_fcntl+318/512]
Dec 1 17:29:26 holomorphy kernel: exim S 00000000 0 905 899
907 (NOTLB)
Dec 1 17:29:26 holomorphy kernel: Call Trace: [sys_wait4+862/912] [system_call+
51/56]
Further diagnostic information is available upon request.
Mr. Gooch, your attention to this matter is much appreciated.
Thanks,
Bill
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [BUG] wait_for_devfsd_finished deadlock 2001-12-02 3:11 [BUG] wait_for_devfsd_finished deadlock William Lee Irwin III @ 2001-12-02 19:01 ` Richard Gooch 2001-12-03 1:34 ` William Lee Irwin III 2001-12-04 5:04 ` Richard Gooch 1 sibling, 1 reply; 5+ messages in thread From: Richard Gooch @ 2001-12-02 19:01 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin, III writes: > While testing 2.4.17-pre1 with some other patches, a situation > reminiscent of a deadlock arose. mutt(1) would block indefinitely > while opening a large mbox, and all further calls to sys_open() > would block indefinitely. > > After some further testing to isolate the problem, I reproduced this > behavior in the vanilla 2.4.17-pre1 kernel. The sysrq output showed > a number of processes with the following call trace in the devfs > core: Your sh process appears to be hung in wait_for_devfsd_finished(). It would be helpful to know what devfsd was doing at this time. If it were hung internally (in user-space), it would account for this behaviour. However, if devfsd crashes, then devfsd_close() will be called, which will wake any waiters. > And the following call traces elsewhere: Are these related? cron -> pipe_wait() procmail -> interruptible_sleep_on_locked() exim -> sys_wait4() Maybe these are just waiting on mutt(1)? > Further diagnostic information is available upon request. That's what I like to hear. Set CONFIG_DEVFS_DEBUG=y and boot with "devfs=dall" and send me the (verbose) kernel logs. That should show the sequence of events that lead to this. > Mr. Gooch, your attention to this matter is much appreciated. Just "Richard" is fine. I'm not a fan of formality. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] wait_for_devfsd_finished deadlock 2001-12-02 19:01 ` Richard Gooch @ 2001-12-03 1:34 ` William Lee Irwin III 2001-12-03 1:44 ` Richard Gooch 0 siblings, 1 reply; 5+ messages in thread From: William Lee Irwin III @ 2001-12-03 1:34 UTC (permalink / raw) To: Richard Gooch; +Cc: linux-kernel On Sun, Dec 02, 2001 at 12:01:58PM -0700, Richard Gooch wrote: > Your sh process appears to be hung in wait_for_devfsd_finished(). It > would be helpful to know what devfsd was doing at this time. If it > were hung internally (in user-space), it would account for this > behaviour. However, if devfsd crashes, then devfsd_close() will be > called, which will wake any waiters. A oops trace from quite a bit further back in my logs makes this appear to be an instance of an already reported problem. I apologize for overlooking this detail in my prior post: Dec 1 16:49:11 holomorphy kernel: printing eip: Dec 1 16:49:11 holomorphy kernel: c016536d Dec 1 16:49:11 holomorphy kernel: Oops: 0002 Dec 1 16:49:11 holomorphy kernel: CPU: 0 Dec 1 16:49:11 holomorphy kernel: EIP: 0010:[devfs_put+13/192] Not tainte d Dec 1 16:49:11 holomorphy kernel: EFLAGS: 00010206 Dec 1 16:49:11 holomorphy kernel: eax: 5a5a5a5a ebx: 5a5a5a5a ecx: 00000002 edx: 5a5a5a5a Dec 1 16:49:11 holomorphy kernel: esi: 00000000 edi: 00000026 ebp: 00000000 esp: ef785f40 Dec 1 16:49:11 holomorphy kernel: ds: 0018 es: 0018 ss: 0018 Dec 1 16:49:11 holomorphy kernel: Process devfsd (pid: 20, stackpage=ef785000) Dec 1 16:49:12 holomorphy kernel: Stack: 00000026 c01680dc 5a5a5a5a c1ca8d74 ffffffea 00000000 00000420 c1cc4800 Dec 1 16:49:12 holomorphy kernel: c02a2a00 ef653308 5a5a5a5a 000003fa 00 000000 00000000 00000001 00000000 Dec 1 16:49:12 holomorphy kernel: ef784000 00000000 00000000 00000000 ef 784000 c02a2a2c c02a2a2c c0130bd6 Dec 1 16:49:12 holomorphy kernel: Call Trace: [devfsd_read+860/992] [sys_read+1 50/208] [system_call+51/56] Dec 1 16:49:12 holomorphy kernel: Dec 1 16:49:12 holomorphy kernel: Code: ff 4b 04 0f 94 c0 84 c0 0f 84 9d 00 00 00 3b 1d d8 1b 30 c0 On Sun, Dec 02, 2001 at 12:01:58PM -0700, Richard Gooch wrote: > Are these related? > cron -> pipe_wait() > procmail -> interruptible_sleep_on_locked() > exim -> sys_wait4() > Maybe these are just waiting on mutt(1)? I don't believe so. In my configuration, exim delivers to a file through a procmail pipe, and to the best of my knowledge, mutt does little more than monitor the files with poll or select, which should not interfere with the completion of their operations. cron should be entirely unrelated, as no mail-related activities are scheduled with it. I suspect in the case I reported, the destruction of the devfsd thread is responsible for the deadlocks. Thanks, Bill ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] wait_for_devfsd_finished deadlock 2001-12-03 1:34 ` William Lee Irwin III @ 2001-12-03 1:44 ` Richard Gooch 0 siblings, 0 replies; 5+ messages in thread From: Richard Gooch @ 2001-12-03 1:44 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin, III writes: > On Sun, Dec 02, 2001 at 12:01:58PM -0700, Richard Gooch wrote: > > Your sh process appears to be hung in wait_for_devfsd_finished(). It > > would be helpful to know what devfsd was doing at this time. If it > > were hung internally (in user-space), it would account for this > > behaviour. However, if devfsd crashes, then devfsd_close() will be > > called, which will wake any waiters. > > A oops trace from quite a bit further back in my logs makes this > appear to be an instance of an already reported problem. I apologize > for overlooking this detail in my prior post: You're referring to the problem reported by Pierre Rousselet? > Dec 1 16:49:11 holomorphy kernel: printing eip: > Dec 1 16:49:11 holomorphy kernel: c016536d > Dec 1 16:49:11 holomorphy kernel: Oops: 0002 > Dec 1 16:49:11 holomorphy kernel: CPU: 0 > Dec 1 16:49:11 holomorphy kernel: EIP: 0010:[devfs_put+13/192] Not tainte > d > Dec 1 16:49:11 holomorphy kernel: EFLAGS: 00010206 > Dec 1 16:49:11 holomorphy kernel: eax: 5a5a5a5a ebx: 5a5a5a5a ecx: 00000002 > edx: 5a5a5a5a > Dec 1 16:49:11 holomorphy kernel: esi: 00000000 edi: 00000026 ebp: 00000000 > esp: ef785f40 > Dec 1 16:49:11 holomorphy kernel: ds: 0018 es: 0018 ss: 0018 > Dec 1 16:49:11 holomorphy kernel: Process devfsd (pid: 20, stackpage=ef785000) > Dec 1 16:49:12 holomorphy kernel: Stack: 00000026 c01680dc 5a5a5a5a c1ca8d74 ffffffea 00000000 00000420 c1cc4800 > Dec 1 16:49:12 holomorphy kernel: c02a2a00 ef653308 5a5a5a5a 000003fa 00 > 000000 00000000 00000001 00000000 > Dec 1 16:49:12 holomorphy kernel: ef784000 00000000 00000000 00000000 ef > 784000 c02a2a2c c02a2a2c c0130bd6 > Dec 1 16:49:12 holomorphy kernel: Call Trace: [devfsd_read+860/992] [sys_read+1 > 50/208] [system_call+51/56] > Dec 1 16:49:12 holomorphy kernel: > Dec 1 16:49:12 holomorphy kernel: Code: ff 4b 04 0f 94 c0 84 c0 0f 84 9d 00 00 > 00 3b 1d d8 1b 30 c0 > > > On Sun, Dec 02, 2001 at 12:01:58PM -0700, Richard Gooch wrote: > > Are these related? > > cron -> pipe_wait() > > procmail -> interruptible_sleep_on_locked() > > exim -> sys_wait4() > > > Maybe these are just waiting on mutt(1)? > > I don't believe so. In my configuration, exim delivers to a file > through a procmail pipe, and to the best of my knowledge, mutt does > little more than monitor the files with poll or select, which should > not interfere with the completion of their operations. cron should > be entirely unrelated, as no mail-related activities are scheduled > with it. Odd, especially cron. My version of crond doesn't seem to look in /dev at all. > I suspect in the case I reported, the destruction of the devfsd thread > is responsible for the deadlocks. That would make sense. If devfsd gets an Oops, the release() method won't be called, thus no waiters will be woken up. So it looks like this is the same problem that Pierre reported. So now I'm going to ask you for debugging help as well. The information I have from Pierre so far hasn't led me to inspiration. So, I want to see your complete kernel logs, booted with "devfs=dall" and make sure you've compiled with: CONFIG_DEVFS_DEBUG=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_HIGHMEM=y CONFIG_DEBUG_SLAB=y CONFIG_DEBUG_IOVIRT=y CONFIG_MAGIC_SYSRQ=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_BUGVERBOSE=y I don't remeber what kernel this was, but I'd prefer if you can test/reproduce this with 2.4.17-pre2. I'll also want lsmod output, your devfsd configuration and your complete .config. Just send it to me only: no need to spam the list with all this detail. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG] wait_for_devfsd_finished deadlock 2001-12-02 3:11 [BUG] wait_for_devfsd_finished deadlock William Lee Irwin III 2001-12-02 19:01 ` Richard Gooch @ 2001-12-04 5:04 ` Richard Gooch 1 sibling, 0 replies; 5+ messages in thread From: Richard Gooch @ 2001-12-04 5:04 UTC (permalink / raw) To: William Lee Irwin III; +Cc: linux-kernel William Lee Irwin, III writes: > While testing 2.4.17-pre1 with some other patches, a situation > reminiscent of a deadlock arose. mutt(1) would block indefinitely > while opening a large mbox, and all further calls to sys_open() > would block indefinitely. > > After some further testing to isolate the problem, I reproduced this > behavior in the vanilla 2.4.17-pre1 kernel. The sysrq output showed > a number of processes with the following call trace in the devfs > core: [...] Just a followup: did you get around to trying devfs-patch-v199.2 ? That should fix the problem. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2001-12-04 5:05 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-02 3:11 [BUG] wait_for_devfsd_finished deadlock William Lee Irwin III 2001-12-02 19:01 ` Richard Gooch 2001-12-03 1:34 ` William Lee Irwin III 2001-12-03 1:44 ` Richard Gooch 2001-12-04 5:04 ` Richard Gooch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox