* 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected
@ 2006-09-05 17:37 Miles Lane
2006-09-05 18:13 ` Andrew Morton
2006-09-06 6:39 ` Arjan van de Ven
0 siblings, 2 replies; 22+ messages in thread
From: Miles Lane @ 2006-09-05 17:37 UTC (permalink / raw)
To: LKML, Andrew Morton, Herbert Xu
ieee1394: Node changed: 0-01:1023 -> 0-00:1023
ieee1394: Node changed: 0-02:1023 -> 0-01:1023
ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae]
=============================================
[ INFO: possible recursive locking detected ]
2.6.18-rc5-mm1 #2
---------------------------------------------
knodemgrd_0/2321 is trying to acquire lock:
(&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
but task is already holding lock:
(&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394]
other info that might help us debug this:
2 locks held by knodemgrd_0/2321:
#0: (nodemgr_serialize){--..}, at: [<c11e76cd>]
mutex_lock_interruptible+0x1c/0x21
#1: (&s->rwsem){----}, at: [<f8959078>]
nodemgr_host_thread+0x717/0x883 [ieee1394]
stack backtrace:
[<c1003c97>] dump_trace+0x69/0x1b7
[<c1003dfa>] show_trace_log_lvl+0x15/0x28
[<c10040f5>] show_trace+0x16/0x19
[<c1004110>] dump_stack+0x18/0x1d
[<c102f1e1>] __lock_acquire+0x7a2/0x9f8
[<c102f70a>] lock_acquire+0x56/0x74
[<c102b805>] down_write+0x27/0x41
[<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
[<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
[<c1028c19>] kthread+0xaf/0xde
[<c100397b>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
Leftover inexact backtrace:
[<c1003dfa>] show_trace_log_lvl+0x15/0x28
[<c10040f5>] show_trace+0x16/0x19
[<c1004110>] dump_stack+0x18/0x1d
[<c102f1e1>] __lock_acquire+0x7a2/0x9f8
[<c102f70a>] lock_acquire+0x56/0x74
[<c102b805>] down_write+0x27/0x41
[<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394]
[<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394]
[<c1028c19>] kthread+0xaf/0xde
[<c100397b>] kernel_thread_helper+0x7/0x10
=======================
ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae]
ieee1394: Node changed: 0-00:1023 -> 0-01:1023
ieee1394: Node changed: 0-01:1023 -> 0-02:1023
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 17:37 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane @ 2006-09-05 18:13 ` Andrew Morton 2006-09-05 18:16 ` Miles Lane ` (2 more replies) 2006-09-06 6:39 ` Arjan van de Ven 1 sibling, 3 replies; 22+ messages in thread From: Andrew Morton @ 2006-09-05 18:13 UTC (permalink / raw) To: Miles Lane; +Cc: LKML, Herbert Xu, linux1394-devel On Tue, 5 Sep 2006 10:37:51 -0700 "Miles Lane" <miles.lane@gmail.com> wrote: > ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.18-rc5-mm1 #2 > --------------------------------------------- > knodemgrd_0/2321 is trying to acquire lock: > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > but task is already holding lock: > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > > other info that might help us debug this: > 2 locks held by knodemgrd_0/2321: > #0: (nodemgr_serialize){--..}, at: [<c11e76cd>] > mutex_lock_interruptible+0x1c/0x21 > #1: (&s->rwsem){----}, at: [<f8959078>] > nodemgr_host_thread+0x717/0x883 [ieee1394] > > stack backtrace: > [<c1003c97>] dump_trace+0x69/0x1b7 > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > [<c10040f5>] show_trace+0x16/0x19 > [<c1004110>] dump_stack+0x18/0x1d > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > [<c102f70a>] lock_acquire+0x56/0x74 > [<c102b805>] down_write+0x27/0x41 > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > [<c1028c19>] kthread+0xaf/0xde > [<c100397b>] kernel_thread_helper+0x7/0x10 > DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10 > > Leftover inexact backtrace: > > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > [<c10040f5>] show_trace+0x16/0x19 > [<c1004110>] dump_stack+0x18/0x1d > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > [<c102f70a>] lock_acquire+0x56/0x74 > [<c102b805>] down_write+0x27/0x41 > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > [<c1028c19>] kthread+0xaf/0xde > [<c100397b>] kernel_thread_helper+0x7/0x10 > ======================= > ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae] > ieee1394: Node changed: 0-00:1023 -> 0-01:1023 > ieee1394: Node changed: 0-01:1023 -> 0-02:1023 That's a 1394 glitch, possibly introduced by git-ieee1394.patch. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 18:13 ` Andrew Morton @ 2006-09-05 18:16 ` Miles Lane 2006-09-05 19:03 ` Stefan Richter 2006-09-05 19:23 ` Stefan Richter 2006-09-05 19:49 ` Miles Lane 2 siblings, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-09-05 18:16 UTC (permalink / raw) To: Andrew Morton; +Cc: LKML, Herbert Xu, linux1394-devel On 9/5/06, Andrew Morton <akpm@osdl.org> wrote: > On Tue, 5 Sep 2006 10:37:51 -0700 > "Miles Lane" <miles.lane@gmail.com> wrote: > > > ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > > ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > > > > ============================================= > > [ INFO: possible recursive locking detected ] > > 2.6.18-rc5-mm1 #2 > > --------------------------------------------- > > knodemgrd_0/2321 is trying to acquire lock: > > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > > > but task is already holding lock: > > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > > > > other info that might help us debug this: > > 2 locks held by knodemgrd_0/2321: > > #0: (nodemgr_serialize){--..}, at: [<c11e76cd>] > > mutex_lock_interruptible+0x1c/0x21 > > #1: (&s->rwsem){----}, at: [<f8959078>] > > nodemgr_host_thread+0x717/0x883 [ieee1394] > > > > stack backtrace: > > [<c1003c97>] dump_trace+0x69/0x1b7 > > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > > [<c10040f5>] show_trace+0x16/0x19 > > [<c1004110>] dump_stack+0x18/0x1d > > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > > [<c102f70a>] lock_acquire+0x56/0x74 > > [<c102b805>] down_write+0x27/0x41 > > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > > [<c1028c19>] kthread+0xaf/0xde > > [<c100397b>] kernel_thread_helper+0x7/0x10 > > DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10 > > > > Leftover inexact backtrace: > > > > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > > [<c10040f5>] show_trace+0x16/0x19 > > [<c1004110>] dump_stack+0x18/0x1d > > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > > [<c102f70a>] lock_acquire+0x56/0x74 > > [<c102b805>] down_write+0x27/0x41 > > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > > [<c1028c19>] kthread+0xaf/0xde > > [<c100397b>] kernel_thread_helper+0x7/0x10 > > ======================= > > ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae] > > ieee1394: Node changed: 0-00:1023 -> 0-01:1023 > > ieee1394: Node changed: 0-01:1023 -> 0-02:1023 > > That's a 1394 glitch, possibly introduced by git-ieee1394.patch. Would you like me to verify that removing the patch fixes it, or should I wait for the 2.6.18-rc6-mm1 tree? Thanks, Miles ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 18:16 ` Miles Lane @ 2006-09-05 19:03 ` Stefan Richter 2006-09-05 19:19 ` Miles Lane 0 siblings, 1 reply; 22+ messages in thread From: Stefan Richter @ 2006-09-05 19:03 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, linux1394-devel, LKML, Herbert Xu Miles Lane wrote: > On 9/5/06, Andrew Morton <akpm@osdl.org> wrote: >> On Tue, 5 Sep 2006 10:37:51 -0700 >> "Miles Lane" <miles.lane@gmail.com> wrote: >> >>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023 >>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023 >>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] >>> >>> ============================================= >>> [ INFO: possible recursive locking detected ] >>> 2.6.18-rc5-mm1 #2 >>> --------------------------------------------- >>> knodemgrd_0/2321 is trying to acquire lock: >>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] >>> >>> but task is already holding lock: >>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] How often does this happen? [...] >> That's a 1394 glitch, possibly introduced by git-ieee1394.patch. > > Would you like me to verify that removing the patch fixes it, or > should I wait for the 2.6.18-rc6-mm1 tree? My patches "ieee1394: nodemgr: switch to kthread api, replace reset semaphore" and "ieee1394: nodemgr: convert nodemgr_serialize semaphore to mutex" may be relevant. They are included in git-ieee1394.patch. Could you revert them individually and test? It should be possible to just "patch -p1 -R < ...." the following patchfiles: http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch If the problem persists, also revert http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/118-ieee1394-nodemgr-switch-to-kthread-api--replace-reset-semaphore.patch If that does not help, install them again and unapply all ieee1394 patches from -mm. If you have the time. Thanks a lot, -- Stefan Richter -=====-=-==- =--= --=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 19:03 ` Stefan Richter @ 2006-09-05 19:19 ` Miles Lane 2006-09-05 19:51 ` Stefan Richter 0 siblings, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-09-05 19:19 UTC (permalink / raw) To: Stefan Richter; +Cc: Andrew Morton, linux1394-devel, LKML, Herbert Xu On 9/5/06, Stefan Richter <stefanr@s5r6.in-berlin.de> wrote: > Miles Lane wrote: > > On 9/5/06, Andrew Morton <akpm@osdl.org> wrote: > >> On Tue, 5 Sep 2006 10:37:51 -0700 > >> "Miles Lane" <miles.lane@gmail.com> wrote: > >> > >>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > >>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > >>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > >>> > >>> ============================================= > >>> [ INFO: possible recursive locking detected ] > >>> 2.6.18-rc5-mm1 #2 > >>> --------------------------------------------- > >>> knodemgrd_0/2321 is trying to acquire lock: > >>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > >>> > >>> but task is already holding lock: > >>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > > How often does this happen? It seems to happen each time I plug in my JVC MiniDV camera (model GR-DVL9800U). > >> That's a 1394 glitch, possibly introduced by git-ieee1394.patch. > > > > Would you like me to verify that removing the patch fixes it, or > > should I wait for the 2.6.18-rc6-mm1 tree? > > My patches > "ieee1394: nodemgr: switch to kthread api, replace reset semaphore" and > "ieee1394: nodemgr: convert nodemgr_serialize semaphore to mutex" > may be relevant. They are included in git-ieee1394.patch. > > Could you revert them individually and test? It should be possible to > just "patch -p1 -R < ...." the following patchfiles: > http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch > If the problem persists, also revert > http://me.in-berlin.de/~s5r6/linux1394/updates/2.6.18-rc5/patches/118-ieee1394-nodemgr-switch-to-kthread-api--replace-reset-semaphore.patch > > If that does not help, install them again and unapply all ieee1394 > patches from -mm. If you have the time. I am setting up to test with the first patch removed. The patch doesn't apply cleanly, but I suspect this is no big deal. patch -p1 -R < /home/miles/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch patching file drivers/ieee1394/nodemgr.c Hunk #2 succeeded at 1630 (offset 9 lines). Hunk #3 succeeded at 1659 (offset 9 lines). Hunk #4 succeeded at 1677 (offset 9 lines). ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 19:19 ` Miles Lane @ 2006-09-05 19:51 ` Stefan Richter 0 siblings, 0 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-05 19:51 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, linux1394-devel, LKML, Herbert Xu Miles Lane wrote: > The patch doesn't apply cleanly, but I suspect this is no big deal. > > patch -p1 -R < > /home/miles/119-ieee1394-nodemgr-convert-nodemgr_serialize-semaphore-to-mutex.patch > > patching file drivers/ieee1394/nodemgr.c > Hunk #2 succeeded at 1630 (offset 9 lines). > Hunk #3 succeeded at 1659 (offset 9 lines). > Hunk #4 succeeded at 1677 (offset 9 lines). Yes, these offsets are harmless. Thanks for the help to debug this. -- Stefan Richter -=====-=-==- =--= --=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 18:13 ` Andrew Morton 2006-09-05 18:16 ` Miles Lane @ 2006-09-05 19:23 ` Stefan Richter 2006-09-05 21:07 ` Arjan van de Ven 2006-09-06 7:13 ` Stefan Richter 2006-09-05 19:49 ` Miles Lane 2 siblings, 2 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-05 19:23 UTC (permalink / raw) To: Andrew Morton; +Cc: Miles Lane, linux1394-devel, LKML, Herbert Xu Andrew Morton wrote: > On Tue, 5 Sep 2006 10:37:51 -0700 > "Miles Lane" <miles.lane@gmail.com> wrote: > >> ieee1394: Node changed: 0-01:1023 -> 0-00:1023 >> ieee1394: Node changed: 0-02:1023 -> 0-01:1023 >> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] >> >> ============================================= >> [ INFO: possible recursive locking detected ] >> 2.6.18-rc5-mm1 #2 >> --------------------------------------------- >> knodemgrd_0/2321 is trying to acquire lock: >> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] >> >> but task is already holding lock: >> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] [...] This information confuses me. These places are not supposed to be the ones where the locks were actually acquired, are they? > That's a 1394 glitch, possibly introduced by git-ieee1394.patch. Or maybe it's older. Nodemgr takes class->subsys.rwsem and device.bus->subsys.rwsem. It always did. Could there be a change in driver core which makes this recursive? Or has it always been recursive? For example, static void nodemgr_update_pdrv(struct node_entry *ne) { struct unit_directory *ud; struct hpsb_protocol_driver *pdrv; struct class *class = &nodemgr_ud_class; struct class_device *cdev; down_read(&class->subsys.rwsem); list_for_each_entry(cdev, &class->children, node) { ud = container_of(cdev, struct unit_directory, class_dev); if (ud->ne != ne || !ud->device.driver) continue; pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver); if (pdrv->update && pdrv->update(ud)) { down_write(&ud->device.bus->subsys.rwsem); device_release_driver(&ud->device); up_write(&ud->device.bus->subsys.rwsem); } } up_read(&class->subsys.rwsem); } Miles, perhaps you should rather unapply all 1394 patches at once. git-ieee1394.patch is alas the lowermost patch of a stack of dependent patches. I somehow expect that the "possible recursive locking" persists even if all the 1394 patches were removed. Thanks in advance, -- Stefan Richter -=====-=-==- =--= --=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 19:23 ` Stefan Richter @ 2006-09-05 21:07 ` Arjan van de Ven 2006-09-05 22:27 ` Stefan Richter 2006-09-06 7:13 ` Stefan Richter 1 sibling, 1 reply; 22+ messages in thread From: Arjan van de Ven @ 2006-09-05 21:07 UTC (permalink / raw) To: Stefan Richter Cc: Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote: > Andrew Morton wrote: > > On Tue, 5 Sep 2006 10:37:51 -0700 > > "Miles Lane" <miles.lane@gmail.com> wrote: > > > >> ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > >> ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > >> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > >> > >> ============================================= > >> [ INFO: possible recursive locking detected ] > >> 2.6.18-rc5-mm1 #2 > >> --------------------------------------------- > >> knodemgrd_0/2321 is trying to acquire lock: > >> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > >> > >> but task is already holding lock: > >> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > [...] > > This information confuses me. These places are not supposed to be the > ones where the locks were actually acquired, are they? they should be yes (but inlined functions get the name of the function they are inlined into) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 21:07 ` Arjan van de Ven @ 2006-09-05 22:27 ` Stefan Richter 2006-09-06 0:10 ` Adrian Bunk 0 siblings, 1 reply; 22+ messages in thread From: Stefan Richter @ 2006-09-05 22:27 UTC (permalink / raw) To: Arjan van de Ven Cc: Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu Arjan van de Ven wrote: > On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote: >> This information confuses me. These places are not supposed to be the >> ones where the locks were actually acquired, are they? > > they should be yes > (but inlined functions get the name of the function they are inlined > into) Was there function inlining performed? E.g. on those functions that are called from only one place? -- Stefan Richter -=====-=-==- =--= --==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 22:27 ` Stefan Richter @ 2006-09-06 0:10 ` Adrian Bunk 0 siblings, 0 replies; 22+ messages in thread From: Adrian Bunk @ 2006-09-06 0:10 UTC (permalink / raw) To: Stefan Richter Cc: Arjan van de Ven, Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu On Wed, Sep 06, 2006 at 12:27:08AM +0200, Stefan Richter wrote: > Arjan van de Ven wrote: > > On Tue, 2006-09-05 at 21:23 +0200, Stefan Richter wrote: > >> This information confuses me. These places are not supposed to be the > >> ones where the locks were actually acquired, are they? > > > > they should be yes > > (but inlined functions get the name of the function they are inlined > > into) > > Was there function inlining performed? E.g. on those functions that are > called from only one place? If a static function has only one caller it gets inlined. > Stefan Richter cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 19:23 ` Stefan Richter 2006-09-05 21:07 ` Arjan van de Ven @ 2006-09-06 7:13 ` Stefan Richter 2006-09-06 16:50 ` Stefan Richter 2006-09-06 22:35 ` Greg KH 1 sibling, 2 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-06 7:13 UTC (permalink / raw) To: Greg KH Cc: Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu, Ben Collins I wrote: > Andrew Morton wrote: >> On Tue, 5 Sep 2006 10:37:51 -0700 >> "Miles Lane" <miles.lane@gmail.com> wrote: >> >>> ieee1394: Node changed: 0-01:1023 -> 0-00:1023 >>> ieee1394: Node changed: 0-02:1023 -> 0-01:1023 >>> ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] >>> >>> ============================================= >>> [ INFO: possible recursive locking detected ] >>> 2.6.18-rc5-mm1 #2 >>> --------------------------------------------- >>> knodemgrd_0/2321 is trying to acquire lock: >>> (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] >>> >>> but task is already holding lock: >>> (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > [...] > > This information confuses me. These places are not supposed to be the > ones where the locks were actually acquired, are they? > >> That's a 1394 glitch, possibly introduced by git-ieee1394.patch. > > Or maybe it's older. Nodemgr takes class->subsys.rwsem and > device.bus->subsys.rwsem. It always did. Could there be a change in > driver core which makes this recursive? Or has it always been recursive? > For example, > > static void nodemgr_update_pdrv(struct node_entry *ne) > { > struct unit_directory *ud; > struct hpsb_protocol_driver *pdrv; > struct class *class = &nodemgr_ud_class; > struct class_device *cdev; > > down_read(&class->subsys.rwsem); > list_for_each_entry(cdev, &class->children, node) { > ud = container_of(cdev, struct unit_directory, class_dev); > if (ud->ne != ne || !ud->device.driver) > continue; > > pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver); > > if (pdrv->update && pdrv->update(ud)) { > down_write(&ud->device.bus->subsys.rwsem); > device_release_driver(&ud->device); > up_write(&ud->device.bus->subsys.rwsem); > } > } > up_read(&class->subsys.rwsem); > } Hi Greg, perhaps you could advise on this. It appears from grepping through the sources that drivers/ieee1394/nodemgr.c is the only one with mixed access to device.bus->subsys.rwsem and class->subsys.rwsem. Other usages of subsys.rwsem that I found are: 1a.) dev->bus->subsys.rwsem driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem. 1b.) driver.bus->subsys.rwsem drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem. 2.) class->subsys.rwsem drivers/scsi/hosts.c takes class->subsys.rwsem. 3.) bustype.subsys.rwsem drivers/input/serio/serio.c takes serio_bus.subsys.rwsem. drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem. drivers/base/power/shutdown.c takes devices_subsys.rwsem. drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem. Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem point to identical or different lock instances? Either way, could it hurt to convert nodemgr to uniformly use ieee1394_bus_type.subsys.rwsem all over the place? Thanks, -- Stefan Richter -=====-=-==- =--= --==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-06 7:13 ` Stefan Richter @ 2006-09-06 16:50 ` Stefan Richter 2006-09-06 17:04 ` [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion Stefan Richter 2006-09-07 22:45 ` 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane 2006-09-06 22:35 ` Greg KH 1 sibling, 2 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-06 16:50 UTC (permalink / raw) To: Greg KH Cc: Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu, Ben Collins I wrote: > I wrote: >> Or maybe it's older. Nodemgr takes class->subsys.rwsem and >> device.bus->subsys.rwsem. It always did. Could there be a change in >> driver core which makes this recursive? Or has it always been recursive? >> For example, >> >> static void nodemgr_update_pdrv(struct node_entry *ne) >> { >> struct unit_directory *ud; >> struct hpsb_protocol_driver *pdrv; >> struct class *class = &nodemgr_ud_class; >> struct class_device *cdev; >> >> down_read(&class->subsys.rwsem); >> list_for_each_entry(cdev, &class->children, node) { This may be wrong anyway. According to include/linux/device.h, class->sem should be used to protect access to class->children. There are more places in nodemgr of this sort. >> ud = container_of(cdev, struct unit_directory, class_dev); >> if (ud->ne != ne || !ud->device.driver) >> continue; >> >> pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver); >> >> if (pdrv->update && pdrv->update(ud)) { >> down_write(&ud->device.bus->subsys.rwsem); >> device_release_driver(&ud->device); >> up_write(&ud->device.bus->subsys.rwsem); >> } >> } >> up_read(&class->subsys.rwsem); >> } > > Hi Greg, > > perhaps you could advise on this. It appears from grepping through the > sources that drivers/ieee1394/nodemgr.c is the only one with mixed > access to device.bus->subsys.rwsem and class->subsys.rwsem. > > Other usages of subsys.rwsem that I found are: > 1a.) dev->bus->subsys.rwsem > driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take > dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem. > > 1b.) driver.bus->subsys.rwsem > drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem. > > 2.) class->subsys.rwsem > drivers/scsi/hosts.c takes class->subsys.rwsem. > > 3.) bustype.subsys.rwsem > drivers/input/serio/serio.c takes serio_bus.subsys.rwsem. > drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem. > drivers/base/power/shutdown.c takes devices_subsys.rwsem. > drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem. > > Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem > point to identical or different lock instances? > > Either way, could it hurt to convert nodemgr to uniformly use > ieee1394_bus_type.subsys.rwsem all over the place? > > Thanks, -- Stefan Richter -=====-=-==- =--= --==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion 2006-09-06 16:50 ` Stefan Richter @ 2006-09-06 17:04 ` Stefan Richter 2006-09-06 17:06 ` [RFT PATCH 2/2] ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne Stefan Richter 2006-09-07 22:45 ` 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane 1 sibling, 1 reply; 22+ messages in thread From: Stefan Richter @ 2006-09-06 17:04 UTC (permalink / raw) To: linux1394-devel Cc: linux-kernel, Andrew Morton, Miles Lane, Herbert Xu, Ben Collins, Greg KH nodemgr_update_pdrv grabbed an rw semaphore (as reader) which was already taken by its caller's caller, nodemgr_probe_ne (as reader too). Reported by Miles Lane, call path pointed out by Arjan van de Ven. FIXME: Shouldn't we rather use class->sem there, not class->subsys.rwsem? Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> --- Index: linux/drivers/ieee1394/nodemgr.c =================================================================== --- linux.orig/drivers/ieee1394/nodemgr.c 2006-08-30 20:47:57.000000000 +0200 +++ linux/drivers/ieee1394/nodemgr.c 2006-09-06 19:03:24.000000000 +0200 @@ -1316,6 +1316,7 @@ static void nodemgr_node_scan(struct hos } +/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader. */ static void nodemgr_suspend_ne(struct node_entry *ne) { struct class_device *cdev; @@ -1368,15 +1369,14 @@ static void nodemgr_resume_ne(struct nod } +/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader. */ static void nodemgr_update_pdrv(struct node_entry *ne) { struct unit_directory *ud; struct hpsb_protocol_driver *pdrv; - struct class *class = &nodemgr_ud_class; struct class_device *cdev; - down_read(&class->subsys.rwsem); - list_for_each_entry(cdev, &class->children, node) { + list_for_each_entry(cdev, &nodemgr_ud_class.children, node) { ud = container_of(cdev, struct unit_directory, class_dev); if (ud->ne != ne || !ud->device.driver) continue; @@ -1389,7 +1389,6 @@ static void nodemgr_update_pdrv(struct n up_write(&ud->device.bus->subsys.rwsem); } } - up_read(&class->subsys.rwsem); } @@ -1420,6 +1419,8 @@ static void nodemgr_irm_write_bc(struct } +/* Caller needs to hold nodemgr_ud_class.subsys.rwsem as reader because the + * calls to nodemgr_update_pdrv() and nodemgr_suspend_ne() here require it. */ static void nodemgr_probe_ne(struct host_info *hi, struct node_entry *ne, int generation) { struct device *dev; ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RFT PATCH 2/2] ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne 2006-09-06 17:04 ` [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion Stefan Richter @ 2006-09-06 17:06 ` Stefan Richter 0 siblings, 0 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-06 17:06 UTC (permalink / raw) To: linux1394-devel Cc: linux-kernel, Andrew Morton, Miles Lane, Herbert Xu, Ben Collins, Greg KH nodemgr_resume_ne was iterating over nodemgr_ud_class.children without protection by nodemgr_ud_class.subsys.rwsem. FIXME: Shouldn't we rather use class->sem there, not class->subsys.rwsem? Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> --- Index: linux/drivers/ieee1394/nodemgr.c =================================================================== --- linux.orig/drivers/ieee1394/nodemgr.c 2006-09-06 18:34:35.000000000 +0200 +++ linux/drivers/ieee1394/nodemgr.c 2006-09-06 18:38:20.000000000 +0200 @@ -1352,6 +1352,7 @@ static void nodemgr_resume_ne(struct nod ne->in_limbo = 0; device_remove_file(&ne->device, &dev_attr_ne_in_limbo); + down_read(&nodemgr_ud_class.subsys.rwsem); down_read(&ne->device.bus->subsys.rwsem); list_for_each_entry(cdev, &nodemgr_ud_class.children, node) { ud = container_of(cdev, struct unit_directory, class_dev); @@ -1363,6 +1364,7 @@ static void nodemgr_resume_ne(struct nod ud->device.driver->resume(&ud->device); } up_read(&ne->device.bus->subsys.rwsem); + up_read(&nodemgr_ud_class.subsys.rwsem); HPSB_DEBUG("Node resumed: ID:BUS[" NODE_BUS_FMT "] GUID[%016Lx]", NODE_BUS_ARGS(ne->host, ne->nodeid), (unsigned long long)ne->guid); ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-06 16:50 ` Stefan Richter 2006-09-06 17:04 ` [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion Stefan Richter @ 2006-09-07 22:45 ` Miles Lane 2006-09-07 23:23 ` Stefan Richter 1 sibling, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-09-07 22:45 UTC (permalink / raw) To: Stefan Richter Cc: Greg KH, Andrew Morton, linux1394-devel, LKML, Herbert Xu, Ben Collins On 9/6/06, Stefan Richter <stefanr@s5r6.in-berlin.de> wrote: > I wrote: > > I wrote: > >> Or maybe it's older. Nodemgr takes class->subsys.rwsem and > >> device.bus->subsys.rwsem. It always did. Could there be a change in > >> driver core which makes this recursive? Or has it always been recursive? > >> For example, > >> > >> static void nodemgr_update_pdrv(struct node_entry *ne) > >> { > >> struct unit_directory *ud; > >> struct hpsb_protocol_driver *pdrv; > >> struct class *class = &nodemgr_ud_class; > >> struct class_device *cdev; > >> > >> down_read(&class->subsys.rwsem); > >> list_for_each_entry(cdev, &class->children, node) { > > This may be wrong anyway. According to include/linux/device.h, > class->sem should be used to protect access to class->children. There > are more places in nodemgr of this sort. > > >> ud = container_of(cdev, struct unit_directory, class_dev); > >> if (ud->ne != ne || !ud->device.driver) > >> continue; > >> > >> pdrv = container_of(ud->device.driver, struct hpsb_protocol_driver, driver); > >> > >> if (pdrv->update && pdrv->update(ud)) { > >> down_write(&ud->device.bus->subsys.rwsem); > >> device_release_driver(&ud->device); > >> up_write(&ud->device.bus->subsys.rwsem); > >> } > >> } > >> up_read(&class->subsys.rwsem); > >> } > > > > Hi Greg, > > > > perhaps you could advise on this. It appears from grepping through the > > sources that drivers/ieee1394/nodemgr.c is the only one with mixed > > access to device.bus->subsys.rwsem and class->subsys.rwsem. > > > > Other usages of subsys.rwsem that I found are: > > 1a.) dev->bus->subsys.rwsem > > driver/ide/ide-proc.c and drivers/net/phy/phy_device.c take > > dev->bus->subsys.rwsem. drivers/pnp/card.c takes dev.bus->subsys.rwsem. > > > > 1b.) driver.bus->subsys.rwsem > > drivers/s390/net/qeth_proc.c takes driver.bus->subsys.rwsem. > > > > 2.) class->subsys.rwsem > > drivers/scsi/hosts.c takes class->subsys.rwsem. > > > > 3.) bustype.subsys.rwsem > > drivers/input/serio/serio.c takes serio_bus.subsys.rwsem. > > drivers/input/gameport/gameport.c takes gameport_bus.subsys.rwsem. > > drivers/base/power/shutdown.c takes devices_subsys.rwsem. > > drivers/usb/core/devices.c and devio.c take usb_bus_type.subsys.rwsem. > > > > Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem > > point to identical or different lock instances? > > > > Either way, could it hurt to convert nodemgr to uniformly use > > ieee1394_bus_type.subsys.rwsem all over the place? I don't have time to do the bisection testing. If there is a patch you'd like me to test against 2.6.18-rc5-mm1+all hotfixes, please let me know. I apologize for not being able to narrow this down further for you. Miles ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-07 22:45 ` 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane @ 2006-09-07 23:23 ` Stefan Richter 0 siblings, 0 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-07 23:23 UTC (permalink / raw) To: Miles Lane Cc: Greg KH, Andrew Morton, linux1394-devel, LKML, Herbert Xu, Ben Collins Miles Lane wrote: > I don't have time to do the bisection testing. If there is a patch > you'd like me to test against 2.6.18-rc5-mm1+all hotfixes, please let > me know. I apologize for not being able to narrow this down further > for you. Bisection is probably not necessary anymore. The issue seems to be much older than -mm's changes to nodemgr. Please apply the patches ieee1394: nodemgr: fix rwsem recursion ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne on top of all of -mm. I posted them yesterday but will mail them again. (linux1394-devel was kept in the dark by SpamCop.) Thanks a lot for your help. -- Stefan Richter -=====-=-==- =--= -=--- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-06 7:13 ` Stefan Richter 2006-09-06 16:50 ` Stefan Richter @ 2006-09-06 22:35 ` Greg KH 1 sibling, 0 replies; 22+ messages in thread From: Greg KH @ 2006-09-06 22:35 UTC (permalink / raw) To: Stefan Richter Cc: Andrew Morton, Miles Lane, linux1394-devel, LKML, Herbert Xu, Ben Collins On Wed, Sep 06, 2006 at 09:13:34AM +0200, Stefan Richter wrote: > Do class->subsys.rwsem, bus->subsys.rwsem, and bus_type.subsys.rwsem > point to identical or different lock instances? class->subsys.rwsem is different from the others. bus->subsys.rwsem and bus_type.subsys.rwsem are probably the same thing (depending on what that bus-> pointer is to.) > Either way, could it hurt to convert nodemgr to uniformly use > ieee1394_bus_type.subsys.rwsem all over the place? Probably a good idea. thanks, greg k-h ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 18:13 ` Andrew Morton 2006-09-05 18:16 ` Miles Lane 2006-09-05 19:23 ` Stefan Richter @ 2006-09-05 19:49 ` Miles Lane 2006-09-05 20:19 ` Stefan Richter 2 siblings, 1 reply; 22+ messages in thread From: Miles Lane @ 2006-09-05 19:49 UTC (permalink / raw) To: Andrew Morton; +Cc: LKML, Herbert Xu, linux1394-devel, Stefan Richter On 9/5/06, Andrew Morton <akpm@osdl.org> wrote: > On Tue, 5 Sep 2006 10:37:51 -0700 > "Miles Lane" <miles.lane@gmail.com> wrote: > > > ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > > ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > > > > ============================================= > > [ INFO: possible recursive locking detected ] > > 2.6.18-rc5-mm1 #2 > > --------------------------------------------- > > knodemgrd_0/2321 is trying to acquire lock: > > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > > > but task is already holding lock: > > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] > > > > other info that might help us debug this: > > 2 locks held by knodemgrd_0/2321: > > #0: (nodemgr_serialize){--..}, at: [<c11e76cd>] > > mutex_lock_interruptible+0x1c/0x21 > > #1: (&s->rwsem){----}, at: [<f8959078>] > > nodemgr_host_thread+0x717/0x883 [ieee1394] > > > > stack backtrace: > > [<c1003c97>] dump_trace+0x69/0x1b7 > > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > > [<c10040f5>] show_trace+0x16/0x19 > > [<c1004110>] dump_stack+0x18/0x1d > > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > > [<c102f70a>] lock_acquire+0x56/0x74 > > [<c102b805>] down_write+0x27/0x41 > > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > > [<c1028c19>] kthread+0xaf/0xde > > [<c100397b>] kernel_thread_helper+0x7/0x10 > > DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10 > > > > Leftover inexact backtrace: > > > > [<c1003dfa>] show_trace_log_lvl+0x15/0x28 > > [<c10040f5>] show_trace+0x16/0x19 > > [<c1004110>] dump_stack+0x18/0x1d > > [<c102f1e1>] __lock_acquire+0x7a2/0x9f8 > > [<c102f70a>] lock_acquire+0x56/0x74 > > [<c102b805>] down_write+0x27/0x41 > > [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > [<f8959098>] nodemgr_host_thread+0x737/0x883 [ieee1394] > > [<c1028c19>] kthread+0xaf/0xde > > [<c100397b>] kernel_thread_helper+0x7/0x10 > > ======================= > > ieee1394: Node resumed: ID:BUS[0-00:1023] GUID[0080880002103eae] > > ieee1394: Node changed: 0-00:1023 -> 0-01:1023 > > ieee1394: Node changed: 0-01:1023 -> 0-02:1023 > > That's a 1394 glitch, possibly introduced by git-ieee1394.patch. > > Hi Andrew, I am having trouble with backing out the git-ieee1394 patches. Suggestions? I am not knowledgable about the kernel code to fix broken patches. # patch -p1 -R --dry-run < /home/miles/git-ieee1394.patch patching file drivers/ieee1394/csr.c patching file drivers/ieee1394/csr.h patching file drivers/ieee1394/dma.c patching file drivers/ieee1394/dma.h patching file drivers/ieee1394/dv1394-private.h patching file drivers/ieee1394/dv1394.c patching file drivers/ieee1394/eth1394.c Hunk #1 succeeded at 66 with fuzz 1 (offset -1 lines). patching file drivers/ieee1394/highlevel.h patching file drivers/ieee1394/hosts.c Hunk #1 succeeded at 98 with fuzz 2 (offset 8 lines). Hunk #2 FAILED at 113. Hunk #3 FAILED at 123. 2 out of 3 hunks FAILED -- saving rejects to file drivers/ieee1394/hosts.c.rej patching file drivers/ieee1394/hosts.h Hunk #2 succeeded at 109 (offset -3 lines). Hunk #3 succeeded at 157 (offset -3 lines). Hunk #4 succeeded at 167 (offset -3 lines). Hunk #5 succeeded at 193 (offset -3 lines). patching file drivers/ieee1394/ieee1394-ioctl.h patching file drivers/ieee1394/ieee1394.h patching file drivers/ieee1394/ieee1394_core.c Hunk #1 succeeded at 354 (offset -1 lines). patching file drivers/ieee1394/ieee1394_core.h Hunk #1 FAILED at 1. Hunk #2 succeeded at 57 (offset -1 lines). Hunk #3 succeeded at 79 (offset -1 lines). Hunk #4 succeeded at 91 (offset -1 lines). Hunk #5 succeeded at 203 (offset -1 lines). Hunk #6 succeeded at 222 (offset -1 lines). 1 out of 6 hunks FAILED -- saving rejects to file drivers/ieee1394/ieee1394_core.h.rej patching file drivers/ieee1394/ieee1394_hotplug.h patching file drivers/ieee1394/ieee1394_transactions.c Hunk #1 succeeded at 13 with fuzz 2 (offset -1 lines). Hunk #2 succeeded at 232 (offset 18 lines). Hunk #3 succeeded at 279 (offset 18 lines). patching file drivers/ieee1394/ieee1394_transactions.h patching file drivers/ieee1394/ieee1394_types.h Hunk #1 FAILED at 1. Hunk #2 succeeded at 9 with fuzz 2 (offset -22 lines). Hunk #3 FAILED at 32. 2 out of 3 hunks FAILED -- saving rejects to file drivers/ieee1394/ieee1394_types.h.rej patching file drivers/ieee1394/iso.c patching file drivers/ieee1394/iso.h patching file drivers/ieee1394/nodemgr.c Hunk #4 succeeded at 418 (offset 10 lines). Hunk #5 succeeded at 1260 (offset 9 lines). Hunk #6 succeeded at 1268 (offset 9 lines). Hunk #7 succeeded at 1309 (offset 9 lines). Hunk #8 succeeded at 1501 (offset 9 lines). Hunk #9 succeeded at 1631 (offset 9 lines). Hunk #10 succeeded at 1676 (offset 9 lines). Hunk #11 succeeded at 1707 (offset 9 lines). Hunk #12 succeeded at 1773 (offset 9 lines). Hunk #13 succeeded at 1815 (offset 9 lines). patching file drivers/ieee1394/nodemgr.h Hunk #5 succeeded at 105 with fuzz 1 (offset -1 lines). Hunk #6 succeeded at 152 (offset -1 lines). Hunk #7 succeeded at 169 (offset -1 lines). patching file drivers/ieee1394/ohci1394.c patching file drivers/ieee1394/raw1394-private.h patching file drivers/ieee1394/raw1394.c patching file drivers/ieee1394/sbp2.c Hunk #1 succeeded at 367 (offset 11 lines). Hunk #2 succeeded at 380 (offset 11 lines). patching file drivers/ieee1394/video1394.c # patch -p1 -R --dry-run < /home/miles/git-ieee1394-fixup.patch patching file drivers/ieee1394/hosts.c Hunk #1 succeeded at 100 with fuzz 2 (offset 10 lines). Hunk #2 FAILED at 117. Hunk #3 FAILED at 128. 2 out of 3 hunks FAILED -- saving rejects to file drivers/ieee1394/hosts.c.rej ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 19:49 ` Miles Lane @ 2006-09-05 20:19 ` Stefan Richter 2006-09-05 20:26 ` Stefan Richter 2006-09-05 20:33 ` Andrew Morton 0 siblings, 2 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-05 20:19 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML, Herbert Xu, linux1394-devel Miles Lane wrote: > I am having trouble with backing out the git-ieee1394 patches. Take a look at http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/broken-out/series There are a number of 1394 subsystem patches; the last one is ieee1394-sbp2-more-help-in-kconfig.patch. (That's supposed that no further external patches touch ieee1394.) The order of patches in patch-series is how they were applied. Not all of these patches depend on each other, but some do. So the safest way to unapply them is to follow the exact reverse order. One tool to make this a little bit easier is quilt. This should be available as a package for most distributions. I haven't tried it myself yet, but akpm's "broken-out" patch distribution can be manipulated by quilt. I guess it works like the following method --- which has the drawback that you cannot use it with your existing linux-2.6.18-rc5-mm1 build. (Except with a trick, see below.) Install linux-2.6.18-rc5. Unpack http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/2.6.18-rc5-mm1-broken-out.tar.bz2 Rename the broken-out directory to "linux-2.6.18-rc5/patches". Copy your linux-2.6.18-rc5-mm1/.config to linux-2.6.18-rc5. Apply all the patches, in the order given by patches/series: $ cd linux-2.6.18-rc5 $ quilt push -a Fetch all of http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc5/2.6.18-rc5-mm1/hot-fixes/ and add it on top of all regular mm1 patches: $ quilt import ~/hot-fixes/*.patch $ quilt push -a Now open patches/series in an editor. Find the ieee1394 patches. Move all of them to the bottom of the series file. Save it. You can now revert each 1394 patch by $ quilt pop Build the kernel as usual. Now to the trick I mentioned before. To avoid starting from linux-2.6.18-rc5 even though you already built and booted 2.6.18-rc5-mm1, perform the steps above on top of 2.6.18-rc5 until and including the step where you imported and pushed the hot-fixes. After that, just copy the patches/ and .pc/ directories over to your existing 2.6.18-rc5-mm1. Check the effect with $ cd ../2.6.18-rc5-mm1 $ quilt top This should give a message that the last hot fix is topmost. It should now be possible to run "quilt pop" etc. Anyway; manually removing the ieee1394 patches by looking at the order in the series file may be faster than setting up quilt and the second kernel source tree. -- Stefan Richter -=====-=-==- =--= --=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 20:19 ` Stefan Richter @ 2006-09-05 20:26 ` Stefan Richter 2006-09-05 20:33 ` Andrew Morton 1 sibling, 0 replies; 22+ messages in thread From: Stefan Richter @ 2006-09-05 20:26 UTC (permalink / raw) To: Miles Lane; +Cc: Andrew Morton, LKML, Herbert Xu, linux1394-devel I wrote: [...] > Now open patches/series in an editor. Find the ieee1394 patches. Move > all of them to the bottom of the series file. Save it. You can now > revert each 1394 patch by > $ quilt pop (Repeat until git-ieee1394.patch was removed.) > Build the kernel as usual. (Of course you just need to build, install, and reload the kernel modules if you have ieee1394 configured as module.) -- Stefan Richter -=====-=-==- =--= --=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 20:19 ` Stefan Richter 2006-09-05 20:26 ` Stefan Richter @ 2006-09-05 20:33 ` Andrew Morton 1 sibling, 0 replies; 22+ messages in thread From: Andrew Morton @ 2006-09-05 20:33 UTC (permalink / raw) To: Stefan Richter; +Cc: Miles Lane, LKML, Herbert Xu, linux1394-devel On Tue, 05 Sep 2006 22:19:51 +0200 Stefan Richter <stefanr@s5r6.in-berlin.de> wrote: > One tool to make this a little bit easier is quilt. This should be > available as a package for most distributions. I haven't tried it myself > yet, but akpm's "broken-out" patch distribution can be manipulated by > quilt. Each -mm announcement contains the following text: :- If you hit a bug in -mm and it is not obvious which patch caused it, it is : most valuable if you can perform a bisection search to identify which patch : introduced the bug. Instructions for this process are at : : http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt : : But beware that this process takes some time (around ten rebuilds and : reboots), so consider reporting the bug first and if we cannot immediately : identify the faulty patch, then perform the bisection search. : ;) ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected 2006-09-05 17:37 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane 2006-09-05 18:13 ` Andrew Morton @ 2006-09-06 6:39 ` Arjan van de Ven 1 sibling, 0 replies; 22+ messages in thread From: Arjan van de Ven @ 2006-09-06 6:39 UTC (permalink / raw) To: Miles Lane; +Cc: LKML, Andrew Morton, Herbert Xu On Tue, 2006-09-05 at 10:37 -0700, Miles Lane wrote: > ieee1394: Node changed: 0-01:1023 -> 0-00:1023 > ieee1394: Node changed: 0-02:1023 -> 0-01:1023 > ieee1394: Node suspended: ID:BUS[0-00:1023] GUID[0080880002103eae] > > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.18-rc5-mm1 #2 > --------------------------------------------- > knodemgrd_0/2321 is trying to acquire lock: > (&s->rwsem){----}, at: [<f8958897>] nodemgr_probe_ne+0x311/0x38d [ieee1394] > > but task is already holding lock: > (&s->rwsem){----}, at: [<f8959078>] nodemgr_host_thread+0x717/0x883 [ieee1394] looks like a real bug to me: nodemgr_node_probe() takes down_read(&class->subsys.rwsem) and then calls nodemgr_probe_ne() which calls nodemgr_update_pdrv() which does down_read(&class->subsys.rwsem). Such recursive taking of rwsems is not allowed (rwsems are fair, if a write comes in in between then there is a deadlock). ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2006-09-07 23:25 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-09-05 17:37 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane 2006-09-05 18:13 ` Andrew Morton 2006-09-05 18:16 ` Miles Lane 2006-09-05 19:03 ` Stefan Richter 2006-09-05 19:19 ` Miles Lane 2006-09-05 19:51 ` Stefan Richter 2006-09-05 19:23 ` Stefan Richter 2006-09-05 21:07 ` Arjan van de Ven 2006-09-05 22:27 ` Stefan Richter 2006-09-06 0:10 ` Adrian Bunk 2006-09-06 7:13 ` Stefan Richter 2006-09-06 16:50 ` Stefan Richter 2006-09-06 17:04 ` [RFT PATCH 1/2] ieee1394: nodemgr: fix rwsem recursion Stefan Richter 2006-09-06 17:06 ` [RFT PATCH 2/2] ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne Stefan Richter 2006-09-07 22:45 ` 2.6.18-rc5-mm1 + all hotfixes -- INFO: possible recursive locking detected Miles Lane 2006-09-07 23:23 ` Stefan Richter 2006-09-06 22:35 ` Greg KH 2006-09-05 19:49 ` Miles Lane 2006-09-05 20:19 ` Stefan Richter 2006-09-05 20:26 ` Stefan Richter 2006-09-05 20:33 ` Andrew Morton 2006-09-06 6:39 ` Arjan van de Ven
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox