* JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)
@ 2006-09-05 20:33 Mattia Dongili
2006-09-06 22:00 ` Dave Kleikamp
0 siblings, 1 reply; 8+ messages in thread
From: Mattia Dongili @ 2006-09-05 20:33 UTC (permalink / raw)
To: Linux Kernel Mailing List; +Cc: shaggy, akpm
Hello,
as the subject says it's some time[0] I'm experiencing deadlocks[1] (I'm
only tracking -mm, and sporadically using the stable series). I have a
couple of use cases that seem to reliably trigger the deadlock, namely
using Eclipse and Firefox.
[0]: I'd say at least since 2.6.17-rc-something
[1]: no sysrq, no logs, no nothing, netconsole doesn't log anything
more, just screwed FS which grub can't read (Inconsistent
filesystem - press any key to continue)
I decided to try out the LOCKDEP stuff to see if I can provide more
useful info, and here we go, as soon as my /home (jfs) partition is
written (at login):
[ 72.420000]
[ 72.420000] =============================================
[ 72.420000] [ INFO: possible recursive locking detected ]
[ 72.420000] 2.6.18-rc5-mm1-2-lockdep #9
[ 72.420000] ---------------------------------------------
[ 72.420000] zsh/3844 is trying to acquire lock:
[ 72.420000] (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
[ 72.420000]
[ 72.420000] but task is already holding lock:
[ 72.420000] (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
[ 72.420000]
[ 72.420000] other info that might help us debug this:
[ 72.420000] 2 locks held by zsh/3844:
[ 72.420000] #0: (&inode->i_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
[ 72.420000] #1: (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
[ 72.420000]
[ 72.420000] stack backtrace:
[ 72.420000] [<c0103b8f>] dump_trace+0x1ef/0x230
[ 72.420000] [<c0103bf6>] show_trace_log_lvl+0x26/0x40
[ 72.420000] [<c01043cb>] show_trace+0x1b/0x20
[ 72.420000] [<c01044b4>] dump_stack+0x24/0x30
[ 72.420000] [<c0136050>] __lock_acquire+0x8e0/0xd80
[ 72.420000] [<c0136869>] lock_acquire+0x69/0x90
[ 72.420000] [<c0320705>] __mutex_lock_slowpath+0x75/0x2b0
[ 72.420000] [<c0320948>] mutex_lock+0x8/0x10
[ 72.420000] [<d137950b>] jfs_create+0xbb/0x420 [jfs]
[ 72.420000] [<c016cb7b>] vfs_create+0xcb/0x120
[ 72.420000] [<c0170158>] open_namei+0x618/0x6f0
[ 72.420000] [<c0162368>] do_filp_open+0x38/0x60
[ 72.420000] [<c01623db>] do_sys_open+0x4b/0xf0
[ 72.420000] [<c01624d7>] sys_open+0x27/0x30
[ 72.420000] [<c01032af>] syscall_call+0x7/0xb
[ 72.420000] [<b7ec8b8d>] 0xb7ec8b8d
[ 72.420000] =======================
I suspect JFS is guilty, anyway my HD has these partitions:
/dev/hda1 on / type reiserfs (rw)
/dev/hda3 on /usr type reiserfs (rw)
/dev/hda5 on /home type jfs (rw)
bootlog: http://oioio.altervista.org/linux/dmesg-2.6.18-rc5-mm1-lockdep
config: http://oioio.altervista.org/linux/config-2.6.18-rc5-mm1-lockdep
--
mattia
:wq!
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)
2006-09-05 20:33 JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1) Mattia Dongili
@ 2006-09-06 22:00 ` Dave Kleikamp
2006-09-07 15:39 ` Mattia Dongili
2006-09-07 18:49 ` argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)] Mattia Dongili
0 siblings, 2 replies; 8+ messages in thread
From: Dave Kleikamp @ 2006-09-06 22:00 UTC (permalink / raw)
To: Mattia Dongili; +Cc: Linux Kernel Mailing List, akpm
I meant to reply to this earlier. I've had a lot of distractions.
On Tue, 2006-09-05 at 22:33 +0200, Mattia Dongili wrote:
> Hello,
>
> as the subject says it's some time[0] I'm experiencing deadlocks[1] (I'm
> only tracking -mm, and sporadically using the stable series). I have a
> couple of use cases that seem to reliably trigger the deadlock, namely
> using Eclipse and Firefox.
>
> [0]: I'd say at least since 2.6.17-rc-something
> [1]: no sysrq, no logs, no nothing, netconsole doesn't log anything
> more, just screwed FS which grub can't read (Inconsistent
> filesystem - press any key to continue)
>
> I decided to try out the LOCKDEP stuff to see if I can provide more
> useful info, and here we go, as soon as my /home (jfs) partition is
> written (at login):
> [ 72.420000]
> [ 72.420000] =============================================
> [ 72.420000] [ INFO: possible recursive locking detected ]
> [ 72.420000] 2.6.18-rc5-mm1-2-lockdep #9
> [ 72.420000] ---------------------------------------------
> [ 72.420000] zsh/3844 is trying to acquire lock:
> [ 72.420000] (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
> [ 72.420000]
> [ 72.420000] but task is already holding lock:
> [ 72.420000] (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
> [ 72.420000]
> [ 72.420000] other info that might help us debug this:
> [ 72.420000] 2 locks held by zsh/3844:
> [ 72.420000] #0: (&inode->i_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
> [ 72.420000] #1: (&jfs_ip->commit_mutex){--..}, at: [<c0320948>] mutex_lock+0x8/0x10
> [ 72.420000]
> [ 72.420000] stack backtrace:
> [ 72.420000] [<c0103b8f>] dump_trace+0x1ef/0x230
> [ 72.420000] [<c0103bf6>] show_trace_log_lvl+0x26/0x40
> [ 72.420000] [<c01043cb>] show_trace+0x1b/0x20
> [ 72.420000] [<c01044b4>] dump_stack+0x24/0x30
> [ 72.420000] [<c0136050>] __lock_acquire+0x8e0/0xd80
> [ 72.420000] [<c0136869>] lock_acquire+0x69/0x90
> [ 72.420000] [<c0320705>] __mutex_lock_slowpath+0x75/0x2b0
> [ 72.420000] [<c0320948>] mutex_lock+0x8/0x10
> [ 72.420000] [<d137950b>] jfs_create+0xbb/0x420 [jfs]
> [ 72.420000] [<c016cb7b>] vfs_create+0xcb/0x120
> [ 72.420000] [<c0170158>] open_namei+0x618/0x6f0
> [ 72.420000] [<c0162368>] do_filp_open+0x38/0x60
> [ 72.420000] [<c01623db>] do_sys_open+0x4b/0xf0
> [ 72.420000] [<c01624d7>] sys_open+0x27/0x30
> [ 72.420000] [<c01032af>] syscall_call+0x7/0xb
> [ 72.420000] [<b7ec8b8d>] 0xb7ec8b8d
> [ 72.420000] =======================
>
> I suspect JFS is guilty, anyway my HD has these partitions:
I haven't got around to instrumenting jfs properly with
mutex_lock_nested(), so I know jfs doesn't run clean with lockdep
enabled. What that means is that these warnings don't necessarily point
to a real problem, and on the other hand, lockdep hasn't been run
correctly against jfs to prove that the mutex usage is safe.
That said, I'm not aware of any known problems in jfs resulting in a
deadlock. Unfortunately, without being able to use sysrq, I don't have
any real good ideas for you off the top of my head to further track down
the problem.
I'm pretty busy this week, but I'll try to get the lockdep stuff right
in jfs as soon as possible. Who knows? Maybe it will find a real
locking problem.
>
> /dev/hda1 on / type reiserfs (rw)
> /dev/hda3 on /usr type reiserfs (rw)
> /dev/hda5 on /home type jfs (rw)
>
> bootlog: http://oioio.altervista.org/linux/dmesg-2.6.18-rc5-mm1-lockdep
> config: http://oioio.altervista.org/linux/config-2.6.18-rc5-mm1-lockdep
>
--
David Kleikamp
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)
2006-09-06 22:00 ` Dave Kleikamp
@ 2006-09-07 15:39 ` Mattia Dongili
2006-09-07 15:46 ` Dave Kleikamp
2006-09-07 18:49 ` argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)] Mattia Dongili
1 sibling, 1 reply; 8+ messages in thread
From: Mattia Dongili @ 2006-09-07 15:39 UTC (permalink / raw)
To: Dave Kleikamp; +Cc: Linux Kernel Mailing List, akpm
On Wed, Sep 06, 2006 at 05:00:28PM -0500, Dave Kleikamp wrote:
[...]
> > I suspect JFS is guilty, anyway my HD has these partitions:
>
> I haven't got around to instrumenting jfs properly with
> mutex_lock_nested(), so I know jfs doesn't run clean with lockdep
> enabled. What that means is that these warnings don't necessarily point
> to a real problem, and on the other hand, lockdep hasn't been run
> correctly against jfs to prove that the mutex usage is safe.
>
> That said, I'm not aware of any known problems in jfs resulting in a
> deadlock. Unfortunately, without being able to use sysrq, I don't have
> any real good ideas for you off the top of my head to further track down
> the problem.
yup... don't know if it could put some light but simply doing
$ mv ~/.firefox ~/.firefox.old
$ cp ~/.firefox.old ~/.firefox
in order to allocate different inodes (right?) I can run Firefox
with my favourite skin (the one that was causing the hang before).
> I'm pretty busy this week, but I'll try to get the lockdep stuff right
> in jfs as soon as possible. Who knows? Maybe it will find a real
> locking problem.
I'll try to keep the filesystem as is to be able to test any fix/test
you'll propose (Eclipse still hangs the computer). Fortunately I have
one more spare partition where I can move /home.
Thanks
--
mattia
:wq!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)
2006-09-07 15:39 ` Mattia Dongili
@ 2006-09-07 15:46 ` Dave Kleikamp
2006-09-07 16:10 ` Mattia Dongili
0 siblings, 1 reply; 8+ messages in thread
From: Dave Kleikamp @ 2006-09-07 15:46 UTC (permalink / raw)
To: Mattia Dongili; +Cc: Linux Kernel Mailing List, akpm
On Thu, 2006-09-07 at 17:39 +0200, Mattia Dongili wrote:
> On Wed, Sep 06, 2006 at 05:00:28PM -0500, Dave Kleikamp wrote:
> [...]
> > > I suspect JFS is guilty, anyway my HD has these partitions:
> >
> > I haven't got around to instrumenting jfs properly with
> > mutex_lock_nested(), so I know jfs doesn't run clean with lockdep
> > enabled. What that means is that these warnings don't necessarily point
> > to a real problem, and on the other hand, lockdep hasn't been run
> > correctly against jfs to prove that the mutex usage is safe.
> >
> > That said, I'm not aware of any known problems in jfs resulting in a
> > deadlock. Unfortunately, without being able to use sysrq, I don't have
> > any real good ideas for you off the top of my head to further track down
> > the problem.
>
> yup... don't know if it could put some light but simply doing
>
> $ mv ~/.firefox ~/.firefox.old
> $ cp ~/.firefox.old ~/.firefox
>
> in order to allocate different inodes (right?) I can run Firefox
> with my favourite skin (the one that was causing the hang before).
>
> > I'm pretty busy this week, but I'll try to get the lockdep stuff right
> > in jfs as soon as possible. Who knows? Maybe it will find a real
> > locking problem.
>
> I'll try to keep the filesystem as is to be able to test any fix/test
> you'll propose (Eclipse still hangs the computer). Fortunately I have
> one more spare partition where I can move /home.
As long as you're going to try a different /home partition, why don't
you format it as something other than jfs. This way if you see the
problem again, you can rule out jfs as the culprit. Also, if jfs is the
problem, you can avoid it.
--
David Kleikamp
IBM Linux Technology Center
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)
2006-09-07 15:46 ` Dave Kleikamp
@ 2006-09-07 16:10 ` Mattia Dongili
0 siblings, 0 replies; 8+ messages in thread
From: Mattia Dongili @ 2006-09-07 16:10 UTC (permalink / raw)
To: Dave Kleikamp; +Cc: Linux Kernel Mailing List, akpm
On Thu, Sep 07, 2006 at 10:46:36AM -0500, Dave Kleikamp wrote:
> On Thu, 2006-09-07 at 17:39 +0200, Mattia Dongili wrote:
[...]
> > one more spare partition where I can move /home.
>
> As long as you're going to try a different /home partition, why don't
> you format it as something other than jfs. This way if you see the
yeah, of course ;)
Thanks
--
mattia
:wq!
^ permalink raw reply [flat|nested] 8+ messages in thread
* argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)]
2006-09-06 22:00 ` Dave Kleikamp
2006-09-07 15:39 ` Mattia Dongili
@ 2006-09-07 18:49 ` Mattia Dongili
2006-09-14 13:42 ` Jeff Mahoney
1 sibling, 1 reply; 8+ messages in thread
From: Mattia Dongili @ 2006-09-07 18:49 UTC (permalink / raw)
To: Dave Kleikamp
Cc: Linux Kernel Mailing List, akpm, reiserfs-dev, reiserfs-list
On Wed, Sep 06, 2006 at 05:00:28PM -0500, Dave Kleikamp wrote:
> I meant to reply to this earlier. I've had a lot of distractions.
>
> On Tue, 2006-09-05 at 22:33 +0200, Mattia Dongili wrote:
> > Hello,
> >
> > as the subject says it's some time[0] I'm experiencing deadlocks[1] (I'm
> > only tracking -mm, and sporadically using the stable series). I have a
> > couple of use cases that seem to reliably trigger the deadlock, namely
> > using Eclipse and Firefox.
[...]
> > /dev/hda1 on / type reiserfs (rw)
> > /dev/hda3 on /usr type reiserfs (rw)
> > /dev/hda5 on /home type jfs (rw)
> >
> > bootlog: http://oioio.altervista.org/linux/dmesg-2.6.18-rc5-mm1-lockdep
> > config: http://oioio.altervista.org/linux/config-2.6.18-rc5-mm1-lockdep
Dave,
I have to apologize. Reiser3 seem to be the one deadlocking here
actually. Changing /home to reiser4 still deadlocks.
Now, reiserfs-developers:
would you want me to keep the filesystem around to try to test patches
or potential fixes or can I wipe it out?
The good thing is that the deadlock is 100% repeatable, the bad thing is
that this laptop has a broken cdrom and I have to take the drive out and
fsck it via usb1.1 each time. :)
Thanks
--
mattia
:wq!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)]
2006-09-07 18:49 ` argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)] Mattia Dongili
@ 2006-09-14 13:42 ` Jeff Mahoney
2006-09-14 20:30 ` Mattia Dongili
0 siblings, 1 reply; 8+ messages in thread
From: Jeff Mahoney @ 2006-09-14 13:42 UTC (permalink / raw)
To: Dave Kleikamp, Linux Kernel Mailing List, akpm, reiserfs-dev,
reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Mattia Dongili wrote:
> On Wed, Sep 06, 2006 at 05:00:28PM -0500, Dave Kleikamp wrote:
>> I meant to reply to this earlier. I've had a lot of distractions.
>>
>> On Tue, 2006-09-05 at 22:33 +0200, Mattia Dongili wrote:
>>> Hello,
>>>
>>> as the subject says it's some time[0] I'm experiencing deadlocks[1] (I'm
>>> only tracking -mm, and sporadically using the stable series). I have a
>>> couple of use cases that seem to reliably trigger the deadlock, namely
>>> using Eclipse and Firefox.
> [...]
>>> /dev/hda1 on / type reiserfs (rw)
>>> /dev/hda3 on /usr type reiserfs (rw)
>>> /dev/hda5 on /home type jfs (rw)
>>>
>>> bootlog: http://oioio.altervista.org/linux/dmesg-2.6.18-rc5-mm1-lockdep
>>> config: http://oioio.altervista.org/linux/config-2.6.18-rc5-mm1-lockdep
>
> Dave,
>
> I have to apologize. Reiser3 seem to be the one deadlocking here
> actually. Changing /home to reiser4 still deadlocks.
>
> Now, reiserfs-developers:
> would you want me to keep the filesystem around to try to test patches
> or potential fixes or can I wipe it out?
> The good thing is that the deadlock is 100% repeatable, the bad thing is
> that this laptop has a broken cdrom and I have to take the drive out and
> fsck it via usb1.1 each time. :)
>
> Thanks
How is it that you arrived on reiser3 and reiser4 deadlocking here?
- -Jeff
- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org
iD8DBQFFCVxYLPWxlyuTD7IRArZ1AJ9pJPRw0ERLLS36xhn6dyHiXZJtVgCeN+Xe
OVBhxH0xk8UL/YaUKlHJuEE=
=9R9c
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)]
2006-09-14 13:42 ` Jeff Mahoney
@ 2006-09-14 20:30 ` Mattia Dongili
0 siblings, 0 replies; 8+ messages in thread
From: Mattia Dongili @ 2006-09-14 20:30 UTC (permalink / raw)
To: Jeff Mahoney
Cc: Dave Kleikamp, Linux Kernel Mailing List, akpm, reiserfs-dev,
reiserfs-list
n Thu, Sep 14, 2006 at 09:42:48AM -0400, Jeff Mahoney wrote:
> Mattia Dongili wrote:
> > On Wed, Sep 06, 2006 at 05:00:28PM -0500, Dave Kleikamp wrote:
> >> I meant to reply to this earlier. I've had a lot of distractions.
> >>
> >> On Tue, 2006-09-05 at 22:33 +0200, Mattia Dongili wrote:
> >>> Hello,
> >>>
> >>> as the subject says it's some time[0] I'm experiencing deadlocks[1] (I'm
> >>> only tracking -mm, and sporadically using the stable series). I have a
> >>> couple of use cases that seem to reliably trigger the deadlock, namely
> >>> using Eclipse and Firefox.
> > [...]
> >>> /dev/hda1 on / type reiserfs (rw)
> >>> /dev/hda3 on /usr type reiserfs (rw)
> >>> /dev/hda5 on /home type jfs (rw)
> >>>
> >>> bootlog: http://oioio.altervista.org/linux/dmesg-2.6.18-rc5-mm1-lockdep
> >>> config: http://oioio.altervista.org/linux/config-2.6.18-rc5-mm1-lockdep
> >
> > Dave,
> >
> > I have to apologize. Reiser3 seem to be the one deadlocking here
> > actually. Changing /home to reiser4 still deadlocks.
> >
> > Now, reiserfs-developers:
> > would you want me to keep the filesystem around to try to test patches
> > or potential fixes or can I wipe it out?
> > The good thing is that the deadlock is 100% repeatable, the bad thing is
> > that this laptop has a broken cdrom and I have to take the drive out and
> > fsck it via usb1.1 each time. :)
> >
> > Thanks
>
>
> How is it that you arrived on reiser3 and reiser4 deadlocking here?
oh, no that's not what I said.
I had:
/dev/hda1 on / type reiserfs (rw)
/dev/hda3 on /usr type reiserfs (rw)
/dev/hda5 on /home type jfs (rw)
the deadlock was there. I then changed to
/dev/hda1 on / type reiserfs (rw)
/dev/hda3 on /usr type reiserfs (rw)
/dev/hda5 on /home type reiser4 (rw)
still deadlocking, so it wasn't jfs but reiserfs.
Anyway I wiped out the reiserfs partitions, it was starting to become
really annoying sorry.
--
mattia
:wq!
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-09-14 20:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-05 20:33 JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1) Mattia Dongili
2006-09-06 22:00 ` Dave Kleikamp
2006-09-07 15:39 ` Mattia Dongili
2006-09-07 15:46 ` Dave Kleikamp
2006-09-07 16:10 ` Mattia Dongili
2006-09-07 18:49 ` argh! it's reiserfs deadlocking! [was: Re: JFS - real deadlock and lockdep warning (2.6.18-rc5-mm1)] Mattia Dongili
2006-09-14 13:42 ` Jeff Mahoney
2006-09-14 20:30 ` Mattia Dongili
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox