From: Greg KH <gregkh@suse.de>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, mgorman@suse.de, hch@infradead.org,
akpm@linux-foundation.org
Subject: Re: [RFC PATCH] shrink_dcache_parent() deadlock
Date: Mon, 9 Jan 2012 09:16:52 -0800 [thread overview]
Message-ID: <20120109171652.GA3904@suse.de> (raw)
In-Reply-To: <877h10957n.fsf@tucsk.pomaz.szeredi.hu>
On Mon, Jan 09, 2012 at 06:05:32PM +0100, Miklos Szeredi wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Mon, Jan 9, 2012 at 2:58 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >>
> >> This patch adds a new dentry flag that is set when the dentry is removed
> >> from the lru and put on the list being processed by
> >> shrink_dentry_list(). The flag is cleared in dentry_lru_del() which is
> >> called if the dentry gets a new reference just before it's pruned.
> >>
> >> Thoughts?
> >
> > Looks reasonable. Were you actually able to reproduce the hang, or how
> > did you notice?
>
> We got a bug report from a partner about hang during reboot. It's
> actually quite easily reproducible with:
>
> while true; do
> echo -bond0 > /sys/class/net/bonding_masters
> echo +bond0 > /sys/class/net/bonding_masters
> echo -bond1 > /sys/class/net/bonding_masters
> echo +bond1 > /sys/class/net/bonding_masters
> done
>
> That reliably triggers the soft lookup detector for several machines
> that we tested. It's rather timing sensitive though, because turning on
> DEBUG_SPINLOCK makes it go away.
>
> I had to add printks to see what's going on, because it wasn't obvious
> from the stack traces and crash dumps.
You also usually need to add a udev rule that looks at the pci id of the
device as well, without checking for what type it is (which is what a
"correct" udev rule should be doing, which is why we only started seeing
this on some systems, with "incorrect" rules.)
Having a file in /lib/udev/rules.d/ with only this one rule:
ATTR{vendor}=="0x8086", ATTR{device}=="0x10ca", ENV{PCI_SLOT_NAME}="%k", ENV{MATCHADDR}="$attr{address}", RUN+="/bin/true"
Seems to do the trick. udev calls stat() on the vendor sysfs file,
while it is going away, and then this dentry race shows up.
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <gregkh@suse.de>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, mgorman@suse.de, hch@infradead.org,
akpm@linux-foundation.org
Subject: Re: [RFC PATCH] shrink_dcache_parent() deadlock
Date: Mon, 9 Jan 2012 09:16:52 -0800 [thread overview]
Message-ID: <20120109171652.GA3904@suse.de> (raw)
In-Reply-To: <877h10957n.fsf@tucsk.pomaz.szeredi.hu>
On Mon, Jan 09, 2012 at 06:05:32PM +0100, Miklos Szeredi wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
> > On Mon, Jan 9, 2012 at 2:58 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >>
> >> This patch adds a new dentry flag that is set when the dentry is removed
> >> from the lru and put on the list being processed by
> >> shrink_dentry_list(). The flag is cleared in dentry_lru_del() which is
> >> called if the dentry gets a new reference just before it's pruned.
> >>
> >> Thoughts?
> >
> > Looks reasonable. Were you actually able to reproduce the hang, or how
> > did you notice?
>
> We got a bug report from a partner about hang during reboot. It's
> actually quite easily reproducible with:
>
> while true; do
> echo -bond0 > /sys/class/net/bonding_masters
> echo +bond0 > /sys/class/net/bonding_masters
> echo -bond1 > /sys/class/net/bonding_masters
> echo +bond1 > /sys/class/net/bonding_masters
> done
>
> That reliably triggers the soft lookup detector for several machines
> that we tested. It's rather timing sensitive though, because turning on
> DEBUG_SPINLOCK makes it go away.
>
> I had to add printks to see what's going on, because it wasn't obvious
> from the stack traces and crash dumps.
You also usually need to add a udev rule that looks at the pci id of the
device as well, without checking for what type it is (which is what a
"correct" udev rule should be doing, which is why we only started seeing
this on some systems, with "incorrect" rules.)
Having a file in /lib/udev/rules.d/ with only this one rule:
ATTR{vendor}=="0x8086", ATTR{device}=="0x10ca", ENV{PCI_SLOT_NAME}="%k", ENV{MATCHADDR}="$attr{address}", RUN+="/bin/true"
Seems to do the trick. udev calls stat() on the vendor sysfs file,
while it is going away, and then this dentry race shows up.
thanks,
greg k-h
next prev parent reply other threads:[~2012-01-09 17:17 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-09 10:58 [RFC PATCH] shrink_dcache_parent() deadlock Miklos Szeredi
2012-01-09 16:43 ` Linus Torvalds
2012-01-09 17:05 ` Miklos Szeredi
2012-01-09 17:05 ` Miklos Szeredi
2012-01-09 17:16 ` Greg KH [this message]
2012-01-09 17:16 ` Greg KH
2012-01-09 17:16 ` Christoph Hellwig
2012-01-09 17:30 ` Al Viro
2012-01-09 18:30 ` Linus Torvalds
2012-01-09 18:46 ` Linus Torvalds
2012-01-09 19:04 ` Christoph Hellwig
2012-01-09 19:18 ` Linus Torvalds
2012-01-09 20:59 ` Dave Chinner
2012-01-09 21:21 ` Linus Torvalds
2012-01-10 1:34 ` Al Viro
2012-01-10 2:02 ` Linus Torvalds
2012-01-10 10:05 ` Miklos Szeredi
2012-01-10 16:00 ` Linus Torvalds
2012-01-10 16:15 ` Al Viro
2012-01-10 16:22 ` Miklos Szeredi
2012-01-10 16:22 ` Miklos Szeredi
2012-01-10 16:33 ` Linus Torvalds
2012-01-10 16:50 ` Miklos Szeredi
2012-01-10 18:04 ` Al Viro
2012-01-10 21:52 ` Dave Chinner
2012-01-10 21:52 ` Dave Chinner
2012-01-09 21:26 ` Al Viro
2012-01-09 17:27 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120109171652.GA3904@suse.de \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=miklos@szeredi.hu \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.