All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Miklos Szeredi <mszeredi@suse.cz>,
	linux-fsdevel@vger.kernel.org
Subject: Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667]
Date: Mon, 26 May 2014 14:57:47 +0100	[thread overview]
Message-ID: <20140526135746.GM18016@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20140526093741.GA1765@lahna.fi.intel.com>

[fsdevel and folks who'd been on d_lru corruption thread Cc'd - that's
a continuation of the same mess]

On Mon, May 26, 2014 at 12:37:41PM +0300, Mika Westerberg wrote:
> Hi,
> 
> After v3.15-rc4 my Fedora 20 system with mainline kernel has been suffering
> from the above lockup.
> 
> This is easy to reproduce:
> 
>  1) Plug in USB memory stick (to xHCI port)
>  2) Unplug it
> 
> Typically only one iteration is needed and suddenly I can see
> systemd-udev taking 100% CPU and eventually the whole system becomes
> unusable.
> 
> I've tried to investigate and it looks like we just spin in
> shrink_dentry_list() forever. Reverting following fs/dcache.c commits
> the issue goes away:
> 
> 60942f2f235ce7b817166cdf355eed729094834d dcache: don't need rcu in shrink_dentry_list()
> 9c8c10e262e0f62cb2530f1b076de979123183dd more graceful recovery in umount_collect()
> fe91522a7ba82ca1a51b07e19954b3825e4aaa22 don't remove from shrink list in select_collect()

Which means that we very likely have a reproducer for d_lru-corrupting
races in earlier kernels here.  I wonder if it can be simulated under KVM...

> (The first two commits themselves don't seem to be related but reverting
> them is needed so that the last one can be cleanly reverted).

What I really wonder is what else is going on there; it keeps finding a bunch
of dentries _already_ on shrink list(s) of somebody else.  And spins (with
eviction of everything worthy not already on shrink lists and cond_resched()
thrown in) to give whoever's trying to evict those suckers do their job.

This means that we either have somebody stuck trying to evict a dentry, or
that more and more dentries keep being added and evicted there.  Is somebody
sitting in a subdirectory of invalid one and trying to do lookups there,
perhaps?  But in that case we would have the same livelock in the older
kernels, possibly harder to hit, but still there...

FWIW, older kernels just went ahead, picked those already-on-shrink-list
dentries and did dentry_kill(), hopefully not at the time when the owner of
shrink list got around to removing the neighbor from that list.  With
list corruption in case it happened at just the wrong moment.

I don't have Fedora anywhere outside of KVM test images, and it'll take
a while to inflict it on actual hardware; in the meanwhile, could you
hit alt-sysrq-t after it gets stuck and post the results?  At least that
would give some idea whether it's somebody stuck on trying to evict a dentry
or a stream of new dentries being added and killed there.

AFAICS, kernfs ->d_release() isn't blocking and final iput() there also
doesn't look like it's likely to get stuck, but I'd rather have that
possibility excluded...

  reply	other threads:[~2014-05-26 13:57 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-26  9:37 fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667] Mika Westerberg
2014-05-26 13:57 ` Al Viro [this message]
2014-05-26 14:29   ` Mika Westerberg
2014-05-26 14:29     ` Mika Westerberg
2014-05-26 15:27     ` Al Viro
2014-05-26 16:42       ` Al Viro
2014-05-26 18:17       ` Linus Torvalds
2014-05-26 18:26         ` Al Viro
2014-05-26 20:24           ` Linus Torvalds
2014-05-27  1:40             ` Al Viro
2014-05-27  3:14               ` Al Viro
2014-05-27  4:00                 ` Al Viro
2014-05-27  7:04                   ` Mika Westerberg
2014-05-27  7:04                     ` Mika Westerberg
2014-05-28  3:19                     ` Al Viro
2014-05-28  7:37                       ` Mika Westerberg
2014-05-28 11:57                         ` Al Viro
2014-05-28 13:11                           ` Mika Westerberg
2014-05-28 14:19                             ` Al Viro
2014-05-28 18:39                               ` Al Viro
2014-05-28 19:43                                 ` Linus Torvalds
2014-05-28 20:02                                   ` Linus Torvalds
2014-05-28 20:25                                     ` Al Viro
2014-05-29 10:42                                     ` Mika Westerberg
2014-05-28 20:14                                   ` Al Viro
2014-05-28 21:11                                     ` Linus Torvalds
2014-05-28 21:28                                       ` Al Viro
2014-05-29  3:11                                 ` Al Viro
2014-05-29  3:52                                   ` Al Viro
2014-05-29  5:34                                     ` Al Viro
2014-05-29 10:51                                       ` Mika Westerberg
2014-05-29 10:51                                         ` Mika Westerberg
2014-05-29 11:04                                         ` Mika Westerberg
2014-05-29 13:30                                           ` Al Viro
2014-05-29 14:56                                             ` Mika Westerberg
2014-05-29 15:10                                             ` Linus Torvalds
2014-05-29 15:44                                               ` Al Viro
2014-05-29 16:23                                                 ` Al Viro
2014-05-29 16:29                                                   ` Linus Torvalds
2014-05-29 16:53                                                     ` Al Viro
2014-05-29 18:52                                                       ` Al Viro
2014-05-29 19:14                                                         ` Linus Torvalds
2014-05-30  4:50                                                           ` Al Viro
2014-05-30  5:00                                                             ` Linus Torvalds
2014-05-30  6:49                                                               ` Al Viro
2014-05-30  8:12                                                         ` Mika Westerberg
2014-05-30 15:21                                                           ` Al Viro
2014-05-30 15:31                                                             ` Linus Torvalds
2014-05-30 16:48                                                               ` [git pull] " Al Viro
2014-05-30 17:14                                                                 ` Al Viro
2014-05-31 14:18                                                                   ` Josh Boyer
2014-05-31 14:48                                                                     ` Linus Torvalds
2014-05-31 14:58                                                                       ` Josh Boyer
2014-05-31 16:12                                                                       ` Josh Boyer
2014-05-30 17:15                                                                 ` Sedat Dilek
2014-05-29  4:21                                   ` Linus Torvalds
2014-05-29  5:16                                     ` Al Viro
2014-05-29  5:26                                       ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140526135746.GM18016@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=mszeredi@suse.cz \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.