From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667] Date: Mon, 26 May 2014 19:26:44 +0100 Message-ID: <20140526182644.GP18016@ZenIV.linux.org.uk> References: <20140526093741.GA1765@lahna.fi.intel.com> <20140526135746.GM18016@ZenIV.linux.org.uk> <20140526142948.GA1685@lahna.fi.intel.com> <20140526152703.GN18016@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Mika Westerberg , Linux Kernel Mailing List , Miklos Szeredi , linux-fsdevel To: Linus Torvalds Return-path: Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Mon, May 26, 2014 at 11:17:42AM -0700, Linus Torvalds wrote: > On Mon, May 26, 2014 at 8:27 AM, Al Viro wrote: > > > > That's the livelock. OK. > > Hmm. Is there any reason we don't have some exclusion around > "check_submounts_and_drop()"? > > That would seem to be the simplest way to avoid any livelock: just > don't allow concurrent calls (we could make the lock per-filesystem or > whatever). This whole case should all be for just exceptional cases > anyway. > > We already sleep in that thing (well, "cond_resched()"), so taking a > mutex should be fine. What makes you think that it's another check_submounts_and_drop()? And not, e.g., shrink_dcache_parent(). Or memory shrinkers. Or some twit sitting in a subdirectory and doing stat(2) in a loop, for that matter... I really, really wonder WTF is causing that - we have spent 20-odd seconds spinning while dentries in there were being evicted by something. That - on sysfs, where dentry_kill() should be non-blocking and very fast. Something very fishy is going on and I'd really like to understand the use pattern we are seeing there.