From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: xfstests 073 regression Date: Sun, 31 Jul 2011 21:28:01 +1000 Message-ID: <20110731112801.GP5404@dastard> References: <20110728164105.GA18258@infradead.org> <20110729142121.GA21149@localhost> <20110730134422.GA1884@infradead.org> <20110731090916.GA9497@localhost> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Christoph Hellwig , "linux-fsdevel@vger.kernel.org" , Jan Kara To: Wu Fengguang Return-path: Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:22006 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751494Ab1GaL2F (ORCPT ); Sun, 31 Jul 2011 07:28:05 -0400 Content-Disposition: inline In-Reply-To: <20110731090916.GA9497@localhost> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sun, Jul 31, 2011 at 05:09:16PM +0800, Wu Fengguang wrote: > On Sat, Jul 30, 2011 at 09:44:22PM +0800, Christoph Hellwig wrote: > > On Fri, Jul 29, 2011 at 10:21:21PM +0800, Wu Fengguang wrote: > > > I cannot reproduce the bug. However looking through the code, I find > > > the only possible place that may keep wb_writeback() looping with > > > wb->list_lock grabbed is the below requeue_io() call. > > > > > > Would you try the patch? Note that even if it fixed the soft lockup, > > > it may not be suitable as the final fix. > > > > This patch fixes the hang for me. > > Great. It means grab_super_passive() always returns false for up to 22s, > due to > > a) list_empty(&sb->s_instances), which is very unlikely > > b) failed to grab &sb->s_umount > > So the chances are s_umount is mostly taken by others during the 22s. > Maybe some task other than the flusher is actively doing writeback. Writeback only holds a read lock on s_umount. > These callers are not likely since they only do _small_ writes that > hardly takes one second. > > bdi_forker_thread: > writeback_inodes_wb(&bdi->wb, 1024); > > balance_dirty_pages: > writeback_inodes_wb(&bdi->wb, write_chunk); The "something else doing writeback" reason doesn't make sense to me. grab_super_passive() is doing a down_read_trylock(), so if the lock is failing it must be held exclusively by something. That only happens if the filesystem is being mounted, unmounted, remounted, frozen or thawed, right? 073 doesn't freeze/thaw filesystems, but it does mount/remount/unmount them. So is this a writeback vs remount,ro race? > However the writeback_inodes_sb*() and sync_inodes_sb() functions will > very likely take dozens of seconds to complete. They have the same > pattern of > > down_read(&sb->s_umount); > bdi_queue_work(sb->s_bdi, &work); > wait_for_completion(&done); > up_read(&sb->s_umount); As per above, those read locks will not hold off grab_super_passive() which is also taking a read lock. There has to be some other actor in this deadlock... > Note that s_umount is grabbed as early as bdi_queue_work() time, when > the flusher is actively working on some other works. And since the > for_background/for_kupdate works will quit on seeing other pending > works, the soft lockup should only happen when the flusher is > executing some nr_pages=LARGE work when there comes a sync() which > calls writeback_inodes_sb() for the wait=0 sync stage. > > If we simply apply the change > > if (!grab_super_passive(sb)) { > - requeue_io(inode, wb); > + redirty_tail(inode, wb); > continue; > } I think the root cause of the deadlock needs to be explained before we can determine the validity of the fix.... Cheers, Dave. -- Dave Chinner david@fromorbit.com