All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>
Subject: Re: xfstests 073 regression
Date: Sun, 31 Jul 2011 21:28:01 +1000	[thread overview]
Message-ID: <20110731112801.GP5404@dastard> (raw)
In-Reply-To: <20110731090916.GA9497@localhost>

On Sun, Jul 31, 2011 at 05:09:16PM +0800, Wu Fengguang wrote:
> On Sat, Jul 30, 2011 at 09:44:22PM +0800, Christoph Hellwig wrote:
> > On Fri, Jul 29, 2011 at 10:21:21PM +0800, Wu Fengguang wrote:
> > > I cannot reproduce the bug. However looking through the code, I find
> > > the only possible place that may keep wb_writeback() looping with
> > > wb->list_lock grabbed is the below requeue_io() call.
> > > 
> > > Would you try the patch?  Note that even if it fixed the soft lockup,
> > > it may not be suitable as the final fix.
> > 
> > This patch fixes the hang for me.
> 
> Great. It means grab_super_passive() always returns false for up to 22s,
> due to
> 
> a) list_empty(&sb->s_instances), which is very unlikely
> 
> b) failed to grab &sb->s_umount
> 
> So the chances are s_umount is mostly taken by others during the 22s.
> Maybe some task other than the flusher is actively doing writeback.

Writeback only holds a read lock on s_umount.

> These callers are not likely since they only do _small_ writes that
> hardly takes one second.
> 
>         bdi_forker_thread:
>                 writeback_inodes_wb(&bdi->wb, 1024);
> 
>         balance_dirty_pages:
>                 writeback_inodes_wb(&bdi->wb, write_chunk);

The "something else doing writeback" reason doesn't make sense to
me.

grab_super_passive() is doing a down_read_trylock(), so if the lock
is failing it must be held exclusively by something.  That only
happens if the filesystem is being mounted, unmounted, remounted,
frozen or thawed, right? 073 doesn't freeze/thaw filesystems, but it
does mount/remount/unmount them.

So is this a writeback vs remount,ro race?

> However the writeback_inodes_sb*() and sync_inodes_sb() functions will
> very likely take dozens of seconds to complete. They have the same
> pattern of
> 
>         down_read(&sb->s_umount);
>         bdi_queue_work(sb->s_bdi, &work);
>         wait_for_completion(&done);
>         up_read(&sb->s_umount);

As per above, those read locks will not hold off
grab_super_passive() which is also taking a read lock. There has to
be some other actor in this deadlock...

> Note that s_umount is grabbed as early as bdi_queue_work() time, when
> the flusher is actively working on some other works. And since the
> for_background/for_kupdate works will quit on seeing other pending
> works, the soft lockup should only happen when the flusher is
> executing some nr_pages=LARGE work when there comes a sync() which
> calls writeback_inodes_sb() for the wait=0 sync stage.
> 
> If we simply apply the change
> 
>                 if (!grab_super_passive(sb)) {
> -                       requeue_io(inode, wb);
> +                       redirty_tail(inode, wb);
>                         continue;
>                 }

I think the root cause of the deadlock needs to be explained before
we can determine the validity of the fix....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2011-07-31 11:28 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-28 16:41 xfstests 073 regression Christoph Hellwig
2011-07-29 14:21 ` Wu Fengguang
2011-07-30 13:44   ` Christoph Hellwig
2011-07-31  9:09     ` Wu Fengguang
2011-07-31 11:05       ` Wu Fengguang
2011-07-31 11:28       ` Dave Chinner [this message]
2011-07-31 15:10     ` Wu Fengguang
2011-07-31 15:14       ` [GIT PULL] fix xfstests 073 regression for 3.1-rc1 Wu Fengguang
2011-07-31 23:47       ` xfstests 073 regression Dave Chinner
2011-08-01  0:25         ` Linus Torvalds
2011-08-01  1:28           ` Dave Chinner
2011-08-01  1:40             ` Linus Torvalds
2011-08-01  2:09               ` Dave Chinner
2011-08-01  2:21                 ` Linus Torvalds
2011-08-01  5:52                 ` Wu Fengguang
2011-08-01 16:44                   ` Christoph Hellwig
2011-08-01 11:23                 ` Christoph Hellwig
2011-08-01 16:52                   ` Christoph Hellwig
2011-08-02 11:44                     ` Wu Fengguang
2011-08-02 12:04                       ` Christoph Hellwig
2011-08-02 12:04                       ` Dave Chinner
2011-08-02 12:16                         ` Wu Fengguang
2011-08-02 12:26                           ` Wu Fengguang
2011-08-02 12:05                       ` Wu Fengguang
2011-08-01  5:24         ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110731112801.GP5404@dastard \
    --to=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.