From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] [RFC] writeback: fix range_cyclic writeback vs writepages deadlock
Date: Tue, 26 Jun 2018 08:27:23 +1000 [thread overview]
Message-ID: <20180625222723.GZ19934@dastard> (raw)
In-Reply-To: <20180622144109.GA9935@bfoster>
On Fri, Jun 22, 2018 at 10:41:10AM -0400, Brian Foster wrote:
> On Fri, Jun 22, 2018 at 01:09:41PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > We've recently seen a workload on XFS filesystems with a repeatable
> > deadlock between background writeback and a multi-process
> > application doing concurrent writes and fsyncs to a small range of a
> > file.
[...]
> > #2 is simple, and I don't think it will have any impact on
> > performance as going back to the start of the file implies an
> > immediate seek. We'll have exactly the same number of seeks if we
> > switch writeback to another inode, and then come back to this one
> > later and restart from index 0.
> >
> > #2a is pretty much "status quo without the deadlock". Moving the
> > retry loop up into the wcp caller means we can issue IO on the
> > pending pages before calling wcp again, and so avoid locking or
> > waiting on pages in the wrong order. I'm not convinced we need to do
> > this given that we get the same thing from #2 on the next writeback
> > call from the writeback infrastructure.
> >
> > #3 is really just a band-aid - it doesn't fix the access/wait inversion
> > problem, just prevents it from becoming a deadlock situation. I'd
> > prefer we fix the inversion, not sweep it under the carpet like
> > this.
> >
> > #3a is really an optimisation that just so happens to include the
> > band-aid fix of #3.
> >
> > So I'm really tending towards #2 as the simplest way to fix this,
> > and that's what's the patch below implements.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
>
> FWIW, this seems like a reasonable approach to me. One thing I'm not
> sure about is whether the higher level plug in wb_writeback() could
> cause the same problem even with the lower level cycle restart out of
> the picture.
Plugging can't cause this because it captures bios that have
been released from the caller context vi submit_bio(). The plug list
has hooks in the scheduler that flush it on context switch precisely
so that we don't get deadlock problems with bios being stuck on the
plug list when we block for some reason.
> It looks to me that if the inode still has dirty pages after the
> write_cache_pages(), it ends up on wb->b_more_io via
> writeback_sb_inodes() -> requeue_inode(). The caller (wb_writeback())
> repopulates ->b_io from ->b_more_io (via queue_io()) if the former is
> empty (and we haven't otherwise stopped) before finishing the plug.
> There is a blk_flush_plug() in writeback_sb_inodes(), but that appears
> to be non-deterministic. That call aside, could that plug effectively
> hold the page in writeback long enough for background writeback to spin
> around and sit on the page 1 lock?
Right, that could happen, but the plug list will be flushed before
we context switch away and sleep after failng to get the page lock,
so there's no problem here.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-06-25 22:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-22 3:09 [PATCH] [RFC] writeback: fix range_cyclic writeback vs writepages deadlock Dave Chinner
2018-06-22 14:41 ` Brian Foster
2018-06-25 22:27 ` Dave Chinner [this message]
2018-06-26 11:28 ` Brian Foster
2018-06-22 15:14 ` Chris Mason
2018-06-25 22:37 ` Dave Chinner
2018-09-10 11:51 ` Jan Kara
2018-09-10 23:21 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180625222723.GZ19934@dastard \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).