From: Brian Foster <bfoster@redhat.com>
To: Avi Kivity <avi@scylladb.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: xfs_extent_busy_flush vs. aio
Date: Tue, 23 Jan 2018 11:11:20 -0500 [thread overview]
Message-ID: <20180123161120.GC32478@bfoster.bfoster> (raw)
In-Reply-To: <509e33df-4f76-2937-0425-98c26b3a1207@scylladb.com>
On Tue, Jan 23, 2018 at 05:45:39PM +0200, Avi Kivity wrote:
>
>
> On 01/23/2018 05:28 PM, Brian Foster wrote:
> > On Tue, Jan 23, 2018 at 04:57:03PM +0200, Avi Kivity wrote:
> > > I'm seeing the equivalent[*] of xfs_extent_busy_flush() sleeping in my
> > > beautiful io_submit() calls.
> > >
> > >
> > > Questions:
> > >
> > > - Is it correct that RWF_NOWAIT will not detect the condition that led to
> > > the log being forced?
> > >
> > > - If so, can it be fixed?
> > >
> > > - Can I do something to reduce the odds of this occurring? larger logs,
> > > more logs, flush more often, resurrect extinct species and sacrifice them to
> > > the xfs gods?
> > >
> > > - Can an xfs developer do something? For example, make it RWF_NOWAIT
> > > friendly (if the answer to the first question was "correct")
> > >
> > So RWF_NOWAIT eventually works its way to IOMAP_NOWAIT, which looks like
> > it skips any write call that would require allocation in
> > xfs_file_iomap_begin(). The busy flush should only happen in the block
> > allocation path, so something is missing here. Do you have a backtrace
> > for the log force you're seeing?
> >
> >
>
> Here's a trace. It's from a kernel that lacks RWF_NOWAIT.
>
Oh, so the case below is roughly how I would have expected to hit the
flush/wait without RWF_NOWAIT. The latter flag should prevent this, to
answer your first question.
For the follow up question, I think this should only occur when the fs
is fairly low on free space. Is that the case here? I'm not sure there's
a specific metric, fwiw, but it's just a matter of attempting an (user
data) allocation that only finds busy extents in the free space btrees
and thus has to the force the log to satisfy the allocation. I suppose
running with more free space available would avoid this. I think running
with less in-core log space could indirectly reduce extent busy time,
but that may also have other performance ramifications and so is
probably not a great idea.
Brian
> 0xffffffff816ab231 : __schedule+0x531/0x9b0 [kernel]
> 0xffffffff816ab6d9 : schedule+0x29/0x70 [kernel]
> 0xffffffff816a90e9 : schedule_timeout+0x239/0x2c0 [kernel]
> 0xffffffff816aba8d : wait_for_completion+0xfd/0x140 [kernel]
> 0xffffffff810ab41d : flush_work+0xfd/0x190 [kernel]
> 0xffffffffc00ddb3a : xlog_cil_force_lsn+0x8a/0x210 [xfs]
> 0xffffffffc00dbbf5 : _xfs_log_force+0x85/0x2c0 [xfs]
> 0xffffffffc00dbe5c : xfs_log_force+0x2c/0x70 [xfs]
> 0xffffffffc0078f60 : xfs_alloc_ag_vextent_size+0x250/0x630 [xfs]
> 0xffffffffc0079ed5 : xfs_alloc_ag_vextent+0xe5/0x150 [xfs]
> 0xffffffffc007abc6 : xfs_alloc_vextent+0x446/0x5f0 [xfs]
> 0xffffffffc008b123 : xfs_bmap_btalloc+0x3f3/0x780 [xfs]
> 0xffffffffc008b4be : xfs_bmap_alloc+0xe/0x10 [xfs]
> 0xffffffffc008bef9 : xfs_bmapi_write+0x499/0xab0 [xfs]
> 0xffffffffc00c6ec8 : xfs_iomap_write_direct+0x1b8/0x390 [xfs]
>
next prev parent reply other threads:[~2018-01-23 16:11 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-23 14:57 xfs_extent_busy_flush vs. aio Avi Kivity
2018-01-23 15:28 ` Brian Foster
2018-01-23 15:45 ` Avi Kivity
2018-01-23 16:11 ` Brian Foster [this message]
2018-01-23 16:22 ` Avi Kivity
2018-01-23 16:47 ` Brian Foster
2018-01-23 17:00 ` Avi Kivity
2018-01-23 17:39 ` Brian Foster
2018-01-25 8:50 ` Avi Kivity
2018-01-25 13:08 ` Brian Foster
2018-01-29 9:40 ` Avi Kivity
2018-01-29 11:35 ` Dave Chinner
2018-01-29 11:44 ` Avi Kivity
2018-01-29 21:56 ` Dave Chinner
2018-01-30 8:58 ` Avi Kivity
2018-02-06 14:10 ` Avi Kivity
2018-02-07 1:57 ` Dave Chinner
2018-02-07 10:54 ` Avi Kivity
2018-02-07 23:43 ` Dave Chinner
2018-02-02 9:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180123161120.GC32478@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=avi@scylladb.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.