From: Dave Chinner <david@fromorbit.com>
To: Avi Kivity <avi@scylladb.com>
Cc: Brian Foster <bfoster@redhat.com>,
Glauber Costa <glauber@scylladb.com>,
xfs@oss.sgi.com
Subject: Re: sleeps and waits during io_submit
Date: Wed, 2 Dec 2015 08:04:17 +1100 [thread overview]
Message-ID: <20151201210417.GY19199@dastard> (raw)
In-Reply-To: <565DBB3E.2010308@scylladb.com>
On Tue, Dec 01, 2015 at 05:22:38PM +0200, Avi Kivity wrote:
> On 12/01/2015 04:56 PM, Brian Foster wrote:
> >On Tue, Dec 01, 2015 at 03:58:28PM +0200, Avi Kivity wrote:
> >>> io_submit() can probably block in a variety of
> >>>places afaict... it might have to read in the inode extent map, allocate
> >>>blocks, take inode/ag locks, reserve log space for transactions, etc.
> >>Any chance of changing all that to be asynchronous? Doesn't sound too hard,
> >>if somebody else has to do it.
> >>
> >I'm not following... if the fs needs to read in the inode extent map to
> >prepare for an allocation, what else can the thread do but wait? Are you
> >suggesting the request kick off whatever the blocking action happens to
> >be asynchronously and return with an error such that the request can be
> >retried later?
>
> Not quite, it should be invisible to the caller.
I have a pony I can sell you.
> That is, the code called by io_submit()
> (file_operations::write_iter, it seems to be called today) can kick
> off this operation and have it continue from where it left off.
This is a problem that people have tried to solve in the past (e.g.
syslets, etc) where the thread executes until it has to block, and
then it's handled off to a worker thread/syslet to block and the
main process returns with EIOCBQUEUED.
Basically, you're asking for a real AIO infrastructure to
beintroduced into the kernel, and I think that's beyond what us XFS
guys can do...
> >>> Reducing the frequency of block allocation/frees might also be
> >>>another help (e.g., preallocate and reuse files,
> >>Isn't that discouraged for SSDs?
> >>
> >Perhaps, if you're referring to the fact that the blocks are never freed
> >and thus never discarded..? Are you running fstrim?
>
> mount -o discard. And yes, overwrites are supposedly more expensive
> than trim old data + allocate new data, but maybe if you compare it
> with the work XFS has to do, perhaps the tradeoff is bad.
Oh, you do realise that using "-o discard" causes significant delays
in journal commit processing? i.e. the journal commit completion
blocks until all the discards have been submitted and waited on
*synchronously*. This is a problem with the linux block layer in
that blkdev_issue_discard() is a synchronous operation.....
Hence if you are seeing delays in transactions (e.g. timestamp updates)
it's entirely possible that things will get much better if you
remove the discard mount option. It's much better from a performance
perspective to use the fstrim command every so often - fstrim issues
discard operations in the context of the fstrim process - it does
not interact with the transaction subsystem at all.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2015-12-01 21:04 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-28 2:43 sleeps and waits during io_submit Glauber Costa
2015-11-30 14:10 ` Brian Foster
2015-11-30 14:29 ` Avi Kivity
2015-11-30 16:14 ` Brian Foster
2015-12-01 9:08 ` Avi Kivity
2015-12-01 13:11 ` Brian Foster
2015-12-01 13:58 ` Avi Kivity
2015-12-01 14:01 ` Glauber Costa
2015-12-01 14:37 ` Avi Kivity
2015-12-01 20:45 ` Dave Chinner
2015-12-01 20:56 ` Avi Kivity
2015-12-01 23:41 ` Dave Chinner
2015-12-02 8:23 ` Avi Kivity
2015-12-01 14:56 ` Brian Foster
2015-12-01 15:22 ` Avi Kivity
2015-12-01 16:01 ` Brian Foster
2015-12-01 16:08 ` Avi Kivity
2015-12-01 16:29 ` Brian Foster
2015-12-01 17:09 ` Avi Kivity
2015-12-01 18:03 ` Carlos Maiolino
2015-12-01 19:07 ` Avi Kivity
2015-12-01 21:19 ` Dave Chinner
2015-12-01 21:38 ` Avi Kivity
2015-12-01 23:06 ` Dave Chinner
2015-12-02 9:02 ` Avi Kivity
2015-12-02 12:57 ` Carlos Maiolino
2015-12-02 23:19 ` Dave Chinner
2015-12-03 12:52 ` Avi Kivity
2015-12-04 3:16 ` Dave Chinner
2015-12-08 13:52 ` Avi Kivity
2015-12-08 23:13 ` Dave Chinner
2015-12-01 18:51 ` Brian Foster
2015-12-01 19:07 ` Glauber Costa
2015-12-01 19:35 ` Brian Foster
2015-12-01 19:45 ` Avi Kivity
2015-12-01 19:26 ` Avi Kivity
2015-12-01 19:41 ` Christoph Hellwig
2015-12-01 19:50 ` Avi Kivity
2015-12-02 0:13 ` Brian Foster
2015-12-02 0:57 ` Dave Chinner
2015-12-02 8:38 ` Avi Kivity
2015-12-02 8:34 ` Avi Kivity
2015-12-08 6:03 ` Dave Chinner
2015-12-08 13:56 ` Avi Kivity
2015-12-08 23:32 ` Dave Chinner
2015-12-09 8:37 ` Avi Kivity
2015-12-01 21:04 ` Dave Chinner [this message]
2015-12-01 21:10 ` Glauber Costa
2015-12-01 21:39 ` Dave Chinner
2015-12-01 21:24 ` Avi Kivity
2015-12-01 21:31 ` Glauber Costa
2015-11-30 15:49 ` Glauber Costa
2015-12-01 13:11 ` Brian Foster
2015-12-01 13:39 ` Glauber Costa
2015-12-01 14:02 ` Brian Foster
2015-11-30 23:10 ` Dave Chinner
2015-11-30 23:51 ` Glauber Costa
2015-12-01 20:30 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151201210417.GY19199@dastard \
--to=david@fromorbit.com \
--cc=avi@scylladb.com \
--cc=bfoster@redhat.com \
--cc=glauber@scylladb.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox