From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes
Date: Sat, 3 Mar 2018 00:21:46 +0100 [thread overview]
Message-ID: <20180302232146.GA31754@lst.de> (raw)
In-Reply-To: <20180302231517.GY30854@dastard>
On Sat, Mar 03, 2018 at 10:15:17AM +1100, Dave Chinner wrote:
> On Sat, Mar 03, 2018 at 12:00:42AM +0100, Christoph Hellwig wrote:
> > Oh, and another thing: I think you want to make this new code dependent
> > on the block devie actually supporting REQ_FUA natively. Otherwise
> > you'll cause a flush for every emulated FUA write, which is only going
> > make things worse, especially for ATA where FLUSH is not queued. And
> > last time I check libata still disabled FUA by default.
>
> Yup, but the issue we have right now is that for pure RWF_DSYNC data
> overwrites we are already doing a post-flush on every IO. It's being
> issued as a separate zero-length IO, which is why REQ_FUA is faster
> and results in lower overall IOPS. The flush comes from this path:
That is only the case if your device actually supports FUA. If the
device does notit is emulated by the block/flk-flush.c code by issuing a
FLUSH once the write has returned.
So for e.g. a direct I/O write() call with O_DSYNC that turns into
e.g. four write calls on the wire you currently have:
WRITE
WRITE
WRITE
WRITE
FLUSH
with your patch and a device that supports FUA you get
WRITE (FUA)
WRITE (FUA)
WRITE (FUA)
WRITE (FUA)
but with a device that does not support FUA you get
WRITE
FLUSH
WRITE
FLUSH
WRITE
FLUSH
WRITE
FLUSH
with the additional pain point that on ATA FLUSH is not a queueable
command, so it will have to wait for the completion of every other
non-related command first, and no other command can be started.
So we should absolutely use your new approach IFF the device actually
supports FUA (aka QUEUE_FLAG_FUA is set), but it will not help much
or even be harmful if the device does not actually support the FUA bit.
next prev parent reply other threads:[~2018-03-02 23:21 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-01 1:41 [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
2018-03-02 17:05 ` Darrick J. Wong
2018-03-02 22:20 ` Christoph Hellwig
2018-03-02 22:26 ` Christoph Hellwig
2018-03-04 23:00 ` Dave Chinner
2018-03-05 15:11 ` Christoph Hellwig
2018-03-02 22:53 ` Dave Chinner
2018-03-02 22:59 ` Christoph Hellwig
2018-03-02 23:00 ` Christoph Hellwig
2018-03-02 23:15 ` Dave Chinner
2018-03-02 23:21 ` Christoph Hellwig [this message]
2018-03-12 23:53 ` Dan Williams
2018-03-13 0:15 ` Robert Dorr
2018-03-13 5:10 ` Dave Chinner
2018-03-13 16:00 ` Robert Dorr
2018-03-13 16:12 ` Christoph Hellwig
2018-03-13 18:52 ` Robert Dorr
2018-03-19 16:06 ` Jan Kara
2018-03-19 16:14 ` Robert Dorr
2018-03-21 23:52 ` Robert Dorr
2018-03-22 14:35 ` Jan Kara
2018-03-22 14:38 ` Robert Dorr
2018-04-24 14:09 ` Robert Dorr
2018-04-24 15:32 ` Nikolay Borisov
2018-04-24 15:32 ` Nikolay Borisov
2018-04-25 22:28 ` Jan Kara
2023-12-07 6:50 ` Theodore Ts'o
2023-12-07 7:32 ` Christoph Hellwig
2023-12-07 23:03 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180302232146.GA31754@lst.de \
--to=hch@lst.de \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.