linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Robert Dorr <rdorr@microsoft.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>, Theodore Ts'o <tytso@mit.edu>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	Scott Konersmann <scottkon@microsoft.com>,
	Slava Oks <slavao@microsoft.com>,
	Jasraj Dange <jasrajd@microsoft.com>,
	Michael Nelson <micn@microsoft.com>
Subject: Re: [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes
Date: Tue, 13 Mar 2018 16:10:37 +1100	[thread overview]
Message-ID: <20180313051037.GC18129@dastard> (raw)
In-Reply-To: <CY4PR21MB074130B753A5BF81332A5171B8D20@CY4PR21MB0741.namprd21.prod.outlook.com>

On Tue, Mar 13, 2018 at 12:15:28AM +0000, Robert Dorr wrote:
> Hello all.  I have a couple of follow-up questions around this
> effort, thanks you all for all your kind inputs, patience and
> knowledge transfer.
> 
> 1.    How does xfs or ext4 make sure a pattern of WS followed by
> FWS does not allow the write (WS) completion to be visible before
> the flush completes?

I'm not sure what you are asking here. You need to be more precise
about what these IOs are, who dispatched them and what their
dependencies are. Where exactly did that FWS (REQ_FLUSH) come from?

I think, though, you're asking questions about IO ordering at the
wrong level - filesystems serialise and order IO, not the block
layer. Hence what you see at the block layer is not necessary a
reflection of the ordering the filesystem is doing.

(I've already explained this earlier today in a different thread:
https://marc.info/?l=linux-xfs&m=152091489100831&w=2)

That's why I ask about the operation causing a REQ_FLUSH to be
issued to the storage device, as that cannot be directly issued from
userspace. It will occur as a side effect of a data integrity
operation the filesystem is asked to perform, but without knowing
the relationship between the integrity operation and the write in
question an answer cannot be given.

It may would be better to describe your IO ordering and integrity
requirements at a higher level (e.g. the syscall layer), because
then we know what you are trying to acheive rather that trying to
understand your problem from context-less questions about "IO
barriers" that don't actually exist...

> I suspected the write was held in
> iomap_dio_complete_work but with the generic_write_sync change in
> the patch would a O_DSYNC write request to a DpoFua=0 block queue
> allow T2 to see the completion via io_getevents before T1
> completed the actual flush?

Yes, that can happen as concurrent data direct IO are not serialised
against each other and will always race to completion without
providing any ordering guarantees.  IOWs, if you have an IO ordering
dependency in your application, then that ordering dependency needs
to be handled in the application.

> 2.    How will my application be able to dynamically determine if
> xfs and ext4 have the performance enhancement for FUA or I need
> engage alternate methods to use fsync/fdatasync at strategic
> locations?

You don't. The filesystem will provide the same integrity guarantees
in either case - FUA is just a performance optimisation that will
get used if your hardware supports it. Applications should not care
what capabilities the storage hardawre has - the kernel should do
what is fastest and most reliable for the underlying storage....

> 3.    Are there any plans yet to optimize ext4 as well?

Not from me.

> 4.    Before the patched code the xfs_file_write_iter would call
> generic_write_sync and that calls submit_io_wait.  Does this hold
> the thread issuing the io_submit so it is unable to drive more
> async I/O?

No, -EIOCBQUEUED is returned to avoid blocking.
AIO calls generic_write_sync() from the IO completion
path via a worker thread so it's all done asynchronously.


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-03-13  5:10 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-01  1:41 [PATCH] [RFC] iomap: Use FUA for pure data O_DSYNC DIO writes Dave Chinner
2018-03-02 17:05 ` Darrick J. Wong
2018-03-02 22:20 ` Christoph Hellwig
2018-03-02 22:26   ` Christoph Hellwig
2018-03-04 23:00     ` Dave Chinner
2018-03-05 15:11       ` Christoph Hellwig
2018-03-02 22:53   ` Dave Chinner
2018-03-02 22:59     ` Christoph Hellwig
2018-03-02 23:00     ` Christoph Hellwig
2018-03-02 23:15       ` Dave Chinner
2018-03-02 23:21         ` Christoph Hellwig
2018-03-12 23:53 ` Dan Williams
2018-03-13  0:15   ` Robert Dorr
2018-03-13  5:10     ` Dave Chinner [this message]
2018-03-13 16:00       ` Robert Dorr
2018-03-13 16:12         ` Christoph Hellwig
2018-03-13 18:52           ` Robert Dorr
2018-03-19 16:06             ` Jan Kara
2018-03-19 16:14               ` Robert Dorr
2018-03-21 23:52                 ` Robert Dorr
2018-03-22 14:35                 ` Jan Kara
2018-03-22 14:38                   ` Robert Dorr
2018-04-24 14:09                     ` Robert Dorr
2018-04-24 15:32                       ` Nikolay Borisov
2018-04-25 22:28                       ` Jan Kara
2023-12-07  6:50 ` Theodore Ts'o
2023-12-07  7:32   ` Christoph Hellwig
2023-12-07 23:03   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180313051037.GC18129@dastard \
    --to=david@fromorbit.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jasrajd@microsoft.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mawilcox@microsoft.com \
    --cc=micn@microsoft.com \
    --cc=rdorr@microsoft.com \
    --cc=scottkon@microsoft.com \
    --cc=slavao@microsoft.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).