From: Dave Chinner <david@fromorbit.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
"J. Bruce Fields" <bfields@fieldses.org>,
Theodore Ts'o <tytso@mit.edu>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andreas Dilger <adilger.kernel@dilger.ca>,
Dan Williams <dan.j.williams@intel.com>,
Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.com>,
Jeff Layton <jlayton@poochiereds.net>,
Matthew Wilcox <willy@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-mm@kvack.org, linux-nvdimm@ml01.01.org, x86@kernel.org,
xfs@oss.sgi.com, Andrew Morton <akpm@linux-foundation.org>,
Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [RFC 00/11] DAX fsynx/msync support
Date: Tue, 3 Nov 2015 07:10:29 +1100 [thread overview]
Message-ID: <20151102201029.GI10656@dastard> (raw)
In-Reply-To: <x49vb9kqy5k.fsf@segfault.boston.devel.redhat.com>
On Mon, Nov 02, 2015 at 09:22:15AM -0500, Jeff Moyer wrote:
> Dave Chinner <david@fromorbit.com> writes:
>
> > Further, REQ_FLUSH/REQ_FUA are more than just "put the data on stable
> > storage" commands. They are also IO barriers that affect scheduling
> > of IOs in progress and in the request queues. A REQ_FLUSH/REQ_FUA
> > IO cannot be dispatched before all prior IO has been dispatched and
> > drained from the request queue, and IO submitted after a queued
> > REQ_FLUSH/REQ_FUA cannot be scheduled ahead of the queued
> > REQ_FLUSH/REQ_FUA operation.
> >
> > IOWs, REQ_FUA/REQ_FLUSH not only guarantee data is on stable
> > storage, they also guarantee the order of IO dispatch and
> > completion when concurrent IO is in progress.
>
> This hasn't been the case for several years, now. It used to work that
> way, and that was deemed a big performance problem. Since file systems
> already issued and waited for all I/O before sending down a barrier, we
> decided to get rid of the I/O ordering pieces of barriers (and stop
> calling them barriers).
>
> See commit 28e7d184521 (block: drop barrier ordering by queue draining).
Yes, I realise that, even if I wasn't very clear about how I wrote
it. ;)
Correct me if I'm wrong: AFAIA, dispatch ordering (i.e. the "IO
barrier") is still enforced by the scheduler via REQ_FUA|REQ_FLUSH
-> ELEVATOR_INSERT_FLUSH -> REQ_SOFTBARRIER and subsequent IO
scheduler calls to elv_dispatch_sort() that don't pass
REQ_SOFTBARRIER in the queue.
IOWs, if we queue a bunch of REQ_WRITE IOs followed by a
REQ_WRITE|REQ_FLUSH IO, all of the prior REQ_WRITE IOs will be
dispatched before the REQ_WRITE|REQ_FLUSH IO and hence be captured
by the cache flush.
Hence once the filesystem has waited on the REQ_WRITE|REQ_FLUSH IO
to complete, we know that all the earlier REQ_WRITE IOs are on
stable storage, too. Hence there's no need for the elevator to drain
the queue to guarantee completion ordering - the dispatch ordering
and flush/fua write semantics guarantee that when the flush/fua
completes, all the IOs dispatch prior to that flush/fua write are
also on stable storage...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-11-02 20:10 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-29 20:12 [RFC 00/11] DAX fsynx/msync support Ross Zwisler
2015-10-29 20:12 ` [RFC 01/11] pmem: add wb_cache_pmem() to the PMEM API Ross Zwisler
2015-10-29 20:12 ` [RFC 02/11] mm: add pmd_mkclean() Ross Zwisler
2015-10-29 20:12 ` [RFC 03/11] pmem: enable REQ_FLUSH handling Ross Zwisler
2015-10-29 20:12 ` [RFC 04/11] dax: support dirty DAX entries in radix tree Ross Zwisler
2015-10-29 20:12 ` [RFC 05/11] mm: add follow_pte_pmd() Ross Zwisler
2015-10-29 20:12 ` [RFC 06/11] mm: add pgoff_mkclean() Ross Zwisler
2015-10-29 20:12 ` [RFC 07/11] mm: add find_get_entries_tag() Ross Zwisler
2015-10-29 20:12 ` [RFC 08/11] fs: add get_block() to struct inode_operations Ross Zwisler
2015-10-29 20:12 ` [RFC 09/11] dax: add support for fsync/sync Ross Zwisler
2015-10-29 20:12 ` [RFC 10/11] xfs, ext2: call dax_pfn_mkwrite() on write fault Ross Zwisler
2015-10-29 20:12 ` [RFC 11/11] ext4: add ext4_dax_pfn_mkwrite() Ross Zwisler
2015-10-29 22:49 ` [RFC 00/11] DAX fsynx/msync support Ross Zwisler
2015-10-30 3:55 ` Dave Chinner
2015-10-30 18:39 ` Ross Zwisler
2015-11-01 23:29 ` Dave Chinner
2015-11-02 14:22 ` Jeff Moyer
2015-11-02 20:10 ` Dave Chinner [this message]
2015-11-02 21:02 ` Jeff Moyer
2015-11-04 18:34 ` Jeff Moyer
2015-11-05 8:33 ` Dave Chinner
2015-11-05 19:49 ` Jeff Moyer
2015-11-05 20:54 ` Jens Axboe
2015-10-30 18:34 ` Dan Williams
2015-10-30 19:43 ` Ross Zwisler
2015-10-30 19:51 ` Dan Williams
2015-11-01 23:36 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151102201029.GI10656@dastard \
--to=david@fromorbit.com \
--cc=adilger.kernel@dilger.ca \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=dan.j.williams@intel.com \
--cc=hpa@zytor.com \
--cc=jack@suse.com \
--cc=jlayton@poochiereds.net \
--cc=jmoyer@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=matthew.r.wilcox@intel.com \
--cc=mingo@redhat.com \
--cc=ross.zwisler@linux.intel.com \
--cc=tglx@linutronix.de \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@linux.intel.com \
--cc=x86@kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).