From: Kurt Miller <kurt@intricatesoftware.com>
To: Christoph Hellwig <hch@infradead.org>,
Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
linux-block@vger.kernel.org
Subject: Re: Block device flush ordering
Date: Tue, 15 Jan 2019 09:35:41 -0500 [thread overview]
Message-ID: <1547562941.20294.196.camel@intricatesoftware.com> (raw)
In-Reply-To: <20190114164549.GA26523@infradead.org>
On Mon, 2019-01-14 at 08:45 -0800, Christoph Hellwig wrote:
> On Mon, Jan 14, 2019 at 09:42:44AM +1100, Dave Chinner wrote:
> >
> > On Thu, Jan 10, 2019 at 09:30:01AM -0500, Kurt Miller wrote:
> > >
> > > For a well behaved block device that has a writeback cache,
> > > what is the proper behavior of flush when there are more
> > > then one outstanding flush operations? Is it;
> > >
> > > Flush all writes seen since the last flush.
> > > or
> > > Flush all writes received prior to the flush including
> > > those before any prior flush.
> The requirement is that all write operations that have been completed
> before the flush was seen are on stable storage. How that is
> implemented in detail is up to the device. The typical implementation
> is simply to writeback the whole cache everytime a flush operation
> is received.
>
> >
> > >
> > >
> > > For example take the following order of requests presented
> > > to the block device:
> > >
> > > writes 1-5
> > > flush 1
> > > write 6
> > > flush 2
> > >
> > > Can flush 2 finish with success as soon as write 6 is flushed
> > > (which may be before flush 1 success)? Or must it wait for
> > > all prior write operations to flush (writes 1-6)?
> No. For all the usual protocols as well as the linux kernel semantics
> there is no overall command ordering, especially as there is no way
> to even enforce that in a multi-queue environment.
>
> >
> >
> > * C1. At any given time, only one flush shall be in progress. This makes
> > * double buffering sufficient.
> Very specific implementation detail inside the request layer.
>
> >
> > Then flush 1 does not guarantee any of the writes are on stable
> > storage. They *may* be on stable storage if the timing is right, but
> > it is not guaranteed by the OS code. Likewise, flush 2 only
> > guarantees writes 1, 3 and 5 are on stable storage becase they are
> > the only writes that have been signalled as complete when flush 2
> > was submitted.
> Exactly.
Thank you both for the detailed answers. They have been very helpful.
Also after spending an afternoon reading kernel code (xlog_sync though
blk_flush_complete_seq) I understand it better. The multiple concurrent
flush requests comment I made in another reply was a logging issue in
our nbd implementation where we were logging completions after replying
to the kernel. As a result our log messages were out of order and
misleading. With that corrected in our code we see only one flush at a
time.
Best,
-Kurt
prev parent reply other threads:[~2019-01-15 14:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-10 14:30 Block device flush ordering Kurt Miller
2019-01-11 9:24 ` Stefan Ring
2019-01-12 0:30 ` Kurt Miller
2019-01-13 22:42 ` Dave Chinner
2019-01-14 16:45 ` Christoph Hellwig
2019-01-15 14:35 ` Kurt Miller [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1547562941.20294.196.camel@intricatesoftware.com \
--to=kurt@intricatesoftware.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.