linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Block device flush ordering
       [not found] <1547130601.20294.152.camel@intricatesoftware.com>
@ 2019-01-13 22:42 ` Dave Chinner
  2019-01-14 16:45   ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2019-01-13 22:42 UTC (permalink / raw)
  To: Kurt Miller; +Cc: linux-xfs, linux-ext4, linux-block

[ cc'd linux-block@vger.kernel.org, where questions about block
device behaviour are better directed. ]

On Thu, Jan 10, 2019 at 09:30:01AM -0500, Kurt Miller wrote:
> For a well behaved block device that has a writeback cache,
> what is the proper behavior of flush when there are more
> then one outstanding flush operations? Is it;
> 
> Flush all writes seen since the last flush.
> or
> Flush all writes received prior to the flush including
> those before any prior flush.
> 
> For example take the following order of requests presented
> to the block device:
> 
> 	writes 1-5
> 	flush 1
> 	write 6
> 	flush 2
> 
> Can flush 2 finish with success as soon as write 6 is flushed
> (which may be before flush 1 success)? Or must it wait for
> all prior write operations to flush (writes 1-6)?

Don't take what I say as gospel, but according to block/blk-flush.c:

.....
 * Currently, the following conditions are used to determine when to issue
 * flush.
 *
 * C1. At any given time, only one flush shall be in progress.  This makes
 *     double buffering sufficient.
.....

However, flushes can be deferred and re-ordered vs other non-flush
write IO dispatch. As such, the rules we work to with filesystems is
that a flush only guarantees IO that is already completed will be
written to stable storage.  i.e. the filesystem has to wait for IO
completion of a write IO it needs to be stable before it can issue
(and wait for) a flush that will guarantee that it is on stable
storage.

IOWs, if your above scenario is:

	submit writes 1-5
	flush 1
	submit write 6
	writes 1,3,5 complete
	flush 2
	writes 2,4,6 complete

Then flush 1 does not guarantee any of the writes are on stable
storage. They *may* be on stable storage if the timing is right, but
it is not guaranteed by the OS code. Likewise, flush 2 only
guarantees writes 1, 3 and 5 are on stable storage becase they are
the only writes that have been signalled as complete when flush 2
was submitted.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Block device flush ordering
  2019-01-13 22:42 ` Block device flush ordering Dave Chinner
@ 2019-01-14 16:45   ` Christoph Hellwig
  2019-01-15 14:35     ` Kurt Miller
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2019-01-14 16:45 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Kurt Miller, linux-xfs, linux-ext4, linux-block

On Mon, Jan 14, 2019 at 09:42:44AM +1100, Dave Chinner wrote:
> On Thu, Jan 10, 2019 at 09:30:01AM -0500, Kurt Miller wrote:
> > For a well behaved block device that has a writeback cache,
> > what is the proper behavior of flush when there are more
> > then one outstanding flush operations? Is it;
> > 
> > Flush all writes seen since the last flush.
> > or
> > Flush all writes received prior to the flush including
> > those before any prior flush.

The requirement is that all write operations that have been completed
before the flush was seen are on stable storage.  How that is
implemented in detail is up to the device.  The typical implementation
is simply to writeback the whole cache everytime a flush operation
is received.

> > 
> > For example take the following order of requests presented
> > to the block device:
> > 
> > 	writes 1-5
> > 	flush 1
> > 	write 6
> > 	flush 2
> > 
> > Can flush 2 finish with success as soon as write 6 is flushed
> > (which may be before flush 1 success)? Or must it wait for
> > all prior write operations to flush (writes 1-6)?

No.  For all the usual protocols as well as the linux kernel semantics
there is no overall command ordering, especially as there is no way
to even enforce that in a multi-queue environment.

>
>  * C1. At any given time, only one flush shall be in progress.  This makes
>  *     double buffering sufficient.

Very specific implementation detail inside the request layer.

> Then flush 1 does not guarantee any of the writes are on stable
> storage. They *may* be on stable storage if the timing is right, but
> it is not guaranteed by the OS code. Likewise, flush 2 only
> guarantees writes 1, 3 and 5 are on stable storage becase they are
> the only writes that have been signalled as complete when flush 2
> was submitted.

Exactly.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Block device flush ordering
  2019-01-14 16:45   ` Christoph Hellwig
@ 2019-01-15 14:35     ` Kurt Miller
  0 siblings, 0 replies; 3+ messages in thread
From: Kurt Miller @ 2019-01-15 14:35 UTC (permalink / raw)
  To: Christoph Hellwig, Dave Chinner; +Cc: linux-xfs, linux-ext4, linux-block

On Mon, 2019-01-14 at 08:45 -0800, Christoph Hellwig wrote:
> On Mon, Jan 14, 2019 at 09:42:44AM +1100, Dave Chinner wrote:
> > 
> > On Thu, Jan 10, 2019 at 09:30:01AM -0500, Kurt Miller wrote:
> > > 
> > > For a well behaved block device that has a writeback cache,
> > > what is the proper behavior of flush when there are more
> > > then one outstanding flush operations? Is it;
> > > 
> > > Flush all writes seen since the last flush.
> > > or
> > > Flush all writes received prior to the flush including
> > > those before any prior flush.
> The requirement is that all write operations that have been completed
> before the flush was seen are on stable storage.  How that is
> implemented in detail is up to the device.  The typical implementation
> is simply to writeback the whole cache everytime a flush operation
> is received.
> 
> > 
> > > 
> > > 
> > > For example take the following order of requests presented
> > > to the block device:
> > > 
> > > 	writes 1-5
> > > 	flush 1
> > > 	write 6
> > > 	flush 2
> > > 
> > > Can flush 2 finish with success as soon as write 6 is flushed
> > > (which may be before flush 1 success)? Or must it wait for
> > > all prior write operations to flush (writes 1-6)?
> No.  For all the usual protocols as well as the linux kernel semantics
> there is no overall command ordering, especially as there is no way
> to even enforce that in a multi-queue environment.
> 
> > 
> > 
> >  * C1. At any given time, only one flush shall be in progress.  This makes
> >  *     double buffering sufficient.
> Very specific implementation detail inside the request layer.
> 
> > 
> > Then flush 1 does not guarantee any of the writes are on stable
> > storage. They *may* be on stable storage if the timing is right, but
> > it is not guaranteed by the OS code. Likewise, flush 2 only
> > guarantees writes 1, 3 and 5 are on stable storage becase they are
> > the only writes that have been signalled as complete when flush 2
> > was submitted.
> Exactly.

Thank you both for the detailed answers. They have been very helpful.
Also after spending an afternoon reading kernel code (xlog_sync though
blk_flush_complete_seq) I understand it better. The multiple concurrent
flush requests comment I made in another reply was a logging issue in
our nbd implementation where we were logging completions after replying
to the kernel. As a result our log messages were out of order and
misleading. With that corrected in our code we see only one flush at a
time.

Best,
-Kurt

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-01-15 14:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1547130601.20294.152.camel@intricatesoftware.com>
2019-01-13 22:42 ` Block device flush ordering Dave Chinner
2019-01-14 16:45   ` Christoph Hellwig
2019-01-15 14:35     ` Kurt Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).