From: Vivek Goyal <vgoyal@redhat.com>
To: Alex Bligh <alex@alex.org.uk>
Cc: linux-fsdevel@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: Re: Questions on block drivers, REQ_FLUSH and REQ_FUA
Date: Tue, 24 May 2011 18:32:20 -0400 [thread overview]
Message-ID: <20110524223220.GA379@redhat.com> (raw)
In-Reply-To: <8E109506CAF2B94185C68859@nimrod.local>
On Tue, May 24, 2011 at 10:29:09PM +0100, Alex Bligh wrote:
[..]
> Q3: Apparently there are no longer concepts of barriers, just REQ_FLUSH
> and REQ_FUA. REQ_FLUSH guarantees all "completed" I/O requests are written
> to disk prior to that BIO starting. However, what about non-completed I/O
> requests? For instance, is the following legitimate:
>
> Receive Send to disk Reply
> ======= ============ =====
> WRITE1
> WRITE2
> WRITE2 (cached)
> FLUSH+WRITE3
> WRITE2
> WRITE3
> WRITE3
> WRITE4
> WRITE4
> WRITE4
> WRITE1
> WRITE1
>
> Here WRITE1 was not 'completed', and thus by the text of
> Documentation/writeback_cache_control.txt, need not be written to disk
> before starting WRITE3 (which had REQ_FLUSH attached).
>
> >The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
> >the filesystem and will make sure the volatile cache of the storage device
> >has been flushed before the actual I/O operation is started. This
> >explicitly guarantees that previously completed write requests are on
> >non-volatile storage before the flagged bio starts.
>
> I presume this is illegal and is a documentation issue.
I know very little about flush semantics but still try to answer two
of your questions.
CCing Tejun.
Tejun, please correct me if I got this wrong.
I think documentation is fine. It specifically talks about completed
requests. The requests which have been sent to drive (and may be in
controller's cache).
So in above example, if driver holds back WRITE1 and never signals
the completion of request, then I think it is fine to complete
the WRITE3+FLUSH ahead of WRITE1.
I think issue will arise only if you signaled that WRITE1 has completed
and cached it in driver (as you seem to indicating) and never sent to the
drive and then you received WRITE3 + FLUSH requests. In that case you shall
have to make sure that by the time WRITE3 + FLUSH completion is signaled,
WRITE1 is on the disk.
>
> Q4. Can I reorder forwards write requests across flushes? IE, can I do
> this:
>
> Receive Send to disk Reply
> ======= ============ =====
> WRITE1
> WRITE2 (cached)
> WRITE2
> WRITE2 (cached)
> FLUSH+WRITE3
> WRITE4
> WRITE4
> WRITE4
> WRITE2
> WRITE3
> WRITE3
>
> Again this does not appear to be illegal, as the FLUSH operation is
> not defined as a barrier, meaning it should in theory be possible
> to handle (and write to disk) requests received after the
> FLUSH request before the FLUSH request finishes, provided that the
> commands received before the FLUSH request itself complete before
> the FLUSH request is replied to. I really don't know what the answer
> is to this one. It makes a big difference to me as I can write multiple
> blocks in parallel, and would really rather not slow up future write
> requests until everything is flushed unless I need to.
IIUC, you are right. You can finish WRITE4 before completing FLUSH+WRITE3
here.
We just need to make sure that any request completed by the driver
is on disk by the time FLUSH+WRITE3 completes.
Thanks
Vivek
next prev parent reply other threads:[~2011-05-24 22:32 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-24 21:29 Questions on block drivers, REQ_FLUSH and REQ_FUA Alex Bligh
2011-05-24 22:32 ` Vivek Goyal [this message]
2011-05-24 22:37 ` Vivek Goyal
2011-05-25 8:06 ` Alex Bligh
2011-05-25 8:59 ` Tejun Heo
2011-05-25 15:54 ` Alex Bligh
2011-05-25 16:43 ` Tejun Heo
2011-05-25 17:43 ` Alex Bligh
2011-05-25 19:10 ` Vivek Goyal
2011-05-25 19:58 ` Alex Bligh
2011-05-25 19:15 ` Vivek Goyal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110524223220.GA379@redhat.com \
--to=vgoyal@redhat.com \
--cc=alex@alex.org.uk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).