linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nikhilesh Reddy <reddyn@codeaurora.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: Using Cache barriers in lieu of REQ_FLUSH | REQ_FUA for emmc 5.1 (jdec spec JESD84-B51)
Date: Mon, 28 Sep 2015 15:28:16 -0700	[thread overview]
Message-ID: <5609BF00.5000502@codeaurora.org> (raw)
In-Reply-To: <20150920034248.GB2909@thunk.org>

On Sat 19 Sep 2015 08:42:48 PM PDT, Theodore Ts'o wrote:
> On Tue, Sep 15, 2015 at 04:17:46PM -0700, Nikhilesh Reddy wrote:
>>
>> The eMMC 5.1 spec defines cache "barrier" capability of the eMMC device as
>> defined in JESD84-B51
>>
>> I was wondering if there were any downsides to replacing the
>> WRITE_FLUSH_FUA	 with the cache barrier?
>>
>> I understand that REQ_FLUSH is used to ensure that the current cache be
>> flushed to prevent any reordering but I dont seem to be clear on why
>> REQ_FUA is used.
>> Can someone please help me understand this part?
>>
>> I know there there was a big decision in 2010
>> https://lwn.net/Articles/400541/
>> and http://lwn.net/Articles/399148/
>> to remove the software based barrier support... but with the hardware
>> supporting "barriers" is there a downside to using them to replace the
>> flushes?
>
> OK, so a couple of things here.
>
> There is queuing happening at two different layers in the system;
> once at the block device layer, and one at the storage device layer.
> (Possibly more if you have a hardware RAID card, etc., but for this
> discussion, what's important is the queuing which is happening inside
> the kernel, and that which is happening below the kernel.
>
> The transition in 2010 is referring to how we handle barriers at the
> block device layer, and was inspired by the fact that at that time,
> the vast majority of the storage devices only supported "cache flush"
> at the storage layer, and a few devices would support FUA (Force Unit
> Attention) requests.  But it can support devices which have a true
> cache barrier function.
>
> So when we say REQ_FLUSH, what we mean is that the writes are flushed
> from the block layer command queues to the storage device, and that
> subsequent writes will not be reordered before the flush.  Since most
> devices don't support a cache barrier command, this is implemented in
> practice as a FLUSH CACHE, but if the device supports cache barrier
> command, that would be sufficient.
>
> The FUA write command is the command that actually has temporal
> meaning; the device is not supported to signal completion until that
> particular write has been committed to stable store.  And if you
> combine that with a flush command, as in WRITE_FLUSH_FUA, then that
> implies a cache barrier, followed by a write that should not return
> until write (FUA), and all preceeding writes, have been committed to
> stable store (implied by the cache barrier).
>
> For devices that support a cache barrier, a REQ_FLUSH can be
> implemented using a cache barrier.  If the storage device does not
> support a cache barrier, the much stronger FLUSH CACHE command will
> also work, and in practice, that's what gets used in for most storage
> devices today.
>
> For devices that don't support a FUA write, this can be simulated
> using the (overly strong) combination of a write followed by a FLUSH
> CACHE command.  (Note, due to regressions caused by buggy hardware,
> the libata driver does not enable FUA by default.  Interestingly,
> apparently Windows 2012 and newer no longer tries to use FUA either;
> maybe Microsoft has run into consumer-grade storage devices with
> crappy firmware?  That being said, if you are using SATA drives which
> in a JBOD which is has a SAS expander, you *are* using FUA --- but
> presumably people who are doing this are at bigger shops who can do
> proper HDD validation and can lean on their storage vendors to make
> sure any firmware bugs they find get fixed.)
>
> So for ext4, when we do a journal commit, first we write the journal
> blocks, then a REQ_FLUSH, and then we FUA write the commit block ---
> which for commodity SATA drives, gets translated to write the journal
> blocks, FLUSH CACHE, write the commit block, FLUSH CACHE.
>
> If your storage device has support for a barrier command and FUA, then
> this could also be translated to write the journal blocks, CACHE
> BARRIER, FUA WRITE the commit block.
>
> And of course if you don't have FUA support, but you do have the
> barrier command, then this could also get translated to write the
> journal blocks, CACHE BARRIER, write the commit block, FLUSH CACHE.
>
> All of these scenarios should work just fine.
>
> Hope this helps,
>
> 				- Ted

Thanks so much !!
This was really helpful!

--
Thanks
Nikhilesh Reddy

Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project.


  reply	other threads:[~2015-09-28 22:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-15 23:17 Using Cache barriers in lieu of REQ_FLUSH | REQ_FUA for emmc 5.1 (jdec spec JESD84-B51) Nikhilesh Reddy
2015-09-20  3:42 ` Theodore Ts'o
2015-09-28 22:28   ` Nikhilesh Reddy [this message]
2015-10-23  6:33   ` Running XFS tests on qemu Nikhilesh Reddy
2015-10-23  9:34     ` Theodore Ts'o
2015-10-27 17:55       ` Nikhilesh Reddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5609BF00.5000502@codeaurora.org \
    --to=reddyn@codeaurora.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).