From: Sahitya Tummala <stummala@codeaurora.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>,
linux-ext4@vger.kernel.org, stummala@codeaurora.org
Subject: Re: fsync_mode mount option for ext4
Date: Wed, 29 May 2019 16:18:09 +0530 [thread overview]
Message-ID: <20190529104809.GJ10043@codeaurora.org> (raw)
In-Reply-To: <20190529052332.GB6210@mit.edu>
Hi Ted,
On Wed, May 29, 2019 at 01:23:32AM -0400, Theodore Ts'o wrote:
> On Wed, May 29, 2019 at 09:37:58AM +0530, Sahitya Tummala wrote:
> >
> > Here is what I think on these mount options. Please correct me if my
> > understanding is wrong.
> >
> > The nobarrier mount option poses risk even if there is a battery
> > protection against sudden power down, as it doesn't guarantee the ordering
> > of important data such as journal writes on the disk. On the storage
> > devices with internal cache, if the cache flush policy is out-of-order,
> > then the places where FS is trying to enforce barriers will be at risk,
> > causing FS to be inconsistent.
>
> If you have protection against sudden shutdown, then nobarrier is
> perfectly safe --- which is to say, if it is guaranteed that any
> writes sent to device will be persisted after a crash, then nobarrier
> is perfectly safe. So for example, if you are using ext4 connected to
> a million dollar EMC Storage Array, which has battery backup, using
> nobarrier is perfectly safe.
>
> That's because we still send writes to the device in an appropriate
> order in nobarrier mode --- in particular, we send the journal updates
> to the device in order. The cache flush policy on the HDD is
> out-of-order, but so long as they all make it out to persistant store
> in the end, it'll be fine.
>
Got it.
> > But whereas with fsync_mode=nobarrier, FS is not trying to enforce
> > any ordering of data on the disk except to ensure the data is flushed
> > from the internal cache to non-volatile memory. Thus, I see this
> > fsync_mode=nobarrier is much better than a general nobarrier. And it
> > provides better performance too as with nobarrier but without
> > compromising much on FS consistency.
>
> "without compomising much on FS consistency" doesn't have any meaning.
> If you care about FS consistency, and you don't have power fail
> protection, then at least for ext4, you *must* send a CACHE FLUSH
> after any time that you modify any file system metadata block --- and
> that's true for 99% of all fsync(2)'s.
>
> I suppose you could do something where if there are times when no
> metadata updates are necessary, but just data block writes, the CACHE
> FLUSH could be suppressed. But (a) this won't actually provide much
> performance improvements for the vast majority of workloads,
> especially on an Android system, and (b) you're making a value
> judgement that FS consistency is more important than application data
> consistency.
>
>
> You didn't answer my question directly --- exactly what is your goal
> that you are trying to achieve, and what assumptions you are willing
> to make? If you have power fail protection (this might require making
> some adjustments to the EC), then you can use nobarrier and just not
> worry about it.
>
> If you don't have power fail protection, and you care about FS
> consistency, then you pretty much have to leave the CACHE FLUSH
> commands in.
>
> If the problem is that some applications are fsync-happy, then I'd
> suggest fixing the applications. Or if you really don't care about
> the applications working correctly or users suffering application data
> loss after a crash, you could hack in a mode, so that for non-root
> users, or maybe certain specific users, fsync is turned into a no-op,
> or a background, asynchronous (non-integrity) writeback.
>
> Are you trying to hit some benchmark target? I'm really confused why
> you would want to be so cavalier with application data safety.
>
Yes, benchmarks for random write/fsync show huge improvement.
For ex, without issuing flush in the ext4 fsync() the
random write score improves from 13MB/s to 62MB/s on eMMC,
using Androbench.
And fsync_mode=nobarrier is enabled by default on pixel phones
where f2fs is used.
https://android.googlesource.com/device/google/crosshatch/+/e02e4813256e51bacdecb93ffd8340f6efbe68e0
We have been getting requests to evaluate the same for EXT4 and
hence, I was checking with the community on its feasibility.
Thanks,
Sahitya.
> - Ted
next prev parent reply other threads:[~2019-05-29 10:48 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-28 3:22 fsync_mode mount option for ext4 Sahitya Tummala
2019-05-28 3:40 ` Theodore Ts'o
2019-05-28 3:48 ` Sahitya Tummala
2019-05-28 13:13 ` Theodore Ts'o
2019-05-29 4:07 ` Sahitya Tummala
2019-05-29 5:23 ` Theodore Ts'o
2019-05-29 6:56 ` Christoph Hellwig
2019-05-29 10:48 ` Sahitya Tummala [this message]
2019-05-29 15:13 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190529104809.GJ10043@codeaurora.org \
--to=stummala@codeaurora.org \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.