All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jayashree <jaya@cs.utexas.edu>, fstests <fstests@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-doc@vger.kernel.org,
	Vijaychidambaram Velayudhan Pillai <vijay@cs.utexas.edu>,
	Theodore Tso <tytso@mit.edu>,
	chao@kernel.org, Filipe Manana <fdmanana@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Josef Bacik <josef@toxicpanda.com>,
	Anna Schumaker <Anna.Schumaker@netapp.com>
Subject: Re: [PATCH v2] Documenting the crash-recovery guarantees of Linux file systems
Date: Wed, 20 Mar 2019 07:43:30 +1100	[thread overview]
Message-ID: <20190319204330.GY26298@dastard> (raw)
In-Reply-To: <CAOQ4uxjyrYO8bmY04-AE9t3kQ=0B-m1A8yLRXtsyevrxCfyLxg@mail.gmail.com>

On Tue, Mar 19, 2019 at 09:35:19AM +0200, Amir Goldstein wrote:
> On Tue, Mar 19, 2019 at 5:13 AM Dave Chinner <david@fromorbit.com> wrote:
> > That is, sync_file_range() is only safe to use for this specific
> > sort of ordered data integrity algorithm when flushing the entire
> > file.(*)
> >
> > create
> > setxattr
> > write                                   metadata volatile
> >   delayed allocation                    data volatile
> > ....
> > sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE |
> >                 SYNC_FILE_RANGE_WRITE | SYNC_FILE_RANGE_WAIT_AFTER);
> >   Extent Allocation                     metadata volatile
> >                   ----> device -+
> >                                         data volatile
> >                   <-- complete -+
> > ....
> > rename                                  metadata volatile
> >
> > And so at this point, we only need a device cache flush to
> > make the data persistent and a journal flush to make the rename
> > persistent. And so it ends up the same case as non-AIO O_DIRECT.
> >
> 
> Funny, I once told that story and one Dave Chinner told me
> "Nice story, but wrong.":
> https://patchwork.kernel.org/patch/10576303/#22190719
> 
> You pointed to the minor detail that sync_file_range() uses
> WB_SYNC_NONE.

Ah, I forgot about that. That's what I get for not looking at the
code. Did I mention that SFR is a complete crock of shit when it
comes to data integrity operations? :/

> So yes, I agree, it is a nice story and we need to make it right,
> by having an API (perhaps SYNC_FILE_RANGE_ALL).
> When you pointed out my mistake, I switched the application to
> use the FIEMAP_FLAG_SYNC API as a hack.

Yeah, that 's a nasty hack :/

> Besides tests and documentation what could be useful is a portable
> user space library that just does the right thing for every filesystem.

*nod*

but before that, we need the model to be defined and documented.
And once we have a library, the fun part of convincing the world
that it should be the glibc default behaviour can begin....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2019-03-19 20:43 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-12 19:27 [PATCH v2] Documenting the crash-recovery guarantees of Linux file systems Jayashree
2019-03-13 17:13 ` Filipe Manana
2019-03-13 18:43 ` Amir Goldstein
2019-03-14  1:19 ` Dave Chinner
2019-03-14  7:19   ` Amir Goldstein
2019-03-15  3:03     ` Dave Chinner
2019-03-15  3:44       ` Amir Goldstein
2019-03-17 22:16         ` Dave Chinner
2019-03-18  7:13           ` Amir Goldstein
2019-03-19  2:37             ` Vijay Chidambaram
2019-03-19  4:37               ` Dave Chinner
2019-03-19 15:17               ` Theodore Ts'o
2019-03-19 21:08                 ` Dave Chinner
2019-03-19  3:13             ` Dave Chinner
2019-03-19  7:35               ` Amir Goldstein
2019-03-19 20:43                 ` Dave Chinner [this message]
2019-03-18  2:48     ` Theodore Ts'o
2019-03-18  5:46       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190319204330.GY26298@dastard \
    --to=david@fromorbit.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=amir73il@gmail.com \
    --cc=chao@kernel.org \
    --cc=corbet@lwn.net \
    --cc=fdmanana@gmail.com \
    --cc=fstests@vger.kernel.org \
    --cc=jaya@cs.utexas.edu \
    --cc=josef@toxicpanda.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=vijay@cs.utexas.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.