linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: NeilBrown <neilb@suse.de>, Jan Kara <jack@suse.cz>,
	"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: milan.opensource@gmail.com, lkml <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH] fsync.2: ERRORS: add EIO and ENOSPC
Date: Thu, 10 Sep 2020 13:42:28 -0400	[thread overview]
Message-ID: <8842543f4c929f7004cf356224230516a7fe2fb7.camel@kernel.org> (raw)
In-Reply-To: <87k0x2k0wn.fsf@notabene.neil.brown.name>

On Thu, 2020-09-10 at 09:04 +1000, NeilBrown wrote:
> On Tue, Sep 08 2020, Jeff Layton wrote:
> 
> > On Tue, 2020-09-08 at 13:27 +0200, Jan Kara wrote:
> > > Added Jeff to CC since he has written the code...
> > > 
> > > On Mon 07-09-20 09:11:06, Michael Kerrisk (man-pages) wrote:
> > > > [Widening the CC to include Andrew and linux-fsdevel@]
> > > > [Milan: thanks for the patch, but it's unclear to me from your commit
> > > > message how/if you verified the details.]
> > > > 
> > > > Andrew, maybe you (or someone else) can comment, since long ago your
> > > > 
> > > >     commit f79e2abb9bd452d97295f34376dedbec9686b986
> > > >     Author: Andrew Morton <akpm@osdl.org>
> > > >     Date:   Fri Mar 31 02:30:42 2006 -0800
> > > > 
> > > > included a comment that is referred to in  stackoverflow discussion
> > > > about this topic (that SO discussion is in turn referred to by
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=194757).
> > > > 
> > > > The essence as I understand it, is this:
> > > > (1) fsync() (and similar) may fail EIO or ENOSPC, at which point data
> > > > has not been synced.
> > > > (2) In this case, the EIO/ENOSPC setting is cleared so that...
> > > > (3) A subsequent fsync() might return success, but...
> > > > (4) That doesn't mean that the data in (1) landed on the disk.
> > > 
> > > Correct.
> > > 
> > > > The proposed manual page patch below wants to document this, but I'd
> > > > be happy to have an FS-knowledgeable person comment before I apply.
> > > 
> > > Just a small comment below:
> > > 
> > > > On Sat, 29 Aug 2020 at 09:13, <milan.opensource@gmail.com> wrote:
> > > > > From: Milan Shah <milan.opensource@gmail.com>
> > > > > 
> > > > > This Fix addresses Bug 194757.
> > > > > Ref: https://bugzilla.kernel.org/show_bug.cgi?id=194757
> > > > > ---
> > > > >  man2/fsync.2 | 13 +++++++++++++
> > > > >  1 file changed, 13 insertions(+)
> > > > > 
> > > > > diff --git a/man2/fsync.2 b/man2/fsync.2
> > > > > index 96401cd..f38b3e4 100644
> > > > > --- a/man2/fsync.2
> > > > > +++ b/man2/fsync.2
> > > > > @@ -186,6 +186,19 @@ In these cases disk caches need to be disabled using
> > > > >  or
> > > > >  .BR sdparm (8)
> > > > >  to guarantee safe operation.
> > > > > +
> > > > > +When
> > > > > +.BR fsync ()
> > > > > +or
> > > > > +.BR fdatasync ()
> > > > > +returns
> > > > > +.B EIO
> > > > > +or
> > > > > +.B ENOSPC
> > > > > +any error flags on pages in the file mapping are cleared, so subsequent synchronisation attempts
> > > > > +will return without error. It is
> > > > > +.I not
> > > > > +safe to retry synchronisation and assume that a non-error return means prior writes are now on disk.
> > > > >  .SH SEE ALSO
> > > > >  .BR sync (1),
> > > > >  .BR bdflush (2),
> > > 
> > > So the error state isn't really stored "on pages in the file mapping".
> > > Current implementation (since 4.14) is that error state is stored in struct
> > > file (I think this tends to be called "file description" in manpages) and
> > > so EIO / ENOSPC is reported once for each file description of the file that
> > > was open before the error happened. Not sure if we want to be so precise in
> > > the manpages or if it just confuses people. Anyway your takeway that no
> > > error on subsequent fsync() does not mean data was written is correct.
> > > 
> > > 
> > 
> > Thinking about it more, I think we ought to spell this out explicitly as
> > we can in the manpage. This is a point of confusion for a lot of people
> > and not understanding this can lead to data integrity bugs. Maybe
> > something like this in the NOTES section?
> > 
> > '''
> > When fsync returns an error, the file is considered to be "clean". A
> > subsequent call to fsync will not result in a reattempt to write out the
> > data, unless that data has been rewritten. Applications that want to
> > reattempt writing to the file after a transient error must re-write
> > their data.
> > '''
> > 
> > To be clear:
> > 
> > In practice, you'd only have to write enough to redirty each page in
> > most cases.
> 
> Nonononono.  In practice you have to repeat the entire write because you
> cannot know if the cached page is from before the write failure, or has
> since been flushed and reloaded.
> 

Oh, good point! There's no way for userland to know that, so you really
do have to rewrite the whole thing.

> > Also, it is hard to claim that the above behavior is universally true. A
> > filesystem could opt to keep the pages dirty for some errors, but the
> > vast majority just toss out the data whenever there is a writeback
> > problem.
> 
> ...and any filesystem that doesn't behave that way is wasting effort,
> because nothing else can be assumed.
> 

Yeah. I only made the point to be pedantic. There's no benefit to
documenting that, I think...

> Regarding your "NOTES" addition, I don't feel comfortable with the
> "clean" language.  I would prefer something like:
> 
>  When fsync() reports a failure (EIO, ENOSPC, EDQUOT) it must be assumed
>  that any write requests initiated since the previous successful fsync
>  was initiated may have failed, and that any cached data may have been
>  lost.  A future fsync() will not attempt to write out the same data
>  again.  If recovery is possible and desired, the application must
>  repeat all the writes that may have failed.
> 
>  If the regions of a file that were written to prior to a failed fsync()
>  are read, the content reported may not reflect the stored content, and
>  subsequent reads may revert to the stored content at any time.
> 

Much nicer.

Should we make a distinction between usage and functional classes of
errors in this? The "usage" errors will probably not result in the pages
being tossed out, but the functional ones almost certainly will...

-- 
Jeff Layton <jlayton@kernel.org>


  reply	other threads:[~2020-09-10 17:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1598685186-27499-1-git-send-email-milan.opensource@gmail.com>
2020-09-07  7:11 ` [PATCH] fsync.2: ERRORS: add EIO and ENOSPC Michael Kerrisk (man-pages)
2020-09-08 11:27   ` Jan Kara
2020-09-08 16:10     ` Jeff Layton
2020-09-09 22:50       ` NeilBrown
2020-09-08 19:44     ` Jeff Layton
2020-09-09 10:53       ` Michael Kerrisk (man-pages)
2020-09-09 23:04       ` NeilBrown
2020-09-10 17:42         ` Jeff Layton [this message]
2020-09-16 23:25           ` NeilBrown
2020-09-17  7:01             ` Michael Kerrisk (man-pages)
2020-09-09 10:52     ` Michael Kerrisk (man-pages)
2020-09-09 11:21       ` Jan Kara
2020-09-09 11:58         ` Michael Kerrisk (man-pages)
2020-09-09 14:14           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8842543f4c929f7004cf356224230516a7fe2fb7.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=milan.opensource@gmail.com \
    --cc=mtk.manpages@gmail.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).