linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,  Jan Kara <jack@suse.cz>,
	tytso@mit.edu, axboe@kernel.dk, mawilcox@microsoft.com,
	 ross.zwisler@linux.intel.com, corbet@lwn.net,
	Chris Mason <clm@fb.com>, Josef Bacik <jbacik@fb.com>,
	David Sterba <dsterba@suse.com>,
	"Darrick J . Wong" <darrick.wong@oracle.com>,
	Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Carlos Maiolino <cmaiolino@redhat.com>,
	Eryu Guan <eguan@redhat.com>, David Howells <dhowells@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org,  linux-mm@kvack.org,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1)
Date: Mon, 19 Jun 2017 12:23:46 -0400	[thread overview]
Message-ID: <1497889426.4654.7.camel@redhat.com> (raw)
In-Reply-To: <20170616193427.13955-1-jlayton@redhat.com>

On Fri, 2017-06-16 at 15:34 -0400, Jeff Layton wrote:
> v7:
> ===
> This is the seventh posting of the patchset to revamp the way writeback
> errors are tracked and reported.
> 
> The main difference from the v6 posting is the removal of the
> FS_WB_ERRSEQ flag. That requires a few other incremental patches in the
> writeback code to ensure that both error tracking models are handled
> in a suitable way.
> 
> Also, a bit more cleanup of the metadata writeback codepaths, and some
> documentation updates.
> 
> Some of these patches have been posted separately, but I'm re-posting
> them here to make it clear that they're prerequisites of the later
> patches in the series.
> 
> This series is based on top of linux-next from a day or so ago. I'd like
> to have this picked up by linux-next in the near future so we can get
> some more extensive testing with it. Should I just plan to maintain a
> topic branch on top of -next and ask Stephen to pick it up?
> 
> Background:
> ===========
> The basic problem is that we have (for a very long time) tracked and
> reported writeback errors based on two flags in the address_space:
> AS_EIO and AS_ENOSPC. Those flags are cleared when they are checked,
> so only the first caller to check them is able to consume them.
> 
> That model is quite unreliable, for several related reasons:
> 
> * only the first fsync caller on the inode will see the error. In a
>   world of containerized setups, that's no longer viable. Applications
>   need to know that their writes are safely stored, and they can
>   currently miss seeing errors that they should be aware of when
>   they're not.
> 
> * there are a lot of internal callers to filemap_fdatawait* and
>   filemap_write_and_wait* that clear these errors but then never report
>   them to userland in any fashion.
> 
> * Some internal callers report writeback errors, but can do so at
>   non-sensical times. For instance, we might want to truncate a file,
>   which triggers a pagecache flush. If that writeback fails, we might
>   report that error to the truncate caller, but a subsequent fsync
>   will likely not see it.
> 
> * Some internal callers try to reset the error flags after clearing
>   them, but that's racy. Another task could check the flags between
>   those two events.
> 
> Solution:
> =========
> This patchset adds a new datatype called an errseq_t that represents a
> sequence of errors. It's a u32, with a field for a POSIX-flavor error
> and a counter, managed with atomics. We can sample that value at a
> particular point in time, and can later tell whether there have been any
> errors since that point.
> 
> That allows us to provide traditional check-and-clear fsync semantics
> on every open file description in a lightweight fashion. fsync callers
> no longer need to coordinate between one another in order to ensure
> that errors at fsync time are handled correctly.
> 
> Strategy:
> =========
> The aim with this pile is to do the minimum possible to support for
> reliable reporting of errors on fsync, without substantially changing
> the internals of the filesystems themselves.
> 
> Most of the internal calls to filemap_fdatawait are left alone, so all
> of the internal error checkers are using the same error handling they
> always have. The only real difference here is that we're better
> reporting errors at fsync.
> 
> I think that we probably will want to eventually convert all of those
> internal callers to use errseq_t based reporting, but that can be done
> in an incremental fashion in follow-on patchsets.
> 
> Testing:
> ========
> I've primarily been testing this with some new xfstests that I will post
> in a separate series. These tests use dm-error fault injection to make
> the underlying block device start throwing I/O errors, and then test the
> how the filesystem layer reports errors after that.
> 
> Jeff Layton (22):
>   fs: remove call_fsync helper function
>   buffer: use mapping_set_error instead of setting the flag
>   fs: check for writeback errors after syncing out buffers in
>     generic_file_fsync
>   buffer: set errors in mapping at the time that the error occurs
>   jbd2: don't clear and reset errors after waiting on writeback
>   mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails
>   mm: don't TestClearPageError in __filemap_fdatawait_range
>   mm: clean up error handling in write_one_page
>   fs: always sync metadata in __generic_file_fsync
>   lib: add errseq_t type and infrastructure for handling it
>   fs: new infrastructure for writeback error handling and reporting
>   mm: tracepoints for writeback error events
>   mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error
>   Documentation: flesh out the section in vfs.txt on storing and
>     reporting writeback errors
>   dax: set errors in mapping when writeback fails
>   block: convert to errseq_t based writeback error tracking
>   ext4: use errseq_t based error handling for reporting data writeback
>     errors
>   fs: add f_md_wb_err field to struct file for tracking metadata errors
>   ext4: add more robust reporting of metadata writeback errors
>   ext2: convert to errseq_t based writeback error tracking
>   xfs: minimal conversion to errseq_t writeback error reporting
>   btrfs: minimal conversion to errseq_t writeback error reporting on
>     fsync
> 
>  Documentation/filesystems/vfs.txt |  43 +++++++-
>  drivers/dax/device.c              |   1 +
>  fs/block_dev.c                    |   9 +-
>  fs/btrfs/file.c                   |   7 +-
>  fs/buffer.c                       |  20 ++--
>  fs/dax.c                          |   4 +-
>  fs/ext2/dir.c                     |   8 ++
>  fs/ext2/file.c                    |  26 ++++-
>  fs/ext4/dir.c                     |   8 +-
>  fs/ext4/file.c                    |   5 +-
>  fs/ext4/fsync.c                   |  28 ++++-
>  fs/file_table.c                   |   1 +
>  fs/gfs2/lops.c                    |   2 +-
>  fs/jbd2/commit.c                  |  15 +--
>  fs/libfs.c                        |  12 +--
>  fs/open.c                         |   3 +
>  fs/sync.c                         |   2 +-
>  fs/xfs/xfs_file.c                 |  15 ++-
>  include/linux/buffer_head.h       |   1 +
>  include/linux/errseq.h            |  19 ++++
>  include/linux/fs.h                |  67 ++++++++++--
>  include/linux/pagemap.h           |  31 ++++--
>  include/trace/events/filemap.h    |  52 ++++++++++
>  ipc/shm.c                         |   2 +-
>  lib/Makefile                      |   2 +-
>  lib/errseq.c                      | 208 ++++++++++++++++++++++++++++++++++++++
>  mm/filemap.c                      | 113 +++++++++++++++++----
>  mm/page-writeback.c               |  15 ++-
>  28 files changed, 628 insertions(+), 91 deletions(-)
>  create mode 100644 include/linux/errseq.h
>  create mode 100644 lib/errseq.c
> 

If there are no major objections to this set, I'd like to have
linux-next start picking it up to get some wider testing. What's the
right vehicle for this, given that it touches stuff all over the tree?

I can see 3 potential options:

1) I could just pull these into the branch that Stephen is already
picking up for file-locks in my tree

2) I could put them into a new branch, and have Stephen pull that one in
addition to the file-locks branch

3) It could go in via someone else's tree entirely (Andrew or Al's
maybe?)

I'm fine with any of these. Anyone have thoughts?

Thanks,
-- 
Jeff Layton <jlayton@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-06-19 16:23 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-16 19:34 [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1) Jeff Layton
2017-06-16 19:34 ` [PATCH v7 01/22] fs: remove call_fsync helper function Jeff Layton
2017-06-20 12:31   ` Christoph Hellwig
2017-06-20 15:33   ` Jan Kara
2017-06-26  8:05   ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 02/22] buffer: use mapping_set_error instead of setting the flag Jeff Layton
2017-06-16 19:34 ` [PATCH v7 03/22] fs: check for writeback errors after syncing out buffers in generic_file_fsync Jeff Layton
2017-06-16 19:34 ` [PATCH v7 04/22] buffer: set errors in mapping at the time that the error occurs Jeff Layton
2017-06-26  8:19   ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 05/22] jbd2: don't clear and reset errors after waiting on writeback Jeff Layton
2017-06-20 15:32   ` Jan Kara
2017-06-26  8:23   ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 06/22] mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails Jeff Layton
2017-06-16 19:34 ` [PATCH v7 07/22] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-06-16 19:34 ` [PATCH v7 08/22] mm: clean up error handling in write_one_page Jeff Layton
2017-06-16 19:34 ` [PATCH v7 09/22] fs: always sync metadata in __generic_file_fsync Jeff Layton
2017-06-16 19:34 ` [PATCH v7 10/22] lib: add errseq_t type and infrastructure for handling it Jeff Layton
2017-06-16 19:34 ` [PATCH v7 11/22] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-06-20 12:34   ` Christoph Hellwig
2017-06-20 12:56     ` Jeff Layton
2017-06-16 19:34 ` [PATCH v7 12/22] mm: tracepoints for writeback error events Jeff Layton
2017-06-16 19:34 ` [PATCH v7 13/22] mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error Jeff Layton
2017-06-16 19:34 ` [PATCH v7 14/22] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 15/22] dax: set errors in mapping when writeback fails Jeff Layton
2017-06-17 12:39   ` Jeff Layton
2017-06-19 17:48     ` Ross Zwisler
2017-06-16 19:34 ` [PATCH v7 16/22] block: convert to errseq_t based writeback error tracking Jeff Layton
2017-06-20 12:35   ` Christoph Hellwig
2017-06-20 17:44     ` Jeff Layton
2017-06-24 11:59       ` Christoph Hellwig
2017-06-24 13:16         ` Jeff Layton
2017-06-26 14:34           ` Jeff Layton
2017-06-27 15:20             ` Christoph Hellwig
2017-06-16 19:34 ` [PATCH v7 17/22] ext4: use errseq_t based error handling for reporting data writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 18/22] fs: add f_md_wb_err field to struct file for tracking metadata errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 19/22] ext4: add more robust reporting of metadata writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 20/22] ext2: convert to errseq_t based writeback error tracking Jeff Layton
2017-06-16 19:34 ` [PATCH v7 21/22] xfs: minimal conversion to errseq_t writeback error reporting Jeff Layton
2017-06-26 13:40   ` Carlos Maiolino
2017-06-26 15:22   ` Darrick J. Wong
2017-06-26 17:58     ` jlayton
2017-06-26 18:10       ` Darrick J. Wong
2017-06-16 19:34 ` [PATCH v7 22/22] btrfs: minimal conversion to errseq_t writeback error reporting on fsync Jeff Layton
2017-06-19 16:23 ` Jeff Layton [this message]
2017-06-19 23:25   ` [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1) Stephen Rothwell
2017-06-20 10:16     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1497889426.4654.7.camel@redhat.com \
    --to=jlayton@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=clm@fb.com \
    --cc=cmaiolino@redhat.com \
    --cc=corbet@lwn.net \
    --cc=darrick.wong@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=dsterba@suse.com \
    --cc=eguan@redhat.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mawilcox@microsoft.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tytso@mit.edu \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).