From: Jeff Layton <jlayton@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Al Viro <viro@ZenIV.linux.org.uk>, Jan Kara <jack@suse.cz>,
tytso@mit.edu, axboe@kernel.dk, mawilcox@microsoft.com,
ross.zwisler@linux.intel.com, corbet@lwn.net,
Chris Mason <clm@fb.com>, Josef Bacik <jbacik@fb.com>,
David Sterba <dsterba@suse.com>,
"Darrick J . Wong" <darrick.wong@oracle.com>,
Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Carlos Maiolino <cmaiolino@redhat.com>,
Eryu Guan <eguan@redhat.com>, David Howells <dhowells@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1)
Date: Mon, 19 Jun 2017 12:23:46 -0400 [thread overview]
Message-ID: <1497889426.4654.7.camel@redhat.com> (raw)
In-Reply-To: <20170616193427.13955-1-jlayton@redhat.com>
On Fri, 2017-06-16 at 15:34 -0400, Jeff Layton wrote:
> v7:
> ===
> This is the seventh posting of the patchset to revamp the way writeback
> errors are tracked and reported.
>
> The main difference from the v6 posting is the removal of the
> FS_WB_ERRSEQ flag. That requires a few other incremental patches in the
> writeback code to ensure that both error tracking models are handled
> in a suitable way.
>
> Also, a bit more cleanup of the metadata writeback codepaths, and some
> documentation updates.
>
> Some of these patches have been posted separately, but I'm re-posting
> them here to make it clear that they're prerequisites of the later
> patches in the series.
>
> This series is based on top of linux-next from a day or so ago. I'd like
> to have this picked up by linux-next in the near future so we can get
> some more extensive testing with it. Should I just plan to maintain a
> topic branch on top of -next and ask Stephen to pick it up?
>
> Background:
> ===========
> The basic problem is that we have (for a very long time) tracked and
> reported writeback errors based on two flags in the address_space:
> AS_EIO and AS_ENOSPC. Those flags are cleared when they are checked,
> so only the first caller to check them is able to consume them.
>
> That model is quite unreliable, for several related reasons:
>
> * only the first fsync caller on the inode will see the error. In a
> world of containerized setups, that's no longer viable. Applications
> need to know that their writes are safely stored, and they can
> currently miss seeing errors that they should be aware of when
> they're not.
>
> * there are a lot of internal callers to filemap_fdatawait* and
> filemap_write_and_wait* that clear these errors but then never report
> them to userland in any fashion.
>
> * Some internal callers report writeback errors, but can do so at
> non-sensical times. For instance, we might want to truncate a file,
> which triggers a pagecache flush. If that writeback fails, we might
> report that error to the truncate caller, but a subsequent fsync
> will likely not see it.
>
> * Some internal callers try to reset the error flags after clearing
> them, but that's racy. Another task could check the flags between
> those two events.
>
> Solution:
> =========
> This patchset adds a new datatype called an errseq_t that represents a
> sequence of errors. It's a u32, with a field for a POSIX-flavor error
> and a counter, managed with atomics. We can sample that value at a
> particular point in time, and can later tell whether there have been any
> errors since that point.
>
> That allows us to provide traditional check-and-clear fsync semantics
> on every open file description in a lightweight fashion. fsync callers
> no longer need to coordinate between one another in order to ensure
> that errors at fsync time are handled correctly.
>
> Strategy:
> =========
> The aim with this pile is to do the minimum possible to support for
> reliable reporting of errors on fsync, without substantially changing
> the internals of the filesystems themselves.
>
> Most of the internal calls to filemap_fdatawait are left alone, so all
> of the internal error checkers are using the same error handling they
> always have. The only real difference here is that we're better
> reporting errors at fsync.
>
> I think that we probably will want to eventually convert all of those
> internal callers to use errseq_t based reporting, but that can be done
> in an incremental fashion in follow-on patchsets.
>
> Testing:
> ========
> I've primarily been testing this with some new xfstests that I will post
> in a separate series. These tests use dm-error fault injection to make
> the underlying block device start throwing I/O errors, and then test the
> how the filesystem layer reports errors after that.
>
> Jeff Layton (22):
> fs: remove call_fsync helper function
> buffer: use mapping_set_error instead of setting the flag
> fs: check for writeback errors after syncing out buffers in
> generic_file_fsync
> buffer: set errors in mapping at the time that the error occurs
> jbd2: don't clear and reset errors after waiting on writeback
> mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails
> mm: don't TestClearPageError in __filemap_fdatawait_range
> mm: clean up error handling in write_one_page
> fs: always sync metadata in __generic_file_fsync
> lib: add errseq_t type and infrastructure for handling it
> fs: new infrastructure for writeback error handling and reporting
> mm: tracepoints for writeback error events
> mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error
> Documentation: flesh out the section in vfs.txt on storing and
> reporting writeback errors
> dax: set errors in mapping when writeback fails
> block: convert to errseq_t based writeback error tracking
> ext4: use errseq_t based error handling for reporting data writeback
> errors
> fs: add f_md_wb_err field to struct file for tracking metadata errors
> ext4: add more robust reporting of metadata writeback errors
> ext2: convert to errseq_t based writeback error tracking
> xfs: minimal conversion to errseq_t writeback error reporting
> btrfs: minimal conversion to errseq_t writeback error reporting on
> fsync
>
> Documentation/filesystems/vfs.txt | 43 +++++++-
> drivers/dax/device.c | 1 +
> fs/block_dev.c | 9 +-
> fs/btrfs/file.c | 7 +-
> fs/buffer.c | 20 ++--
> fs/dax.c | 4 +-
> fs/ext2/dir.c | 8 ++
> fs/ext2/file.c | 26 ++++-
> fs/ext4/dir.c | 8 +-
> fs/ext4/file.c | 5 +-
> fs/ext4/fsync.c | 28 ++++-
> fs/file_table.c | 1 +
> fs/gfs2/lops.c | 2 +-
> fs/jbd2/commit.c | 15 +--
> fs/libfs.c | 12 +--
> fs/open.c | 3 +
> fs/sync.c | 2 +-
> fs/xfs/xfs_file.c | 15 ++-
> include/linux/buffer_head.h | 1 +
> include/linux/errseq.h | 19 ++++
> include/linux/fs.h | 67 ++++++++++--
> include/linux/pagemap.h | 31 ++++--
> include/trace/events/filemap.h | 52 ++++++++++
> ipc/shm.c | 2 +-
> lib/Makefile | 2 +-
> lib/errseq.c | 208 ++++++++++++++++++++++++++++++++++++++
> mm/filemap.c | 113 +++++++++++++++++----
> mm/page-writeback.c | 15 ++-
> 28 files changed, 628 insertions(+), 91 deletions(-)
> create mode 100644 include/linux/errseq.h
> create mode 100644 lib/errseq.c
>
If there are no major objections to this set, I'd like to have
linux-next start picking it up to get some wider testing. What's the
right vehicle for this, given that it touches stuff all over the tree?
I can see 3 potential options:
1) I could just pull these into the branch that Stephen is already
picking up for file-locks in my tree
2) I could put them into a new branch, and have Stephen pull that one in
addition to the file-locks branch
3) It could go in via someone else's tree entirely (Andrew or Al's
maybe?)
I'm fine with any of these. Anyone have thoughts?
Thanks,
--
Jeff Layton <jlayton@redhat.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-06-19 16:23 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-16 19:34 [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1) Jeff Layton
2017-06-16 19:34 ` [PATCH v7 01/22] fs: remove call_fsync helper function Jeff Layton
2017-06-20 12:31 ` Christoph Hellwig
2017-06-20 15:33 ` Jan Kara
2017-06-26 8:05 ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 02/22] buffer: use mapping_set_error instead of setting the flag Jeff Layton
2017-06-16 19:34 ` [PATCH v7 03/22] fs: check for writeback errors after syncing out buffers in generic_file_fsync Jeff Layton
2017-06-16 19:34 ` [PATCH v7 04/22] buffer: set errors in mapping at the time that the error occurs Jeff Layton
2017-06-26 8:19 ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 05/22] jbd2: don't clear and reset errors after waiting on writeback Jeff Layton
2017-06-20 15:32 ` Jan Kara
2017-06-26 8:23 ` Carlos Maiolino
2017-06-16 19:34 ` [PATCH v7 06/22] mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails Jeff Layton
2017-06-16 19:34 ` [PATCH v7 07/22] mm: don't TestClearPageError in __filemap_fdatawait_range Jeff Layton
2017-06-16 19:34 ` [PATCH v7 08/22] mm: clean up error handling in write_one_page Jeff Layton
2017-06-16 19:34 ` [PATCH v7 09/22] fs: always sync metadata in __generic_file_fsync Jeff Layton
2017-06-16 19:34 ` [PATCH v7 10/22] lib: add errseq_t type and infrastructure for handling it Jeff Layton
2017-06-16 19:34 ` [PATCH v7 11/22] fs: new infrastructure for writeback error handling and reporting Jeff Layton
2017-06-20 12:34 ` Christoph Hellwig
2017-06-20 12:56 ` Jeff Layton
2017-06-16 19:34 ` [PATCH v7 12/22] mm: tracepoints for writeback error events Jeff Layton
2017-06-16 19:34 ` [PATCH v7 13/22] mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error Jeff Layton
2017-06-16 19:34 ` [PATCH v7 14/22] Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 15/22] dax: set errors in mapping when writeback fails Jeff Layton
2017-06-17 12:39 ` Jeff Layton
2017-06-19 17:48 ` Ross Zwisler
2017-06-16 19:34 ` [PATCH v7 16/22] block: convert to errseq_t based writeback error tracking Jeff Layton
2017-06-20 12:35 ` Christoph Hellwig
2017-06-20 17:44 ` Jeff Layton
2017-06-24 11:59 ` Christoph Hellwig
2017-06-24 13:16 ` Jeff Layton
2017-06-26 14:34 ` Jeff Layton
2017-06-27 15:20 ` Christoph Hellwig
2017-06-16 19:34 ` [PATCH v7 17/22] ext4: use errseq_t based error handling for reporting data writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 18/22] fs: add f_md_wb_err field to struct file for tracking metadata errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 19/22] ext4: add more robust reporting of metadata writeback errors Jeff Layton
2017-06-16 19:34 ` [PATCH v7 20/22] ext2: convert to errseq_t based writeback error tracking Jeff Layton
2017-06-16 19:34 ` [PATCH v7 21/22] xfs: minimal conversion to errseq_t writeback error reporting Jeff Layton
2017-06-26 13:40 ` Carlos Maiolino
2017-06-26 15:22 ` Darrick J. Wong
2017-06-26 17:58 ` jlayton
2017-06-26 18:10 ` Darrick J. Wong
2017-06-16 19:34 ` [PATCH v7 22/22] btrfs: minimal conversion to errseq_t writeback error reporting on fsync Jeff Layton
2017-06-19 16:23 ` Jeff Layton [this message]
2017-06-19 23:25 ` [PATCH v7 00/22] fs: enhanced writeback error reporting with errseq_t (pile #1) Stephen Rothwell
2017-06-20 10:16 ` Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1497889426.4654.7.camel@redhat.com \
--to=jlayton@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=cmaiolino@redhat.com \
--cc=corbet@lwn.net \
--cc=darrick.wong@oracle.com \
--cc=dhowells@redhat.com \
--cc=dsterba@suse.com \
--cc=eguan@redhat.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jbacik@fb.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mawilcox@microsoft.com \
--cc=ross.zwisler@linux.intel.com \
--cc=sfr@canb.auug.org.au \
--cc=tytso@mit.edu \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).