From: Jan Kara <jack@suse.cz>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH v2] mm: don't break integrity writeback on ->writepage() error
Date: Thu, 22 Nov 2018 13:06:38 +0100 [thread overview]
Message-ID: <20181122120638.GM9840@quack2.suse.cz> (raw)
In-Reply-To: <20181116134304.32440-1-bfoster@redhat.com>
On Fri 16-11-18 08:43:04, Brian Foster wrote:
> The write_cache_pages() function is used in both background and
> integrity writeback scenarios by various filesystems. Background
> writeback is mostly concerned with cleaning a certain number of
> dirty pages based on various mm heuristics. It may not write the
> full set of dirty pages or wait for I/O to complete. Integrity
> writeback is responsible for persisting a set of dirty pages before
> the writeback job completes. For example, an fsync() call must
> perform integrity writeback to ensure data is on disk before the
> call returns.
>
> write_cache_pages() unconditionally breaks out of its processing
> loop in the event of a ->writepage() error. This is fine for
> background writeback, which had no strict requirements and will
> eventually come around again. This can cause problems for integrity
> writeback on filesystems that might need to clean up state
> associated with failed page writeouts. For example, XFS performs
> internal delayed allocation accounting before returning a
> ->writepage() error, where applicable. If the current writeback
> happens to be associated with an unmount and write_cache_pages()
> completes the writeback prematurely due to error, the filesystem is
> unmounted in an inconsistent state if dirty+delalloc pages still
> exist.
>
> To handle this problem, update write_cache_pages() to always process
> the full set of pages for integrity writeback regardless of
> ->writepage() errors. Save the first encountered error and return it
> to the caller once complete. This facilitates XFS (or any other fs
> that expects integrity writeback to process the entire set of dirty
> pages) to clean up its internal state completely in the event of
> persistent mapping errors. Background writeback continues to exit on
> the first error encountered.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
OK, makes sense and the patch looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
>
> Here's a v2 with minor enhancenents based on Andrew Morton's feedback. I
> combined the additional comments with the existing one to avoid having
> too many multi-indent/multi-line comments in this area.
>
> Brian
>
> v2:
> - Dropped unnecessary ->for_sync check.
> - Added comment and updated commit log description.
> v1: https://marc.info/?l=linux-fsdevel&m=154143578027082&w=2
>
> mm/page-writeback.c | 35 +++++++++++++++++++++--------------
> 1 file changed, 21 insertions(+), 14 deletions(-)
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3f690bae6b78..59b4b56d3762 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2154,6 +2154,7 @@ int write_cache_pages(struct address_space *mapping,
> {
> int ret = 0;
> int done = 0;
> + int error;
> struct pagevec pvec;
> int nr_pages;
> pgoff_t uninitialized_var(writeback_index);
> @@ -2227,25 +2228,31 @@ int write_cache_pages(struct address_space *mapping,
> goto continue_unlock;
>
> trace_wbc_writepage(wbc, inode_to_bdi(mapping->host));
> - ret = (*writepage)(page, wbc, data);
> - if (unlikely(ret)) {
> - if (ret == AOP_WRITEPAGE_ACTIVATE) {
> + error = (*writepage)(page, wbc, data);
> + if (unlikely(error)) {
> + /*
> + * Handle errors according to the type of
> + * writeback. There's no need to continue to for
> + * background writeback. Just push done_index
> + * past this page so media errors won't choke
> + * writeout for the entire file. For integrity
> + * writeback, we must process the entire dirty
> + * set regardless of errors because the fs may
> + * still have state to clear for each page. In
> + * that case we continue processing and return
> + * the first error.
> + */
> + if (error == AOP_WRITEPAGE_ACTIVATE) {
> unlock_page(page);
> - ret = 0;
> - } else {
> - /*
> - * done_index is set past this page,
> - * so media errors will not choke
> - * background writeout for the entire
> - * file. This has consequences for
> - * range_cyclic semantics (ie. it may
> - * not be suitable for data integrity
> - * writeout).
> - */
> + error = 0;
> + } else if (wbc->sync_mode != WB_SYNC_ALL) {
> + ret = error;
> done_index = page->index + 1;
> done = 1;
> break;
> }
> + if (!ret)
> + ret = error;
> }
>
> /*
> --
> 2.17.2
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2018-11-22 22:45 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-16 13:43 [PATCH v2] mm: don't break integrity writeback on ->writepage() error Brian Foster
2018-11-22 12:06 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181122120638.GM9840@quack2.suse.cz \
--to=jack@suse.cz \
--cc=bfoster@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).