From: "zhangyi (F)" <yi.zhang@huawei.com>
To: <linux-ext4@vger.kernel.org>, <tytso@mit.edu>, <jack@suse.com>
Cc: <adilger.kernel@dilger.ca>, <zhangxiaoxu5@huawei.com>,
<linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v3 0/5] ext4: fix inconsistency since async write metadata buffer error
Date: Mon, 13 Jul 2020 09:40:47 +0800 [thread overview]
Message-ID: <4b8a3738-cf3a-a1fb-06d6-c14436cf2cf4@huawei.com> (raw)
In-Reply-To: <20200620025427.1756360-1-yi.zhang@huawei.com>
Hi, Ted and Jan, what do you think about this solution ?
Thanks,
Yi.
On 2020/6/20 10:54, zhangyi (F) wrote:
> Changes since v2:
> - Christoph against the solution of adding callback in the block layer
> that could let ext4 handle write error. So for simplicity, switch to
> check the bdev mapping->wb_err when ext4 getting journal write access
> as Jan suggested now. Maybe we could implement the callback through
> introduce a special inode (e.g. a meta inode) for ext4 in the future.
> - Patch 1: Add mapping->wb_err check and invoke ext4_error_err() in
> ext4_journal_get_write_access() if wb_err is different from the
> original one saved at mount time.
> - Patch 2-3: Remove partial fix <7963e5ac90125> and <9c83a923c67d>.
> - Patch 4: Fix another inconsistency problem since we may bypass the
> journal's checkpoint procedure if we free metadata buffers which
> were failed to async write out.
> - Patch 5: Just a cleanup patch.
>
> The above 5 patches are based on linux-5.8-rc1 and have been tested by
> xfstests, no newly increased failures.
>
> Thanks,
> Yi.
>
> -----------------------
>
> Original background
> ===================
>
> This patch set point to fix the inconsistency problem which has been
> discussed and partial fixed in [1].
>
> Now, the problem is on the unstable storage which has a flaky transport
> (e.g. iSCSI transport may disconnect few seconds and reconnect due to
> the bad network environment), if we failed to async write metadata in
> background, the end write routine in block layer will clear the buffer's
> uptodate flag, but the data in such buffer is actually uptodate. Finally
> we may read "old && inconsistent" metadata from the disk when we get the
> buffer later because not only the uptodate flag was cleared but also we
> do not check the write io error flag, or even worse the buffer has been
> freed due to memory presure.
>
> Fortunately, if the jbd2 do checkpoint after async IO error happens,
> the checkpoint routine will check the write_io_error flag and abort the
> the journal if detect IO error. And in the journal recover case, the
> recover code will invoke sync_blockdev() after recover complete, it will
> also detect IO error and refuse to mount the filesystem.
>
> Current ext4 have already deal with this problem in __ext4_get_inode_loc()
> and commit 7963e5ac90125 ("ext4: treat buffers with write errors as
> containing valid data"), but it's not enough.
>
> [1] https://lore.kernel.org/linux-ext4/20190823030207.GC8130@mit.edu/
>
>
> zhangyi (F) (5):
> ext4: abort the filesystem if failed to async write metadata buffer
> ext4: remove ext4_buffer_uptodate()
> ext4: remove write io error check before read inode block
> jbd2: abort journal if free a async write error metadata buffer
> jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers()
>
> fs/ext4/ext4.h | 16 +++-------------
> fs/ext4/ext4_jbd2.c | 25 +++++++++++++++++++++++++
> fs/ext4/inode.c | 15 +++------------
> fs/ext4/super.c | 23 ++++++++++++++++++++---
> fs/jbd2/transaction.c | 20 ++++++++++++++------
> include/linux/jbd2.h | 2 +-
> 6 files changed, 66 insertions(+), 35 deletions(-)
>
prev parent reply other threads:[~2020-07-13 1:41 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-20 2:54 [PATCH v3 0/5] ext4: fix inconsistency since async write metadata buffer error zhangyi (F)
2020-06-20 2:54 ` [PATCH v3 1/5] ext4: abort the filesystem if failed to async write metadata buffer zhangyi (F)
2020-08-07 17:49 ` tytso
2020-06-20 2:54 ` [PATCH v3 2/5] ext4: remove ext4_buffer_uptodate() zhangyi (F)
2020-08-07 17:53 ` tytso
2020-06-20 2:54 ` [PATCH v3 3/5] ext4: remove write io error check before read inode block zhangyi (F)
2020-06-20 2:54 ` [PATCH v3 4/5] jbd2: abort journal if free a async write error metadata buffer zhangyi (F)
2020-08-07 17:59 ` tytso
2020-06-20 2:54 ` [PATCH v3 5/5] jbd2: remove unused parameter in jbd2_journal_try_to_free_buffers() zhangyi (F)
2020-07-13 1:40 ` zhangyi (F) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4b8a3738-cf3a-a1fb-06d6-c14436cf2cf4@huawei.com \
--to=yi.zhang@huawei.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=zhangxiaoxu5@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).