From: Baokun Li <libaokun1@huawei.com>
To: Ye Bin <yebin@huaweicloud.com>, <tytso@mit.edu>,
<adilger.kernel@dilger.ca>, <linux-ext4@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <jack@suse.cz>,
Ye Bin <yebin10@huawei.com>
Subject: Re: [PATCH v3 0/2] fix error flag covered by journal recovery
Date: Thu, 16 Feb 2023 15:18:09 +0800 [thread overview]
Message-ID: <fa2c6946-ab52-47f4-e5d2-49043eb2bd50@huawei.com> (raw)
In-Reply-To: <20230214022905.765088-1-yebin@huaweicloud.com>
On 2023/2/14 10:29, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> Diff v3 Vs v2:
> Only fix fs error flag lost when previous journal errno is not record
> in disk. As this may lead to drop orphan list, however fs not record
> error flag, then fsck will not repair deeply.
>
> Diff v2 vs v1:
> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
> ext4_load_journal() to jbd2_journal_recover().
>
> When do fault injection test, got issue as follows:
> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>
> Without do file system check, file system is clean when do second mount.
> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
> mode the last super block is commit directly. So super block in journal is
> not uptodate. When do jounral recovery, the uptodate super block will be
> covered by jounral data. If super block submit all failed after recover
> journal, then file system error flag is lost. When do "fsck -a" couldn't
> repair file system deeply.
> To solve above issue we need to do extra handle when do super block journal
> recovery.
>
>
> Ye Bin (2):
> ext4: commit super block if fs record error when journal record
> without error
> ext4: make sure fs error flag setted before clear journal error
>
> fs/ext4/super.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
When we proceed in the flow of ( uninstall after injecting fault
triggered error -> mount
kernel replay journal -> umount to view fsck info ), there are three cases:
1. When an injection fault causes the ERROR_FS flag to not be saved to
disk, but j_errno
is successfully saved to disk, PATCH 2/2 effectively ensures that
ERROR_FS is saved to disk
so that fsck performs a force check to discover the error correctly.
2. When j_errno is lost and the ERROR_FS flag is saved, after the
journal replay:
a. The ext4_super_block on disk has neither error info nor ERROR_FS
flag;
b. The ext4_super_block in memory contains error info but no
ERROR_FS flag
because the error info is copied additionally during journal
replay;
c. The ext4_sb_info in memory contains both error info and ERROR_FS
flag.
This means that the ext4_super_block in memory will be written to disk
the next time
ext4_commit_super is executed, while the ERROR_FS flag in ext4_sb_info
will not be written
to disk until ext4_put_super is called. So if there is a disk
deletion/power failure/disk offline,
we will lose the ERROR_FS flag or even the error info.
(In this case, repairing directly with e2fsck will not do a force check
either, because it
relies on j_errno to recover the ERROR_FS flag after the journal replay.
And it reloads
the information from the disk into memory after the journal replay, so the
ERROR_FS flag and error info are completely lost.)
3. If neither the ERROR_FS flag nor j_errno are saved to disk, we seem
to be unable to
determine if a deep sweep is currently needed. But I think when journal
replay is needed
it means that the file system exits abnormally,
*Is it possible to consider e2fsck to do a force check after the journal
replay?*
--
With Best Regards,
Baokun Li
.
next prev parent reply other threads:[~2023-02-16 7:19 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-14 2:29 [PATCH v3 0/2] fix error flag covered by journal recovery Ye Bin
2023-02-14 2:29 ` [PATCH v3 1/2] ext4: commit super block if fs record error when journal record without error Ye Bin
2023-02-16 7:17 ` Baokun Li
2023-02-16 7:44 ` yebin (H)
2023-02-16 9:17 ` Baokun Li
2023-02-16 9:29 ` yebin (H)
2023-02-16 17:31 ` Jan Kara
2023-02-17 1:43 ` Baokun Li
2023-02-17 1:44 ` yebin (H)
2023-02-17 10:56 ` Jan Kara
2023-02-18 2:18 ` yebin (H)
2023-02-27 11:20 ` Jan Kara
2023-02-28 2:24 ` yebin (H)
2023-03-01 9:07 ` Jan Kara
2023-02-14 2:29 ` [PATCH v3 2/2] ext4: make sure fs error flag setted before clear journal error Ye Bin
2023-02-16 7:17 ` Baokun Li
2023-02-16 17:20 ` Jan Kara
2023-02-16 7:18 ` Baokun Li [this message]
2023-02-16 8:12 ` [PATCH v3 0/2] fix error flag covered by journal recovery yebin (H)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa2c6946-ab52-47f4-e5d2-49043eb2bd50@huawei.com \
--to=libaokun1@huawei.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yebin10@huawei.com \
--cc=yebin@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox