From: Jan Kara <jack@suse.cz>
To: "zhangyi (F)" <yi.zhang@huawei.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu, jack@suse.cz,
adilger.kernel@dilger.ca, zhangxiaoxu5@huawei.com
Subject: Re: [PATCH] ext4, jbd2: switch to use completion variable instead of JBD2_REC_ERR
Date: Mon, 8 Jun 2020 09:57:29 +0200 [thread overview]
Message-ID: <20200608075729.GI13248@quack2.suse.cz> (raw)
In-Reply-To: <20200526142039.32643-1-yi.zhang@huawei.com>
On Tue 26-05-20 22:20:39, zhangyi (F) wrote:
> In the ext4 filesystem with errors=panic, if one process is recording
> errno in the superblock when invoking jbd2_journal_abort() due to some
> error cases, it could be raced by another __ext4_abort() which is
> setting the SB_RDONLY flag but missing panic because errno has not been
> recorded.
>
> jbd2_journal_abort()
> journal->j_flags |= JBD2_ABORT;
> jbd2_journal_update_sb_errno()
> | __ext4_abort()
> | sb->s_flags |= SB_RDONLY;
> | if (!JBD2_REC_ERR)
> | return;
> journal->j_flags |= JBD2_REC_ERR;
>
> Finally, it will no longer trigger panic because the filesystem has
> already been set read-only. Fix this by remove JBD2_REC_ERR and switch
> to use completion variable instead.
Thanks for the patch! I don't quite understand how this last part can
happen: "Finally, it will no longer trigger panic because the filesystem has
already been set read-only."
AFAIU jbd2_journal_abort() gets called somewhere from jbd2 so ext4 doesn't
know about it. At the same time ext4_abort() gets called somewhere from
ext4 and races as you describe above. OK. But then the next ext4_abort()
call should panic() just fine. What am I missing? I understand that we
might want that the first ext4_abort() already triggers the panic but I'd
like to understand whether that's the bug you're trying to fix or something
else...
WRT the solution I think that the completion you add unnecessarily
complicates matters. I'd rather introduce j_abort_mutex to the journal and
all jbd2_journal_abort() calls will take it and release it once everything
is done. That way we can remove JBD2_REC_ERR, races are avoided, and the
filesystem (ext4 or ocfs2) knows that after its call to
jbd2_journal_abort() completes, journal abort is completed (either by us or
someone else) and so we are free to panic. No need for strange
wait_for_completion() calls in ext4_handle_error() or __ext4_abort() and
the error handling is again fully self-contained within the jbd2 layer.
Honza
> Fixes: 4327ba52afd03 ("ext4, jbd2: ensure entering into panic after recording an error in superblock")
> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
> ---
> fs/ext4/super.c | 25 +++++++++++++------------
> fs/jbd2/journal.c | 6 ++----
> include/linux/jbd2.h | 6 +++++-
> 3 files changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index bf5fcb477f66..987a0bd5b78a 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -495,6 +495,8 @@ static bool system_going_down(void)
>
> static void ext4_handle_error(struct super_block *sb)
> {
> + struct ext4_sb_info *sbi = EXT4_SB(sb);
> +
> if (test_opt(sb, WARN_ON_ERROR))
> WARN_ON_ONCE(1);
>
> @@ -502,9 +504,9 @@ static void ext4_handle_error(struct super_block *sb)
> return;
>
> if (!test_opt(sb, ERRORS_CONT)) {
> - journal_t *journal = EXT4_SB(sb)->s_journal;
> + journal_t *journal = sbi->s_journal;
>
> - EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED;
> + sbi->s_mount_flags |= EXT4_MF_FS_ABORTED;
> if (journal)
> jbd2_journal_abort(journal, -EIO);
> }
> @@ -522,9 +524,8 @@ static void ext4_handle_error(struct super_block *sb)
> smp_wmb();
> sb->s_flags |= SB_RDONLY;
> } else if (test_opt(sb, ERRORS_PANIC)) {
> - if (EXT4_SB(sb)->s_journal &&
> - !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
> - return;
> + if (sbi->s_journal && is_journal_aborted(sbi->s_journal))
> + wait_for_completion(&sbi->s_journal->j_record_errno);
> panic("EXT4-fs (device %s): panic forced after error\n",
> sb->s_id);
> }
> @@ -710,10 +711,11 @@ void __ext4_std_error(struct super_block *sb, const char *function,
> void __ext4_abort(struct super_block *sb, const char *function,
> unsigned int line, int error, const char *fmt, ...)
> {
> + struct ext4_sb_info *sbi = EXT4_SB(sb);
> struct va_format vaf;
> va_list args;
>
> - if (unlikely(ext4_forced_shutdown(EXT4_SB(sb))))
> + if (unlikely(ext4_forced_shutdown(sbi)))
> return;
>
> save_error_info(sb, error, 0, 0, function, line);
> @@ -726,20 +728,19 @@ void __ext4_abort(struct super_block *sb, const char *function,
>
> if (sb_rdonly(sb) == 0) {
> ext4_msg(sb, KERN_CRIT, "Remounting filesystem read-only");
> - EXT4_SB(sb)->s_mount_flags |= EXT4_MF_FS_ABORTED;
> + sbi->s_mount_flags |= EXT4_MF_FS_ABORTED;
> /*
> * Make sure updated value of ->s_mount_flags will be visible
> * before ->s_flags update
> */
> smp_wmb();
> sb->s_flags |= SB_RDONLY;
> - if (EXT4_SB(sb)->s_journal)
> - jbd2_journal_abort(EXT4_SB(sb)->s_journal, -EIO);
> + if (sbi->s_journal)
> + jbd2_journal_abort(sbi->s_journal, -EIO);
> }
> if (test_opt(sb, ERRORS_PANIC) && !system_going_down()) {
> - if (EXT4_SB(sb)->s_journal &&
> - !(EXT4_SB(sb)->s_journal->j_flags & JBD2_REC_ERR))
> - return;
> + if (sbi->s_journal && is_journal_aborted(sbi->s_journal))
> + wait_for_completion(&sbi->s_journal->j_record_errno);
> panic("EXT4-fs panic from previous error\n");
> }
> }
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index a49d0e670ddf..b8acdb2f7ac7 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1140,6 +1140,7 @@ static journal_t *journal_init_common(struct block_device *bdev,
> init_waitqueue_head(&journal->j_wait_commit);
> init_waitqueue_head(&journal->j_wait_updates);
> init_waitqueue_head(&journal->j_wait_reserved);
> + init_completion(&journal->j_record_errno);
> mutex_init(&journal->j_barrier);
> mutex_init(&journal->j_checkpoint_mutex);
> spin_lock_init(&journal->j_revoke_lock);
> @@ -2188,10 +2189,7 @@ void jbd2_journal_abort(journal_t *journal, int errno)
> * layer could realise that a filesystem check is needed.
> */
> jbd2_journal_update_sb_errno(journal);
> -
> - write_lock(&journal->j_state_lock);
> - journal->j_flags |= JBD2_REC_ERR;
> - write_unlock(&journal->j_state_lock);
> + complete_all(&journal->j_record_errno);
> }
>
> /**
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index f613d8529863..0f623b0c347f 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -765,6 +765,11 @@ struct journal_s
> */
> int j_errno;
>
> + /**
> + * @j_record_errno: complete to record errno in the journal superblock
> + */
> + struct completion j_record_errno;
> +
> /**
> * @j_sb_buffer: The first part of the superblock buffer.
> */
> @@ -1247,7 +1252,6 @@ JBD2_FEATURE_INCOMPAT_FUNCS(csum3, CSUM_V3)
> #define JBD2_ABORT_ON_SYNCDATA_ERR 0x040 /* Abort the journal on file
> * data write error in ordered
> * mode */
> -#define JBD2_REC_ERR 0x080 /* The errno in the sb has been recorded */
>
> /*
> * Function declarations for the journaling transaction and buffer
> --
> 2.21.3
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2020-06-08 7:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-26 14:20 [PATCH] ext4, jbd2: switch to use completion variable instead of JBD2_REC_ERR zhangyi (F)
2020-06-08 3:25 ` zhangyi (F)
2020-06-08 7:57 ` Jan Kara [this message]
2020-06-08 15:08 ` zhangyi (F)
2020-06-08 15:36 ` Jan Kara
2020-06-09 7:03 ` zhangyi (F)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200608075729.GI13248@quack2.suse.cz \
--to=jack@suse.cz \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yi.zhang@huawei.com \
--cc=zhangxiaoxu5@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox