From: Long Li <leo.lilong@huawei.com>
To: Jan Kara <jack@suse.cz>, <leo.lilong@huaweicloud.com>
Cc: <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
<linux-ext4@vger.kernel.org>, <yi.zhang@huawei.com>,
<yangerkun@huawei.com>
Subject: Re: [RESEND PATCH] ext4: Fix race in buffer_head read fault injection
Date: Fri, 8 Nov 2024 11:09:35 +0800 [thread overview]
Message-ID: <20241108030935.GA3232538@ceph-admin> (raw)
In-Reply-To: <20241107155342.sonicmzg7leo63nq@quack3>
On Thu, Nov 07, 2024 at 04:53:42PM +0100, Jan Kara wrote:
> On Thu 24-10-24 10:19:09, leo.lilong@huaweicloud.com wrote:
> > From: Long Li <leo.lilong@huawei.com>
> >
> > When I enabled ext4 debug for fault injection testing, I encountered the
> > following warning:
> >
> > EXT4-fs error (device sda): ext4_read_inode_bitmap:201: comm fsstress:
> > Cannot read inode bitmap - block_group = 8, inode_bitmap = 1051
> > WARNING: CPU: 0 PID: 511 at fs/buffer.c:1181 mark_buffer_dirty+0x1b3/0x1d0
> >
> > The root cause of the issue lies in the improper implementation of ext4's
> > buffer_head read fault injection. The actual completion of buffer_head
> > read and the buffer_head fault injection are not atomic, which can lead
> > to the uptodate flag being cleared on normally used buffer_heads in race
> > conditions.
> >
> > [CPU0] [CPU1] [CPU2]
> > ext4_read_inode_bitmap
> > ext4_read_bh()
> > <bh read complete>
> > ext4_read_inode_bitmap
> > if (buffer_uptodate(bh))
> > return bh
> > jbd2_journal_commit_transaction
> > __jbd2_journal_refile_buffer
> > __jbd2_journal_unfile_buffer
> > __jbd2_journal_temp_unlink_buffer
> > ext4_simulate_fail_bh()
> > clear_buffer_uptodate
> > mark_buffer_dirty
> > <report warning>
> > WARN_ON_ONCE(!buffer_uptodate(bh))
> >
> > The best approach would be to perform fault injection in the IO completion
> > callback function, rather than after IO completion. However, the IO
> > completion callback function cannot get the fault injection code in sb.
> >
> > Fix it by passing the result of fault injection into the bh read function,
> > we simulate faults within the bh read function itself. This requires adding
> > an extra parameter to the bh read functions that need fault injection.
> >
> > Fixes: 46f870d690fe ("ext4: simulate various I/O and checksum errors when reading metadata")
> > Signed-off-by: Long Li <leo.lilong@huawei.com>
>
> Thanks for the fix! One suggestion below:
>
> > @@ -3100,9 +3092,9 @@ extern struct buffer_head *ext4_sb_bread(struct super_block *sb,
> > extern struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
> > sector_t block);
> > extern void ext4_read_bh_nowait(struct buffer_head *bh, blk_opf_t op_flags,
> > - bh_end_io_t *end_io);
> > + bh_end_io_t *end_io, bool simu_fail);
> > extern int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags,
> > - bh_end_io_t *end_io);
> > + bh_end_io_t *end_io, bool simu_fail);
>
> Instead of adding a bool argument whether we should simulate a failure, I'd
> pass the 'code' into ext4_read_bh_nowait() and handle the check in there.
> That reduces the boilerplate code a bit and looks somewhat cleaner.
>
> Honza
Hi Honza,
Thanks for your reply, your solution does appear more cleaner, but it seems
we cannot directly get sbi->s_simulate_fail in ext4_read_bh_nowait(), nor
can we get it in the IO completion callback function.
Thanks,
Long Li
prev parent reply other threads:[~2024-11-08 2:55 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-24 2:19 [RESEND PATCH] ext4: Fix race in buffer_head read fault injection leo.lilong
2024-11-07 15:53 ` Jan Kara
2024-11-08 3:09 ` Long Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241108030935.GA3232538@ceph-admin \
--to=leo.lilong@huawei.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=leo.lilong@huaweicloud.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).