All of lore.kernel.org
 help / color / mirror / Atom feed
From: Long Li <leo.lilong@huawei.com>
To: Jan Kara <jack@suse.cz>, <leo.lilong@huaweicloud.com>
Cc: <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
	<linux-ext4@vger.kernel.org>, <yi.zhang@huawei.com>,
	<yangerkun@huawei.com>
Subject: Re: [RESEND PATCH] ext4: Fix race in buffer_head read fault injection
Date: Fri, 8 Nov 2024 11:09:35 +0800	[thread overview]
Message-ID: <20241108030935.GA3232538@ceph-admin> (raw)
In-Reply-To: <20241107155342.sonicmzg7leo63nq@quack3>

On Thu, Nov 07, 2024 at 04:53:42PM +0100, Jan Kara wrote:
> On Thu 24-10-24 10:19:09, leo.lilong@huaweicloud.com wrote:
> > From: Long Li <leo.lilong@huawei.com>
> > 
> > When I enabled ext4 debug for fault injection testing, I encountered the
> > following warning:
> > 
> >   EXT4-fs error (device sda): ext4_read_inode_bitmap:201: comm fsstress:
> >          Cannot read inode bitmap - block_group = 8, inode_bitmap = 1051
> >   WARNING: CPU: 0 PID: 511 at fs/buffer.c:1181 mark_buffer_dirty+0x1b3/0x1d0
> > 
> > The root cause of the issue lies in the improper implementation of ext4's
> > buffer_head read fault injection. The actual completion of buffer_head
> > read and the buffer_head fault injection are not atomic, which can lead
> > to the uptodate flag being cleared on normally used buffer_heads in race
> > conditions.
> > 
> > [CPU0]           [CPU1]         [CPU2]
> > ext4_read_inode_bitmap
> >   ext4_read_bh()
> >   <bh read complete>
> >                  ext4_read_inode_bitmap
> >                    if (buffer_uptodate(bh))
> >                      return bh
> >                                jbd2_journal_commit_transaction
> >                                  __jbd2_journal_refile_buffer
> >                                    __jbd2_journal_unfile_buffer
> >                                      __jbd2_journal_temp_unlink_buffer
> >   ext4_simulate_fail_bh()
> >     clear_buffer_uptodate
> >                                       mark_buffer_dirty
> >                                         <report warning>
> >                                         WARN_ON_ONCE(!buffer_uptodate(bh))
> > 
> > The best approach would be to perform fault injection in the IO completion
> > callback function, rather than after IO completion. However, the IO
> > completion callback function cannot get the fault injection code in sb.
> > 
> > Fix it by passing the result of fault injection into the bh read function,
> > we simulate faults within the bh read function itself. This requires adding
> > an extra parameter to the bh read functions that need fault injection.
> > 
> > Fixes: 46f870d690fe ("ext4: simulate various I/O and checksum errors when reading metadata")
> > Signed-off-by: Long Li <leo.lilong@huawei.com>
> 
> Thanks for the fix! One suggestion below:
> 
> > @@ -3100,9 +3092,9 @@ extern struct buffer_head *ext4_sb_bread(struct super_block *sb,
> >  extern struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb,
> >  						   sector_t block);
> >  extern void ext4_read_bh_nowait(struct buffer_head *bh, blk_opf_t op_flags,
> > -				bh_end_io_t *end_io);
> > +				bh_end_io_t *end_io, bool simu_fail);
> >  extern int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags,
> > -			bh_end_io_t *end_io);
> > +			bh_end_io_t *end_io, bool simu_fail);
> 
> Instead of adding a bool argument whether we should simulate a failure, I'd
> pass the 'code' into ext4_read_bh_nowait() and handle the check in there.
> That reduces the boilerplate code a bit and looks somewhat cleaner.
> 
> 								Honza

Hi Honza,

Thanks for your reply, your solution does appear more cleaner, but it seems
we cannot directly get sbi->s_simulate_fail in ext4_read_bh_nowait(), nor
can we get it in the IO completion callback function.

Thanks,
Long Li

      reply	other threads:[~2024-11-08  2:55 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-24  2:19 [RESEND PATCH] ext4: Fix race in buffer_head read fault injection leo.lilong
2024-11-07 15:53 ` Jan Kara
2024-11-08  3:09   ` Long Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241108030935.GA3232538@ceph-admin \
    --to=leo.lilong@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=leo.lilong@huaweicloud.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.