public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: Xiao Ni <xni@redhat.com>
To: Benjamin Marzinski <bmarzins@redhat.com>,
	Yu Kuai
	<yukuai@alb-78bjiv52429oh8qptp.cn-shenzhen.alb.aliyuncs.com>,
	Song Liu <song@kernel.org>, Li Nan <linan122@huawei.com>
Cc: linux-raid@vger.kernel.org, dm-devel@lists.linux.dev,
	Nigel Croxon <ncroxon@redhat.com>
Subject: Re: [PATCH v2] md/raid5: Fix UAF on IO across the reshape position
Date: Mon, 13 Apr 2026 10:07:01 +0800	[thread overview]
Message-ID: <3fed2f45-37a5-44cf-b862-f778ca685af9@redhat.com> (raw)
In-Reply-To: <20260408043548.1695157-1-bmarzins@redhat.com>


在 2026/4/8 12:35, Benjamin Marzinski 写道:
> If make_stripe_request() returns STRIPE_WAIT_RESHAPE,
> raid5_make_request() will free the cloned bio. But raid5_make_request()
> can call make_stripe_request() multiple times, writing to the various
> stripes. If that bio got added to the toread or towrite lists of a
> stripe disk in an earlier call to make_stripe_request(), then it's not
> safe to just free the bio if a later part of it is found to cross the
> reshape position. Doing so can lead to a UAF error, when bio_endio()
> is called on the bio for the earlier stripes.
>
> Instead, raid5_make_request() needs to wait until all parts of the bio
> have called bio_endio(). To do this, bios that cross the reshape
> position while the reshape can't make progress are flagged as needing to
> wait for all parts to complete. When raid5_make_request() has a bio that
> failed make_stripe_request() with STRIPE_WAIT_RESHAPE, it sets
> bi->bi_private to a completion struct and waits for completion after
> ending the bio.  When the bio_endio() is called for the last time on a
> clone bio with bi->bi_private set, it wakes up the waiter. This
> guarantees that raid5_make_request() doesn't return until the cloned bio
> needing a retry for io across the reshape boundary is safely cleaned up.
>
> There is a simple reproducer available at [1]. Compile the kernel with
> KASAN for more useful reporting when the error is triggered (this is not
> necessary to see the bug).
>
> [1] https://gist.github.com/bmarzins/e48598824305cf2171289e47d7241fa5
>
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>
> Changes from v1:
> - Removed mddev->pending_retry_bios, mddev->retry_bios_wait, and
>    md_io_clone->must_retry. Instead, use a completion struct
>    pointed to by bi->bi_private, as suggested by Xiao Ni and Yu Kuai.
>
>   drivers/md/md.c    | 31 ++++++++-----------------------
>   drivers/md/md.h    |  1 -
>   drivers/md/raid5.c |  7 ++++++-
>   3 files changed, 14 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 3ce6f9e9d38e..4318d875a5f6 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -9215,9 +9215,11 @@ static void md_bitmap_end(struct mddev *mddev, struct md_io_clone *md_io_clone)
>   
>   static void md_end_clone_io(struct bio *bio)
>   {
> -	struct md_io_clone *md_io_clone = bio->bi_private;
> +	struct md_io_clone *md_io_clone = container_of(bio, struct md_io_clone,
> +						       bio_clone);
>   	struct bio *orig_bio = md_io_clone->orig_bio;
>   	struct mddev *mddev = md_io_clone->mddev;
> +	struct completion *reshape_completion = bio->bi_private;
>   
>   	if (bio_data_dir(orig_bio) == WRITE && md_bitmap_enabled(mddev, false))
>   		md_bitmap_end(mddev, md_io_clone);
> @@ -9229,7 +9231,10 @@ static void md_end_clone_io(struct bio *bio)
>   		bio_end_io_acct(orig_bio, md_io_clone->start_time);
>   
>   	bio_put(bio);
> -	bio_endio(orig_bio);
> +	if (unlikely(reshape_completion))
> +		complete(reshape_completion);
> +	else
> +		bio_endio(orig_bio);
>   	percpu_ref_put(&mddev->active_io);
>   }
>   
> @@ -9254,7 +9259,7 @@ static void md_clone_bio(struct mddev *mddev, struct bio **bio)
>   	}
>   
>   	clone->bi_end_io = md_end_clone_io;
> -	clone->bi_private = md_io_clone;
> +	clone->bi_private = NULL;
>   	*bio = clone;
>   }
>   
> @@ -9265,26 +9270,6 @@ void md_account_bio(struct mddev *mddev, struct bio **bio)
>   }
>   EXPORT_SYMBOL_GPL(md_account_bio);
>   
> -void md_free_cloned_bio(struct bio *bio)
> -{
> -	struct md_io_clone *md_io_clone = bio->bi_private;
> -	struct bio *orig_bio = md_io_clone->orig_bio;
> -	struct mddev *mddev = md_io_clone->mddev;
> -
> -	if (bio_data_dir(orig_bio) == WRITE && md_bitmap_enabled(mddev, false))
> -		md_bitmap_end(mddev, md_io_clone);
> -
> -	if (bio->bi_status && !orig_bio->bi_status)
> -		orig_bio->bi_status = bio->bi_status;
> -
> -	if (md_io_clone->start_time)
> -		bio_end_io_acct(orig_bio, md_io_clone->start_time);
> -
> -	bio_put(bio);
> -	percpu_ref_put(&mddev->active_io);
> -}
> -EXPORT_SYMBOL_GPL(md_free_cloned_bio);
> -
>   /* md_allow_write(mddev)
>    * Calling this ensures that the array is marked 'active' so that writes
>    * may proceed without blocking.  It is important to call this before
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index ac84289664cd..5d57fee22901 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -917,7 +917,6 @@ extern void md_finish_reshape(struct mddev *mddev);
>   void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev,
>   			struct bio *bio, sector_t start, sector_t size);
>   void md_account_bio(struct mddev *mddev, struct bio **bio);
> -void md_free_cloned_bio(struct bio *bio);
>   
>   extern bool __must_check md_flush_request(struct mddev *mddev, struct bio *bio);
>   void md_write_metadata(struct mddev *mddev, struct md_rdev *rdev,
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index a8e8d431071b..dc0c680ca199 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6217,7 +6217,12 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
>   
>   	mempool_free(ctx, conf->ctx_pool);
>   	if (res == STRIPE_WAIT_RESHAPE) {
> -		md_free_cloned_bio(bi);
> +		DECLARE_COMPLETION_ONSTACK(done);
> +		WRITE_ONCE(bi->bi_private, &done);
> +
> +		bio_endio(bi);
> +
> +		wait_for_completion(&done);
>   		return false;
>   	}
>   


The patch looks good to me.
Reviewed-by: Xiao Ni <xni@redhat.com>



  parent reply	other threads:[~2026-04-13  2:07 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-08  4:35 [PATCH v2] md/raid5: Fix UAF on IO across the reshape position Benjamin Marzinski
2026-04-08 11:22 ` Xiao Ni
2026-04-08 19:57   ` Benjamin Marzinski
2026-04-09  2:31     ` Xiao Ni
2026-04-13  2:08       ` Xiao Ni
2026-04-13  2:07 ` Xiao Ni [this message]
2026-04-19  3:51 ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fed2f45-37a5-44cf-b862-f778ca685af9@redhat.com \
    --to=xni@redhat.com \
    --cc=bmarzins@redhat.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=linan122@huawei.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=ncroxon@redhat.com \
    --cc=song@kernel.org \
    --cc=yukuai@alb-78bjiv52429oh8qptp.cn-shenzhen.alb.aliyuncs.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox