Linux RAID subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Yu Kuai" <yukuai@kernel.org>
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 19/20] md/raid5: wire llbitmap reshape lifecycle
Date: Wed, 24 Jun 2026 07:20:25 +0000	[thread overview]
Message-ID: <20260624072026.52B5F1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <55686793056df8053d1797d22c18fce8df7eebe8.1782282042.git.yukuai@kernel.org>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Checkpointing the translated llbitmap to disk before the MD superblock is updated creates a crash window that leads to silent data corruption on restart.
- [Critical] The llbitmap page control array (`llbitmap->pctl`) can be accessed after being freed during RAID5 reshape initialization because the bitmap resize happens before the array is quiesced.
--

commit 18675e43f792016c6cde2f177af7a007c667120d
Author: Yu Kuai <yukuai@fygo.io>

md/raid5: wire llbitmap reshape lifecycle

Prepare llbitmap before RAID5 reshape starts, checkpoint the bitmap
before advancing reshape_position, and finish the llbitmap geometry
update when reshape completes.

> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 9de648f75dd04..ac7ea483502fd 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c

[ ... ]

> @@ -6430,6 +6430,13 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk
>  			   || test_bit(MD_RECOVERY_INTR, &mddev->recovery));
>  		if (atomic_read(&conf->reshape_stripes) != 0)
>  			return 0;
> +		if (md_bitmap_enabled(mddev, false) &&
> +		    mddev->bitmap_ops->reshape_mark &&
> +		    conf->reshape_safe != conf->reshape_progress) {
> +			mddev->bitmap_ops->reshape_mark(mddev, conf->reshape_safe,
> +						       conf->reshape_progress);
> +			mddev->bitmap_ops->unplug(mddev, true);
> +		}

[Severity: Critical]
Does checkpointing the translated llbitmap to disk before the MD superblock
is updated create a crash window that leads to silent data corruption on
restart?

During RAID5 reshape, the code translates the bitmap to the new layout via
reshape_mark() and flushes it to disk before asynchronously waking the MD
thread to update the superblock.

If the system crashes after the bitmap flush but before the superblock write
completes, the on-disk metadata will still point to the old reshape position
(reshape_safe), while the on-disk bitmap for the region 
[reshape_safe, reshape_progress] will be stored in the new layout.

On reboot, the array will resume the reshape from reshape_safe and
misinterpret the newly mapped bitmap bits as if they were in the old layout.
If the array is degraded during this restart, it might incorrectly assume the
parity for these sectors is in-sync and reconstruct missing data using
corrupted parity.

>  		mddev->reshape_position = conf->reshape_progress;
>  		mddev->curr_resync_completed = sector_nr;
>  		if (!mddev->reshape_backwards)

[ ... ]

> @@ -8571,6 +8585,12 @@ static int raid5_start_reshape(struct mddev *mddev)
>  			mdname(mddev));
>  		return -EINVAL;
>  	}
> +	if (md_bitmap_enabled(mddev, false) &&
> +	    mddev->bitmap_id == ID_LLBITMAP) {
> +		i = mddev->bitmap_ops->resize(mddev, mddev->dev_sectors, 0);
> +		if (i)
> +			return i;
> +	}

[Severity: Critical]
Can this resize operation trigger a use-after-free regression since it
occurs before raid5_quiesce() stops concurrent I/O?

In raid5_start_reshape(), mddev->bitmap_ops->resize() is called to prepare
the bitmap for the new geometry before raid5_quiesce() is invoked. This means
the array is not suspended, and concurrent I/O is actively being handled.

If the resize operation calls llbitmap_expand_pages() and frees the old
pctl array via kfree() without RCU synchronization, concurrent fast-path I/O
threads executing llbitmap_start_write() could locklessly dereference the
freed pointer array in llbitmap_raise_barrier().

>  
>  	atomic_set(&conf->reshape_stripes, 0);
>  	spin_lock_irq(&conf->device_lock);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1782282042.git.yukuai@kernel.org?part=19

  reply	other threads:[~2026-06-24  7:20 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05  9:15 [PATCH 00/20] md/md-llbitmap: support reshape for RAID10 and RAID5 Yu Kuai
2026-06-05  9:15 ` [PATCH] md: add exact bitmap mapping and reshape hooks Yu Kuai
2026-06-05  9:15 ` [PATCH] md: skip bitmap accounting for empty write ranges Yu Kuai
2026-06-05  9:15 ` [PATCH] md: add helper to split bios at reshape offset Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: track bitmap sync_size explicitly Yu Kuai
2026-06-15 10:48   ` Su Yue
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: allocate page controls independently Yu Kuai
2026-06-15 11:06   ` Su Yue
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: grow the page cache in place for reshape Yu Kuai
2026-06-15 11:16   ` Su Yue
2026-06-15 16:19     ` yu kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: track target reshape geometry fields Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: finish reshape geometry Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: refuse reshape while llbitmap still needs sync Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: add reshape range mapping helpers Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: don't skip reshape ranges from bitmap state Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: remap checkpointed bits as reshape progresses Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: clamp state-machine walks to tracked bits Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: wire llbitmap reshape lifecycle Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: split reshape bios before bitmap accounting Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: add exact old and new llbitmap mapping helpers Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: wire llbitmap reshape lifecycle Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: split reshape bios before bitmap accounting Yu Kuai
2026-06-05 17:27   ` kernel test robot
2026-06-06  2:15   ` kernel test robot
2026-06-24  6:41 ` [PATCH v2 00/20] md/md-llbitmap: support reshape for RAID10 and RAID5 Yu Kuai
2026-06-24  6:41   ` [PATCH v2 01/20] md: add exact bitmap mapping and reshape hooks Yu Kuai
2026-06-24  6:41   ` [PATCH v2 02/20] md: skip bitmap accounting for empty write ranges Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 03/20] md: add helper to split bios at reshape offset Yu Kuai
2026-06-24  7:01     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 04/20] md/md-llbitmap: track bitmap sync_size explicitly Yu Kuai
2026-06-24  7:02     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 05/20] md/md-llbitmap: allocate page controls independently Yu Kuai
2026-06-24  7:02     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 06/20] md/md-llbitmap: grow the page cache in place for reshape Yu Kuai
2026-06-24  7:03     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 07/20] md/md-llbitmap: track target reshape geometry fields Yu Kuai
2026-06-24  7:07     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 08/20] md/md-llbitmap: finish reshape geometry Yu Kuai
2026-06-24  9:06     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 09/20] md/md-llbitmap: refuse reshape while llbitmap still needs sync Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 10/20] md/md-llbitmap: add reshape range mapping helpers Yu Kuai
2026-06-24  7:08     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 11/20] md/md-llbitmap: don't skip reshape ranges from bitmap state Yu Kuai
2026-06-24  6:58     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 12/20] md/md-llbitmap: remap checkpointed bits as reshape progresses Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 13/20] md/md-llbitmap: clamp state-machine walks to tracked bits Yu Kuai
2026-06-24  7:06     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 14/20] md/raid10: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-24  6:42   ` [PATCH v2 15/20] md/raid10: wire llbitmap reshape lifecycle Yu Kuai
2026-06-24  7:22     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 16/20] md/raid10: split reshape bios before bitmap accounting Yu Kuai
2026-06-24  7:20     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 17/20] md/raid5: add exact old and new llbitmap mapping helpers Yu Kuai
2026-06-24  7:16     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 18/20] md/raid5: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-24  7:24     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 19/20] md/raid5: wire llbitmap reshape lifecycle Yu Kuai
2026-06-24  7:20     ` sashiko-bot [this message]
2026-06-24  6:42   ` [PATCH v2 20/20] md/raid5: split reshape bios before bitmap accounting Yu Kuai
2026-06-24  7:29     ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260624072026.52B5F1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=yukuai@fygo.io \
    --cc=yukuai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox