Linux RAID subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Yu Kuai" <yukuai@kernel.org>
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 18/20] md/raid5: reject llbitmap reshape when md chunk shrinks
Date: Wed, 24 Jun 2026 07:24:54 +0000	[thread overview]
Message-ID: <20260624072454.CDE1F1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <9b06d3a655f709b35ab10995a872b30d2026f7c3.1782282042.git.yukuai@kernel.org>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] The validation in `check_reshape` fails to reject reshapes that decrease the number of data disks (`mddev->delta_disks < 0`) for llbitmap, leaving an identical bitmap coverage truncation vulnerability to the one the patch attempts to fix.

Pre-existing issues:
- [High] Writing 0 or a very small/overflowing value to the `chunk_size` sysfs attribute bypasses chunk size validation, leading to a divide-by-zero kernel panic during reshape.
- [High] Pre-existing memory leak of `stripe_head` pages in `resize_stripes()` if array reshape initialization fails.
--

commit cd96767f06f730181f5e43d1d52815ff0535ad82
Author: Yu Kuai <yukuai@fygo.io>

md/raid5: reject llbitmap reshape when md chunk shrinks
    
llbitmap reshape keeps one live bitmap and cannot safely make an
existing bitmap bit cover a smaller data range.

The llbitmap chunksize itself will not shrink when mddev->chunk_sectors
stays the same or grows. However, shrinking mddev->chunk_sectors shrinks
sectors_per_chunk used by raid5_bitmap_sector_map(). That can shrink the
effective data range covered by each bit across the old and new RAID5
geometry. Reject that reshape while llbitmap is active.

> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 1613a42cc25dc..9de648f75dd04 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -8506,6 +8506,9 @@ static int check_reshape(struct mddev *mddev)
>  	if (!check_stripe_cache(mddev))
>  		return -ENOSPC;
>  
> +	if (mddev->bitmap_id == ID_LLBITMAP &&
> +	    mddev->new_chunk_sectors < mddev->chunk_sectors)
> +		return -EOPNOTSUPP;

[Severity: High]
Does this validation fully address the logical coverage truncation?

The commit message correctly notes that shrinking mddev->chunk_sectors shrinks
sectors_per_chunk, which is unsafe for llbitmap because it shrinks the
effective data range covered by each bit.

However, sectors_per_chunk is also reduced if mddev->delta_disks < 0:

sectors_per_chunk = chunk_sectors * (raid_disks - max_degraded);

Since the validation here only checks if the new chunk sectors are smaller,
a reshape that reduces the number of data disks could bypass this check,
resulting in the exact same logical coverage truncation vulnerability.

Could this be updated to also check mddev->delta_disks < 0, or compute the
product to verify sectors_per_chunk does not shrink?

>  	if (mddev->new_chunk_sectors > mddev->chunk_sectors ||
>  	    mddev->delta_disks > 0)

[Severity: High]
This is a pre-existing issue, but writing 0 or a very small value to the
chunk_size sysfs attribute appears to bypass chunk size validation entirely,
leading to a divide-by-zero panic during reshape.

In drivers/md/md.c:chunk_size_store(), user input is right-shifted by 9
to compute mddev->new_chunk_sectors. If the user writes 0, it becomes 0.

When raid5_check_reshape() is called:

drivers/md/raid5.c:raid5_check_reshape() {
    ...
	if (new_chunk > 0) {
		if (!is_power_of_2(new_chunk))
			return -EINVAL;
    ...
}

Because the checks are inside if (new_chunk > 0), a value of 0 skips the
sanity bounds. check_reshape() then accepts it.

Later, in raid5_compute_sector(), this chunk size causes a kernel panic:

drivers/md/raid5.c:raid5_compute_sector() {
    ...
	chunk_offset = sector_div(r_sector, sectors_per_chunk);
    ...
}

Should chunk_size_store() or raid5_check_reshape() explicitly prevent a
chunk size of 0?

>  		if (resize_chunks(conf,
>  				  conf->previous_raid_disks
>  				  + max(0, mddev->delta_disks),
>  				  max(mddev->new_chunk_sectors,
>  				      mddev->chunk_sectors)
>  			    ) < 0)
>  			return -ENOMEM;
> 
>  	if (conf->previous_raid_disks + mddev->delta_disks <= conf->pool_size)
>  		return 0; /* never bother to shrink */
>  	return resize_stripes(conf, (conf->previous_raid_disks
>  				     + mddev->delta_disks));
>  }

[Severity: High]
This isn't a bug introduced by this patch, but is it possible for
resize_stripes() to leak the stripe_head pages if the array reshape
initialization fails?

If the array reshape fails under memory pressure during step 3 of
resize_stripes() (for example, the ndisks allocation fails), the function
structurally proceeds to step 4 without aborting:

drivers/md/raid5.c:resize_stripes() {
    ...
		for (i=conf->raid_disks; i < newsize; i++)
			if (nsh->dev[i].page == NULL) {
				struct page *p = alloc_page(GFP_NOIO);
				nsh->dev[i].page = p;
    ...
}

It loops up to newsize and allocates physical pages for the expanded stripe
slots. At the end of the function, because err is set, it skips updating
conf->pool_size:

drivers/md/raid5.c:resize_stripes() {
    ...
	if (!err)
		conf->pool_size = newsize;
    ...
}

Since conf->pool_size is unchanged, any future teardown via shrink_buffers()
will intentionally ignore the newly allocated pages at indices >= pool_size.
When free_stripe() subsequently destroys the slab object without freeing these
pages, they appear to be permanently leaked.

Is there a mechanism that cleans up these orphaned pages if conf->pool_size
is not updated?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1782282042.git.yukuai@kernel.org?part=18

  reply	other threads:[~2026-06-24  7:24 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05  9:15 [PATCH 00/20] md/md-llbitmap: support reshape for RAID10 and RAID5 Yu Kuai
2026-06-05  9:15 ` [PATCH] md: add exact bitmap mapping and reshape hooks Yu Kuai
2026-06-05  9:15 ` [PATCH] md: skip bitmap accounting for empty write ranges Yu Kuai
2026-06-05  9:15 ` [PATCH] md: add helper to split bios at reshape offset Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: track bitmap sync_size explicitly Yu Kuai
2026-06-15 10:48   ` Su Yue
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: allocate page controls independently Yu Kuai
2026-06-15 11:06   ` Su Yue
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: grow the page cache in place for reshape Yu Kuai
2026-06-15 11:16   ` Su Yue
2026-06-15 16:19     ` yu kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: track target reshape geometry fields Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: finish reshape geometry Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: refuse reshape while llbitmap still needs sync Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: add reshape range mapping helpers Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: don't skip reshape ranges from bitmap state Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: remap checkpointed bits as reshape progresses Yu Kuai
2026-06-05  9:15 ` [PATCH] md/md-llbitmap: clamp state-machine walks to tracked bits Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: wire llbitmap reshape lifecycle Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid10: split reshape bios before bitmap accounting Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: add exact old and new llbitmap mapping helpers Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: wire llbitmap reshape lifecycle Yu Kuai
2026-06-05  9:15 ` [PATCH] md/raid5: split reshape bios before bitmap accounting Yu Kuai
2026-06-05 17:27   ` kernel test robot
2026-06-06  2:15   ` kernel test robot
2026-06-24  6:41 ` [PATCH v2 00/20] md/md-llbitmap: support reshape for RAID10 and RAID5 Yu Kuai
2026-06-24  6:41   ` [PATCH v2 01/20] md: add exact bitmap mapping and reshape hooks Yu Kuai
2026-06-24  6:41   ` [PATCH v2 02/20] md: skip bitmap accounting for empty write ranges Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 03/20] md: add helper to split bios at reshape offset Yu Kuai
2026-06-24  7:01     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 04/20] md/md-llbitmap: track bitmap sync_size explicitly Yu Kuai
2026-06-24  7:02     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 05/20] md/md-llbitmap: allocate page controls independently Yu Kuai
2026-06-24  7:02     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 06/20] md/md-llbitmap: grow the page cache in place for reshape Yu Kuai
2026-06-24  7:03     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 07/20] md/md-llbitmap: track target reshape geometry fields Yu Kuai
2026-06-24  7:07     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 08/20] md/md-llbitmap: finish reshape geometry Yu Kuai
2026-06-24  9:06     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 09/20] md/md-llbitmap: refuse reshape while llbitmap still needs sync Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 10/20] md/md-llbitmap: add reshape range mapping helpers Yu Kuai
2026-06-24  7:08     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 11/20] md/md-llbitmap: don't skip reshape ranges from bitmap state Yu Kuai
2026-06-24  6:58     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 12/20] md/md-llbitmap: remap checkpointed bits as reshape progresses Yu Kuai
2026-06-24  7:04     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 13/20] md/md-llbitmap: clamp state-machine walks to tracked bits Yu Kuai
2026-06-24  7:06     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 14/20] md/raid10: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-24  6:42   ` [PATCH v2 15/20] md/raid10: wire llbitmap reshape lifecycle Yu Kuai
2026-06-24  7:22     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 16/20] md/raid10: split reshape bios before bitmap accounting Yu Kuai
2026-06-24  7:20     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 17/20] md/raid5: add exact old and new llbitmap mapping helpers Yu Kuai
2026-06-24  7:16     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 18/20] md/raid5: reject llbitmap reshape when md chunk shrinks Yu Kuai
2026-06-24  7:24     ` sashiko-bot [this message]
2026-06-24  6:42   ` [PATCH v2 19/20] md/raid5: wire llbitmap reshape lifecycle Yu Kuai
2026-06-24  7:20     ` sashiko-bot
2026-06-24  6:42   ` [PATCH v2 20/20] md/raid5: split reshape bios before bitmap accounting Yu Kuai
2026-06-24  7:29     ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260624072454.CDE1F1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=yukuai@fygo.io \
    --cc=yukuai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox