From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17B683905E7 for ; Wed, 24 Jun 2026 07:16:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782285389; cv=none; b=cC3UG5u4Zp38NnVq+foqyTRlWYu3J+jbID628t72bhhLuwFGV+jrYW+LT1STVBrEMIVxIalhgOToZivVQI+khq71wy60CwQUr+v81xXuxAzty7+GRWEddBHmy70RP9LvyslKdJXMwhJiqKEIrsLrHIHANeiRRXfZvYH6BMmkg3I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782285389; c=relaxed/simple; bh=PcylBQQHxj6JglNFV4Ad77Al9bsZeBa9khlXpwWuZOs=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=hSCuHDlTTObJcke1K0yS0PpxxYWyvn3dzTXEYABREOoKi44m0WFImnep9ECJfANRvHjjScWUXYSH5zedZd9YrMr0HldoSgVmqKEvtOH5V2365P7p1aV3UXsWK3/Bz8vCGVqTA/Kw3lwgX1bUDkDeGPzOHrSaBZwtMpMawgWwLNg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bkdA0ijj; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bkdA0ijj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF1F01F00A3A; Wed, 24 Jun 2026 07:16:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782285387; bh=gRUavpIBmeBcKPhEDROiX0UR6LfFSrBORfi1W0eHnEk=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=bkdA0ijj9+muQg6xe4FlBBWY84kzrX2UBRH/ppEajQ2SKLzF/rh2vFLKzMShaPDnc 9j/dHKQ4tcDNppm5Ksev6abJBrkdtv+zmvenuEg+i0cp303wmXP0c0ul5FmhQLdhgr e+UHbvPBv+VkJN6Wj1Qlt8o1EAg2O88w9+9KT3k2xs8dg9/GvTO2S15UfoQJeDa214 7e8O8QlCzL3C/FBnXNrR/jBm5qF8w+ZhM2NYVCLloOhkK36UV2bUNk31AmH31cFWSr Tuurwt51YV3YETNBxqmdMmGJsenDeJRvZOi5XfSXIf0eCyvfmbMMN8qJe5UiFm5Xu6 DN+Oh4jvOyFkA== From: sashiko-bot@kernel.org Subject: Re: [PATCH v2 17/20] md/raid5: add exact old and new llbitmap mapping helpers Reply-To: sashiko-reviews@lists.linux.dev To: "Yu Kuai" Cc: linux-raid@vger.kernel.org, yukuai@fygo.io In-Reply-To: <7f4f61a0c06d01b38b0a175ce5df9afe6450d09e.1782282042.git.yukuai@kernel.org> References: <20260605091527.2463539-1-yukuai@kernel.org> <7f4f61a0c06d01b38b0a175ce5df9afe6450d09e.1782282042.git.yukuai@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Wed, 24 Jun 2026 07:16:27 +0000 Message-Id: <20260624071627.BF1F01F00A3A@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 4 potential issue(= s) to consider: New issues: - [High] raid5_bitmap_array_sectors() incorrectly returns the new, smaller = array size for the previous layout during a shrink reshape, leading to trun= cated bitmap tracking. Pre-existing issues: - [High] Invalid use of round_down() and round_up() with a non-power-of-2 s= ectors_per_chunk multiplier, causing corrupt bitmap mappings. - [High] Bounding box calculation for mapping ranges between old and new la= youts can overflow `unsigned long` on 32-bit systems, truncating dirty regi= ons. - [High] raid5_bitmap_sector_map() accesses reshape geometry parameters wit= hout holding the required sequence lock. -- commit 7f0559d3994c9371245bbc6a23cf3ce1e8d48872 Author: Yu Kuai md/raid5: add exact old and new llbitmap mapping helpers Teach RAID5 to export exact old and new llbitmap mappings and the corresponding sync and array sizes for reshape-aware bitmap users. > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 0c5c9fb0606ee..1613a42cc25dc 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c [ ... ] > @@ -5942,25 +5942,43 @@ static enum reshape_loc get_reshape_loc(struct md= dev *mddev, [ ... ] > -static void raid5_bitmap_sector(struct mddev *mddev, sector_t *offset, > - unsigned long *sectors) > +static void raid5_bitmap_sector_map(struct mddev *mddev, sector_t *offse= t, > + unsigned long *sectors, > + bool previous) > { > struct r5conf *conf =3D mddev->private; > sector_t start =3D *offset; > sector_t end =3D start + *sectors; > - sector_t prev_start =3D start; > - sector_t prev_end =3D end; > int sectors_per_chunk; > - enum reshape_loc loc; > int dd_idx; > =20 > - sectors_per_chunk =3D conf->chunk_sectors * > - (conf->raid_disks - conf->max_degraded); > + if (previous) > + sectors_per_chunk =3D conf->prev_chunk_sectors * > + (conf->previous_raid_disks - conf->max_degraded); > + else > + sectors_per_chunk =3D conf->chunk_sectors * > + (conf->raid_disks - conf->max_degraded); [Severity: High] This is a pre-existing issue, but does raid5_bitmap_sector_map() access these reshape geometry parameters safely? During a concurrent reshape operation (e.g., raid5_start_reshape()), these configuration fields can be updated in-place. Without being protected by read_seqcount_begin(&conf->gen_lock), could a concurrent I/O observe torn state or a mismatch between chunk size and raid disks, leading to an incorrect bitmap mapping? > start =3D round_down(start, sectors_per_chunk); > end =3D round_up(end, sectors_per_chunk); [Severity: High] This is a pre-existing issue, but do round_down() and round_up() work corre= ctly here when sectors_per_chunk is not a power of 2? The round_down() and round_up() macros perform a bitwise masking operation which strictly requires the multiplier to be a power of 2. Because the number of data disks (raid_disks - max_degraded) is often not a power of 2 (for example, 3 data disks in a 4-disk RAID5), sectors_per_chunk is frequen= tly not a power of 2. Will this bitwise mask silently corrupt the logical offsets instead of roun= ding to stripe boundaries, leading to incorrect physical regions being tracked in the bitmap? > - start =3D raid5_compute_sector(conf, start, 0, &dd_idx, NULL); > - end =3D raid5_compute_sector(conf, end, 0, &dd_idx, NULL); > + start =3D raid5_compute_sector(conf, start, previous, &dd_idx, NULL); > + end =3D raid5_compute_sector(conf, end, previous, &dd_idx, NULL); > + *offset =3D start; > + *sectors =3D end - start; > +} > + > +static void raid5_bitmap_sector(struct mddev *mddev, sector_t *offset, > + unsigned long *sectors) > +{ > + struct r5conf *conf =3D mddev->private; > + sector_t start =3D *offset; > + sector_t end =3D start + *sectors; > + sector_t prev_start =3D start; > + unsigned long prev_sectors =3D end - start; > + enum reshape_loc loc; > + > + raid5_bitmap_sector_map(mddev, &start, sectors, false); > + end =3D start + *sectors; [ ... ] > @@ -5987,7 +5998,7 @@ static void raid5_bitmap_sector(struct mddev *mddev= , sector_t *offset, > * we set bits for both. > */ > *offset =3D min(start, prev_start); > - *sectors =3D max(end, prev_end) - *offset; > + *sectors =3D max(end, prev_start + prev_sectors) - *offset; [Severity: High] This isn't a bug introduced by this patch, but can this bounding box calculation overflow on 32-bit systems? When calculating the bounding box spanning the old and new physical layout mappings for an I/O ahead of a reshape, the physical sector offsets diverge linearly due to differing data disk counts. For large arrays (e.g., >40TB), the gap between the mapped sectors can exceed 2TB (4,294,967,295 sectors). Because *sectors is a pointer to unsigned long (which is 32-bit on 32-bit architectures), could this subtraction silently overflow, truncating the dirty region and potentially missing necessary bits? > } > =20 > static enum stripe_result make_stripe_request(struct mddev *mddev, [ ... ] > @@ -9049,6 +9060,20 @@ static void raid5_prepare_suspend(struct mddev *md= dev) > wake_up(&conf->wait_for_reshape); > } > =20 > +static sector_t raid5_bitmap_sync_size(struct mddev *mddev, bool previou= s) > +{ > + return mddev->dev_sectors; > +} > + > +static sector_t raid5_bitmap_array_sectors(struct mddev *mddev, bool pre= vious) > +{ > + struct r5conf *conf =3D mddev->private; > + > + if (previous) > + return raid5_size(mddev, 0, 0); [Severity: High] Does passing 0 for raid_disks in raid5_size() incorrectly truncate the array size for the previous layout? During an array shrink reshape (e.g., from 5 disks to 4), passing 0 forces raid5_size() to use the minimum disk count between the old and new layouts. This would return the new, smaller array size instead of the exact previous layout's size. Since reshape-aware bitmap users rely on this limit to clip logical I/O offsets, could I/Os mapping to the old unreshaped region beyond the new limit be erroneously discarded, leading to dropped dirty bits? > + return raid5_size(mddev, mddev->dev_sectors, conf->raid_disks); > +} --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/cover.1782282042.gi= t.yukuai@kernel.org?part=3D17