From: sashiko-bot@kernel.org
To: "Abd-Alrhman Masalkhi" <abd.masalkhi@gmail.com>
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 4/4] md/raid1: simplify raid1_write_request() error handling
Date: Sat, 13 Jun 2026 18:47:56 +0000 [thread overview]
Message-ID: <20260613184757.57BC51F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260613182810.1317258-5-abd.masalkhi@gmail.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Critical] Moving `atomic_inc(&rdev->nr_pending)` down opens a race window with `hot_remove_disk`, leading to a NULL pointer dereference during error handling or a Use-After-Free in clustered configurations.
Pre-existing issues:
- [High] A `REQ_ATOMIC` write can be silently split if a bad block overlaps the start of the write, violating atomicity guarantees.
--
commit c334f57d91677f32131f21e50bb32b83136e5a60
Author: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
md/raid1: simplify raid1_write_request() error handling
This commit delays the rdev->nr_pending increment until after badblocks
are checked to avoid incrementing and immediately decrementing on skipped
devices. It also consolidates error handling paths and modifies REQ_ATOMIC
failures to complete with BLK_STS_NOTSUPP instead of I/O errors.
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 632d72607e11e..86d4f224ffb11 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
[ ... ]
> @@ -1576,23 +1573,21 @@ static bool raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (!rdev || test_bit(Faulty, &rdev->flags))
> continue;
>
> - atomic_inc(&rdev->nr_pending);
> if (test_bit(WriteErrorSeen, &rdev->flags)) {
> sector_t first_bad;
> sector_t bad_sectors;
> int is_bad;
>
> - is_bad = is_badblock(rdev, r1_bio->sector, max_sectors,
> + is_bad = is_badblock(rdev, sector, max_sectors,
> &first_bad, &bad_sectors);
> - if (is_bad && first_bad <= r1_bio->sector) {
> + if (is_bad && first_bad <= sector) {
> /* Cannot write here at all */
> - bad_sectors -= (r1_bio->sector - first_bad);
> + bad_sectors -= (sector - first_bad);
> if (bad_sectors < max_sectors)
> /* mustn't write more than bad_sectors
> * to other devices yet
> */
> max_sectors = bad_sectors;
> - rdev_dec_pending(rdev, mddev);
> continue;
[Severity: High]
This is a pre-existing issue, but does this logic skip the rejection of
REQ_ATOMIC bios if a bad block overlaps the start of the write?
When is_badblock() finds a bad block at or before the write sector, the
code shrinks max_sectors to force a split at the boundary and then calls
continue. Because of the continue, the REQ_ATOMIC bio skips the rejection
logic below.
Could this silently split an atomic write into smaller pieces and submit
them to the healthy mirrors via bio_submit_split_bioset(), violating
block-level atomicity guarantees?
> }
> if (is_bad) {
[ ... ]
> if (bio->bi_opf & REQ_ATOMIC) {
> - rdev_dec_pending(rdev, mddev);
> - goto err_handle;
> + bio->bi_status = BLK_STS_NOTSUPP;
> + bio_endio(bio);
> + goto err_dec_pending;
> }
>
> - good_sectors = first_bad - r1_bio->sector;
> + good_sectors = first_bad - sector;
> if (good_sectors < max_sectors)
> max_sectors = good_sectors;
> }
> }
> +
> + atomic_inc(&rdev->nr_pending);
[Severity: Critical]
Does moving atomic_inc(&rdev->nr_pending) down open a race window with
hot_remove_disk()?
Because nr_pending acts as an existence lock that prevents the device from
being removed from the array configuration by raid1_remove_conf(),
incrementing it after the is_badblock() and test_bit(WriteErrorSeen, ...)
checks means md_check_recovery() could evaluate rdev_removeable() while
nr_pending is 0 and call hot_remove_disk().
If raid1_remove_conf() unlinks the device by setting
conf->mirrors[i].rdev = NULL, when this thread resumes and the I/O
eventually fails, would the error handling path (handle_write_finished()
calling narrow_write_error()) read conf->mirrors[i].rdev and dereference a
NULL pointer?
Additionally, in clustered configurations, if the unlinked rdev is
immediately freed by md_kick_rdev_from_array(), could this lead to a
use-after-free directly in raid1_write_request() when evaluating
test_bit(WriteErrorSeen, &rdev->flags) or is_badblock()?
> r1_bio->bios[i] = bio;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260613182810.1317258-1-abd.masalkhi@gmail.com?part=4
prev parent reply other threads:[~2026-06-13 18:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-13 18:28 [PATCH v2 0/4] md/raid1,raid10: fix write-path reference leaks and clean up error handling Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 1/4] md/raid1: fix writes_pending and barrier reference leaks on write failures Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 2/4] md/raid10: fix writes_pending leak on write request failures Abd-Alrhman Masalkhi
2026-06-13 18:40 ` sashiko-bot
2026-06-13 18:28 ` [PATCH v2 3/4] md/raid10: fix writes_pending and barrier reference leaks on discard failures Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 4/4] md/raid1: simplify raid1_write_request() error handling Abd-Alrhman Masalkhi
2026-06-13 18:47 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260613184757.57BC51F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=abd.masalkhi@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=yukuai@fygo.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox