From: sashiko-bot@kernel.org
To: "Abd-Alrhman Masalkhi" <abd.masalkhi@gmail.com>
Cc: linux-raid@vger.kernel.org, yukuai@fygo.io
Subject: Re: [PATCH v2 4/4] md/raid1: simplify raid1_write_request() error handling
Date: Sat, 13 Jun 2026 18:47:56 +0000 [thread overview]
Message-ID: <20260613184757.57BC51F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260613182810.1317258-5-abd.masalkhi@gmail.com>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
New issues:
- [Critical] Moving `atomic_inc(&rdev->nr_pending)` down opens a race window with `hot_remove_disk`, leading to a NULL pointer dereference during error handling or a Use-After-Free in clustered configurations.
Pre-existing issues:
- [High] A `REQ_ATOMIC` write can be silently split if a bad block overlaps the start of the write, violating atomicity guarantees.
--
commit c334f57d91677f32131f21e50bb32b83136e5a60
Author: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
md/raid1: simplify raid1_write_request() error handling
This commit delays the rdev->nr_pending increment until after badblocks
are checked to avoid incrementing and immediately decrementing on skipped
devices. It also consolidates error handling paths and modifies REQ_ATOMIC
failures to complete with BLK_STS_NOTSUPP instead of I/O errors.
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 632d72607e11e..86d4f224ffb11 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
[ ... ]
> @@ -1576,23 +1573,21 @@ static bool raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (!rdev || test_bit(Faulty, &rdev->flags))
> continue;
>
> - atomic_inc(&rdev->nr_pending);
> if (test_bit(WriteErrorSeen, &rdev->flags)) {
> sector_t first_bad;
> sector_t bad_sectors;
> int is_bad;
>
> - is_bad = is_badblock(rdev, r1_bio->sector, max_sectors,
> + is_bad = is_badblock(rdev, sector, max_sectors,
> &first_bad, &bad_sectors);
> - if (is_bad && first_bad <= r1_bio->sector) {
> + if (is_bad && first_bad <= sector) {
> /* Cannot write here at all */
> - bad_sectors -= (r1_bio->sector - first_bad);
> + bad_sectors -= (sector - first_bad);
> if (bad_sectors < max_sectors)
> /* mustn't write more than bad_sectors
> * to other devices yet
> */
> max_sectors = bad_sectors;
> - rdev_dec_pending(rdev, mddev);
> continue;
[Severity: High]
This is a pre-existing issue, but does this logic skip the rejection of
REQ_ATOMIC bios if a bad block overlaps the start of the write?
When is_badblock() finds a bad block at or before the write sector, the
code shrinks max_sectors to force a split at the boundary and then calls
continue. Because of the continue, the REQ_ATOMIC bio skips the rejection
logic below.
Could this silently split an atomic write into smaller pieces and submit
them to the healthy mirrors via bio_submit_split_bioset(), violating
block-level atomicity guarantees?
> }
> if (is_bad) {
[ ... ]
> if (bio->bi_opf & REQ_ATOMIC) {
> - rdev_dec_pending(rdev, mddev);
> - goto err_handle;
> + bio->bi_status = BLK_STS_NOTSUPP;
> + bio_endio(bio);
> + goto err_dec_pending;
> }
>
> - good_sectors = first_bad - r1_bio->sector;
> + good_sectors = first_bad - sector;
> if (good_sectors < max_sectors)
> max_sectors = good_sectors;
> }
> }
> +
> + atomic_inc(&rdev->nr_pending);
[Severity: Critical]
Does moving atomic_inc(&rdev->nr_pending) down open a race window with
hot_remove_disk()?
Because nr_pending acts as an existence lock that prevents the device from
being removed from the array configuration by raid1_remove_conf(),
incrementing it after the is_badblock() and test_bit(WriteErrorSeen, ...)
checks means md_check_recovery() could evaluate rdev_removeable() while
nr_pending is 0 and call hot_remove_disk().
If raid1_remove_conf() unlinks the device by setting
conf->mirrors[i].rdev = NULL, when this thread resumes and the I/O
eventually fails, would the error handling path (handle_write_finished()
calling narrow_write_error()) read conf->mirrors[i].rdev and dereference a
NULL pointer?
Additionally, in clustered configurations, if the unlinked rdev is
immediately freed by md_kick_rdev_from_array(), could this lead to a
use-after-free directly in raid1_write_request() when evaluating
test_bit(WriteErrorSeen, &rdev->flags) or is_badblock()?
> r1_bio->bios[i] = bio;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260613182810.1317258-1-abd.masalkhi@gmail.com?part=4
prev parent reply other threads:[~2026-06-13 18:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-13 18:28 [PATCH v2 0/4] md/raid1,raid10: fix write-path reference leaks and clean up error handling Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 1/4] md/raid1: fix writes_pending and barrier reference leaks on write failures Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 2/4] md/raid10: fix writes_pending leak on write request failures Abd-Alrhman Masalkhi
2026-06-13 18:40 ` sashiko-bot
2026-06-13 18:28 ` [PATCH v2 3/4] md/raid10: fix writes_pending and barrier reference leaks on discard failures Abd-Alrhman Masalkhi
2026-06-13 18:28 ` [PATCH v2 4/4] md/raid1: simplify raid1_write_request() error handling Abd-Alrhman Masalkhi
2026-06-13 18:47 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260613184757.57BC51F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=abd.masalkhi@gmail.com \
--cc=linux-raid@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=yukuai@fygo.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.