From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 463EA3AC0C2 for ; Sun, 28 Jun 2026 14:36:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782657383; cv=none; b=XyChbRvSVJofdyzEraxhHqjjvVYJHkKdB3SlbPTCf2kW0vDCP3u1yaDUsQZdv4ohDpEx6cH7++uTsYJQSGS8CySKesSH3+tWt5aPvvJOhAUWg+QMGlgDhVfe1WkLbhNFaz8+4cnXIbB5l4+KjNgltCKqoCIE/gzoL0hLv4iFoEs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782657383; c=relaxed/simple; bh=wYkuZYf9XFKY7oyR8K9tJZoWW3mDqVdNxM4z5zWlMHk=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=fokjJKLzHNjrY5M9SYwvv6dj11hpYtqE4+4ZypsNiEVhK4H+wf0IpQmUxMlAYGWduX+rhHat/WqxLEvxRVYyQDB/997bRnLctdUc+5DKTKxbAH++vSRedBy47zk25Sh5byQa6zbpNRG+ksjqEkkcM7YmCuJjp7NlH1HwK5Rxwog= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ErAswZJJ; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ErAswZJJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64A6C1F000E9; Sun, 28 Jun 2026 14:36:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782657380; bh=9YuMh3Q7IjGFeNOKihOZ9PtidEs3HDqgAgUIqBaWLeA=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=ErAswZJJs81ehaaID4vr73ewzYLMDkruXzpZ2kH6PjQsQ2yZ24QEofcvRQKrKPw4j pw3a9KLwcNU7ongA7gCdEyLZle0atM6p2u25WHb/ySsg8f7nGOjcNMyIKx/tWCc2Pc NRAe5V9qnGvTfi3oQm4TcR5dMY1jCaOh1K5OPVlvCg/B8aRanzo4NCyLPnEWhQSRja ZAY7+1didFvQ+1tTCub0SCy9nKVvPKmSTDFikIqJmJUtpZwLNIe2EALJVGMxpQCY2O clI7q/WGnCaRlXe86zzvZA4A7dae1dmHjzHqTBIWVnvQcSecwKOuWGvmoT6Z5qsvjr XlLanzB3y0QkQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH v2 3/7] md/raid10: consistently fail atomic writes that require splitting Reply-To: sashiko-reviews@lists.linux.dev To: "Abd-Alrhman Masalkhi" Cc: linux-raid@vger.kernel.org, yukuai@fygo.io In-Reply-To: <20260628142420.1051027-4-abd.masalkhi@gmail.com> References: <20260628142420.1051027-1-abd.masalkhi@gmail.com> <20260628142420.1051027-4-abd.masalkhi@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Sun, 28 Jun 2026 14:36:19 +0000 Message-Id: <20260628143620.64A6C1F000E9@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 1 potential issue(= s) to consider: Pre-existing issues: - [High] The `err_dec_pending` (formerly `err_handle`) cleanup path in `rai= d10_write_request` can trigger a NULL pointer dereference when a replacemen= t device is promoted concurrently. -- commit bc9d81fdb2c33f8c55baad27ab1eedf427ae25ed Author: Abd-Alrhman Masalkhi md/raid10: consistently fail atomic writes that require splitting RAID10 handles one badblock path explicitly by failing atomic writes with EIO, but another path can also reduce the writable range and force the bio through bio_submit_split_bioset(), completing it with EINVAL. This commit centralizes the check to fail all atomic writes requiring splitting with EIO. > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index bd322eccdc3ff..3480fc7907f01 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c [ ... ] > @@ -1493,6 +1484,9 @@ static bool raid10_write_request(struct mddev *mdde= v, struct bio *bio, > r10_bio->sectors =3D max_sectors; > =20 > if (r10_bio->sectors < bio_sectors(bio)) { > + if (atomic) > + goto err_handle; > + [Severity: High] This is a pre-existing issue, but can this jump to the error handler cause a null pointer dereference if a replacement device is concurrently promoted? In the error handler path, the code fetches the replacement device without checking for null or falling back to the primary slot: err_handle: for (k =3D 0; k < i; k++) { int d =3D r10_bio->devs[k].devnum; struct md_rdev *rrdev =3D conf->mirrors[d].replacement; ... if (r10_bio->devs[k].repl_bio) { rdev_dec_pending(rrdev, mddev); r10_bio->devs[k].repl_bio =3D NULL; } } If raid10_remove_disk() runs concurrently, it can promote the replacement to the primary slot and clear conf->mirrors[d].replacement: raid10_remove_disk() { ... WRITE_ONCE(*rdevp, NULL); if (p->replacement) { WRITE_ONCE(p->rdev, p->replacement); clear_bit(Replacement, &p->replacement->flags); WRITE_ONCE(p->replacement, NULL); } ... } If this concurrent promotion happens, will rrdev be null here? It looks like other paths such as raid10_end_write_request() safely handle this exact race by falling back to rdev when replacement is null, but this cleanup path does not. > allow_barrier(conf); > bio =3D bio_submit_split_bioset(bio, r10_bio->sectors, > &conf->bio_split); --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260628142420.1051= 027-1-abd.masalkhi@gmail.com?part=3D3