From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8715A39656B; Tue, 16 Dec 2025 12:31:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765888318; cv=none; b=ec1PQytcUBQSvmqhRJGQ4gXPsCDWG40q0gb7bBgy7a/NsCGuOsBIseEXBzpA8PEsfkKuT+beBpggv8Gpevs29JfBDerYnH5jt0Mwgzou6O+0PcYeZOUfGGQQTQFvgFYfAsL6+6lboDiqAgnZ8lhXlUs7vunw9Rxo+WVTLSo2xoA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765888318; c=relaxed/simple; bh=lyoglxECy02AR0O2KOrjHKTxrU7klFHmrEqPU5lRy7c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jyx4ch3be8pTD/aTMQuVsG75Pn5467bb9T7BWv4KM754YTsRAtsll/YYv3IQUemXXsdo6UhKsn21EYv7aYpTG7vJtNAl0njdfGUTvWVAbO8Dazr79LmPCYAIf+//YvYFmEHkY1xMva8UmaxU58Ut6yQCl07lv9otUgjlScZBNTE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=jVkMLS6T; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="jVkMLS6T" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EDAD2C4CEF1; Tue, 16 Dec 2025 12:31:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1765888318; bh=lyoglxECy02AR0O2KOrjHKTxrU7klFHmrEqPU5lRy7c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jVkMLS6TGvQw6Y2TaRzFzcq+90QgcQFF3LkGSXzVfD1bYajRo6e2HFKCyufauRGre m/bwIwVLXwAJFhNxieMUdLDVP1y5bI7M91oWPNcZMpDsM7cgri70ECS5vsN9x5dEfz FEoNHHou7rsC5wUY+8RWzw7spSWCSyO+L0fhvFyA= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Yu Kuai , Li Nan , Sasha Levin Subject: [PATCH 6.18 499/614] md/raid5: fix IO hang when array is broken with IO inflight Date: Tue, 16 Dec 2025 12:14:26 +0100 Message-ID: <20251216111419.449371626@linuxfoundation.org> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20251216111401.280873349@linuxfoundation.org> References: <20251216111401.280873349@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.18-stable review patch. If anyone has any objections, please let me know. ------------------ From: Yu Kuai [ Upstream commit a913d1f6a7f607c110aeef8b58c8988f47a4b24e ] Following test can cause IO hang: mdadm -CvR /dev/md0 -l10 -n4 /dev/sd[abcd] --assume-clean --chunk=64K --bitmap=none sleep 5 echo 1 > /sys/block/sda/device/delete echo 1 > /sys/block/sdb/device/delete echo 1 > /sys/block/sdc/device/delete echo 1 > /sys/block/sdd/device/delete dd if=/dev/md0 of=/dev/null bs=8k count=1 iflag=direct Root cause: 1) all disks removed, however all rdevs in the array is still in sync, IO will be issued normally. 2) IO failure from sda, and set badblocks failed, sda will be faulty and MD_SB_CHANGING_PENDING will be set. 3) error recovery try to recover this IO from other disks, IO will be issued to sdb, sdc, and sdd. 4) IO failure from sdb, and set badblocks failed again, now array is broken and will become read-only. 5) IO failure from sdc and sdd, however, stripe can't be handled anymore because MD_SB_CHANGING_PENDING is set: handle_stripe handle_stripe if (test_bit MD_SB_CHANGING_PENDING) set_bit STRIPE_HANDLE goto finish // skip handling failed stripe release_stripe if (test_bit STRIPE_HANDLE) list_add_tail conf->hand_list 6) later raid5d can't handle failed stripe as well: raid5d md_check_recovery md_update_sb if (!md_is_rdwr()) // can't clear pending bit return if (test_bit MD_SB_CHANGING_PENDING) break; // can't handle failed stripe Since MD_SB_CHANGING_PENDING can never be cleared for read-only array, fix this problem by skip this checking for read-only array. Link: https://lore.kernel.org/linux-raid/20251117085557.770572-3-yukuai@fnnas.com Fixes: d87f064f5874 ("md: never update metadata when array is read-only.") Signed-off-by: Yu Kuai Reviewed-by: Li Nan Signed-off-by: Sasha Levin --- drivers/md/raid5.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 24b32a0c95b40..8b5f8a12d4179 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -4956,7 +4956,8 @@ static void handle_stripe(struct stripe_head *sh) goto finish; if (s.handle_bad_blocks || - test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) { + (md_is_rdwr(conf->mddev) && + test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags))) { set_bit(STRIPE_HANDLE, &sh->state); goto finish; } @@ -6768,7 +6769,8 @@ static void raid5d(struct md_thread *thread) int batch_size, released; unsigned int offset; - if (test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) + if (md_is_rdwr(mddev) && + test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) break; released = release_stripe_list(conf, conf->temp_inactive_list); -- 2.51.0