From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from va-2-36.ptr.blmpb.com (va-2-36.ptr.blmpb.com [209.127.231.36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A36F637F72C for ; Wed, 22 Apr 2026 02:33:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.231.36 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776825237; cv=none; b=KIPS9td0kLCflwWOP/oWBjo80YAyixONiciTX3gAz3gU6yCZ6X+7mV8qbs6OxE4394XHhGhTLx+r76Vj7TvGA6aHp3SLo/W49ofn2B7ngJx86w26gH1cLWjZ9xNfwh2tmJkS8JT2hLqMqXmi+cXk+qqIBqwLCSg6kbJHv5t9Gdw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776825237; c=relaxed/simple; bh=frGf6y+RMqGevpOobR/Kk9dKrAl9pXxo3fhMo9O+nzg=; h=Date:Message-Id:To:References:Cc:From:In-Reply-To:Subject: Mime-Version:Content-Type; b=e9OqB6QmOz6ypHSNZ4oqOjo4TMwZ+Ojd/+TIY7VaW6b21GaQC/IBZ9ccyxNxIaJncYbYMVNaU1IIZv65teJsbLovDpJ+uCyyL06XpbkOfwu4ldzPAk286Yl/g1uEj1PR6YvqYQLlvigAKZym7yaKc7QecLYOIdDHIucficrITko= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fnnas.com; spf=pass smtp.mailfrom=fnnas.com; dkim=pass (2048-bit key) header.d=fnnas-com.20200927.dkim.feishu.cn header.i=@fnnas-com.20200927.dkim.feishu.cn header.b=JL7+Oc00; arc=none smtp.client-ip=209.127.231.36 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=fnnas.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fnnas.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fnnas-com.20200927.dkim.feishu.cn header.i=@fnnas-com.20200927.dkim.feishu.cn header.b="JL7+Oc00" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=s1; d=fnnas-com.20200927.dkim.feishu.cn; t=1776825215; h=from:subject:mime-version:from:date:message-id:subject:to:cc: reply-to:content-type:mime-version:in-reply-to:message-id; bh=2f9y+UXK1Lu/tSGCicWquuODS+m028Vnz53xVRFUn8A=; b=JL7+Oc00XS+MWsT32l79IoXKpLbKFNfDJFxng30x76lQf/WdwflO5JzluJdDlpYhY7Y6Qb dio7tNFcRYhdoJqSe27z+3fRLqRw5ZGSRjwN688Q2GNQ6iyGM0uRvQNND48PWAuSchFJlj MhqZS4ZkPS3AtriEzn3GapYFZQER3umnE1dcgnKwlHaeXmgTmolTA2ZJCSZvnaU+lForTq SX0KWAha15oR/EWlIdjMCg4ZE7cf5MMb4vwh5JL9VemvQ2FUeMkkky6HgRpaA/HAeTHZRM oFwvip3HD/soMSh4JGzkD7YRvYBM4WN4OKVVBk6DD/dbE9xeq9Q08I3kZOdJ7A== Date: Wed, 22 Apr 2026 10:33:16 +0800 Message-Id: <20260422023317.796326-3-chencheng@fnnas.com> To: , References: <20260422023317.796326-1-chencheng@fnnas.com> X-Original-From: chencheng@fnnas.com Received: from localhost.localdomain ([113.111.140.171]) by smtp.feishu.cn with ESMTPS; Wed, 22 Apr 2026 10:33:32 +0800 X-Lms-Return-Path: Cc: , From: "Chen Cheng" In-Reply-To: <20260422023317.796326-1-chencheng@fnnas.com> Content-Transfer-Encoding: 7bit X-Mailer: git-send-email 2.53.0 Subject: [PATCH 3/4] md/raid10: fix r10bio devs overflow across reshape Precedence: bulk X-Mailing-List: linux-raid@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 From: Chen Cheng A 4-disk to 5-disk raid10 reshape can complete or free an r10bio that was allocated before the geometry switch. The failure was reproduced with a simple write workload while reshaping a raid10 array from 4 disks to 5 disks, e.g.: mdadm -C /dev/md777 -l10 -n4 /dev/sda /dev/sdb /dev/sdc /dev/sdd mkfs.ext4 /dev/md777 mount /dev/md777 /mnt/test fsstress -d /mnt/test -n 24000 -p 8 -l 24 & mdadm /dev/md777 --add /dev/sde mdadm --grow /dev/md777 --raid-devices=5 \ --backup-file=/tmp/md-reshape-backup Without this patch, the sequence above can trigger: BUG: KASAN: slab-out-of-bounds in free_r10bio+0x1c4/0x260 [raid10] Read of size 8 at addr ffff00008c2dfac8 by task ksoftirqd/0/15 free_r10bio raid_end_bio_io one_write_done raid10_end_write_request The buggy object was 200 bytes long, which matches an r10bio with space for only four devs[] entries. However, put_all_bios() and find_bio_disk() walk r10_bio->devs[] using the current conf->geo.raid_disks value. Once reshape switches conf->geo.raid_disks from 4 to 5, an old 4-slot r10bio can be completed or freed as if it had 5 slots, and the walk overruns devs[4]. The same stale-width mismatch can also surface during a 5-disk to 4-disk reshape. The same transition also leaves stale-width objects in the active r10bio pool, so new requests can reuse a 4-slot object after reshape starts unless the pool is replaced for the new geometry. Fix this by recording the actual devs[] slot count in each r10bio and using that count when scanning or freeing the object. Also replace the active r10bio pool with one sized for the new geometry before reshape switches layouts. Old-width r10bio objects are freed directly instead of being returned to a pool that now expects a different width. A/B validation: - Without this patch, the 4-disk to 5-disk reshape test triggered the KASAN report. - With this patch, neither the 4-disk to 5-disk nor the 5-disk to 4-disk reshape test triggers KASAN. --- drivers/md/raid10.c | 43 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 41 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index b447903fbdc6..3edde440623a 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -234,6 +234,30 @@ static void * r10buf_pool_alloc(gfp_t gfp_flags, void *data) return NULL; } +static int reinit_r10bio_pool(struct r10conf *conf, unsigned int nr_devs) +{ + struct r10bio_pool *new_pool, *old_pool = conf->r10bio_pool; + int ret; + + if (old_pool->nr_devs == nr_devs) + return 0; + + new_pool = kzalloc_obj(struct r10bio_pool); + if (!new_pool) + return -ENOMEM; + + ret = init_r10bio_pool(new_pool, nr_devs); + if (ret) { + kfree(new_pool); + return ret; + } + + conf->r10bio_pool = new_pool; + mempool_exit(&old_pool->pool); + kfree(old_pool); + return 0; +} + static void r10buf_pool_free(void *__r10_bio, void *data) { struct r10conf *conf = data; @@ -268,7 +292,7 @@ static void put_all_bios(struct r10conf *conf, struct r10bio *r10_bio) { int i; - for (i = 0; i < conf->geo.raid_disks; i++) { + for (i = 0; i < r10_bio->used_nr_devs; i++) { struct bio **bio = & r10_bio->devs[i].bio; if (!BIO_SPECIAL(*bio)) bio_put(*bio); @@ -285,6 +309,10 @@ static void free_r10bio(struct r10bio *r10_bio) struct r10conf *conf = r10_bio->mddev->private; put_all_bios(conf, r10_bio); + if (r10_bio->alloc_nr_devs != conf->r10bio_pool->nr_devs) { + rbio_pool_free(r10_bio, conf); + return; + } mempool_free(r10_bio, &conf->r10bio_pool->pool); } @@ -365,7 +393,7 @@ static int find_bio_disk(struct r10conf *conf, struct r10bio *r10_bio, int slot; int repl = 0; - for (slot = 0; slot < conf->geo.raid_disks; slot++) { + for (slot = 0; slot < r10_bio->used_nr_devs; slot++) { if (r10_bio->devs[slot].bio == bio) break; if (r10_bio->devs[slot].repl_bio == bio) { @@ -4416,6 +4444,11 @@ static int raid10_start_reshape(struct mddev *mddev) if (spares < mddev->delta_disks) return -EINVAL; + raise_barrier(conf, 0); + ret = reinit_r10bio_pool(conf, new.raid_disks); + if (ret) + goto out_lower_barrier; + conf->offset_diff = min_offset_diff; spin_lock_irq(&conf->device_lock); if (conf->mirrors_new) { @@ -4433,6 +4466,7 @@ static int raid10_start_reshape(struct mddev *mddev) sector_t size = raid10_size(mddev, 0, 0); if (size < mddev->array_sectors) { spin_unlock_irq(&conf->device_lock); + lower_barrier(conf); pr_warn("md/raid10:%s: array size must be reduce before number of disks\n", mdname(mddev)); return -EINVAL; @@ -4443,6 +4477,7 @@ static int raid10_start_reshape(struct mddev *mddev) conf->reshape_progress = 0; conf->reshape_safe = conf->reshape_progress; spin_unlock_irq(&conf->device_lock); + lower_barrier(conf); if (mddev->delta_disks && mddev->bitmap) { struct mdp_superblock_1 *sb = NULL; @@ -4527,6 +4562,10 @@ static int raid10_start_reshape(struct mddev *mddev) md_new_event(); return 0; +out_lower_barrier: + lower_barrier(conf); + return ret; + abort: mddev->recovery = 0; spin_lock_irq(&conf->device_lock); -- 2.53.0