All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hiroshi Nishida <nishidafmly@gmail.com>
To: Song Liu <song@kernel.org>, Yu Kuai <yukuai@fygo.io>
Cc: Li Nan <magiclinan@didiglobal.com>, Xiao Ni <xiao@kernel.org>,
	linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Hiroshi Nishida <nishidafmly@gmail.com>
Subject: [PATCH 6/8] md/raid5: allocate worker groups per NUMA node
Date: Wed, 24 Jun 2026 08:54:50 -0700	[thread overview]
Message-ID: <20260624155452.211646-7-nishidafmly@gmail.com> (raw)
In-Reply-To: <20260624155452.211646-1-nishidafmly@gmail.com>

alloc_thread_groups() previously allocated all r5worker arrays in a
single kcalloc() block, assigning workers for NUMA node N from node 0
memory.  On multi-socket systems this causes remote memory traffic on
every worker->work and worker->temp_inactive_list access.

Replace the single allocation with kzalloc_node(size, GFP_NOIO, i) per
group so each node's workers live in local memory.  Because the workers
are now separate per-node allocations, both free sites --
free_thread_groups() and the reallocation path in
raid5_store_group_thread_cnt() -- are updated to free each group's
allocation individually instead of only group 0's.

Also fix a latent bug: the original kcalloc() had its nmemb and size
arguments swapped (harmless due to commutativity but semantically wrong).

Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Hiroshi Nishida <nishidafmly@gmail.com>
---
 drivers/md/raid5.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8e9edaaca667..c8787ab7b309 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7297,8 +7297,12 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
 			conf->worker_groups = new_groups;
 			spin_unlock_irq(&conf->device_lock);
 
-			if (old_groups)
-				kfree(old_groups[0].workers);
+			if (old_groups) {
+				int node;
+
+				for (node = 0; node < num_possible_nodes(); node++)
+					kfree(old_groups[node].workers);
+			}
 			kfree(old_groups);
 		}
 	}
@@ -7336,7 +7340,6 @@ static int alloc_thread_groups(struct r5conf *conf, int cnt, int *group_cnt,
 {
 	int i, j, k;
 	ssize_t size;
-	struct r5worker *workers;
 
 	if (cnt == 0) {
 		*group_cnt = 0;
@@ -7344,24 +7347,24 @@ static int alloc_thread_groups(struct r5conf *conf, int cnt, int *group_cnt,
 		return 0;
 	}
 	*group_cnt = num_possible_nodes();
-	size = sizeof(struct r5worker) * cnt;
-	workers = kcalloc(size, *group_cnt, GFP_NOIO);
 	*worker_groups = kzalloc_objs(struct r5worker_group, *group_cnt,
 				      GFP_NOIO);
-	if (!*worker_groups || !workers) {
-		kfree(workers);
-		kfree(*worker_groups);
+	if (!*worker_groups)
 		return -ENOMEM;
-	}
 
+	size = sizeof(struct r5worker) * cnt;
 	for (i = 0; i < *group_cnt; i++) {
-		struct r5worker_group *group;
+		struct r5worker_group *group = &(*worker_groups)[i];
+		struct r5worker *workers;
+
+		workers = kzalloc_node(size, GFP_NOIO, i);
+		if (!workers)
+			goto out_free;
 
-		group = &(*worker_groups)[i];
 		INIT_LIST_HEAD(&group->handle_list);
 		INIT_LIST_HEAD(&group->loprio_list);
 		group->conf = conf;
-		group->workers = workers + i * cnt;
+		group->workers = workers;
 
 		for (j = 0; j < cnt; j++) {
 			struct r5worker *worker = group->workers + j;
@@ -7374,12 +7377,22 @@ static int alloc_thread_groups(struct r5conf *conf, int cnt, int *group_cnt,
 	}
 
 	return 0;
+
+out_free:
+	while (--i >= 0)
+		kfree((*worker_groups)[i].workers);
+	kfree(*worker_groups);
+	*worker_groups = NULL;
+	return -ENOMEM;
 }
 
 static void free_thread_groups(struct r5conf *conf)
 {
+	int i;
+
 	if (conf->worker_groups)
-		kfree(conf->worker_groups[0].workers);
+		for (i = 0; i < conf->group_cnt; i++)
+			kfree(conf->worker_groups[i].workers);
 	kfree(conf->worker_groups);
 	conf->worker_groups = NULL;
 }
-- 
2.43.0


  parent reply	other threads:[~2026-06-24 15:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-24 15:54 [PATCH 0/8] md/raid5: scalability and rebuild-path improvements Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 1/8] md: change chunk_sectors and stripe cache counts to unsigned int Hiroshi Nishida
2026-06-24 16:16   ` sashiko-bot
2026-06-24 17:25     ` Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 2/8] md/raid5: raise stripe cache limit from 32768 to 262144 Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 3/8] md: widen badblock sectors param from int to sector_t Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 4/8] md/raid5: raise NR_STRIPE_HASH_LOCKS from 8 to 32 Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 5/8] md/raid5: submit a window of stripes during resync/recovery Hiroshi Nishida
2026-06-24 16:12   ` sashiko-bot
2026-06-24 17:13     ` Hiroshi Nishida
2026-06-24 15:54 ` Hiroshi Nishida [this message]
2026-06-24 16:07   ` [PATCH 6/8] md/raid5: allocate worker groups per NUMA node sashiko-bot
2026-06-24 16:53     ` Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 7/8] md/raid5: raise MAX_STRIPE_BATCH from 8 to 32 Hiroshi Nishida
2026-06-24 16:09   ` sashiko-bot
2026-06-24 17:01     ` Hiroshi Nishida
2026-06-24 15:54 ` [PATCH 8/8] md/raid5: reserve stripe cache for user I/O during rebuild Hiroshi Nishida
2026-06-24 16:12   ` sashiko-bot
2026-06-24 17:25     ` Hiroshi Nishida

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260624155452.211646-7-nishidafmly@gmail.com \
    --to=nishidafmly@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=magiclinan@didiglobal.com \
    --cc=song@kernel.org \
    --cc=xiao@kernel.org \
    --cc=yukuai@fygo.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.