All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com, Gang Li <gang.li@linux.dev>
Subject: [RFC PATCH v2 3/5] padata: dispatch works on different nodes
Date: Fri,  8 Dec 2023 10:52:38 +0800	[thread overview]
Message-ID: <20231208025240.4744-4-gang.li@linux.dev> (raw)
In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev>

When a group of tasks that access different nodes are scheduled on the
same node, they may encounter bandwidth bottlenecks and access latency.

Thus, numa_aware flag is introduced here, allowing tasks to be
distributed across different nodes to fully utilize the advantage of
multi-node systems.

Signed-off-by: Gang Li <gang.li@linux.dev>
---
 include/linux/padata.h | 2 ++
 kernel/padata.c        | 8 ++++++--
 mm/mm_init.c           | 1 +
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/linux/padata.h b/include/linux/padata.h
index 495b16b6b4d72..f6c58c30ed96a 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -137,6 +137,7 @@ struct padata_shell {
  *             appropriate for one worker thread to do at once.
  * @max_threads: Max threads to use for the job, actual number may be less
  *               depending on task size and minimum chunk size.
+ * @numa_aware: Dispatch jobs to different nodes.
  */
 struct padata_mt_job {
 	void (*thread_fn)(unsigned long start, unsigned long end, void *arg);
@@ -146,6 +147,7 @@ struct padata_mt_job {
 	unsigned long		align;
 	unsigned long		min_chunk;
 	int			max_threads;
+	bool			numa_aware;
 };
 
 /**
diff --git a/kernel/padata.c b/kernel/padata.c
index 179fb1518070c..80f82c563e46a 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -485,7 +485,7 @@ void __init padata_do_multithreaded(struct padata_mt_job *job)
 	struct padata_work my_work, *pw;
 	struct padata_mt_job_state ps;
 	LIST_HEAD(works);
-	int nworks;
+	int nworks, nid;
 
 	if (job->size == 0)
 		return;
@@ -517,7 +517,11 @@ void __init padata_do_multithreaded(struct padata_mt_job *job)
 	ps.chunk_size = roundup(ps.chunk_size, job->align);
 
 	list_for_each_entry(pw, &works, pw_list)
-		queue_work(system_unbound_wq, &pw->pw_work);
+		if (job->numa_aware)
+			queue_work_node((++nid % num_node_state(N_MEMORY)),
+					system_unbound_wq, &pw->pw_work);
+		else
+			queue_work(system_unbound_wq, &pw->pw_work);
 
 	/* Use the current thread, which saves starting a workqueue worker. */
 	padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK);
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 077bfe393b5e2..1226f0c81fcb3 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -2234,6 +2234,7 @@ static int __init deferred_init_memmap(void *data)
 			.align       = PAGES_PER_SECTION,
 			.min_chunk   = PAGES_PER_SECTION,
 			.max_threads = max_threads,
+			.numa_aware  = false,
 		};
 
 		padata_do_multithreaded(&job);
-- 
2.30.2



  parent reply	other threads:[~2023-12-08  2:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  2:52 [RFC PATCH v2 0/5] hugetlb: parallelize hugetlb page init on boot Gang Li
2023-12-08  2:52 ` [RFC PATCH v2 1/5] hugetlb: code clean for hugetlb_hstate_alloc_pages Gang Li
2023-12-08  2:52 ` [RFC PATCH v2 2/5] hugetlb: split hugetlb_hstate_alloc_pages Gang Li
2023-12-08  2:52 ` Gang Li [this message]
2023-12-12 23:40   ` [RFC PATCH v2 3/5] padata: dispatch works on different nodes Tim Chen
2023-12-18  6:46     ` Gang Li
2023-12-27 10:33       ` Gang Li
2023-12-08  2:52 ` [RFC PATCH v2 4/5] hugetlb: parallelize 2M hugetlb allocation and initialization Gang Li
2023-12-08  2:52 ` [RFC PATCH v2 5/5] hugetlb: parallelize 1G hugetlb initialization Gang Li
2023-12-08  2:56 ` [PATCH 1/1] hugetlb: add timing to hugetlb allocations on boot Gang Li
2023-12-12 20:06 ` [RFC PATCH v2 0/5] hugetlb: parallelize hugetlb page init " Mike Kravetz
2023-12-21  7:22   ` Gang Li
2023-12-12 22:14 ` David Rientjes
2023-12-12 23:08   ` Mike Kravetz
2023-12-13  0:10     ` David Rientjes
2023-12-18  6:34       ` Gang Li
2023-12-22  4:33         ` David Rientjes
2023-12-25  5:21           ` David Rientjes
2023-12-25  6:24             ` Gang Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231208025240.4744-4-gang.li@linux.dev \
    --to=gang.li@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=ligang.bdlg@bytedance.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.