From: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>,
Lorenzo Stoakes <lstoakes@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
Matthew Wilcox <willy@infradead.org>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Dave Chinner <david@fromorbit.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Joel Fernandes <joel@joelfernandes.org>,
Uladzislau Rezki <urezki@gmail.com>,
Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: [PATCH v3 10/11] mm: vmalloc: Set nr_nodes based on CPUs in a system
Date: Tue, 2 Jan 2024 19:46:32 +0100 [thread overview]
Message-ID: <20240102184633.748113-11-urezki@gmail.com> (raw)
In-Reply-To: <20240102184633.748113-1-urezki@gmail.com>
A number of nodes which are used in the alloc/free paths is
set based on num_possible_cpus() in a system. Please note a
high limit threshold though is fixed and corresponds to 128
nodes.
For 32-bit or single core systems an access to a global vmap
heap is not balanced. Such small systems do not suffer from
lock contentions due to low number of CPUs. In such case the
nr_nodes is equal to 1.
Test on AMD Ryzen Threadripper 3970X 32-Core Processor:
sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64
<default perf>
94.41% 0.89% [kernel] [k] _raw_spin_lock
93.35% 93.07% [kernel] [k] native_queued_spin_lock_slowpath
76.13% 0.28% [kernel] [k] __vmalloc_node_range
72.96% 0.81% [kernel] [k] alloc_vmap_area
56.94% 0.00% [kernel] [k] __get_vm_area_node
41.95% 0.00% [kernel] [k] vmalloc
37.15% 0.01% [test_vmalloc] [k] full_fit_alloc_test
35.17% 0.00% [kernel] [k] ret_from_fork_asm
35.17% 0.00% [kernel] [k] ret_from_fork
35.17% 0.00% [kernel] [k] kthread
35.08% 0.00% [test_vmalloc] [k] test_func
34.45% 0.00% [test_vmalloc] [k] fix_size_alloc_test
28.09% 0.01% [test_vmalloc] [k] long_busy_list_alloc_test
23.53% 0.25% [kernel] [k] vfree.part.0
21.72% 0.00% [kernel] [k] remove_vm_area
20.08% 0.21% [kernel] [k] find_unlink_vmap_area
2.34% 0.61% [kernel] [k] free_vmap_area_noflush
<default perf>
vs
<patch-series perf>
82.32% 0.22% [test_vmalloc] [k] long_busy_list_alloc_test
63.36% 0.02% [kernel] [k] vmalloc
63.34% 2.64% [kernel] [k] __vmalloc_node_range
30.42% 4.46% [kernel] [k] vfree.part.0
28.98% 2.51% [kernel] [k] __alloc_pages_bulk
27.28% 0.19% [kernel] [k] __get_vm_area_node
26.13% 1.50% [kernel] [k] alloc_vmap_area
21.72% 21.67% [kernel] [k] clear_page_rep
19.51% 2.43% [kernel] [k] _raw_spin_lock
16.61% 16.51% [kernel] [k] native_queued_spin_lock_slowpath
13.40% 2.07% [kernel] [k] free_unref_page
10.62% 0.01% [kernel] [k] remove_vm_area
9.02% 8.73% [kernel] [k] insert_vmap_area
8.94% 0.00% [kernel] [k] ret_from_fork_asm
8.94% 0.00% [kernel] [k] ret_from_fork
8.94% 0.00% [kernel] [k] kthread
8.29% 0.00% [test_vmalloc] [k] test_func
7.81% 0.05% [test_vmalloc] [k] full_fit_alloc_test
5.30% 4.73% [kernel] [k] purge_vmap_node
4.47% 2.65% [kernel] [k] free_vmap_area_noflush
<patch-series perf>
confirms that a native_queued_spin_lock_slowpath goes down to
16.51% percent from 93.07%.
The throughput is ~12x higher:
urezki@pc638:~$ time sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64
Run the test with following parameters: run_test_mask=7 nr_threads=64
Done.
Check the kernel ring buffer to see the summary.
real 10m51.271s
user 0m0.013s
sys 0m0.187s
urezki@pc638:~$
urezki@pc638:~$ time sudo ./test_vmalloc.sh run_test_mask=7 nr_threads=64
Run the test with following parameters: run_test_mask=7 nr_threads=64
Done.
Check the kernel ring buffer to see the summary.
real 0m51.301s
user 0m0.015s
sys 0m0.040s
urezki@pc638:~$
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
mm/vmalloc.c | 29 +++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 0c671cb96151..ef534c76daef 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -4879,10 +4879,27 @@ static void vmap_init_free_space(void)
static void vmap_init_nodes(void)
{
struct vmap_node *vn;
- int i, j;
+ int i, n;
+
+#if BITS_PER_LONG == 64
+ /* A high threshold of max nodes is fixed and bound to 128. */
+ n = clamp_t(unsigned int, num_possible_cpus(), 1, 128);
+
+ if (n > 1) {
+ vn = kmalloc_array(n, sizeof(*vn), GFP_NOWAIT | __GFP_NOWARN);
+ if (vn) {
+ /* Node partition is 16 pages. */
+ vmap_zone_size = (1 << 4) * PAGE_SIZE;
+ nr_vmap_nodes = n;
+ vmap_nodes = vn;
+ } else {
+ pr_err("Failed to allocate an array. Disable a node layer\n");
+ }
+ }
+#endif
- for (i = 0; i < nr_vmap_nodes; i++) {
- vn = &vmap_nodes[i];
+ for (n = 0; n < nr_vmap_nodes; n++) {
+ vn = &vmap_nodes[n];
vn->busy.root = RB_ROOT;
INIT_LIST_HEAD(&vn->busy.head);
spin_lock_init(&vn->busy.lock);
@@ -4891,9 +4908,9 @@ static void vmap_init_nodes(void)
INIT_LIST_HEAD(&vn->lazy.head);
spin_lock_init(&vn->lazy.lock);
- for (j = 0; j < MAX_VA_SIZE_PAGES; j++) {
- INIT_LIST_HEAD(&vn->pool[j].head);
- WRITE_ONCE(vn->pool[j].len, 0);
+ for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
+ INIT_LIST_HEAD(&vn->pool[i].head);
+ WRITE_ONCE(vn->pool[i].len, 0);
}
spin_lock_init(&vn->pool_lock);
--
2.39.2
next prev parent reply other threads:[~2024-01-02 18:47 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-02 18:46 [PATCH v3 00/11] Mitigate a vmap lock contention v3 Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 01/11] mm: vmalloc: Add va_alloc() helper Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 02/11] mm: vmalloc: Rename adjust_va_to_fit_type() function Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 03/11] mm: vmalloc: Move vmap_init_free_space() down in vmalloc.c Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 04/11] mm: vmalloc: Remove global vmap_area_root rb-tree Uladzislau Rezki (Sony)
2024-01-05 8:10 ` Wen Gu
2024-01-05 10:50 ` Uladzislau Rezki
2024-01-06 9:17 ` Wen Gu
2024-01-06 16:36 ` Uladzislau Rezki
2024-01-07 6:59 ` Hillf Danton
2024-01-08 7:45 ` Wen Gu
2024-01-08 18:37 ` Uladzislau Rezki
2024-01-16 23:25 ` Lorenzo Stoakes
2024-01-18 13:15 ` Uladzislau Rezki
2024-01-20 12:55 ` Lorenzo Stoakes
2024-01-22 17:44 ` Uladzislau Rezki
2024-01-02 18:46 ` [PATCH v3 05/11] mm/vmalloc: remove vmap_area_list Uladzislau Rezki (Sony)
2024-01-16 23:36 ` Lorenzo Stoakes
2024-01-02 18:46 ` [PATCH v3 06/11] mm: vmalloc: Remove global purge_vmap_area_root rb-tree Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 07/11] mm: vmalloc: Offload free_vmap_area_lock lock Uladzislau Rezki (Sony)
2024-01-03 11:08 ` Hillf Danton
2024-01-03 15:47 ` Uladzislau Rezki
2024-01-11 9:02 ` Dave Chinner
2024-01-11 15:54 ` Uladzislau Rezki
2024-01-11 20:37 ` Dave Chinner
2024-01-12 12:18 ` Uladzislau Rezki
2024-01-16 22:12 ` Dave Chinner
2024-01-18 18:15 ` Uladzislau Rezki
2024-02-08 0:25 ` Baoquan He
2024-02-08 13:57 ` Uladzislau Rezki
2024-02-28 9:48 ` Baoquan He
2024-02-28 10:39 ` Uladzislau Rezki
2024-02-28 12:26 ` Baoquan He
2024-03-22 18:21 ` Guenter Roeck
2024-03-22 19:03 ` Uladzislau Rezki
2024-03-22 20:53 ` Guenter Roeck
2024-01-02 18:46 ` [PATCH v3 08/11] mm: vmalloc: Support multiple nodes in vread_iter Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 09/11] mm: vmalloc: Support multiple nodes in vmallocinfo Uladzislau Rezki (Sony)
2024-01-02 18:46 ` Uladzislau Rezki (Sony) [this message]
2024-01-11 9:25 ` [PATCH v3 10/11] mm: vmalloc: Set nr_nodes based on CPUs in a system Dave Chinner
2024-01-15 19:09 ` Uladzislau Rezki
2024-01-16 22:06 ` Dave Chinner
2024-01-18 18:23 ` Uladzislau Rezki
2024-01-18 21:28 ` Dave Chinner
2024-01-19 10:32 ` Uladzislau Rezki
2024-01-02 18:46 ` [PATCH v3 11/11] mm: vmalloc: Add a shrinker to drain vmap pools Uladzislau Rezki (Sony)
2024-02-22 8:35 ` [PATCH v3 00/11] Mitigate a vmap lock contention v3 Uladzislau Rezki
2024-02-22 23:15 ` Pedro Falcato
2024-02-23 9:34 ` Uladzislau Rezki
2024-02-23 10:26 ` Baoquan He
2024-02-23 11:06 ` Uladzislau Rezki
2024-02-23 15:57 ` Baoquan He
2024-02-23 18:55 ` Uladzislau Rezki
2024-02-28 9:27 ` Baoquan He
2024-02-29 10:38 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240102184633.748113-11-urezki@gmail.com \
--to=urezki@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=oleksiy.avramchenko@sony.com \
--cc=paulmck@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).