All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Baoquan He <bhe@redhat.com>,
	Lorenzo Stoakes <lstoakes@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v3 07/11] mm: vmalloc: Offload free_vmap_area_lock lock
Date: Fri, 22 Mar 2024 20:03:04 +0100	[thread overview]
Message-ID: <Zf3V6B9f5o0H1LnE@pc636> (raw)
In-Reply-To: <bbc242d5-3ab0-410f-a3b1-54a68e3e375f@roeck-us.net>

On Fri, Mar 22, 2024 at 11:21:02AM -0700, Guenter Roeck wrote:
> Hi,
> 
> On Tue, Jan 02, 2024 at 07:46:29PM +0100, Uladzislau Rezki (Sony) wrote:
> > Concurrent access to a global vmap space is a bottle-neck.
> > We can simulate a high contention by running a vmalloc test
> > suite.
> > 
> > To address it, introduce an effective vmap node logic. Each
> > node behaves as independent entity. When a node is accessed
> > it serves a request directly(if possible) from its pool.
> > 
> > This model has a size based pool for requests, i.e. pools are
> > serialized and populated based on object size and real demand.
> > A maximum object size that pool can handle is set to 256 pages.
> > 
> > This technique reduces a pressure on the global vmap lock.
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> 
> This patch results in a persistent "spinlock bad magic" message
> when booting s390 images with spinlock debugging enabled.
> 
> [    0.465445] BUG: spinlock bad magic on CPU#0, swapper/0
> [    0.465490]  lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470]  [<00000000011f26c8>] dump_stack_lvl+0x98/0xd8
> [    0.466516]  [<00000000001dcc6a>] do_raw_spin_lock+0x8a/0x108
> [    0.466545]  [<000000000042146c>] find_vmap_area+0x6c/0x108
> [    0.466572]  [<000000000042175a>] find_vm_area+0x22/0x40
> [    0.466597]  [<000000000012f152>] __set_memory+0x132/0x150
> [    0.466624]  [<0000000001cc0398>] vmem_map_init+0x40/0x118
> [    0.466651]  [<0000000001cc0092>] paging_init+0x22/0x68
> [    0.466677]  [<0000000001cbbed2>] setup_arch+0x52a/0x708
> [    0.466702]  [<0000000001cb6140>] start_kernel+0x80/0x5c8
> [    0.466727]  [<0000000000100036>] startup_continue+0x36/0x40
> 
> Bisect results and decoded stacktrace below.
> 
> The uninitialized spinlock is &vn->busy.lock.
> Debugging shows that this lock is actually never initialized.
> 
It is. Once the vmalloc_init() "main entry" function is called from the:

<snip>
start_kernel()
  mm_core_init()
    vmalloc_init()
<snip>

> [    0.464684] ####### locking 0000000002280fb8
> [    0.464862] BUG: spinlock bad magic on CPU#0, swapper/0
> ...
> [    0.464684] ####### locking 0000000002280fb8
> [    0.477479] ####### locking 0000000002280fb8
> [    0.478166] ####### locking 0000000002280fb8
> [    0.478218] ####### locking 0000000002280fb8
> ...
> [    0.718250] #### busy lock init 0000000002871860
> [    0.718328] #### busy lock init 00000000028731b8
> 
> Only the initialized locks are used after the call to vmap_init_nodes().
> 
Right, when the vmap space and vmalloc is initialized.

> Guenter
> 
> ---
> # bad: [8e938e39866920ddc266898e6ae1fffc5c8f51aa] Merge tag '6.9-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
> # good: [e8f897f4afef0031fe618a8e94127a0934896aba] Linux 6.8
> git bisect start 'HEAD' 'v6.8'
> # good: [e56bc745fa1de77abc2ad8debc4b1b83e0426c49] smb311: additional compression flag defined in updated protocol spec
> git bisect good e56bc745fa1de77abc2ad8debc4b1b83e0426c49
> # bad: [902861e34c401696ed9ad17a54c8790e7e8e3069] Merge tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect bad 902861e34c401696ed9ad17a54c8790e7e8e3069
> # good: [480e035fc4c714fb5536e64ab9db04fedc89e910] Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel
> git bisect good 480e035fc4c714fb5536e64ab9db04fedc89e910
> # good: [fe46a7dd189e25604716c03576d05ac8a5209743] Merge tag 'sound-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good fe46a7dd189e25604716c03576d05ac8a5209743
> # bad: [435a75548109f19e5b5b14ae35b9acb063c084e9] mm: use folio more widely in __split_huge_page
> git bisect bad 435a75548109f19e5b5b14ae35b9acb063c084e9
> # good: [4d5bf0b6183f79ea361dd506365d2a471270735c] mm/mmu_gather: add tlb_remove_tlb_entries()
> git bisect good 4d5bf0b6183f79ea361dd506365d2a471270735c
> # bad: [4daacfe8f99f4b4cef562649d56c48642981f46e] mm/damon/sysfs-schemes: support PSI-based quota auto-tune
> git bisect bad 4daacfe8f99f4b4cef562649d56c48642981f46e
> # good: [217b2119b9e260609958db413876f211038f00ee] mm,page_owner: implement the tracking of the stacks count
> git bisect good 217b2119b9e260609958db413876f211038f00ee
> # bad: [40254101d87870b2e5ac3ddc28af40aa04c48486] arm64, crash: wrap crash dumping code into crash related ifdefs
> git bisect bad 40254101d87870b2e5ac3ddc28af40aa04c48486
> # bad: [53becf32aec1c8049b854f0c31a11df5ed75df6f] mm: vmalloc: support multiple nodes in vread_iter
> git bisect bad 53becf32aec1c8049b854f0c31a11df5ed75df6f
> # good: [7fa8cee003166ef6db0bba70d610dbf173543811] mm: vmalloc: move vmap_init_free_space() down in vmalloc.c
> git bisect good 7fa8cee003166ef6db0bba70d610dbf173543811
> # good: [282631cb2447318e2a55b41a665dbe8571c46d70] mm: vmalloc: remove global purge_vmap_area_root rb-tree
> git bisect good 282631cb2447318e2a55b41a665dbe8571c46d70
> # bad: [96aa8437d169b8e030a98e2b74fd9a8ee9d3be7e] mm: vmalloc: add a scan area of VA only once
> git bisect bad 96aa8437d169b8e030a98e2b74fd9a8ee9d3be7e
> # bad: [72210662c5a2b6005f6daea7fe293a0dc573e1a5] mm: vmalloc: offload free_vmap_area_lock lock
> git bisect bad 72210662c5a2b6005f6daea7fe293a0dc573e1a5
> # first bad commit: [72210662c5a2b6005f6daea7fe293a0dc573e1a5] mm: vmalloc: offload free_vmap_area_lock lock
> 
> ---
> [    0.465490] lock: single+0x1860/0x1958, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [    0.466067] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-12955-g8e938e398669 #1
> [    0.466188] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    0.466270] Call Trace:
> [    0.466470] dump_stack_lvl (lib/dump_stack.c:117)
> [    0.466516] do_raw_spin_lock (kernel/locking/spinlock_debug.c:87 kernel/locking/spinlock_debug.c:115)
> [    0.466545] find_vmap_area (mm/vmalloc.c:1059 mm/vmalloc.c:2364)
> [    0.466572] find_vm_area (mm/vmalloc.c:3150)
> [    0.466597] __set_memory (arch/s390/mm/pageattr.c:360 arch/s390/mm/pageattr.c:393)
> [    0.466624] vmem_map_init (./arch/s390/include/asm/set_memory.h:55 arch/s390/mm/vmem.c:660)
> [    0.466651] paging_init (arch/s390/mm/init.c:97)
> [    0.466677] setup_arch (arch/s390/kernel/setup.c:972)
> [    0.466702] start_kernel (init/main.c:899)
> [    0.466727] startup_continue (arch/s390/kernel/head64.S:35)
> [    0.466811] INFO: lockdep is turned off.
> 
<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 22aa63f4ef63..0d77d171b5d9 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2343,6 +2343,9 @@ struct vmap_area *find_vmap_area(unsigned long addr)
        struct vmap_area *va;
        int i, j;

+       if (unlikely(!vmap_initialized))
+               return NULL;
+
        /*
         * An addr_to_node_id(addr) converts an address to a node index
         * where a VA is located. If VA spans several zones and passed
<snip>

Could you please test it?

--
Uladzislau Rezki


  reply	other threads:[~2024-03-22 19:03 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-02 18:46 [PATCH v3 00/11] Mitigate a vmap lock contention v3 Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 01/11] mm: vmalloc: Add va_alloc() helper Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 02/11] mm: vmalloc: Rename adjust_va_to_fit_type() function Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 03/11] mm: vmalloc: Move vmap_init_free_space() down in vmalloc.c Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 04/11] mm: vmalloc: Remove global vmap_area_root rb-tree Uladzislau Rezki (Sony)
2024-01-05  8:10   ` Wen Gu
2024-01-05 10:50     ` Uladzislau Rezki
2024-01-06  9:17       ` Wen Gu
2024-01-06 16:36         ` Uladzislau Rezki
2024-01-07  6:59           ` Hillf Danton
2024-01-08  7:45             ` Wen Gu
2024-01-08 18:37               ` Uladzislau Rezki
2024-01-16 23:25   ` Lorenzo Stoakes
2024-01-18 13:15     ` Uladzislau Rezki
2024-01-20 12:55       ` Lorenzo Stoakes
2024-01-22 17:44         ` Uladzislau Rezki
2024-01-02 18:46 ` [PATCH v3 05/11] mm/vmalloc: remove vmap_area_list Uladzislau Rezki (Sony)
2024-01-16 23:36   ` Lorenzo Stoakes
2024-01-02 18:46 ` [PATCH v3 06/11] mm: vmalloc: Remove global purge_vmap_area_root rb-tree Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 07/11] mm: vmalloc: Offload free_vmap_area_lock lock Uladzislau Rezki (Sony)
2024-01-03 11:08   ` Hillf Danton
2024-01-03 15:47     ` Uladzislau Rezki
2024-01-11  9:02   ` Dave Chinner
2024-01-11 15:54     ` Uladzislau Rezki
2024-01-11 20:37       ` Dave Chinner
2024-01-12 12:18         ` Uladzislau Rezki
2024-01-16 22:12           ` Dave Chinner
2024-01-18 18:15             ` Uladzislau Rezki
2024-02-08  0:25   ` Baoquan He
2024-02-08 13:57     ` Uladzislau Rezki
2024-02-28  9:48   ` Baoquan He
2024-02-28 10:39     ` Uladzislau Rezki
2024-02-28 12:26       ` Baoquan He
2024-03-22 18:21   ` Guenter Roeck
2024-03-22 19:03     ` Uladzislau Rezki [this message]
2024-03-22 20:53       ` Guenter Roeck
2024-01-02 18:46 ` [PATCH v3 08/11] mm: vmalloc: Support multiple nodes in vread_iter Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 09/11] mm: vmalloc: Support multiple nodes in vmallocinfo Uladzislau Rezki (Sony)
2024-01-02 18:46 ` [PATCH v3 10/11] mm: vmalloc: Set nr_nodes based on CPUs in a system Uladzislau Rezki (Sony)
2024-01-11  9:25   ` Dave Chinner
2024-01-15 19:09     ` Uladzislau Rezki
2024-01-16 22:06       ` Dave Chinner
2024-01-18 18:23         ` Uladzislau Rezki
2024-01-18 21:28           ` Dave Chinner
2024-01-19 10:32             ` Uladzislau Rezki
2024-01-02 18:46 ` [PATCH v3 11/11] mm: vmalloc: Add a shrinker to drain vmap pools Uladzislau Rezki (Sony)
2024-02-22  8:35 ` [PATCH v3 00/11] Mitigate a vmap lock contention v3 Uladzislau Rezki
2024-02-22 23:15   ` Pedro Falcato
2024-02-23  9:34     ` Uladzislau Rezki
2024-02-23 10:26       ` Baoquan He
2024-02-23 11:06         ` Uladzislau Rezki
2024-02-23 15:57           ` Baoquan He
2024-02-23 18:55             ` Uladzislau Rezki
2024-02-28  9:27               ` Baoquan He
2024-02-29 10:38                 ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zf3V6B9f5o0H1LnE@pc636 \
    --to=urezki@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@roeck-us.net \
    --cc=lstoakes@gmail.com \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.