From: Vlastimil Babka <vbabka@suse.cz>
To: Sudarsan Mahendran <sudarsanm@google.com>,
Harry Yoo <harry.yoo@oracle.com>
Cc: Liam.Howlett@oracle.com, cl@gentwo.org, howlett@gmail.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
maple-tree@lists.infradead.org, rcu@vger.kernel.org,
rientjes@google.com, roman.gushchin@linux.dev, surenb@google.com,
urezki@gmail.com, Greg Thelen <gthelen@google.com>
Subject: Re: [PATCH v5 00/14] SLUB percpu sheaves
Date: Sat, 16 Aug 2025 20:31:20 +0200 [thread overview]
Message-ID: <498fc518-d78a-43a4-9196-507891e9b844@suse.cz> (raw)
In-Reply-To: <CAA9mObAiQbAYvzhW---VoqDA6Zsb152p5ePMvbco0xgwyvaB2Q@mail.gmail.com>
On 8/16/25 7:35 PM, Sudarsan Mahendran wrote:
>
>
> On Sat, Aug 16, 2025 at 1:06 AM Harry Yoo <harry.yoo@oracle.com
> <mailto:harry.yoo@oracle.com>> wrote:
>>
>> On Fri, Aug 15, 2025 at 03:53:00PM -0700, Sudarsan Mahendran wrote:
>> > Hi Vlastimil,
>> >
>> > I ported this patch series on top of v6.17.
>> > I had to resolve some merge conflicts because of
>> > fba46a5d83ca8decb338722fb4899026d8d9ead2
>> >
>> > The conflict resolution looks like:
>> >
>> > @@ -5524,20 +5335,19 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc);
>> > int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
>> > {
>> > MA_WR_STATE(wr_mas, mas, entry);
>> > - int ret = 0;
>> > - int request;
>> >
>> > mas_wr_prealloc_setup(&wr_mas);
>> > mas->store_type = mas_wr_store_type(&wr_mas);
>> > - request = mas_prealloc_calc(&wr_mas, entry);
>> > - if (!request)
>> > + mas_prealloc_calc(&wr_mas, entry);
>> > + if (!mas->node_request)
>> > goto set_flag;
>> >
>> > mas->mas_flags &= ~MA_STATE_PREALLOC;
>> > - mas_node_count_gfp(mas, request, gfp);
>> > + mas_alloc_nodes(mas, gfp);
>> > if (mas_is_err(mas)) {
>> > - mas_set_alloc_req(mas, 0);
>> > - ret = xa_err(mas->node);
>> > + int ret = xa_err(mas->node);
>> > +
>> > + mas->node_request = 0;
>> > mas_destroy(mas);
>> > mas_reset(mas);
>> > return ret;
>> > @@ -5545,7 +5355,7 @@ int mas_preallocate(struct ma_state *mas, void
> *entry, gfp_t gfp)
>> >
>> > set_flag:
>> > mas->mas_flags |= MA_STATE_PREALLOC;
>> > - return ret;
>> > + return 0;
>> > }
>> > EXPORT_SYMBOL_GPL(mas_preallocate);
>> >
>> >
>> >
>> > When I try to boot this kernel, I see kernel panic
>> > with rcu_free_sheaf() doing recursion into __kmem_cache_free_bulk()
>> >
>> > Stack trace:
>> >
>> > [ 1.583673] Oops: stack guard page: 0000 [#1] SMP NOPTI
>> > [ 1.583676] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted
> 6.17.0-smp-sheaves2 #1 NONE
>> > [ 1.583679] RIP: 0010:__kmem_cache_free_bulk+0x57/0x540
>> > [ 1.583684] Code: 48 85 f6 0f 84 b8 04 00 00 49 89 d6 49 89 ff 48
> 85 ff 0f 84 fe 03 00 00 49 83 7f 08 00 0f 84 f3 03 00 00 0f 1f 44 00 00
> 31 c0 <48> 89 44 24 18 65 8b 05 6d 26 dc 02 89 44 24 2c 31 ff 89 f8 c7 44
>> > [ 1.583685] RSP: 0018:ff40dbc49b048fc0 EFLAGS: 00010246
>> > [ 1.583687] RAX: 0000000000000000 RBX: 0000000000000012 RCX:
> ffffffff939e8640
>> > [ 1.583687] RDX: ff2afe75213e6c90 RSI: 0000000000000012 RDI:
> ff2afe750004ad00
>> > [ 1.583688] RBP: ff40dbc49b049130 R08: ff2afe75368c2500 R09:
> ff2afe75368c3b00
>> > [ 1.583689] R10: ff2afe75368c2500 R11: ff2afe75368c3b00 R12:
> ff2aff31ba00b000
>> > [ 1.583690] R13: ffffffff939e8640 R14: ff2afe75213e6c90 R15:
> ff2afe750004ad00
>> > [ 1.583690] FS: 0000000000000000(0000) GS:ff2aff31ba00b000(0000)
> knlGS:0000000000000000
>> > [ 1.583691] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ 1.583692] CR2: ff40dbc49b048fb8 CR3: 0000000017c3e001 CR4:
> 0000000000771ef0
>> > [ 1.583692] PKRU: 55555554
>> > [ 1.583693] Call Trace:
>> > [ 1.583694] <IRQ>
>> > [ 1.583696] __kmem_cache_free_bulk+0x2c7/0x540
>>
>> [..]
>>
>> > [ 1.583759] __kmem_cache_free_bulk+0x2c7/0x540
>>
>> Hi Sudarsan, thanks for the report.
>>
>> I'm not really sure how __kmem_cache_free_bulk() can call itself.
>> There's no recursion of __kmem_cache_free_bulk() in the code.
> Hi Harry,
>
> I assume somehow the free_to_pcs_bulk() fallback case is taken, thus
> calling __kmem_cache_free_bulk(), which calls free_to_pcs_bulk() ad nauseam.
Could it be a rebase gone wrong? Mine to 6.17-rc1 is here (but untested)
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/
> free_to_pcs_bulk()
> {
> ...
> fallback:
> __kmem_cache_free_bulk(s, size, p);
> ...
> }
>
> static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size,
> void **p)
I don't have this, this codes seems to correspond to my
kmem_cache_free_bulk(), while __kmem_cache_free_bulk() is just
build_detached_freelist() and do_slab_free() with no sheaves involved.
> {
> if (!size)
> return;
>
> /*
> * freeing to sheaves is so incompatible with the detached
> freelist so
> * once we go that way, we have to do everything differently
> */
> if (s && s->cpu_sheaves) {
> free_to_pcs_bulk(s, size, p);
> return;
> }
> ...
>
> Thanks Greg for pointing this out.
>
>
>> As v6.17-rc1 is known to cause a few surprising bugs, could you please
>> rebase onto of mm-hotfixes-unstable and check if it still reproduces?
>>
>> > [ 1.583761] ? update_group_capacity+0xad/0x1f0
>> > [ 1.583763] ? sched_balance_rq+0x4f6/0x1e80
>> > [ 1.583765] __kmem_cache_free_bulk+0x2c7/0x540
>> > [ 1.583767] ? update_irq_load_avg+0x35/0x480
>> > [ 1.583768] ? __pfx_rcu_free_sheaf+0x10/0x10
>> > [ 1.583769] rcu_free_sheaf+0x86/0x110
>> > [ 1.583771] rcu_do_batch+0x245/0x750
>> > [ 1.583772] rcu_core+0x13a/0x260
>> > [ 1.583773] handle_softirqs+0xcb/0x270
>> > [ 1.583775] __irq_exit_rcu+0x48/0xf0
>> > [ 1.583776] sysvec_apic_timer_interrupt+0x74/0x80
>> > [ 1.583778] </IRQ>
>> > [ 1.583778] <TASK>
>> > [ 1.583779] asm_sysvec_apic_timer_interrupt+0x1a/0x20
>> > [ 1.583780] RIP: 0010:cpuidle_enter_state+0x101/0x290
>> > [ 1.583781] Code: 85 f4 ff ff 49 89 c4 8b 73 04 bf ff ff ff ff e8
> d5 44 d4 ff 31 ff e8 9e c7 37 ff 80 7c 24 04 00 74 05 e8 12 45 d4 ff fb
> 85 ed <0f> 88 ba 00 00 00 89 e9 48 6b f9 68 4c 8b 44 24 08 49 8b 54 38 30
>> > [ 1.583782] RSP: 0018:ff40dbc4809afe80 EFLAGS: 00000202
>> > [ 1.583782] RAX: ff2aff31ba00b000 RBX: ff2afe75614b0800 RCX:
> 000000005e64b52b
>> > [ 1.583783] RDX: 000000005e73f761 RSI: 0000000000000067 RDI:
> 0000000000000000
>> > [ 1.583783] RBP: 0000000000000002 R08: fffffffffffffff6 R09:
> 0000000000000000
>> > [ 1.583784] R10: 0000000000000380 R11: ffffffff908c38d0 R12:
> 000000005e64b535
>> > [ 1.583784] R13: 000000005e5580da R14: ffffffff92890b10 R15:
> 0000000000000002
>> > [ 1.583784] ? __pfx_read_tsc+0x10/0x10
>> > [ 1.583787] cpuidle_enter+0x2c/0x40
>> > [ 1.583788] do_idle+0x1a7/0x240
>> > [ 1.583790] cpu_startup_entry+0x2a/0x30
>> > [ 1.583791] start_secondary+0x95/0xa0
>> > [ 1.583794] common_startup_64+0x13e/0x140
>> > [ 1.583796] </TASK>
>> > [ 1.583796] Modules linked in:
>> > [ 1.583798] ---[ end trace 0000000000000000 ]---
>> > [ 1.583798] RIP: 0010:__kmem_cache_free_bulk+0x57/0x540
>> > [ 1.583800] Code: 48 85 f6 0f 84 b8 04 00 00 49 89 d6 49 89 ff 48
> 85 ff 0f 84 fe 03 00 00 49 83 7f 08 00 0f 84 f3 03 00 00 0f 1f 44 00 00
> 31 c0 <48> 89 44 24 18 65 8b 05 6d 26 dc 02 89 44 24 2c 31 ff 89 f8 c7 44
>> > [ 1.583800] RSP: 0018:ff40dbc49b048fc0 EFLAGS: 00010246
>> > [ 1.583801] RAX: 0000000000000000 RBX: 0000000000000012 RCX:
> ffffffff939e8640
>> > [ 1.583801] RDX: ff2afe75213e6c90 RSI: 0000000000000012 RDI:
> ff2afe750004ad00
>> > [ 1.583801] RBP: ff40dbc49b049130 R08: ff2afe75368c2500 R09:
> ff2afe75368c3b00
>> > [ 1.583802] R10: ff2afe75368c2500 R11: ff2afe75368c3b00 R12:
> ff2aff31ba00b000
>> > [ 1.583802] R13: ffffffff939e8640 R14: ff2afe75213e6c90 R15:
> ff2afe750004ad00
>> > [ 1.583802] FS: 0000000000000000(0000) GS:ff2aff31ba00b000(0000)
> knlGS:0000000000000000
>> > [ 1.583803] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [ 1.583803] CR2: ff40dbc49b048fb8 CR3: 0000000017c3e001 CR4:
> 0000000000771ef0
>> > [ 1.583803] PKRU: 55555554
>> > [ 1.583804] Kernel panic - not syncing: Fatal exception in interrupt
>> > [ 1.584659] Kernel Offset: 0xf600000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>> >
>> >
next prev parent reply other threads:[~2025-08-16 18:29 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-23 13:34 [PATCH v5 00/14] SLUB percpu sheaves Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 01/14] slab: add opt-in caching layer of " Vlastimil Babka
2025-08-18 10:09 ` Harry Yoo
2025-08-26 8:03 ` Vlastimil Babka
2025-08-19 4:19 ` Suren Baghdasaryan
2025-08-26 8:51 ` Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 02/14] slab: add sheaf support for batching kfree_rcu() operations Vlastimil Babka
2025-07-23 16:39 ` Uladzislau Rezki
2025-07-24 14:30 ` Vlastimil Babka
2025-07-24 17:36 ` Uladzislau Rezki
2025-07-23 13:34 ` [PATCH v5 03/14] slab: sheaf prefilling for guaranteed allocations Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 04/14] slab: determine barn status racily outside of lock Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 05/14] tools: Add testing support for changes to rcu and slab for sheaves Vlastimil Babka
2025-08-22 16:28 ` Suren Baghdasaryan
2025-08-26 9:32 ` Vlastimil Babka
2025-08-27 0:19 ` Suren Baghdasaryan
2025-07-23 13:34 ` [PATCH v5 06/14] tools: Add sheaves support to testing infrastructure Vlastimil Babka
2025-08-22 16:56 ` Suren Baghdasaryan
2025-08-26 9:59 ` Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 07/14] maple_tree: use percpu sheaves for maple_node_cache Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 08/14] mm, vma: use percpu sheaves for vm_area_struct cache Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 09/14] mm, slub: skip percpu sheaves for remote object freeing Vlastimil Babka
2025-08-25 5:22 ` Harry Yoo
2025-08-26 10:11 ` Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 10/14] mm, slab: allow NUMA restricted allocations to use percpu sheaves Vlastimil Babka
2025-08-22 19:58 ` Suren Baghdasaryan
2025-08-25 6:52 ` Harry Yoo
2025-08-26 10:49 ` Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 11/14] testing/radix-tree/maple: Increase readers and reduce delay for faster machines Vlastimil Babka
2025-07-23 13:34 ` [PATCH v5 12/14] maple_tree: Sheaf conversion Vlastimil Babka
2025-08-22 20:18 ` Suren Baghdasaryan
2025-08-26 14:22 ` Liam R. Howlett
2025-08-27 2:07 ` Suren Baghdasaryan
2025-08-28 14:27 ` Liam R. Howlett
2025-07-23 13:34 ` [PATCH v5 13/14] maple_tree: Add single node allocation support to maple state Vlastimil Babka
2025-08-22 20:25 ` Suren Baghdasaryan
2025-08-26 15:10 ` Liam R. Howlett
2025-08-27 2:03 ` Suren Baghdasaryan
2025-07-23 13:34 ` [PATCH v5 14/14] maple_tree: Convert forking to use the sheaf interface Vlastimil Babka
2025-08-22 20:29 ` Suren Baghdasaryan
2025-08-15 22:53 ` [PATCH v5 00/14] SLUB percpu sheaves Sudarsan Mahendran
2025-08-16 8:05 ` Harry Yoo
2025-08-16 17:35 ` Sudarsan Mahendran
2025-08-16 18:31 ` Vlastimil Babka [this message]
2025-08-16 18:33 ` Vlastimil Babka
2025-08-17 4:28 ` Sudarsan Mahendran
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=498fc518-d78a-43a4-9196-507891e9b844@suse.cz \
--to=vbabka@suse.cz \
--cc=Liam.Howlett@oracle.com \
--cc=cl@gentwo.org \
--cc=gthelen@google.com \
--cc=harry.yoo@oracle.com \
--cc=howlett@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=maple-tree@lists.infradead.org \
--cc=rcu@vger.kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=sudarsanm@google.com \
--cc=surenb@google.com \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).