From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Petr Tesarik <ptesarik@suse.com>,
Christoph Lameter <cl@gentwo.org>,
David Rientjes <rientjes@google.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Hao Li <hao.li@linux.dev>,
Andrew Morton <akpm@linux-foundation.org>,
Uladzislau Rezki <urezki@gmail.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Suren Baghdasaryan <surenb@google.com>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Alexei Starovoitov <ast@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
linux-rt-devel@lists.linux.dev, bpf@vger.kernel.org,
kasan-dev@googlegroups.com,
kernel test robot <oliver.sang@intel.com>,
stable@vger.kernel.org
Subject: Re: [PATCH RFC v2 01/20] mm/slab: add rcu_barrier() to kvfree_rcu_barrier_on_cache()
Date: Wed, 14 Jan 2026 20:14:07 +0900 [thread overview]
Message-ID: <aWd6f3jERlrB5yeF@hyeyoo> (raw)
In-Reply-To: <342a2a8f-43ee-4eff-a062-6d325faa8899@suse.cz>
On Tue, Jan 13, 2026 at 02:09:33PM +0100, Vlastimil Babka wrote:
> On 1/13/26 1:31 PM, Harry Yoo wrote:
> > On Tue, Jan 13, 2026 at 10:32:33AM +0100, Vlastimil Babka wrote:
> >> On 1/13/26 3:08 AM, Harry Yoo wrote:
> >>> On Mon, Jan 12, 2026 at 04:16:55PM +0100, Vlastimil Babka wrote:
> >>>> After we submit the rcu_free sheaves to call_rcu() we need to make sure
> >>>> the rcu callbacks complete. kvfree_rcu_barrier() does that via
> >>>> flush_all_rcu_sheaves() but kvfree_rcu_barrier_on_cache() doesn't. Fix
> >>>> that.
> >>>
> >>> Oops, my bad.
> >>>
> >>>> Reported-by: kernel test robot <oliver.sang@intel.com>
> >>>> Closes: https://lore.kernel.org/oe-lkp/202601121442.c530bed3-lkp@intel.com
> >>>> Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
> >>>> Cc: stable@vger.kernel.org
> >>>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >>>> ---
> >>>
> >>> The fix looks good to me, but I wonder why
> >>> `if (s->sheaf_capacity) rcu_barrier();` in __kmem_cache_shutdown()
> >>> didn't prevent the bug from happening?
> >>
> >> Hmm good point, didn't notice it's there.
> >>
> >> I think it doesn't help because it happens only after
> >> flush_all_cpus_locked(). And the callback from rcu_free_sheaf_nobarn()
> >> will do sheaf_flush_unused() and end up installing the cpu slab again.
> >
> > I thought about it a little bit more...
> >
> > It's not because a cpu slab was installed again (for list_slab_objects()
> > to be called on a slab, it must be on n->partial list), but because
>
> Hmm that's true.
>
> > flush_slab() cannot handle concurrent frees to the cpu slab.
> >
> > CPU X CPU Y
> >
> > - flush_slab() reads
> > c->freelist
> > rcu_free_sheaf_nobarn()
> > ->sheaf_flush_unused()
> > ->__kmem_cache_free_bulk()
> > ->do_slab_free()
> > -> sees slab == c->slab
> > -> frees to c->freelist
> > - c->slab = NULL,
> > c->freelist = NULL
> > - call deactivate_slab()
> > ^ the object freed by sheaf_flush_unused() is leaked,
> > thus slab->inuse != 0
>
> But for this to be the same "c" it has to be the same cpu, not different
> X and Y, no?
You're absolutely right! It just slipped my mind.
> And that case is protected I think, the action by X with
> local_lock_irqsave() prevents an irq handler to execute Y.
> Action Y is
> using __update_cpu_freelist_fast to find out it was interrupted by X
> messing with c-> fields.
Right.
Also, the test module is just freeing one object (with slab merging
disabled), so there is no concurrent freeing in the test.
For the record, an accurate analysis of the problem (as discussed
off-list):
It turns out the object freed by sheaf_flush_unused() was in KASAN
percpu quarantine list (confirmed by dumping the list) by the time
__kmem_cache_shutdown() returns an error.
Quarantined objects are supposed to be flushed by kasan_cache_shutdown(),
but things go wrong if the rcu callback (rcu_free_sheaf_nobarn()) is
processed after kasan_cache_shutdown() finishes.
That's why rcu_barrier() in __kmem_cache_shutdown() didn't help,
because it's called after kasan_cache_shutdown().
Calling rcu_barrier() in kvfree_rcu_barrier_on_cache() guarantees
that it'll be added to the quarantine list before kasan_cache_shutdown()
is called. So it's a valid fix!
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2026-01-14 11:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 15:16 [PATCH RFC v2 00/20] slab: replace cpu (partial) slabs with sheaves Vlastimil Babka
2026-01-12 15:16 ` [PATCH RFC v2 01/20] mm/slab: add rcu_barrier() to kvfree_rcu_barrier_on_cache() Vlastimil Babka
2026-01-13 2:08 ` Harry Yoo
2026-01-13 9:32 ` Vlastimil Babka
2026-01-13 12:31 ` Harry Yoo
2026-01-13 13:09 ` Vlastimil Babka
2026-01-14 11:14 ` Harry Yoo [this message]
2026-01-14 13:02 ` Vlastimil Babka
2026-01-15 23:52 ` Suren Baghdasaryan
2026-01-14 4:56 ` Harry Yoo
2026-01-12 15:20 ` [PATCH v2 00/20] slab: replace cpu (partial) slabs with sheaves Vlastimil Babka
2026-01-15 15:12 ` [PATCH RFC " Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWd6f3jERlrB5yeF@hyeyoo \
--to=harry.yoo@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=ast@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=cl@gentwo.org \
--cc=hao.li@linux.dev \
--cc=kasan-dev@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=ptesarik@suse.com \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=stable@vger.kernel.org \
--cc=surenb@google.com \
--cc=urezki@gmail.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox