All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Hao Li <hao.li@linux.dev>
Cc: Alan Stern <stern@rowland.harvard.edu>,
	linux-mm@kvack.org, Dmitry Vyukov <dvyukov@google.com>,
	lkmm@lists.linux.dev, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Daniel Lustig <dlustig@nvidia.com>,
	Akira Yokosawa <akiyks@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Luc Maranget <luc.maranget@inria.fr>,
	Jade Alglave <j.alglave@ucl.ac.uk>,
	David Howells <dhowells@redhat.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Boqun Feng <boqun@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Will Deacon <will@kernel.org>,
	Andrea Parri <parri.andrea@gmail.com>,
	Pedro Falcato <pfalcato@suse.de>,
	Vlastimil Babka <vbabka@suse.cz>,
	Christoph Lameter <cl@gentwo.org>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Venkat Rao Bagalkote <venkat88@linux.ibm.com>,
	Mateusz Guzik <mjguzik@gmail.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Marco Elver <elver@google.com>
Subject: Re: [BUG] Memory ordering between kmalloc() and kfree()? it's confusing!
Date: Fri, 27 Feb 2026 18:03:23 +0900	[thread overview]
Message-ID: <aaFd21CMhQCB2rkU@hyeyoo> (raw)
In-Reply-To: <yf7gon3s3efwiwpsdytujnhohnxjc3fgu7oabmia4tbhwqgcs7@rzb7sfy2wrbu>

On Fri, Feb 27, 2026 at 04:06:37PM +0800, Hao Li wrote:
> On Fri, Feb 27, 2026 at 01:17:52AM +0900, Harry Yoo wrote:
> > On Thu, Feb 26, 2026 at 10:45:55AM -0500, Alan Stern wrote:
> > > On Thu, Feb 26, 2026 at 03:35:08PM +0900, Harry Yoo wrote:
> > > > Hello, SLAB, LKMM, and KCSAN folks!
> > > > 
> > > > I'd like to discuss slab's assumption on users regarding memory ordering.
> > > > 
> > > > Recently, I've been investigating an interesting slab memory ordering
> > > > issue [3] [4] in v7.0-rc1, which made me think about memory ordering
> > > > for slab objects.
> > > > 
> > > > But without answering "What does slab expect users to do for correct
> > > > operation?", I kept getting puzzled, and my brain hurt too much :/
> > > > I'm writing things down to stop getting confused :)
> > > > 
> > > > Since I have never thought about this before, my reasoning could be
> > > > partially or entirely incorrect. If so, please kindly let me know.
> > > > 
> > > > # Slab's assumption: Stores to object, its metadata, or struct slab
> > > > # must be visible to the CPU that frees the object, when it is
> > > > # passed to kfree(). It's users' responsibility to guarantee that.
> > > > 
> > > > When the slab allocator allocates an object, it updates its metadata and
> > > > struct slab fields. After allocation, the user of slab updates object's
> > > > content. As long as the object is freed on the same CPU that it was
> > > > allocated, kfree() can see those stores (A CPU must be able to see
> > > > what's in its store buffer), so no problem!
> > > > 
> > > > However, when e.g.) the pointer to object is stored in a shared variable
> > > > and then freed on a different CPU, things become trickier.
> > > > 
> > > > In this case, I think it's fair for the slab allocator to assume that:
> > > > 
> > > >   1) Such stores must involve _at least_ a release barrier
> > > >      (for example, via {cmp,}xchg{,_release}, or smp_store_release())
> > > >      to ensure preceding stores are visible to other CPUs before
> > > >      the pointer store becomes visible, and
> > > > 
> > > >   2) The CPU that frees an object must invoke at least an acquire
> > > >      barrier to ensure that stores to object content / metadata, etc.,
> > > >      are visible to the freeing CPU when it calls kfree().
> > > > 
> > > > Because the slab allocator itself doesn't guarantee that such
> > > > barriers are invoked within the allocator, it relies on users to
> > > > do this when needed.
> > > 
> > > It doesn't?  Then how does the slab allocator guarantee that two 
> > > different CPUs won't try to perform allocations or deallocations from 
> > > the same slab at the same time, messing everything up?
> > 
> > Ah, alloc/free slowpaths do use cmpxchg128 or spinlock and
> > don't mess things up.
> > 
> > But fastpath allocs/frees are served from percpu array that is protected
> > by a local_lock. local_lock has a compiler barrier in it, but that's
> > not enough.
> 
> Hmm, this memory-ordering issue is indeed pretty mind-bending. I'd like to
> share a few thoughts as well. Happy to be corrected!

Yeah, it's indeed confusing :)

> For our current problem, I think the key lies in the relative ordering between
> the two variables, stride and obj_exts. To address it, we need to ensure that
> on the writer side, stride is assigned before obj_exts. And on the reader
> side, we need to guarantee that if it can observe the latest value of
> obj_exts, then it must also be able to observe the latest value of stride.

Yes, that's a somewhat expensive way to avoid the problem by enforcing
ordering between these two variables.

While obj_exts still can be set concurrently (via cmpxchg()), if we set
stride very early during slab initialization, by the time the object is
allocated or freed on another CPU - it must observe a valid stride, no?
(In a similar way we always expect slab->slab_cache to be visible
after slab initialization)

Then, the ordering between those variables doesn't really matter?

> If this understanding is correct, then even if the slab API caller inserts a
> memory barrier between alloc and free, or uses a spinlock (or any statement
> that provides an equivalent memory-barrier effect), it would only ensure that
> the writes to the pair {stride, obj_exts} as a whole happen-before the reads
> of {stride, obj_exts} as a whole. However, it still wouldn't be able to
> guarantee the ordering between the two variables: stride and obj_exts.

-- 
Cheers,
Harry / Hyeonggon

  reply	other threads:[~2026-02-27  9:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26  6:35 [BUG] Memory ordering between kmalloc() and kfree()? it's confusing! Harry Yoo
2026-02-26 15:45 ` Alan Stern
2026-02-26 16:17   ` Harry Yoo
2026-02-26 16:42     ` Alan Stern
2026-02-26 17:11       ` Harry Yoo
2026-02-26 18:06         ` Alan Stern
2026-02-27 12:36           ` Harry Yoo
2026-02-27 17:00             ` Alan Stern
2026-02-26 17:59     ` Christoph Lameter (Ampere)
2026-02-27  8:06     ` Hao Li
2026-02-27  9:03       ` Harry Yoo [this message]
2026-02-27  9:14 ` Akira Yokosawa
2026-03-06  2:46 ` Harry Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aaFd21CMhQCB2rkU@hyeyoo \
    --to=harry.yoo@oracle.com \
    --cc=akiyks@gmail.com \
    --cc=boqun@kernel.org \
    --cc=cl@gentwo.org \
    --cc=dhowells@redhat.com \
    --cc=dlustig@nvidia.com \
    --cc=dvyukov@google.com \
    --cc=elver@google.com \
    --cc=hao.li@linux.dev \
    --cc=j.alglave@ucl.ac.uk \
    --cc=joelagnelf@nvidia.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkmm@lists.linux.dev \
    --cc=luc.maranget@inria.fr \
    --cc=mjguzik@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=stern@rowland.harvard.edu \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=venkat88@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.