All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harry Yoo <harry.yoo@oracle.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Uladzislau Rezki <urezki@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	rcu@vger.kernel.org, maple-tree@lists.infradead.org
Subject: Re: [PATCH v4 2/9] slab: add sheaf support for batching kfree_rcu() operations
Date: Tue, 29 Apr 2025 16:36:14 +0900	[thread overview]
Message-ID: <aBCBblup5P1F8SPE@harry> (raw)
In-Reply-To: <20250425-slub-percpu-caches-v4-2-8a636982b4a4@suse.cz>

On Fri, Apr 25, 2025 at 10:27:22AM +0200, Vlastimil Babka wrote:
> Extend the sheaf infrastructure for more efficient kfree_rcu() handling.
> For caches with sheaves, on each cpu maintain a rcu_free sheaf in
> addition to main and spare sheaves.
> 
> kfree_rcu() operations will try to put objects on this sheaf. Once full,
> the sheaf is detached and submitted to call_rcu() with a handler that
> will try to put it in the barn, or flush to slab pages using bulk free,
> when the barn is full. Then a new empty sheaf must be obtained to put
> more objects there.
> 
> It's possible that no free sheaves are available to use for a new
> rcu_free sheaf, and the allocation in kfree_rcu() context can only use
> GFP_NOWAIT and thus may fail. In that case, fall back to the existing
> kfree_rcu() implementation.
> 
> Expected advantages:
> - batching the kfree_rcu() operations, that could eventually replace the
>   existing batching
> - sheaves can be reused for allocations via barn instead of being
>   flushed to slabs, which is more efficient
>   - this includes cases where only some cpus are allowed to process rcu
>     callbacks (Android)
> 
> Possible disadvantage:
> - objects might be waiting for more than their grace period (it is
>   determined by the last object freed into the sheaf), increasing memory
>   usage - but the existing batching does that too.
> 
> Only implement this for CONFIG_KVFREE_RCU_BATCHED as the tiny
> implementation favors smaller memory footprint over performance.
> 
> Add CONFIG_SLUB_STATS counters free_rcu_sheaf and free_rcu_sheaf_fail to
> count how many kfree_rcu() used the rcu_free sheaf successfully and how
> many had to fall back to the existing implementation.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---

Looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>

with a few nits:

>  mm/slab.h        |   3 +
>  mm/slab_common.c |  24 ++++++++
>  mm/slub.c        | 183 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  3 files changed, 208 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/slab.h b/mm/slab.h
> index 1980330c2fcb4a4613a7e4f7efc78b349993fd89..ddf1e4bcba734dccbf67e83bdbab3ca7272f540e 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -459,6 +459,9 @@ static inline bool is_kmalloc_normal(struct kmem_cache *s)
>  	return !(s->flags & (SLAB_CACHE_DMA|SLAB_ACCOUNT|SLAB_RECLAIM_ACCOUNT));
>  }
>  
> +bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj);

> +/* Legal flag mask for kmem_cache_create(), for various configurations */

nit: I think now this line should be removed?

>  #define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
>  			 SLAB_CACHE_DMA32 | SLAB_PANIC | \
>  			 SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS | \
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 4f295bdd2d42355af6311a799955301005f8a532..6c3b90f03cb79b57f426824450f576a977d85c53 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> diff --git a/mm/slub.c b/mm/slub.c
> index ae3e80ad9926ca15601eef2f2aa016ca059498f8..6f31a27b5d47fa6621fa8af6d6842564077d4b60 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5304,6 +5340,140 @@ bool free_to_pcs(struct kmem_cache *s, void *object)
>  	return true;
>  }
>  
> +bool __kfree_rcu_sheaf(struct kmem_cache *s, void *obj)
> +{
> +	struct slub_percpu_sheaves *pcs;
> +	struct slab_sheaf *rcu_sheaf;
> +
> +	if (!local_trylock(&s->cpu_sheaves->lock))
> +		goto fail;
> +
> +	pcs = this_cpu_ptr(s->cpu_sheaves);
> +
> +	if (unlikely(!pcs->rcu_free)) {
> +
> +		struct slab_sheaf *empty;

nit: should we grab the spare sheaf here if it's empty?

> +
> +		empty = barn_get_empty_sheaf(pcs->barn);
> +
> +		if (empty) {
> +			pcs->rcu_free = empty;
> +			goto do_free;
> +		}
> +
> +		local_unlock(&s->cpu_sheaves->lock);
> +
> +		empty = alloc_empty_sheaf(s, GFP_NOWAIT);
> +
> +		if (!empty)
> +			goto fail;
> +
>  /*
>   * Bulk free objects to the percpu sheaves.
>   * Unlike free_to_pcs() this includes the calls to all necessary hooks

-- 
Cheers,
Harry / Hyeonggon

  reply	other threads:[~2025-04-29  7:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-25  8:27 [PATCH v4 0/9] SLUB percpu sheaves Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 1/9] slab: add opt-in caching layer of " Vlastimil Babka
2025-04-25 17:31   ` Christoph Lameter (Ampere)
2025-04-28  7:01     ` Vlastimil Babka
2025-05-06 17:32       ` Suren Baghdasaryan
2025-05-06 23:11         ` Suren Baghdasaryan
2025-04-29  1:08   ` Harry Yoo
2025-05-13 16:08     ` Vlastimil Babka
2025-05-06 23:14   ` Suren Baghdasaryan
2025-05-14 13:06     ` Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 2/9] slab: add sheaf support for batching kfree_rcu() operations Vlastimil Babka
2025-04-29  7:36   ` Harry Yoo [this message]
2025-05-14 13:07     ` Vlastimil Babka
2025-05-06 21:34   ` Suren Baghdasaryan
2025-05-14 14:01     ` Vlastimil Babka
2025-05-15  8:45       ` Vlastimil Babka
2025-05-15 15:03         ` Suren Baghdasaryan
2025-04-25  8:27 ` [PATCH v4 3/9] slab: sheaf prefilling for guaranteed allocations Vlastimil Babka
2025-05-06 22:54   ` Suren Baghdasaryan
2025-05-07  9:15   ` Harry Yoo
2025-05-07  9:20     ` Harry Yoo
2025-05-15  8:41     ` Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 4/9] slab: determine barn status racily outside of lock Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 5/9] tools: Add testing support for changes to rcu and slab for sheaves Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 6/9] tools: Add sheaves support to testing infrastructure Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 7/9] maple_tree: use percpu sheaves for maple_node_cache Vlastimil Babka
2025-04-25  8:27 ` [PATCH v4 8/9] mm, vma: use percpu sheaves for vm_area_struct cache Vlastimil Babka
2025-05-06 23:08   ` Suren Baghdasaryan
2025-04-25  8:27 ` [PATCH v4 9/9] mm, slub: skip percpu sheaves for remote object freeing Vlastimil Babka
2025-04-25 17:35   ` Christoph Lameter (Ampere)
2025-04-28  7:08     ` Vlastimil Babka
2025-05-07 10:39   ` Harry Yoo
2025-05-15  8:59     ` Vlastimil Babka
2025-05-15 12:46 ` [PATCH v4 0/9] SLUB percpu sheaves Vlastimil Babka
2025-05-15 15:01   ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBCBblup5P1F8SPE@harry \
    --to=harry.yoo@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=surenb@google.com \
    --cc=urezki@gmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.