From: Uladzislau Rezki <urezki@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Pekka Enberg <penberg@kernel.org>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Hyeonggon Yoo <42.hyeyoo@gmail.com>,
"Paul E. McKenney" <paulmck@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Matthew Wilcox <willy@infradead.org>,
Boqun Feng <boqun.feng@gmail.com>,
Uladzislau Rezki <urezki@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
rcu@vger.kernel.org, maple-tree@lists.infradead.org
Subject: Re: [PATCH RFC 2/6] mm/slub: add sheaf support for batching kfree_rcu() operations
Date: Thu, 14 Nov 2024 17:57:42 +0100 [thread overview]
Message-ID: <ZzYsBu_rJWSAcAYf@pc636> (raw)
In-Reply-To: <20241112-slub-percpu-caches-v1-2-ddc0bdc27e05@suse.cz>
On Tue, Nov 12, 2024 at 05:38:46PM +0100, Vlastimil Babka wrote:
> Extend the sheaf infrastructure for more efficient kfree_rcu() handling.
> For caches where sheafs are initialized, on each cpu maintain a rcu_free
> sheaf in addition to main and spare sheaves.
>
> kfree_rcu() operations will try to put objects on this sheaf. Once full,
> the sheaf is detached and submitted to call_rcu() with a handler that
> will try to put in on the barn, or flush to slab pages using bulk free,
> when the barn is full. Then a new empty sheaf must be obtained to put
> more objects there.
>
> It's possible that no free sheafs are available to use for a new
> rcu_free sheaf, and the allocation in kfree_rcu() context can only use
> GFP_NOWAIT and thus may fail. In that case, fall back to the existing
> kfree_rcu() machinery.
>
> Because some intended users will need to perform additonal cleanups
> after the grace period and thus have custom rcu_call() callbacks today,
> add the possibility to specify a kfree_rcu() specific destructor.
> Because of the fall back possibility, the destructor now needs be
> invoked also from within RCU, so add __kvfree_rcu() that RCU can use
> instead of kvfree().
>
> Expected advantages:
> - batching the kfree_rcu() operations, that could eventually replace the
> batching done in RCU itself
> - sheafs can be reused via barn instead of being flushed to slabs, which
> is more effective
> - this includes cases where only some cpus are allowed to process rcu
> callbacks (Android)
>
> Possible disadvantage:
> - objects might be waiting for more than their grace period (it is
> determined by the last object freed into the sheaf), increasing memory
> usage - but that might be true for the batching done by RCU as well?
>
> RFC LIMITATIONS: - only tree rcu is converted, not tiny
> - the rcu fallback might resort to kfree_bulk(), not kvfree(). Instead
> of adding a variant of kfree_bulk() with destructors, is there an easy
> way to disable the kfree_bulk() path in the fallback case?
>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
> include/linux/slab.h | 15 +++++
> kernel/rcu/tree.c | 8 ++-
> mm/slab.h | 25 +++++++
> mm/slab_common.c | 3 +
> mm/slub.c | 182 +++++++++++++++++++++++++++++++++++++++++++++++++--
> 5 files changed, 227 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index b13fb1c1f03c14a5b45bc6a64a2096883aef9f83..23904321992ad2eeb9389d0883cf4d5d5d71d896 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -343,6 +343,21 @@ struct kmem_cache_args {
> * %0 means no sheaves will be created
> */
> unsigned int sheaf_capacity;
> + /**
> + * @sheaf_rcu_dtor: A destructor for objects freed by kfree_rcu()
> + *
> + * Only valid when non-zero @sheaf_capacity is specified. When freeing
> + * objects by kfree_rcu() in a cache with sheaves, the objects are put
> + * to a special percpu sheaf. When that sheaf is full, it's passed to
> + * call_rcu() and after a grace period the sheaf can be reused for new
> + * allocations. In case a cleanup is necessary after the grace period
> + * and before reusal, a pointer to such function can be given as
> + * @sheaf_rcu_dtor and will be called on each object in the rcu sheaf
> + * after the grace period passes and before the sheaf's reuse.
> + *
> + * %NULL means no destructor is called.
> + */
> + void (*sheaf_rcu_dtor)(void *obj);
> };
>
> struct kmem_cache *__kmem_cache_create_args(const char *name,
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index b1f883fcd9185a5e22c10102d1024c40688f57fb..42c994fdf9960bfed8d8bd697de90af72c1f4f58 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -65,6 +65,7 @@
> #include <linux/kasan.h>
> #include <linux/context_tracking.h>
> #include "../time/tick-internal.h"
> +#include "../../mm/slab.h"
>
> #include "tree.h"
> #include "rcu.h"
> @@ -3420,7 +3421,7 @@ kvfree_rcu_list(struct rcu_head *head)
> trace_rcu_invoke_kvfree_callback(rcu_state.name, head, offset);
>
> if (!WARN_ON_ONCE(!__is_kvfree_rcu_offset(offset)))
> - kvfree(ptr);
> + __kvfree_rcu(ptr);
>
> rcu_lock_release(&rcu_callback_map);
> cond_resched_tasks_rcu_qs();
> @@ -3797,6 +3798,9 @@ void kvfree_call_rcu(struct rcu_head *head, void *ptr)
> if (!head)
> might_sleep();
>
> + if (kfree_rcu_sheaf(ptr))
> + return;
> +
>
This change crosses all effort which has been done in order to improve kvfree_rcu :)
For example:
performance, app launch improvements for Android devices;
memory consumption optimizations to minimize LMK triggering;
batching to speed-up offloading;
etc.
So we have done a lot of work there. We were thinking about moving all
functionality from "kernel/rcu" to "mm/". As a first step i can do that,
i.e. move kvfree_rcu() as is. After that we can switch to second step.
Sounds good for you or not?
--
Uladzislau Rezki
next prev parent reply other threads:[~2024-11-14 16:57 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-12 16:38 [PATCH RFC 0/6] SLUB percpu sheaves Vlastimil Babka
2024-11-12 16:38 ` [PATCH RFC 1/6] mm/slub: add opt-in caching layer of " Vlastimil Babka
2024-11-12 16:38 ` [PATCH RFC 2/6] mm/slub: add sheaf support for batching kfree_rcu() operations Vlastimil Babka
2024-11-14 16:57 ` Uladzislau Rezki [this message]
2024-11-17 11:01 ` Vlastimil Babka
2024-11-20 12:37 ` Uladzislau Rezki
2024-11-25 11:02 ` Vlastimil Babka
2024-11-25 11:18 ` Uladzislau Rezki
2024-11-28 16:24 ` Uladzislau Rezki
2024-11-29 13:54 ` Vlastimil Babka
2024-11-29 14:20 ` Uladzislau Rezki
2024-11-12 16:38 ` [PATCH RFC 3/6] maple_tree: use percpu sheaves for maple_node_cache Vlastimil Babka
2024-11-12 16:38 ` [PATCH RFC 4/6] mm, vma: use sheaves for vm_area_struct cache Vlastimil Babka
2024-11-12 16:38 ` [PATCH RFC 5/6] mm, slub: cheaper locking for percpu sheaves Vlastimil Babka
2024-11-12 16:38 ` [PATCH RFC 6/6] mm, slub: sheaf prefilling for guaranteed allocations Vlastimil Babka
2024-11-18 13:13 ` Hyeonggon Yoo
2024-11-18 14:26 ` Vlastimil Babka
2024-11-19 2:29 ` Hyeonggon Yoo
2024-11-19 8:27 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZzYsBu_rJWSAcAYf@pc636 \
--to=urezki@gmail.com \
--cc=42.hyeyoo@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=boqun.feng@gmail.com \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=maple-tree@lists.infradead.org \
--cc=paulmck@kernel.org \
--cc=penberg@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rientjes@google.com \
--cc=roman.gushchin@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.