All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: paulmck@kernel.org, Uladzislau Rezki <urezki@gmail.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	RCU <rcu@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2)
Date: Tue, 21 Jan 2025 14:33:00 +0100	[thread overview]
Message-ID: <Z4-iDAONPyf7ljBU@pc636> (raw)
In-Reply-To: <55931fdd-1d5f-4ffd-8496-fe436171dee2@suse.cz>

On Mon, Jan 20, 2025 at 11:06:13PM +0100, Vlastimil Babka wrote:
> On 12/16/24 17:46, Paul E. McKenney wrote:
> > On Mon, Dec 16, 2024 at 04:55:06PM +0100, Uladzislau Rezki wrote:
> >> On Mon, Dec 16, 2024 at 04:44:41PM +0100, Vlastimil Babka wrote:
> >> > On 12/16/24 16:41, Uladzislau Rezki wrote:
> >> > > On Mon, Dec 16, 2024 at 03:20:44PM +0100, Vlastimil Babka wrote:
> >> > >> On 12/16/24 12:03, Uladzislau Rezki wrote:
> >> > >> > On Sun, Dec 15, 2024 at 06:30:02PM +0100, Vlastimil Babka wrote:
> >> > >> > 
> >> > >> >> Also how about a followup patch moving the rcu-tiny implementation of
> >> > >> >> kvfree_call_rcu()?
> >> > >> >> 
> >> > >> > As, Paul already noted, it would make sense. Or just remove a tiny
> >> > >> > implementation.
> >> > >> 
> >> > >> AFAICS tiny rcu is for !SMP systems. Do they benefit from the "full"
> >> > >> implementation with all the batching etc or would that be unnecessary overhead?
> >> > >> 
> >> > > Yes, it is for a really small systems with low amount of memory. I see
> >> > > only one overhead it is about driving objects in pages. For a small
> >> > > system it can be critical because we allocate.
> >> > > 
> >> > > From the other hand, for a tiny variant we can modify the normal variant
> >> > > by bypassing batching logic, thus do not consume memory(for Tiny case)
> >> > > i.e. merge it to a normal kvfree_rcu() path.
> >> > 
> >> > Maybe we could change it to use CONFIG_SLUB_TINY as that has similar use
> >> > case (less memory usage on low memory system, tradeoff for worse performance).
> >> > 
> >> Yep, i also was thinking about that without saying it :)
> > 
> > Works for me as well!
> 
> Hi, so I tried looking at this. First I just moved the code to slab as seen
> in the top-most commit here [1]. Hope the non-inlined __kvfree_call_rcu() is
> not a show-stopper here.
> 
> Then I wanted to switch the #ifdefs from CONFIG_TINY_RCU to CONFIG_SLUB_TINY
> to control whether we use the full blown batching implementation or the
> simple call_rcu() implmentation, and realized it's not straightforward and
> reveals there are still some subtle dependencies of kvfree_rcu() on RCU
> internals :)
> 
> Problem 1: !CONFIG_SLUB_TINY with CONFIG_TINY_RCU
> 
> AFAICS the batching implementation includes kfree_rcu_scheduler_running()
> which is called from rcu_set_runtime_mode() but only on TREE_RCU. Perhaps
> there are other facilities the batching implementation needs that only
> exists in the TREE_RCU implementation
> 
> Possible solution: batching implementation depends on both !CONFIG_SLUB_TINY
> and !CONFIG_TINY_RCU. I think it makes sense as both !SMP systems and small
> memory systems are fine with the simple implementation.
> 
> Problem 2: CONFIG_TREE_RCU with !CONFIG_SLUB_TINY
> 
> AFAICS I can't just make the simple implementation do call_rcu() on
> CONFIG_TREE_RCU, because call_rcu() no longer knows how to handle the fake
> callback (__is_kvfree_rcu_offset()) - I see how rcu_reclaim_tiny() does that
> but no such equivalent exists in TREE_RCU. Am I right?
> 
> Possible solution: teach TREE_RCU callback invocation to handle
> __is_kvfree_rcu_offset() again, perhaps hide that branch behind #ifndef
> CONFIG_SLUB_TINY to avoid overhead if the batching implementation is used.
> Downside: we visibly demonstrate how kvfree_rcu() is not purely a slab thing
> but RCU has to special case it still.
> 
> Possible solution 2: instead of the special offset handling, SLUB provides a
> callback function, which will determine pointer to the object from the
> pointer to a middle of it without knowing the rcu_head offset.
> Downside: this will have some overhead, but SLUB_TINY is not meant to be
> performant anyway so we might not care.
> Upside: we can remove __is_kvfree_rcu_offset() from TINY_RCU as well
> 
> Thoughts?
> 
For the call_rcu() and to be able to reclaim over it we need to patch the
tree.c(please note TINY already works):

<snip>
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index b1f883fcd918..ab24229dfa73 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2559,13 +2559,19 @@ static void rcu_do_batch(struct rcu_data *rdp)
                debug_rcu_head_unqueue(rhp);

                rcu_lock_acquire(&rcu_callback_map);
-               trace_rcu_invoke_callback(rcu_state.name, rhp);

                f = rhp->func;
-               debug_rcu_head_callback(rhp);
-               WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
-               f(rhp);

+               if (__is_kvfree_rcu_offset((unsigned long) f)) {
+                       trace_rcu_invoke_kvfree_callback("", rhp, (unsigned long) f);
+                       kvfree((void *) rhp - (unsigned long) f);
+               } else {
+                       trace_rcu_invoke_callback(rcu_state.name, rhp);
+                       debug_rcu_head_callback(rhp);
+                       WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
+                       f(rhp);
+               }
                rcu_lock_release(&rcu_callback_map);

                /*
<snip>

Mixing up CONFIG_SLUB_TINY with CONFIG_TINY_RCU in the slab_common.c
should be avoided, i.e. if we can, we should eliminate a dependency on
TREE_RCU or TINY_RCU in a slab. As much as possible.

So, it requires a more closer look for sure :)

--
Uladzislau Rezki

  reply	other threads:[~2025-01-21 13:33 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12 18:02 [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2) Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 1/5] rcu/kvfree: Initialize kvfree_rcu() separately Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 2/5] rcu/kvfree: Move some functions under CONFIG_TINY_RCU Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 3/5] rcu/kvfree: Adjust names passed into trace functions Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 4/5] rcu/kvfree: Adjust a shrinker name Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 5/5] mm/slab: Move kvfree_rcu() into SLAB Uladzislau Rezki (Sony)
2024-12-12 18:30 ` [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2) Christoph Lameter (Ampere)
2024-12-12 19:08   ` Uladzislau Rezki
2024-12-12 19:10   ` Paul E. McKenney
2024-12-12 19:13     ` Uladzislau Rezki
2024-12-15 17:30 ` Vlastimil Babka
2024-12-15 18:21   ` Paul E. McKenney
2024-12-16 11:03   ` Uladzislau Rezki
2024-12-16 14:20     ` Vlastimil Babka
2024-12-16 15:41       ` Uladzislau Rezki
2024-12-16 15:44         ` Vlastimil Babka
2024-12-16 15:55           ` Uladzislau Rezki
2024-12-16 16:46             ` Paul E. McKenney
2025-01-20 22:06               ` Vlastimil Babka
2025-01-21 13:33                 ` Uladzislau Rezki [this message]
2025-01-21 13:49                   ` Vlastimil Babka
2025-01-21 14:14                     ` Uladzislau Rezki
2025-01-21 20:32                       ` Paul E. McKenney
2025-01-22 15:04                         ` Joel Fernandes
2025-01-22 16:43                           ` Vlastimil Babka
2025-01-22 16:47                             ` Joel Fernandes
2025-01-22 17:42                             ` Uladzislau Rezki
2024-12-16 13:07   ` Uladzislau Rezki
2025-01-11 19:40     ` Vlastimil Babka
2025-01-06  7:21 ` [External Mail] " Hyeonggon Yoo
2025-01-11 19:43   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4-iDAONPyf7ljBU@pc636 \
    --to=urezki@gmail.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.