All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: Vlastimil Babka <vbabka@suse.cz>, paulmck@kernel.org
Cc: Uladzislau Rezki <urezki@gmail.com>,
	paulmck@kernel.org, linux-mm@kvack.org,
	Andrew Morton <akpm@linux-foundation.org>,
	RCU <rcu@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2)
Date: Tue, 21 Jan 2025 15:14:16 +0100	[thread overview]
Message-ID: <Z4-ruJ3KVD3PieZi@pc636> (raw)
In-Reply-To: <970317a9-0283-4eec-94ae-63056659d7de@suse.cz>

On Tue, Jan 21, 2025 at 02:49:13PM +0100, Vlastimil Babka wrote:
> On 1/21/25 2:33 PM, Uladzislau Rezki wrote:
> > On Mon, Jan 20, 2025 at 11:06:13PM +0100, Vlastimil Babka wrote:
> >> On 12/16/24 17:46, Paul E. McKenney wrote:
> >>> On Mon, Dec 16, 2024 at 04:55:06PM +0100, Uladzislau Rezki wrote:
> >>>> On Mon, Dec 16, 2024 at 04:44:41PM +0100, Vlastimil Babka wrote:
> >>>>> On 12/16/24 16:41, Uladzislau Rezki wrote:
> >>>>>> On Mon, Dec 16, 2024 at 03:20:44PM +0100, Vlastimil Babka wrote:
> >>>>>>> On 12/16/24 12:03, Uladzislau Rezki wrote:
> >>>>>>>> On Sun, Dec 15, 2024 at 06:30:02PM +0100, Vlastimil Babka wrote:
> >>>>>>>>
> >>>>>>>>> Also how about a followup patch moving the rcu-tiny implementation of
> >>>>>>>>> kvfree_call_rcu()?
> >>>>>>>>>
> >>>>>>>> As, Paul already noted, it would make sense. Or just remove a tiny
> >>>>>>>> implementation.
> >>>>>>>
> >>>>>>> AFAICS tiny rcu is for !SMP systems. Do they benefit from the "full"
> >>>>>>> implementation with all the batching etc or would that be unnecessary overhead?
> >>>>>>>
> >>>>>> Yes, it is for a really small systems with low amount of memory. I see
> >>>>>> only one overhead it is about driving objects in pages. For a small
> >>>>>> system it can be critical because we allocate.
> >>>>>>
> >>>>>> From the other hand, for a tiny variant we can modify the normal variant
> >>>>>> by bypassing batching logic, thus do not consume memory(for Tiny case)
> >>>>>> i.e. merge it to a normal kvfree_rcu() path.
> >>>>>
> >>>>> Maybe we could change it to use CONFIG_SLUB_TINY as that has similar use
> >>>>> case (less memory usage on low memory system, tradeoff for worse performance).
> >>>>>
> >>>> Yep, i also was thinking about that without saying it :)
> >>>
> >>> Works for me as well!
> >>
> >> Hi, so I tried looking at this. First I just moved the code to slab as seen
> >> in the top-most commit here [1]. Hope the non-inlined __kvfree_call_rcu() is
> >> not a show-stopper here.
> >>
> >> Then I wanted to switch the #ifdefs from CONFIG_TINY_RCU to CONFIG_SLUB_TINY
> >> to control whether we use the full blown batching implementation or the
> >> simple call_rcu() implmentation, and realized it's not straightforward and
> >> reveals there are still some subtle dependencies of kvfree_rcu() on RCU
> >> internals :)
> >>
> >> Problem 1: !CONFIG_SLUB_TINY with CONFIG_TINY_RCU
> >>
> >> AFAICS the batching implementation includes kfree_rcu_scheduler_running()
> >> which is called from rcu_set_runtime_mode() but only on TREE_RCU. Perhaps
> >> there are other facilities the batching implementation needs that only
> >> exists in the TREE_RCU implementation
> >>
> >> Possible solution: batching implementation depends on both !CONFIG_SLUB_TINY
> >> and !CONFIG_TINY_RCU. I think it makes sense as both !SMP systems and small
> >> memory systems are fine with the simple implementation.
> >>
> >> Problem 2: CONFIG_TREE_RCU with !CONFIG_SLUB_TINY
> >>
> >> AFAICS I can't just make the simple implementation do call_rcu() on
> >> CONFIG_TREE_RCU, because call_rcu() no longer knows how to handle the fake
> >> callback (__is_kvfree_rcu_offset()) - I see how rcu_reclaim_tiny() does that
> >> but no such equivalent exists in TREE_RCU. Am I right?
> >>
> >> Possible solution: teach TREE_RCU callback invocation to handle
> >> __is_kvfree_rcu_offset() again, perhaps hide that branch behind #ifndef
> >> CONFIG_SLUB_TINY to avoid overhead if the batching implementation is used.
> >> Downside: we visibly demonstrate how kvfree_rcu() is not purely a slab thing
> >> but RCU has to special case it still.
> >>
> >> Possible solution 2: instead of the special offset handling, SLUB provides a
> >> callback function, which will determine pointer to the object from the
> >> pointer to a middle of it without knowing the rcu_head offset.
> >> Downside: this will have some overhead, but SLUB_TINY is not meant to be
> >> performant anyway so we might not care.
> >> Upside: we can remove __is_kvfree_rcu_offset() from TINY_RCU as well
> >>
> >> Thoughts?
> >>
> > For the call_rcu() and to be able to reclaim over it we need to patch the
> > tree.c(please note TINY already works):
> > 
> > <snip>
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index b1f883fcd918..ab24229dfa73 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2559,13 +2559,19 @@ static void rcu_do_batch(struct rcu_data *rdp)
> >                 debug_rcu_head_unqueue(rhp);
> > 
> >                 rcu_lock_acquire(&rcu_callback_map);
> > -               trace_rcu_invoke_callback(rcu_state.name, rhp);
> > 
> >                 f = rhp->func;
> > -               debug_rcu_head_callback(rhp);
> > -               WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > -               f(rhp);
> > 
> > +               if (__is_kvfree_rcu_offset((unsigned long) f)) {
> > +                       trace_rcu_invoke_kvfree_callback("", rhp, (unsigned long) f);
> > +                       kvfree((void *) rhp - (unsigned long) f);
> > +               } else {
> > +                       trace_rcu_invoke_callback(rcu_state.name, rhp);
> > +                       debug_rcu_head_callback(rhp);
> > +                       WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> > +                       f(rhp);
> > +               }
> >                 rcu_lock_release(&rcu_callback_map);
> 
> Right so that's the first Possible solution, but without the #ifdef. So
> there's an overhead of checking __is_kvfree_rcu_offset() even if the
> batching is done in slab and this function is never called with an offset.
>
Or fulfilling a missing functionality? TREE is broken in that sense
whereas a TINY handles it without any issues. 

It can be called for SLUB_TINY option, just call_rcu() instead of
batching layer. And yes, kvfree_rcu_barrier() switches to rcu_barrier().

>
> After coming up with Possible solution 2, I've started liking the idea
> more as RCU could then forget about the __is_kvfree_rcu_offset()
> "callbacks" completely, and the performant case of TREE_RCU + batching
> would be unaffected.
> 
I doubt it is a performance issue :)

>
> I'm speculating perhaps if there was not CONFIG_SLOB in the past, the
> __is_kvfree_rcu_offset() would never exist in the first place? SLAB and
> SLUB both can determine start of the object from a pointer to the middle
> of it, while SLOB couldn't.
> 
We needed just to reclaim over RCU. So, i do not know. Paul probably
knows more then me :)

> >                 /*
> > <snip>
> > 
> > Mixing up CONFIG_SLUB_TINY with CONFIG_TINY_RCU in the slab_common.c
> > should be avoided, i.e. if we can, we should eliminate a dependency on
> > TREE_RCU or TINY_RCU in a slab. As much as possible.
> > 
> > So, it requires a more closer look for sure :)
> 
> That requires solving Problem 1 above, but question is if it's worth the
> trouble. Systems running TINY_RCU are unlikely to benefit from the batching?
> 
> But sure there's also possibility to hide these dependencies in KConfig,
> so the slab code would only consider a single (for example) #ifdef
> CONFIG_KVFREE_RCU_BATCHING that would be set automatically depending on
> TREE_RCU and !SLUB_TINY.
> 
It is for small systems. We can use TINY or !SMP. We covered this AFAIR
that a single CPU system should not go with batching:

#ifdef SLUB_TINY || !SMP || TINY_RCU

or:

config TINY_RCU
	bool
	default y if !PREEMPT_RCU && !SMP
+	select SLUB_TINY


Paul, more input?

--
Uladzislau Rezki

  reply	other threads:[~2025-01-21 14:14 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12 18:02 [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2) Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 1/5] rcu/kvfree: Initialize kvfree_rcu() separately Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 2/5] rcu/kvfree: Move some functions under CONFIG_TINY_RCU Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 3/5] rcu/kvfree: Adjust names passed into trace functions Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 4/5] rcu/kvfree: Adjust a shrinker name Uladzislau Rezki (Sony)
2024-12-12 18:02 ` [PATCH v2 5/5] mm/slab: Move kvfree_rcu() into SLAB Uladzislau Rezki (Sony)
2024-12-12 18:30 ` [PATCH v2 0/5] Move kvfree_rcu() into SLAB (v2) Christoph Lameter (Ampere)
2024-12-12 19:08   ` Uladzislau Rezki
2024-12-12 19:10   ` Paul E. McKenney
2024-12-12 19:13     ` Uladzislau Rezki
2024-12-15 17:30 ` Vlastimil Babka
2024-12-15 18:21   ` Paul E. McKenney
2024-12-16 11:03   ` Uladzislau Rezki
2024-12-16 14:20     ` Vlastimil Babka
2024-12-16 15:41       ` Uladzislau Rezki
2024-12-16 15:44         ` Vlastimil Babka
2024-12-16 15:55           ` Uladzislau Rezki
2024-12-16 16:46             ` Paul E. McKenney
2025-01-20 22:06               ` Vlastimil Babka
2025-01-21 13:33                 ` Uladzislau Rezki
2025-01-21 13:49                   ` Vlastimil Babka
2025-01-21 14:14                     ` Uladzislau Rezki [this message]
2025-01-21 20:32                       ` Paul E. McKenney
2025-01-22 15:04                         ` Joel Fernandes
2025-01-22 16:43                           ` Vlastimil Babka
2025-01-22 16:47                             ` Joel Fernandes
2025-01-22 17:42                             ` Uladzislau Rezki
2024-12-16 13:07   ` Uladzislau Rezki
2025-01-11 19:40     ` Vlastimil Babka
2025-01-06  7:21 ` [External Mail] " Hyeonggon Yoo
2025-01-11 19:43   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4-ruJ3KVD3PieZi@pc636 \
    --to=urezki@gmail.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.