From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
Joel Fernandes <joel@joelfernandes.org>,
linux-kernel@vger.kernel.org, rcu@vger.kernel.org
Subject: Re: [PATCH v2] rcu/kfree: Do not request RCU when not needed
Date: Tue, 15 Nov 2022 13:05:07 +0100 [thread overview]
Message-ID: <Y3OAc+cBK5Dcky8l@pc636> (raw)
In-Reply-To: <20221114161733.GD4001@paulmck-ThinkPad-P17-Gen-1>
On Mon, Nov 14, 2022 at 08:17:33AM -0800, Paul E. McKenney wrote:
> On Mon, Nov 14, 2022 at 01:20:33PM +0100, Uladzislau Rezki wrote:
> > > On Thu, Nov 10, 2022 at 03:01:30PM +0100, Uladzislau Rezki wrote:
> > > > > Hi,
> > > > >
> > > > > On Thu, Nov 10, 2022 at 8:05 AM Uladzislau Rezki <urezki@gmail.com> wrote:
> > > > >
> > > > > > > On ChromeOS, using this with the increased timeout, we see that we
> > > > > > almost always
> > > > > > > never need to initiate a new grace period. Testing also shows this frees
> > > > > > large
> > > > > > > amounts of unreclaimed memory, under intense kfree_rcu() pressure.
> > > > > > >
> > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > > > > > ---
> > > > > > > v1->v2: Same logic but use polled grace periods instead of sampling
> > > > > > gp_seq.
> > > > > > >
> > > > > > > kernel/rcu/tree.c | 8 +++++++-
> > > > > > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > index 591187b6352e..ed41243f7a49 100644
> > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > @@ -2935,6 +2935,7 @@ struct kfree_rcu_cpu_work {
> > > > > > >
> > > > > > > /**
> > > > > > > * struct kfree_rcu_cpu - batch up kfree_rcu() requests for RCU grace
> > > > > > period
> > > > > > > + * @gp_snap: The GP snapshot recorded at the last scheduling of monitor
> > > > > > work.
> > > > > > > * @head: List of kfree_rcu() objects not yet waiting for a grace period
> > > > > > > * @bkvhead: Bulk-List of kvfree_rcu() objects not yet waiting for a
> > > > > > grace period
> > > > > > > * @krw_arr: Array of batches of kfree_rcu() objects waiting for a
> > > > > > grace period
> > > > > > > @@ -2964,6 +2965,7 @@ struct kfree_rcu_cpu {
> > > > > > > struct kfree_rcu_cpu_work krw_arr[KFREE_N_BATCHES];
> > > > > > > raw_spinlock_t lock;
> > > > > > > struct delayed_work monitor_work;
> > > > > > > + unsigned long gp_snap;
> > > > > > > bool initialized;
> > > > > > > int count;
> > > > > > >
> > > > > > > @@ -3167,6 +3169,7 @@ schedule_delayed_monitor_work(struct kfree_rcu_cpu
> > > > > > *krcp)
> > > > > > > mod_delayed_work(system_wq, &krcp->monitor_work,
> > > > > > delay);
> > > > > > > return;
> > > > > > > }
> > > > > > > + krcp->gp_snap = get_state_synchronize_rcu();
> > > > > > > queue_delayed_work(system_wq, &krcp->monitor_work, delay);
> > > > > > > }
> > > > > > >
> > > > > > How do you guarantee a full grace period for objects which proceed
> > > > > > to be placed into an input stream that is not yet detached?
> > > > >
> > > > >
> > > > > Just replying from phone as I’m OOO today.
> > > > >
> > > > > Hmm, so you’re saying that objects can be queued after the delayed work has
> > > > > been queued, but processed when the delayed work is run? I’m looking at
> > > > > this code after few years so I may have missed something.
> > > > >
> > > > > That’s a good point and I think I missed that. I think your version did too
> > > > > but I’ll have to double check.
> > > > >
> > > > > The fix then is to sample the clock for the latest object queued, not for
> > > > > when the delayed work is queued.
> > > > >
> > > > The patch i sent gurantee it. Just in case see v2:
> > >
> > > You are right and thank you! CBs can be queued while the monitor timer is in
> > > progress. So we need to sample unconditionally. I think my approach is still
> > > better since I take advantage of multiple seconds (I update snapshot on every
> > > CB queue monitor and sample in the monitor handler).
> > >
> > > Whereas your patch is snapshotting before queuing the regular work and when
> > > the work is executed (This is a much shorter duration and I bet you would be
> > > blocking in cond_synchronize..() more often).
> > >
> > There is a performance test that measures a taken time and memory
> > footprint, so you can do a quick comparison. A "rcutorture" can be
> > run with various parameters to figure out if a patch that is in question
> > makes any difference.
> >
> > Usually i run it as:
> >
> > <snip>
> > #! /usr/bin/env bash
> >
> > LOOPS=10
> >
> > for (( i=0; i<$LOOPS; i++ )); do
> > tools/testing/selftests/rcutorture/bin/kvm.sh --memory 10G --torture rcuscale --allcpus --duration 1 \
> > --kconfig CONFIG_NR_CPUS=64 \
> > --kconfig CONFIG_RCU_NOCB_CPU=y \
> > --kconfig CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y \
> > --kconfig CONFIG_RCU_LAZY=n \
> > --bootargs "rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" --trust-make
> > echo "Done $i"
> > done
> > <snip>
> >
> > just run it from your linux sandbox.
>
> Would it make sense to modify the "if test "$do_kvfree" = "yes" code
> in tools/testing/selftests/rcutorture/bin/torture.sh to do something
> like this instead of what it currently does?
>
<snip>
if test "$do_kvfree" = "yes"
then
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
fi
<snip>
it does almost the same but i add some repeat loops usually to get some
average results.
> Though if so, it would be much faster to use kvm.sh's --buildonly flag
> to build the kernel, then kvm-again.sh to rerun that kernel.
>
I think can be easily run from the bash in while true; do ... done; script
with a desired number of iteration.
But maybe torture.sh already has it. I mean number of iterations.
--
Uladzislau Rezki
next prev parent reply other threads:[~2022-11-15 12:06 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-09 2:47 [PATCH v2] rcu/kfree: Do not request RCU when not needed Joel Fernandes (Google)
2022-11-10 13:05 ` Uladzislau Rezki
[not found] ` <CAEXW_YSq89xzgyQ9Tdt1tCqz8VAfzb7kSXVZmnxDuJ65U0UZ3w@mail.gmail.com>
2022-11-10 14:01 ` Uladzislau Rezki
2022-11-11 1:56 ` Joel Fernandes
2022-11-14 12:20 ` Uladzislau Rezki
2022-11-14 16:17 ` Paul E. McKenney
2022-11-14 20:54 ` Joel Fernandes
2022-11-14 21:26 ` Paul E. McKenney
2022-11-15 12:05 ` Uladzislau Rezki [this message]
2022-11-14 20:49 ` Joel Fernandes
2022-11-15 13:07 ` Uladzislau Rezki
2022-11-16 19:19 ` Uladzislau Rezki
2022-11-16 22:05 ` Joel Fernandes
2022-11-17 12:58 ` Uladzislau Rezki
2022-11-17 13:06 ` Joel Fernandes
2022-11-17 13:11 ` Uladzislau Rezki
2022-11-17 13:23 ` Joel Fernandes
2022-11-17 13:43 ` Uladzislau Rezki
-- strict thread matches above, loose matches on Subject: below --
2022-11-04 14:21 Joel Fernandes (Google)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3OAc+cBK5Dcky8l@pc636 \
--to=urezki@gmail.com \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.