From: Uladzislau Rezki <urezki@gmail.com>
To: Kalesh Singh <kaleshsingh@google.com>
Cc: Uladzislau Rezki <urezki@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
RCU <rcu@vger.kernel.org>,
Neeraj upadhyay <Neeraj.Upadhyay@amd.com>,
Boqun Feng <boqun.feng@gmail.com>,
Hillf Danton <hdanton@sina.com>,
Joel Fernandes <joel@joelfernandes.org>,
LKML <linux-kernel@vger.kernel.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [PATCH v4 1/4] rcu: Reduce synchronize_rcu() latency
Date: Thu, 11 Jan 2024 18:35:30 +0100 [thread overview]
Message-ID: <ZaAm4mVEf8bXnLEp@pc636> (raw)
In-Reply-To: <CAC_TJvfe52GRrjHMfvAorC+PtTBRK0hxL9OPoi0SMqDV3XkFZw@mail.gmail.com>
Hello, Kalesh!
> >
> > >
> > > Hi Uladzislau,
> > >
> > > I've tried your patches (v3) on Android with 6.1.43 kernel.
> > >
> > > The test cycles 10 apps (including camera) sequentially for 100
> > > iterations.
> > >
> > > I've set rcu_normal to override the rcu_expedited in the boot
> > > parameters:
> > >
> > > adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu
> > >
> > > rcupdate.rcu_normal=1
> > > rcupdate.rcu_expedited=1
> > > rcu_nocbs=0-7
> > >
> > >
> > > The configurations are:
> > >
> > > A - echo 0 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp
> > > B - echo 1 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp
> > >
> > > Results:
> > >
> > > = APP LAUNCH TIME =
> > > delta (B-A) ratio(%)
> > > overall_app_launch_time(ms) -11399.00 -6.65
> > >
> > >
> > > == camera_launch_time
> > > type delta(B-A %) A_count B_count
> > > HOT -7.05 99 99
> > > COLD -6.33 1 1
> > >
> > >
>
> Hi Uladzislau,
>
> > If i interpret it correctly you also see that this series reduces
> > a launch time by 6/7% on your app set. Is that correct?
>
> Yes your understanding is correct.
>
> >
> > > === Function Latencies ===
> > >
> > > Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit
> > >
> > > nsec : count distribution nsec : count distribution
> > > 0 -> 1 : 0 | | 0 -> 1 : 0 | |
> > > 2 -> 3 : 0 | | 2 -> 3 : 0 | |
> > > 4 -> 7 : 0 | | 4 -> 7 : 0 | |
> > > 8 -> 15 : 0 | | 8 -> 15 : 0 | |
> > > 16 -> 31 : 0 | | 16 -> 31 : 0 | |
> > > 32 -> 63 : 0 | | 32 -> 63 : 0 | |
> > > 64 -> 127 : 0 | | 64 -> 127 : 0 | |
> > > 128 -> 255 : 0 | | 128 -> 255 : 0 | |
> > > 256 -> 511 : 0 | | 256 -> 511 : 0 | |
> > > 512 -> 1023 : 0 | | 512 -> 1023 : 0 | |
> > > 1024 -> 2047 : 0 | | 1024 -> 2047 : 0 | |
> > > 2048 -> 4095 : 0 | | 2048 -> 4095 : 0 | |
> > > 4096 -> 8191 : 0 | | 4096 -> 8191 : 0 | |
> > > 8192 -> 16383 : 0 | | 8192 -> 16383 : 0 | |
> > > 16384 -> 32767 : 0 | | 16384 -> 32767 : 0 | |
> > > 32768 -> 65535 : 0 | | 32768 -> 65535 : 0 | |
> > > 65536 -> 131071 : 0 | | 65536 -> 131071 : 0 | |
> > > 131072 -> 262143 : 0 | | 131072 -> 262143 : 0 | |
> > > 262144 -> 524287 : 0 | | 262144 -> 524287 : 0 | |
> > > 524288 -> 1048575 : 0 | | 524288 -> 1048575 : 0 | |
> > > 1048576 -> 2097151 : 0 | | 1048576 -> 2097151 : 0 | |
> > > 2097152 -> 4194303 : 0 | | 2097152 -> 4194303 : 0 | |
> > > 4194304 -> 8388607 : 871 |** | 4194304 -> 8388607 : 1180 |**** |
> > > 8388608 -> 16777215 : 3204 |******** | 8388608 -> 16777215 : 7020 |************************* |
> > > 16777216 -> 33554431 : 15013 |****************************************| 16777216 -> 33554431 : 10952 |****************************************|
> > > Exiting trace of synchronize_rcu_expedited Exiting trace of synchronize_rcu_expedited
> > >
> > >
> > > Tracing synchronize_rcu. Hit Ctrl-C to exit Tracing synchronize_rcu. Hit Ctrl-C to exit
> > >
> > > nsec : count distribution nsec : count distribution
> > > 0 -> 1 : 0 | | 0 -> 1 : 0 | |
> > > 2 -> 3 : 0 | | 2 -> 3 : 0 | |
> > > 4 -> 7 : 0 | | 4 -> 7 : 0 | |
> > > 8 -> 15 : 0 | | 8 -> 15 : 0 | |
> > > 16 -> 31 : 0 | | 16 -> 31 : 0 | |
> > > 32 -> 63 : 0 | | 32 -> 63 : 0 | |
> > > 64 -> 127 : 0 | | 64 -> 127 : 0 | |
> > > 128 -> 255 : 0 | | 128 -> 255 : 0 | |
> > > 256 -> 511 : 0 | | 256 -> 511 : 0 | |
> > > 512 -> 1023 : 0 | | 512 -> 1023 : 0 | |
> > > 1024 -> 2047 : 0 | | 1024 -> 2047 : 0 | |
> > > 2048 -> 4095 : 0 | | 2048 -> 4095 : 0 | |
> > > 4096 -> 8191 : 0 | | 4096 -> 8191 : 0 | |
> > > 8192 -> 16383 : 0 | | 8192 -> 16383 : 0 | |
> > > 16384 -> 32767 : 0 | | 16384 -> 32767 : 0 | |
> > > 32768 -> 65535 : 0 | | 32768 -> 65535 : 0 | |
> > > 65536 -> 131071 : 0 | | 65536 -> 131071 : 0 | |
> > > 131072 -> 262143 : 0 | | 131072 -> 262143 : 0 | |
> > > 262144 -> 524287 : 0 | | 262144 -> 524287 : 0 | |
> > > 524288 -> 1048575 : 0 | | 524288 -> 1048575 : 0 | |
> > > 1048576 -> 2097151 : 0 | | 1048576 -> 2097151 : 0 | |
> > > 2097152 -> 4194303 : 0 | | 2097152 -> 4194303 : 0 | |
> > > 4194304 -> 8388607 : 861 |** | 4194304 -> 8388607 : 1136 |**** |
> > > 8388608 -> 16777215 : 3111 |******** | 8388608 -> 16777215 : 6320 |************************ |
> > > 16777216 -> 33554431 : 13901 |****************************************| 16777216 -> 33554431 : 10484 |****************************************|
> > > Exiting trace of synchronize_rcu Exiting trace of synchronize_rcu
> > >
> > Who is B and who is A?
>
> Left is A (rcu_normal_wake_from_gp=0) and right is B
> (rcu_normal_wake_from_gp=1)
> >
> > >
> > > Interestingly I tried the same experiment without rcu_normal=1 (leaving rcu_expedited=1):
> > >
> > > adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu
> > > rcupdate.rcu_expedited=1
> > > rcu_nocbs=0-7
> > >
> > > In this case I also saw the -6 to -7% decrease in the app launch times
> > > but I don't have a good explanation why that would be? (The fucntion
> > > latency histograms in this case didn't show any significant difference).
> > > Do you have any insight why this may happen?
> > >
> > When rcu_expedited=1 is set and rcu_normal=0 is disabled. The
> > synchronize_rcu() call is converted into synchronize_rcu_expidited():
> >
> > <snip>
> > void synchronize_rcu(void)
> > {
> > unsigned long flags;
> > struct rcu_node *rnp;
> >
> > RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map) ||
> > lock_is_held(&rcu_lock_map) ||
> > lock_is_held(&rcu_sched_lock_map),
> > "Illegal synchronize_rcu() in RCU read-side critical section");
> > if (!rcu_blocking_is_gp()) {
> > if (rcu_gp_is_expedited())
> > synchronize_rcu_expedited();
> > else
> > synchronize_rcu_normal();
> > return;
> > }
> > ...
> > <snip>
> >
> > rcu_gp_is_expidited() is true, so invoke "expedited" version.
> >
> > I see some concerns in preferring an expedited version as a global
> > replacement. First of all it is related to latency sensitive workloads
> > because in order to expedite a grace period it sends out IPIs on all
> > online CPUs to force them to report a quiescent-state asap. I have not
> > investigated yet how it affects such workloads.
> >
> > Therefore, in your case, you also see a performance boost of your app sets.
>
> IIUC the patch shouldn't affect the case? The only difference in A vs
> B is rcu_normal_wake_from_gp (both have rcu_expedited=1).
>
Right. This patch does not touch "expedited" version at all.
Appreciate for test results and looking at!
--
Uladzislau Rezki
next prev parent reply other threads:[~2024-01-11 17:35 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-04 16:25 [PATCH v4 0/4] Reduce synchronize_rcu() latency(v4) Uladzislau Rezki (Sony)
2024-01-04 16:25 ` [PATCH v4 1/4] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
2024-01-09 19:16 ` Kalesh Singh
2024-01-10 9:21 ` Uladzislau Rezki
2024-01-11 16:37 ` Kalesh Singh
2024-01-11 17:35 ` Uladzislau Rezki [this message]
2024-01-12 23:09 ` Frederic Weisbecker
2024-01-18 10:37 ` Uladzislau Rezki
2024-01-16 16:18 ` Paul E. McKenney
2024-01-17 12:26 ` Uladzislau Rezki
2024-01-19 15:24 ` Paul E. McKenney
2024-01-22 17:35 ` Uladzislau Rezki
2024-01-23 11:21 ` Paul E. McKenney
2024-01-04 16:25 ` [PATCH v4 2/4] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
2024-01-12 23:20 ` Frederic Weisbecker
2024-01-15 12:14 ` Uladzislau Rezki
2024-01-04 16:25 ` [PATCH v4 3/4] rcu: Improve handling of synchronize_rcu() users Uladzislau Rezki (Sony)
2024-01-16 16:32 ` Paul E. McKenney
2024-01-04 16:25 ` [PATCH v4 4/4] rcu: Support direct wake-up " Uladzislau Rezki (Sony)
2024-01-13 9:19 ` Z qiang
2024-01-15 10:46 ` Uladzislau Rezki
2024-01-15 10:57 ` Uladzislau Rezki
2024-01-16 6:19 ` Z qiang
2024-01-27 7:07 ` [PATCH v4 0/4] Reduce synchronize_rcu() latency(v4) Paul E. McKenney
2024-01-29 16:23 ` Uladzislau Rezki
2024-01-29 19:43 ` Paul E. McKenney
2024-01-29 20:36 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZaAm4mVEf8bXnLEp@pc636 \
--to=urezki@gmail.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=hdanton@sina.com \
--cc=joel@joelfernandes.org \
--cc=kaleshsingh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleksiy.avramchenko@sony.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).