From: Kalesh Singh <kaleshsingh@google.com>
To: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>,
RCU <rcu@vger.kernel.org>,
Neeraj upadhyay <Neeraj.Upadhyay@amd.com>,
Boqun Feng <boqun.feng@gmail.com>,
Hillf Danton <hdanton@sina.com>,
Joel Fernandes <joel@joelfernandes.org>,
LKML <linux-kernel@vger.kernel.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
Frederic Weisbecker <frederic@kernel.org>
Subject: Re: [PATCH v4 1/4] rcu: Reduce synchronize_rcu() latency
Date: Tue, 9 Jan 2024 11:16:27 -0800 [thread overview]
Message-ID: <ZZ2bi5iPwXLgjB-f@google.com> (raw)
In-Reply-To: <20240104162510.72773-2-urezki@gmail.com>
On Thu, Jan 04, 2024 at 05:25:07PM +0100, Uladzislau Rezki (Sony) wrote:
> A call to a synchronize_rcu() can be optimized from a latency
> point of view. Workloads which depend on this can benefit of it.
>
> The delay of wakeme_after_rcu() callback, which unblocks a waiter,
> depends on several factors:
>
> - how fast a process of offloading is started. Combination of:
> - !CONFIG_RCU_NOCB_CPU/CONFIG_RCU_NOCB_CPU;
> - !CONFIG_RCU_LAZY/CONFIG_RCU_LAZY;
> - other.
> - when started, invoking path is interrupted due to:
> - time limit;
> - need_resched();
> - if limit is reached.
> - where in a nocb list it is located;
> - how fast previous callbacks completed;
>
> Example:
>
> 1. On our embedded devices i can easily trigger the scenario when
> it is a last in the list out of ~3600 callbacks:
>
> <snip>
> <...>-29 [001] d..1. 21950.145313: rcu_batch_start: rcu_preempt CBs=3613 bl=28
> ...
> <...>-29 [001] ..... 21950.152578: rcu_invoke_callback: rcu_preempt rhp=00000000b2d6dee8 func=__free_vm_area_struct.cfi_jt
> <...>-29 [001] ..... 21950.152579: rcu_invoke_callback: rcu_preempt rhp=00000000a446f607 func=__free_vm_area_struct.cfi_jt
> <...>-29 [001] ..... 21950.152580: rcu_invoke_callback: rcu_preempt rhp=00000000a5cab03b func=__free_vm_area_struct.cfi_jt
> <...>-29 [001] ..... 21950.152581: rcu_invoke_callback: rcu_preempt rhp=0000000013b7e5ee func=__free_vm_area_struct.cfi_jt
> <...>-29 [001] ..... 21950.152582: rcu_invoke_callback: rcu_preempt rhp=000000000a8ca6f9 func=__free_vm_area_struct.cfi_jt
> <...>-29 [001] ..... 21950.152583: rcu_invoke_callback: rcu_preempt rhp=000000008f162ca8 func=wakeme_after_rcu.cfi_jt
> <...>-29 [001] d..1. 21950.152625: rcu_batch_end: rcu_preempt CBs-invoked=3612 idle=....
> <snip>
>
> 2. We use cpuset/cgroup to classify tasks and assign them into
> different cgroups. For example "backgrond" group which binds tasks
> only to little CPUs or "foreground" which makes use of all CPUs.
> Tasks can be migrated between groups by a request if an acceleration
> is needed.
>
> See below an example how "surfaceflinger" task gets migrated.
> Initially it is located in the "system-background" cgroup which
> allows to run only on little cores. In order to speed it up it
> can be temporary moved into "foreground" cgroup which allows
> to use big/all CPUs:
>
> cgroup_attach_task():
> -> cgroup_migrate_execute()
> -> cpuset_can_attach()
> -> percpu_down_write()
> -> rcu_sync_enter()
> -> synchronize_rcu()
> -> now move tasks to the new cgroup.
> -> cgroup_migrate_finish()
>
> <snip>
> rcuop/1-29 [000] ..... 7030.528570: rcu_invoke_callback: rcu_preempt rhp=00000000461605e0 func=wakeme_after_rcu.cfi_jt
> PERFD-SERVER-1855 [000] d..1. 7030.530293: cgroup_attach_task: dst_root=3 dst_id=22 dst_level=1 dst_path=/foreground pid=1900 comm=surfaceflinger
> TimerDispatch-2768 [002] d..5. 7030.537542: sched_migrate_task: comm=surfaceflinger pid=1900 prio=98 orig_cpu=0 dest_cpu=4
> <snip>
>
> "Boosting a task" depends on synchronize_rcu() latency:
>
> - first trace shows a completion of synchronize_rcu();
> - second shows attaching a task to a new group;
> - last shows a final step when migration occurs.
>
> 3. To address this drawback, maintain a separate track that consists
> of synchronize_rcu() callers only. After completion of a grace period
> users are deferred to a dedicated worker to process requests.
>
> 4. This patch reduces the latency of synchronize_rcu() approximately
> by ~30-40% on synthetic tests. The real test case, camera launch time,
> shows(time is in milliseconds):
>
> 1-run 542 vs 489 improvement 9%
> 2-run 540 vs 466 improvement 13%
> 3-run 518 vs 468 improvement 9%
> 4-run 531 vs 457 improvement 13%
> 5-run 548 vs 475 improvement 13%
> 6-run 509 vs 484 improvement 4%
>
> Synthetic test(no "noise" from other callbacks):
> Hardware: x86_64 64 CPUs, 64GB of memory
> Linux-6.6
>
> - 10K tasks(simultaneous);
> - each task does(1000 loops)
> synchronize_rcu();
> kfree(p);
>
> default: CONFIG_RCU_NOCB_CPU: takes 54 seconds to complete all users;
> patch: CONFIG_RCU_NOCB_CPU: takes 35 seconds to complete all users.
>
> Running 60K gives approximately same results on my setup. Please note
> it is without any interaction with another type of callbacks, otherwise
> it will impact a lot a default case.
>
> 5. An extra CONFIG_RCU_SR_NORMAL_DEBUG_GP kernel option is added
> which enables additional debugging for detecting a grace period
> incompletion for synchronize_rcu() users. If a GP is not fully
> passed for any user, the warning message is emitted.
>
> 6. By default it is disabled. To enable this perform one of the
> below sequence:
>
> echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
> or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"
>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Hi Uladzislau,
I've tried your patches (v3) on Android with 6.1.43 kernel.
The test cycles 10 apps (including camera) sequentially for 100
iterations.
I've set rcu_normal to override the rcu_expedited in the boot
parameters:
adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu
rcupdate.rcu_normal=1
rcupdate.rcu_expedited=1
rcu_nocbs=0-7
The configurations are:
A - echo 0 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp
B - echo 1 >/sys/module/rcutree/parameters/rcu_normal_wake_from_gp
Results:
= APP LAUNCH TIME =
delta (B-A) ratio(%)
overall_app_launch_time(ms) -11399.00 -6.65
== camera_launch_time
type delta(B-A %) A_count B_count
HOT -7.05 99 99
COLD -6.33 1 1
=== Function Latencies ===
Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit Tracing synchronize_rcu_expedited. Hit Ctrl-C to exit
nsec : count distribution nsec : count distribution
0 -> 1 : 0 | | 0 -> 1 : 0 | |
2 -> 3 : 0 | | 2 -> 3 : 0 | |
4 -> 7 : 0 | | 4 -> 7 : 0 | |
8 -> 15 : 0 | | 8 -> 15 : 0 | |
16 -> 31 : 0 | | 16 -> 31 : 0 | |
32 -> 63 : 0 | | 32 -> 63 : 0 | |
64 -> 127 : 0 | | 64 -> 127 : 0 | |
128 -> 255 : 0 | | 128 -> 255 : 0 | |
256 -> 511 : 0 | | 256 -> 511 : 0 | |
512 -> 1023 : 0 | | 512 -> 1023 : 0 | |
1024 -> 2047 : 0 | | 1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | | 2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | | 4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | | 8192 -> 16383 : 0 | |
16384 -> 32767 : 0 | | 16384 -> 32767 : 0 | |
32768 -> 65535 : 0 | | 32768 -> 65535 : 0 | |
65536 -> 131071 : 0 | | 65536 -> 131071 : 0 | |
131072 -> 262143 : 0 | | 131072 -> 262143 : 0 | |
262144 -> 524287 : 0 | | 262144 -> 524287 : 0 | |
524288 -> 1048575 : 0 | | 524288 -> 1048575 : 0 | |
1048576 -> 2097151 : 0 | | 1048576 -> 2097151 : 0 | |
2097152 -> 4194303 : 0 | | 2097152 -> 4194303 : 0 | |
4194304 -> 8388607 : 871 |** | 4194304 -> 8388607 : 1180 |**** |
8388608 -> 16777215 : 3204 |******** | 8388608 -> 16777215 : 7020 |************************* |
16777216 -> 33554431 : 15013 |****************************************| 16777216 -> 33554431 : 10952 |****************************************|
Exiting trace of synchronize_rcu_expedited Exiting trace of synchronize_rcu_expedited
Tracing synchronize_rcu. Hit Ctrl-C to exit Tracing synchronize_rcu. Hit Ctrl-C to exit
nsec : count distribution nsec : count distribution
0 -> 1 : 0 | | 0 -> 1 : 0 | |
2 -> 3 : 0 | | 2 -> 3 : 0 | |
4 -> 7 : 0 | | 4 -> 7 : 0 | |
8 -> 15 : 0 | | 8 -> 15 : 0 | |
16 -> 31 : 0 | | 16 -> 31 : 0 | |
32 -> 63 : 0 | | 32 -> 63 : 0 | |
64 -> 127 : 0 | | 64 -> 127 : 0 | |
128 -> 255 : 0 | | 128 -> 255 : 0 | |
256 -> 511 : 0 | | 256 -> 511 : 0 | |
512 -> 1023 : 0 | | 512 -> 1023 : 0 | |
1024 -> 2047 : 0 | | 1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | | 2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | | 4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | | 8192 -> 16383 : 0 | |
16384 -> 32767 : 0 | | 16384 -> 32767 : 0 | |
32768 -> 65535 : 0 | | 32768 -> 65535 : 0 | |
65536 -> 131071 : 0 | | 65536 -> 131071 : 0 | |
131072 -> 262143 : 0 | | 131072 -> 262143 : 0 | |
262144 -> 524287 : 0 | | 262144 -> 524287 : 0 | |
524288 -> 1048575 : 0 | | 524288 -> 1048575 : 0 | |
1048576 -> 2097151 : 0 | | 1048576 -> 2097151 : 0 | |
2097152 -> 4194303 : 0 | | 2097152 -> 4194303 : 0 | |
4194304 -> 8388607 : 861 |** | 4194304 -> 8388607 : 1136 |**** |
8388608 -> 16777215 : 3111 |******** | 8388608 -> 16777215 : 6320 |************************ |
16777216 -> 33554431 : 13901 |****************************************| 16777216 -> 33554431 : 10484 |****************************************|
Exiting trace of synchronize_rcu Exiting trace of synchronize_rcu
Interestingly I tried the same experiment without rcu_normal=1 (leaving rcu_expedited=1):
adb shell cat /proc/cmdline | tr ' ' '\n' | grep rcu
rcupdate.rcu_expedited=1
rcu_nocbs=0-7
In this case I also saw the -6 to -7% decrease in the app launch times
but I don't have a good explanation why that would be? (The fucntion
latency histograms in this case didn't show any significant difference).
Do you have any insight why this may happen?
Thanks,
Kalesh
> ---
> .../admin-guide/kernel-parameters.txt | 14 ++
> kernel/rcu/Kconfig.debug | 12 ++
> kernel/rcu/tree.c | 138 +++++++++++++++++-
> kernel/rcu/tree_exp.h | 2 +-
> 4 files changed, 164 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 17a454909ab4..2cca75e4f0c6 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5047,6 +5047,20 @@
> delay, memory pressure or callback list growing too
> big.
>
> + rcutree.rcu_normal_wake_from_gp= [KNL]
> + Reduces a latency of synchronize_rcu() call. This approach
> + maintains its own track of synchronize_rcu() callers, so it
> + does not interact with regular callbacks because it does not
> + use a call_rcu[_hurry]() path. Please note, this is for a
> + normal grace period.
> +
> + How to enable it:
> +
> + echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
> + or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"
> +
> + Default is 0.
> +
> rcuscale.gp_async= [KNL]
> Measure performance of asynchronous
> grace-period primitives such as call_rcu().
> diff --git a/kernel/rcu/Kconfig.debug b/kernel/rcu/Kconfig.debug
> index 9b0b52e1836f..4812c6249185 100644
> --- a/kernel/rcu/Kconfig.debug
> +++ b/kernel/rcu/Kconfig.debug
> @@ -168,4 +168,16 @@ config RCU_STRICT_GRACE_PERIOD
> when looking for certain types of RCU usage bugs, for example,
> too-short RCU read-side critical sections.
>
> +config RCU_SR_NORMAL_DEBUG_GP
> + bool "Debug synchronize_rcu() callers for a grace period completion"
> + depends on DEBUG_KERNEL && RCU_EXPERT
> + default n
> + help
> + This option enables additional debugging for detecting a grace
> + period incompletion for synchronize_rcu() users. If a GP is not
> + fully passed for any user, the warning message is emitted.
> +
> + Say Y here if you want to enable such debugging
> + Say N if you are unsure.
> +
> endmenu # "RCU Debugging"
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 499803234176..b756c40e4960 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1422,6 +1422,106 @@ static void rcu_poll_gp_seq_end_unlocked(unsigned long *snap)
> raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> }
>
> +/*
> + * There are three lists for handling synchronize_rcu() users.
> + * A first list corresponds to new coming users, second for users
> + * which wait for a grace period and third is for which a grace
> + * period is passed.
> + */
> +static struct sr_normal_state {
> + struct llist_head srs_next; /* request a GP users. */
> + struct llist_head srs_wait; /* wait for GP users. */
> + struct llist_head srs_done; /* ready for GP users. */
> +
> + /*
> + * In order to add a batch of nodes to already
> + * existing srs-done-list, a tail of srs-wait-list
> + * is maintained.
> + */
> + struct llist_node *srs_wait_tail;
> +} sr;
> +
> +/* Disabled by default. */
> +static int rcu_normal_wake_from_gp;
> +module_param(rcu_normal_wake_from_gp, int, 0644);
> +
> +static void rcu_sr_normal_complete(struct llist_node *node)
> +{
> + struct rcu_synchronize *rs = container_of(
> + (struct rcu_head *) node, struct rcu_synchronize, head);
> + unsigned long oldstate = (unsigned long) rs->head.func;
> +
> + WARN_ONCE(IS_ENABLED(CONFIG_RCU_SR_NORMAL_DEBUG_GP) &&
> + !poll_state_synchronize_rcu(oldstate),
> + "A full grace period is not passed yet: %lu",
> + rcu_seq_diff(get_state_synchronize_rcu(), oldstate));
> +
> + /* Finally. */
> + complete(&rs->completion);
> +}
> +
> +static void rcu_sr_normal_gp_cleanup_work(struct work_struct *work)
> +{
> + struct llist_node *done, *rcu, *next;
> +
> + done = llist_del_all(&sr.srs_done);
> + if (!done)
> + return;
> +
> + llist_for_each_safe(rcu, next, done)
> + rcu_sr_normal_complete(rcu);
> +}
> +static DECLARE_WORK(sr_normal_gp_cleanup, rcu_sr_normal_gp_cleanup_work);
> +
> +/*
> + * Helper function for rcu_gp_cleanup().
> + */
> +static void rcu_sr_normal_gp_cleanup(void)
> +{
> + struct llist_node *head, *tail;
> +
> + if (llist_empty(&sr.srs_wait))
> + return;
> +
> + tail = READ_ONCE(sr.srs_wait_tail);
> + head = __llist_del_all(&sr.srs_wait);
> +
> + if (head) {
> + /* Can be not empty. */
> + llist_add_batch(head, tail, &sr.srs_done);
> + queue_work(system_highpri_wq, &sr_normal_gp_cleanup);
> + }
> +}
> +
> +/*
> + * Helper function for rcu_gp_init().
> + */
> +static void rcu_sr_normal_gp_init(void)
> +{
> + struct llist_node *head, *tail;
> +
> + if (llist_empty(&sr.srs_next))
> + return;
> +
> + tail = llist_del_all(&sr.srs_next);
> + head = llist_reverse_order(tail);
> +
> + /*
> + * A waiting list of GP should be empty on this step,
> + * since a GP-kthread, rcu_gp_init() -> gp_cleanup(),
> + * rolls it over. If not, it is a BUG, warn a user.
> + */
> + WARN_ON_ONCE(!llist_empty(&sr.srs_wait));
> +
> + WRITE_ONCE(sr.srs_wait_tail, tail);
> + __llist_add_batch(head, tail, &sr.srs_wait);
> +}
> +
> +static void rcu_sr_normal_add_req(struct rcu_synchronize *rs)
> +{
> + llist_add((struct llist_node *) &rs->head, &sr.srs_next);
> +}
> +
> /*
> * Initialize a new grace period. Return false if no grace period required.
> */
> @@ -1456,6 +1556,7 @@ static noinline_for_stack bool rcu_gp_init(void)
> /* Record GP times before starting GP, hence rcu_seq_start(). */
> rcu_seq_start(&rcu_state.gp_seq);
> ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
> + rcu_sr_normal_gp_init();
> trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start"));
> rcu_poll_gp_seq_start(&rcu_state.gp_seq_polled_snap);
> raw_spin_unlock_irq_rcu_node(rnp);
> @@ -1825,6 +1926,9 @@ static noinline void rcu_gp_cleanup(void)
> }
> raw_spin_unlock_irq_rcu_node(rnp);
>
> + // Make synchronize_rcu() users aware of the end of old grace period.
> + rcu_sr_normal_gp_cleanup();
> +
> // If strict, make all CPUs aware of the end of the old grace period.
> if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD))
> on_each_cpu(rcu_strict_gp_boundary, NULL, 0);
> @@ -3561,6 +3665,38 @@ static int rcu_blocking_is_gp(void)
> return true;
> }
>
> +/*
> + * Helper function for the synchronize_rcu() API.
> + */
> +static void synchronize_rcu_normal(void)
> +{
> + struct rcu_synchronize rs;
> +
> + if (!READ_ONCE(rcu_normal_wake_from_gp)) {
> + wait_rcu_gp(call_rcu_hurry);
> + return;
> + }
> +
> + init_rcu_head_on_stack(&rs.head);
> + init_completion(&rs.completion);
> +
> + /*
> + * This code might be preempted, therefore take a GP
> + * snapshot before adding a request.
> + */
> + if (IS_ENABLED(CONFIG_RCU_SR_NORMAL_DEBUG_GP))
> + rs.head.func = (void *) get_state_synchronize_rcu();
> +
> + rcu_sr_normal_add_req(&rs);
> +
> + /* Kick a GP and start waiting. */
> + (void) start_poll_synchronize_rcu();
> +
> + /* Now we can wait. */
> + wait_for_completion(&rs.completion);
> + destroy_rcu_head_on_stack(&rs.head);
> +}
> +
> /**
> * synchronize_rcu - wait until a grace period has elapsed.
> *
> @@ -3612,7 +3748,7 @@ void synchronize_rcu(void)
> if (rcu_gp_is_expedited())
> synchronize_rcu_expedited();
> else
> - wait_rcu_gp(call_rcu_hurry);
> + synchronize_rcu_normal();
> return;
> }
>
> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
> index 014ddf672165..bdc30d972d32 100644
> --- a/kernel/rcu/tree_exp.h
> +++ b/kernel/rcu/tree_exp.h
> @@ -985,7 +985,7 @@ void synchronize_rcu_expedited(void)
>
> /* If expedited grace periods are prohibited, fall back to normal. */
> if (rcu_gp_is_normal()) {
> - wait_rcu_gp(call_rcu_hurry);
> + synchronize_rcu_normal();
> return;
> }
>
> --
> 2.39.2
>
next prev parent reply other threads:[~2024-01-09 19:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-04 16:25 [PATCH v4 0/4] Reduce synchronize_rcu() latency(v4) Uladzislau Rezki (Sony)
2024-01-04 16:25 ` [PATCH v4 1/4] rcu: Reduce synchronize_rcu() latency Uladzislau Rezki (Sony)
2024-01-09 19:16 ` Kalesh Singh [this message]
2024-01-10 9:21 ` Uladzislau Rezki
2024-01-11 16:37 ` Kalesh Singh
2024-01-11 17:35 ` Uladzislau Rezki
2024-01-12 23:09 ` Frederic Weisbecker
2024-01-18 10:37 ` Uladzislau Rezki
2024-01-16 16:18 ` Paul E. McKenney
2024-01-17 12:26 ` Uladzislau Rezki
2024-01-19 15:24 ` Paul E. McKenney
2024-01-22 17:35 ` Uladzislau Rezki
2024-01-23 11:21 ` Paul E. McKenney
2024-01-04 16:25 ` [PATCH v4 2/4] rcu: Add a trace event for synchronize_rcu_normal() Uladzislau Rezki (Sony)
2024-01-12 23:20 ` Frederic Weisbecker
2024-01-15 12:14 ` Uladzislau Rezki
2024-01-04 16:25 ` [PATCH v4 3/4] rcu: Improve handling of synchronize_rcu() users Uladzislau Rezki (Sony)
2024-01-16 16:32 ` Paul E. McKenney
2024-01-04 16:25 ` [PATCH v4 4/4] rcu: Support direct wake-up " Uladzislau Rezki (Sony)
2024-01-13 9:19 ` Z qiang
2024-01-15 10:46 ` Uladzislau Rezki
2024-01-15 10:57 ` Uladzislau Rezki
2024-01-16 6:19 ` Z qiang
2024-01-27 7:07 ` [PATCH v4 0/4] Reduce synchronize_rcu() latency(v4) Paul E. McKenney
2024-01-29 16:23 ` Uladzislau Rezki
2024-01-29 19:43 ` Paul E. McKenney
2024-01-29 20:36 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZZ2bi5iPwXLgjB-f@google.com \
--to=kaleshsingh@google.com \
--cc=Neeraj.Upadhyay@amd.com \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=hdanton@sina.com \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleksiy.avramchenko@sony.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.