* [PATCH rcu 0/2] RCU tasks updates, possibly for v5.18
@ 2022-02-05 0:21 Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 1/2] rcu-tasks: Use order_base_2() instead of ilog2() Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 2/2] rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention Paul E. McKenney
0 siblings, 2 replies; 3+ messages in thread
From: Paul E. McKenney @ 2022-02-05 0:21 UTC (permalink / raw)
To: rcu; +Cc: linux-kernel, kernel-team, rostedt
Hello!
This series provides a couple of updates to the RCU Tasks flavors.
1. rcu-tasks: Use order_base_2() instead of ilog2().
This spreads CPUs over queues more evenly.
2. rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention.
The current mainline series reduces contention by at most a
factor of two. This patch gains at least another factor of
eight during times of high update-side traffic.
Thanx, Paul
------------------------------------------------------------------------
b/kernel/rcu/tasks.h | 6 +++---
kernel/rcu/tasks.h | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH rcu 1/2] rcu-tasks: Use order_base_2() instead of ilog2()
2022-02-05 0:21 [PATCH rcu 0/2] RCU tasks updates, possibly for v5.18 Paul E. McKenney
@ 2022-02-05 0:21 ` Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 2/2] rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention Paul E. McKenney
1 sibling, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2022-02-05 0:21 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney,
Mark Rutland, Martin KaFai Lau, KP Singh
The ilog2() function can be used to generate a shift count, but it will
generate the same count for a power of two as for one greater than a power
of two. This results in shift counts that are larger than necessary for
systems with a power-of-two number of CPUs because the CPUs are numbered
from zero, so that the maximum CPU number is one less than that power
of two.
This commit therefore substitutes order_base_2(), which appears to have
been designed for exactly this use case.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
---
kernel/rcu/tasks.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index fb8c57fd70b8f..c0fc3641ef13a 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -123,7 +123,7 @@ static struct rcu_tasks rt_name = \
.call_func = call, \
.rtpcpu = &rt_name ## __percpu, \
.name = n, \
- .percpu_enqueue_shift = ilog2(CONFIG_NR_CPUS) + 1, \
+ .percpu_enqueue_shift = order_base_2(CONFIG_NR_CPUS), \
.percpu_enqueue_lim = 1, \
.percpu_dequeue_lim = 1, \
.barrier_q_mutex = __MUTEX_INITIALIZER(rt_name.barrier_q_mutex), \
@@ -302,7 +302,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
if (unlikely(needadjust)) {
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
if (rtp->percpu_enqueue_lim != nr_cpu_ids) {
- WRITE_ONCE(rtp->percpu_enqueue_shift, ilog2(nr_cpu_ids) + 1);
+ WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids);
smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids);
pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name);
@@ -417,7 +417,7 @@ static int rcu_tasks_need_gpcb(struct rcu_tasks *rtp)
if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) {
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
if (rtp->percpu_enqueue_lim > 1) {
- WRITE_ONCE(rtp->percpu_enqueue_shift, ilog2(nr_cpu_ids) + 1);
+ WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
smp_store_release(&rtp->percpu_enqueue_lim, 1);
rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu();
pr_info("Starting switch %s to CPU-0 callback queuing.\n", rtp->name);
--
2.31.1.189.g2e36527f23
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH rcu 2/2] rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention
2022-02-05 0:21 [PATCH rcu 0/2] RCU tasks updates, possibly for v5.18 Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 1/2] rcu-tasks: Use order_base_2() instead of ilog2() Paul E. McKenney
@ 2022-02-05 0:21 ` Paul E. McKenney
1 sibling, 0 replies; 3+ messages in thread
From: Paul E. McKenney @ 2022-02-05 0:21 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, rostedt, Paul E. McKenney,
Neeraj Upadhyay, Martin KaFai Lau, KP Singh
Currently, call_rcu_tasks_generic() sets ->percpu_enqueue_shift to
order_base_2(nr_cpu_ids) upon encountering sufficient contention.
This does not shift to use of non-CPU-0 callback queues as intended, but
rather continues using only CPU 0's queue. Although this does provide
some decrease in contention due to spreading work over multiple locks,
it is not the dramatic decrease that was intended.
This commit therefore makes call_rcu_tasks_generic() set
->percpu_enqueue_shift to 0.
Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: KP Singh <kpsingh@kernel.org>
---
kernel/rcu/tasks.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index c0fc3641ef13a..d73e32d803438 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -302,7 +302,7 @@ static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
if (unlikely(needadjust)) {
raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags);
if (rtp->percpu_enqueue_lim != nr_cpu_ids) {
- WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids));
+ WRITE_ONCE(rtp->percpu_enqueue_shift, 0);
WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids);
smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids);
pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name);
--
2.31.1.189.g2e36527f23
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-02-05 0:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-05 0:21 [PATCH rcu 0/2] RCU tasks updates, possibly for v5.18 Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 1/2] rcu-tasks: Use order_base_2() instead of ilog2() Paul E. McKenney
2022-02-05 0:21 ` [PATCH rcu 2/2] rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox