* [PATCH v2 01/10] rcu: Change rdp arg to cpu number for rcu_watching_snap_stopped_since()
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 02/10] rcu: Make some rcu_watching_* functions global neeraj.upadhyay
` (8 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
In preparation of making rcu_watching_snap_stopped_since() available
for RCU code outside of tree.c, change the rdp argument to cpu number.
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tree.c | 14 +++++++-------
kernel/rcu/tree_exp.h | 2 +-
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a60616e69b66..ea17dd2d0344 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -312,16 +312,16 @@ static bool rcu_watching_snap_in_eqs(int snap)
* rcu_watching_snap_stopped_since() - Has RCU stopped watching a given CPU
* since the specified @snap?
*
- * @rdp: The rcu_data corresponding to the CPU for which to check EQS.
+ * @cpu: The CPU for which to check EQS.
* @snap: rcu_watching snapshot taken when the CPU wasn't in an EQS.
*
- * Returns true if the CPU corresponding to @rdp has spent some time in an
- * extended quiescent state since @snap. Note that this doesn't check if it
- * /still/ is in an EQS, just that it went through one since @snap.
+ * Returns true if the CPU has spent some time in an extended quiescent state
+ * since @snap. Note that this doesn't check if it /still/ is in an EQS, just
+ * that it went through one since @snap.
*
* This is meant to be used in a loop waiting for a CPU to go through an EQS.
*/
-static bool rcu_watching_snap_stopped_since(struct rcu_data *rdp, int snap)
+static bool rcu_watching_snap_stopped_since(int cpu, int snap)
{
/*
* The first failing snapshot is already ordered against the accesses
@@ -334,7 +334,7 @@ static bool rcu_watching_snap_stopped_since(struct rcu_data *rdp, int snap)
if (WARN_ON_ONCE(rcu_watching_snap_in_eqs(snap)))
return true;
- return snap != ct_rcu_watching_cpu_acquire(rdp->cpu);
+ return snap != ct_rcu_watching_cpu_acquire(cpu);
}
/*
@@ -826,7 +826,7 @@ static int rcu_watching_snap_recheck(struct rcu_data *rdp)
* read-side critical section that started before the beginning
* of the current RCU grace period.
*/
- if (rcu_watching_snap_stopped_since(rdp, rdp->watching_snap)) {
+ if (rcu_watching_snap_stopped_since(rdp->cpu, rdp->watching_snap)) {
trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti"));
rcu_gpnum_ovf(rnp, rdp);
return 1;
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index fb664d3a01c9..e3bd4c18a852 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -401,7 +401,7 @@ static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp)
unsigned long mask = rdp->grpmask;
retry_ipi:
- if (rcu_watching_snap_stopped_since(rdp, rdp->exp_watching_snap)) {
+ if (rcu_watching_snap_stopped_since(rdp->cpu, rdp->exp_watching_snap)) {
mask_ofl_test |= mask;
continue;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 02/10] rcu: Make some rcu_watching_* functions global
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 01/10] rcu: Change rdp arg to cpu number for rcu_watching_snap_stopped_since() neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 03/10] rcu/tasks: Move holdout checks for idle task to a separate function neeraj.upadhyay
` (7 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
rcu_watching_snap_in_eqs() and rcu_watching_snap_stopped_since()
will be used in subsequent commits by RCU-tasks to check rcu_watching
state of idle tasks. So, make these functions global.
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/rcu.h | 4 ++++
kernel/rcu/tree.c | 4 ++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index feb3ac1dc5d5..5fec1b52039c 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -609,6 +609,8 @@ void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
#ifdef CONFIG_TINY_RCU
static inline bool rcu_watching_zero_in_eqs(int cpu, int *vp) { return false; }
+static inline bool rcu_watching_snap_in_eqs(int snap) { return false; }
+static inline bool rcu_watching_snap_stopped_since(int cpu, int snap) { return false; }
static inline unsigned long rcu_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
static inline unsigned long
@@ -622,6 +624,8 @@ static inline void rcu_gp_slow_register(atomic_t *rgssp) { }
static inline void rcu_gp_slow_unregister(atomic_t *rgssp) { }
#else /* #ifdef CONFIG_TINY_RCU */
bool rcu_watching_zero_in_eqs(int cpu, int *vp);
+bool rcu_watching_snap_in_eqs(int snap);
+bool rcu_watching_snap_stopped_since(int cpu, int snap);
unsigned long rcu_get_gp_seq(void);
unsigned long rcu_exp_batches_completed(void);
unsigned long srcu_batches_completed(struct srcu_struct *sp);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index ea17dd2d0344..5ecbf85f157d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -303,7 +303,7 @@ static void rcu_watching_online(void)
* Return true if the snapshot returned from ct_rcu_watching()
* indicates that RCU is in an extended quiescent state.
*/
-static bool rcu_watching_snap_in_eqs(int snap)
+bool rcu_watching_snap_in_eqs(int snap)
{
return !(snap & CT_RCU_WATCHING);
}
@@ -321,7 +321,7 @@ static bool rcu_watching_snap_in_eqs(int snap)
*
* This is meant to be used in a loop waiting for a CPU to go through an EQS.
*/
-static bool rcu_watching_snap_stopped_since(int cpu, int snap)
+bool rcu_watching_snap_stopped_since(int cpu, int snap)
{
/*
* The first failing snapshot is already ordered against the accesses
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 03/10] rcu/tasks: Move holdout checks for idle task to a separate function
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 01/10] rcu: Change rdp arg to cpu number for rcu_watching_snap_stopped_since() neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 02/10] rcu: Make some rcu_watching_* functions global neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 04/10] rcu/tasks: Create rcu_idle_task_is_holdout() definition for !SMP neeraj.upadhyay
` (6 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Move checks for an idle task being a holdout task for RCU-tasks
to a separate function - rcu_idle_task_is_holdout(). This function
will be used in subsequent commits to add additional checks for
idle task. No functional change intended.
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 6333f4ccf024..56015ced3f37 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -976,6 +976,15 @@ static void rcu_tasks_pregp_step(struct list_head *hop)
synchronize_rcu();
}
+static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
+{
+ /* Idle tasks on offline CPUs are RCU-tasks quiescent states. */
+ if (!rcu_cpu_online(cpu))
+ return false;
+
+ return true;
+}
+
/* Check for quiescent states since the pregp's synchronize_rcu() */
static bool rcu_tasks_is_holdout(struct task_struct *t)
{
@@ -995,9 +1004,8 @@ static bool rcu_tasks_is_holdout(struct task_struct *t)
cpu = task_cpu(t);
- /* Idle tasks on offline CPUs are RCU-tasks quiescent states. */
- if (t == idle_task(cpu) && !rcu_cpu_online(cpu))
- return false;
+ if (t == idle_task(cpu))
+ return rcu_idle_task_is_holdout(t, cpu);
return true;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 04/10] rcu/tasks: Create rcu_idle_task_is_holdout() definition for !SMP
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (2 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 03/10] rcu/tasks: Move holdout checks for idle task to a separate function neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 05/10] rcu/tasks: Consider idle tasks not running on CPU as non-holdouts neeraj.upadhyay
` (5 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
rcu_idle_task_is_holdout() is called in rcu_tasks_kthread() context.
As idle tasks cannot be non-voluntary preempted, non-running idle tasks
are not in RCU-tasks critical section. So, idle task is not a RCU-tasks
holdout task on !SMP (which also covers TINY_RCU).
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 56015ced3f37..b794deeaf6d8 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -976,6 +976,7 @@ static void rcu_tasks_pregp_step(struct list_head *hop)
synchronize_rcu();
}
+#ifdef CONFIG_SMP
static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
{
/* Idle tasks on offline CPUs are RCU-tasks quiescent states. */
@@ -984,6 +985,17 @@ static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
return true;
}
+#else /* #ifdef CONFIG_SMP */
+static inline bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
+{
+ /*
+ * rcu_idle_task_is_holdout() is called in rcu_tasks_kthread()
+ * context. Idle thread would have done a voluntary context
+ * switch.
+ */
+ return false;
+}
+#endif
/* Check for quiescent states since the pregp's synchronize_rcu() */
static bool rcu_tasks_is_holdout(struct task_struct *t)
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 05/10] rcu/tasks: Consider idle tasks not running on CPU as non-holdouts
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (3 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 04/10] rcu/tasks: Create rcu_idle_task_is_holdout() definition for !SMP neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 06/10] rcu/tasks: Check RCU watching state for holdout idle tasks neeraj.upadhyay
` (4 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
As idle tasks cannot be non-voluntary preempted, idle tasks which
are not running on CPU are not in RCU-tasks read side critical
section. So, remove them for holdout tasks for RCU-tasks.
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index b794deeaf6d8..9523aff6cdae 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -983,6 +983,15 @@ static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
if (!rcu_cpu_online(cpu))
return false;
+ /*
+ * As idle tasks cannot be involuntary preempted, non-running idle tasks
+ * are not in RCU-tasks critical section.
+ * synchronize_rcu() calls in rcu_tasks_pregp_step() and rcu_tasks_postgp()
+ * ensure that all ->on_cpu transitions are complete.
+ */
+ if (!t->on_cpu)
+ return false;
+
return true;
}
#else /* #ifdef CONFIG_SMP */
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 06/10] rcu/tasks: Check RCU watching state for holdout idle tasks
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (4 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 05/10] rcu/tasks: Consider idle tasks not running on CPU as non-holdouts neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 07/10] rcu/tasks: Check RCU watching state for holdout idle injection tasks neeraj.upadhyay
` (3 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Use RCU watching state of a CPU to check whether RCU-tasks GP
need to wait for idle task on that CPU. Idle tasks which are
in deep-idle states where RCU is not watching or which have
transitioned to/from deep-idle state do not block RCU-tasks
grace period.
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 36 +++++++++++++++++++++++++++++++++---
1 file changed, 33 insertions(+), 3 deletions(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 9523aff6cdae..d8506d2e6f54 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -36,6 +36,8 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
* @cpu: CPU number corresponding to this entry.
* @index: Index of this CPU in rtpcp_array of the rcu_tasks structure.
* @rtpp: Pointer to the rcu_tasks structure.
+ * @rcu_watching_snap: Per-GP RCU-watching snapshot for idle tasks.
+ * @rcu_watching_snap_rec: RCU-watching snapshot recorded for idle task.
*/
struct rcu_tasks_percpu {
struct rcu_segcblist cblist;
@@ -52,6 +54,8 @@ struct rcu_tasks_percpu {
int cpu;
int index;
struct rcu_tasks *rtpp;
+ int rcu_watching_snap;
+ bool rcu_watching_snap_rec;
};
/**
@@ -957,9 +961,14 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
// rcu_tasks_pregp_step() and by the scheduler's locks and interrupt
// disabling.
+void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func);
+DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks");
+
/* Pre-grace-period preparation. */
static void rcu_tasks_pregp_step(struct list_head *hop)
{
+ int cpu;
+
/*
* Wait for all pre-existing t->on_rq and t->nvcsw transitions
* to complete. Invoking synchronize_rcu() suffices because all
@@ -974,11 +983,20 @@ static void rcu_tasks_pregp_step(struct list_head *hop)
* grace period.
*/
synchronize_rcu();
+
+ /* Initialize watching snapshots for this GP */
+ for_each_possible_cpu(cpu) {
+ struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
+
+ rtpcp->rcu_watching_snap_rec = false;
+ }
}
#ifdef CONFIG_SMP
static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
{
+ struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
+
/* Idle tasks on offline CPUs are RCU-tasks quiescent states. */
if (!rcu_cpu_online(cpu))
return false;
@@ -992,6 +1010,21 @@ static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
if (!t->on_cpu)
return false;
+ if (!rtpcp->rcu_watching_snap_rec) {
+ /*
+ * Do plain access. Ordering between remote CPU's pre idle accesses
+ * and post rcu-tasks grace period is provided by synchronize_rcu()
+ * in rcu_tasks_postgp().
+ */
+ rtpcp->rcu_watching_snap = ct_rcu_watching_cpu(cpu);
+ rtpcp->rcu_watching_snap_rec = true;
+ /* RCU-idle contexts are RCU-tasks quiescent state for idle tasks. */
+ if (rcu_watching_snap_in_eqs(rtpcp->rcu_watching_snap))
+ return false;
+ } else if (rcu_watching_snap_stopped_since(cpu, rtpcp->rcu_watching_snap)) {
+ return false;
+ }
+
return true;
}
#else /* #ifdef CONFIG_SMP */
@@ -1042,9 +1075,6 @@ static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop)
}
}
-void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func);
-DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks");
-
/* Processing between scanning taskslist and draining the holdout list. */
static void rcu_tasks_postscan(struct list_head *hop)
{
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 07/10] rcu/tasks: Check RCU watching state for holdout idle injection tasks
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (5 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 06/10] rcu/tasks: Check RCU watching state for holdout idle tasks neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 14:37 ` Frederic Weisbecker
2024-10-09 12:51 ` [PATCH v2 08/10] rcu/tasks: Make RCU-tasks pay attention to idle tasks neeraj.upadhyay
` (2 subsequent siblings)
9 siblings, 1 reply; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Use RCU watching state of a CPU to check whether RCU-tasks GP
need to wait for idle injection task on that CPU. Idle injection
tasks which are in deep-idle states where RCU is not watching or
which have transitioned to/from deep-idle state do not block
RCU-tasks grace period.
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 63 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 48 insertions(+), 15 deletions(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index d8506d2e6f54..1947f9b6346d 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -38,6 +38,8 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
* @rtpp: Pointer to the rcu_tasks structure.
* @rcu_watching_snap: Per-GP RCU-watching snapshot for idle tasks.
* @rcu_watching_snap_rec: RCU-watching snapshot recorded for idle task.
+ * @rcu_watching_idle_inj_snap: Per-GP RCU-watching snapshot for idle inject task.
+ * @rcu_watching_idle_inj_rec: RCU-watching snapshot recorded for idle inject task.
*/
struct rcu_tasks_percpu {
struct rcu_segcblist cblist;
@@ -56,6 +58,8 @@ struct rcu_tasks_percpu {
struct rcu_tasks *rtpp;
int rcu_watching_snap;
bool rcu_watching_snap_rec;
+ int rcu_watching_idle_inj_snap;
+ bool rcu_watching_idle_inj_rec;
};
/**
@@ -989,10 +993,34 @@ static void rcu_tasks_pregp_step(struct list_head *hop)
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
rtpcp->rcu_watching_snap_rec = false;
+ rtpcp->rcu_watching_idle_inj_rec = false;
}
}
#ifdef CONFIG_SMP
+static bool rcu_idle_check_rcu_watching(int *rcu_watching_snap, bool *rcu_watching_rec, int cpu)
+{
+ if (!*rcu_watching_rec) {
+ /*
+ * Do plain access. Ordering between remote CPU's pre idle accesses
+ * and post rcu-tasks grace period is provided by synchronize_rcu()
+ * in rcu_tasks_postgp().
+ */
+ *rcu_watching_snap = ct_rcu_watching_cpu(cpu);
+ *rcu_watching_rec = true;
+ if (rcu_watching_snap_in_eqs(*rcu_watching_snap))
+ /*
+ * RCU-idle contexts are RCU-tasks quiescent state for idle
+ * (and idle injection) tasks.
+ */
+ return false;
+ } else if (rcu_watching_snap_stopped_since(cpu, *rcu_watching_snap)) {
+ return false;
+ }
+
+ return true;
+}
+
static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
{
struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
@@ -1010,22 +1038,16 @@ static bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
if (!t->on_cpu)
return false;
- if (!rtpcp->rcu_watching_snap_rec) {
- /*
- * Do plain access. Ordering between remote CPU's pre idle accesses
- * and post rcu-tasks grace period is provided by synchronize_rcu()
- * in rcu_tasks_postgp().
- */
- rtpcp->rcu_watching_snap = ct_rcu_watching_cpu(cpu);
- rtpcp->rcu_watching_snap_rec = true;
- /* RCU-idle contexts are RCU-tasks quiescent state for idle tasks. */
- if (rcu_watching_snap_in_eqs(rtpcp->rcu_watching_snap))
- return false;
- } else if (rcu_watching_snap_stopped_since(cpu, rtpcp->rcu_watching_snap)) {
- return false;
- }
+ return rcu_idle_check_rcu_watching(&rtpcp->rcu_watching_snap,
+ &rtpcp->rcu_watching_snap_rec, cpu);
+}
- return true;
+static bool rcu_idle_inj_is_holdout(struct task_struct *t, int cpu)
+{
+ struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rcu_tasks.rtpcpu, cpu);
+
+ return rcu_idle_check_rcu_watching(&rtpcp->rcu_watching_idle_inj_snap,
+ &rtpcp->rcu_watching_idle_inj_rec, cpu);
}
#else /* #ifdef CONFIG_SMP */
static inline bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
@@ -1037,6 +1059,15 @@ static inline bool rcu_idle_task_is_holdout(struct task_struct *t, int cpu)
*/
return false;
}
+
+static inline bool rcu_idle_inj_is_holdout(struct task_struct *t, int cpu)
+{
+ /*
+ * Idle injection tasks are PF_IDLE within preempt disabled
+ * region. So, we should not enter this call for !SMP.
+ */
+ return false;
+}
#endif
/* Check for quiescent states since the pregp's synchronize_rcu() */
@@ -1060,6 +1091,8 @@ static bool rcu_tasks_is_holdout(struct task_struct *t)
if (t == idle_task(cpu))
return rcu_idle_task_is_holdout(t, cpu);
+ else if (is_idle_task(t))
+ return rcu_idle_inj_is_holdout(t, cpu);
return true;
}
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* Re: [PATCH v2 07/10] rcu/tasks: Check RCU watching state for holdout idle injection tasks
2024-10-09 12:51 ` [PATCH v2 07/10] rcu/tasks: Check RCU watching state for holdout idle injection tasks neeraj.upadhyay
@ 2024-10-09 14:37 ` Frederic Weisbecker
0 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2024-10-09 14:37 UTC (permalink / raw)
To: neeraj.upadhyay
Cc: rcu, linux-kernel, paulmck, joel, boqun.feng, urezki, rostedt,
mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay
Le Wed, Oct 09, 2024 at 06:21:24PM +0530, neeraj.upadhyay@kernel.org a écrit :
> From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
>
> Use RCU watching state of a CPU to check whether RCU-tasks GP
> need to wait for idle injection task on that CPU. Idle injection
> tasks which are in deep-idle states where RCU is not watching or
> which have transitioned to/from deep-idle state do not block
> RCU-tasks grace period.
>
> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
For now this should work because there is a single user that is
a per-cpu kthread, therefore no RCU-watching writer can race
against another (real idle VS idle injection or idle_injection VS
idle injection) without going first through a voluntary context
switch. But who knows about the future? If an idle injection kthread
is preempted by another idle injection right after clearing PF_IDLE,
there could be some spurious QS accounted for the preempted
kthread.
So perhaps we can consider idle injection as any normal task and
wait for it to voluntary schedule?
Well I see DEFAULT_DURATION_JIFFIES = 6, which is 60 ms on HZ=100.
Yeah that's a lot...so perhaps this patch is needed after all...
> ---
> kernel/rcu/tasks.h | 63 +++++++++++++++++++++++++++++++++++-----------
> 1 file changed, 48 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index d8506d2e6f54..1947f9b6346d 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -38,6 +38,8 @@ typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
> * @rtpp: Pointer to the rcu_tasks structure.
> * @rcu_watching_snap: Per-GP RCU-watching snapshot for idle tasks.
> * @rcu_watching_snap_rec: RCU-watching snapshot recorded for idle task.
> + * @rcu_watching_idle_inj_snap: Per-GP RCU-watching snapshot for idle inject task.
> + * @rcu_watching_idle_inj_rec: RCU-watching snapshot recorded for idle inject task.
> */
> struct rcu_tasks_percpu {
> struct rcu_segcblist cblist;
> @@ -56,6 +58,8 @@ struct rcu_tasks_percpu {
> struct rcu_tasks *rtpp;
> int rcu_watching_snap;
> bool rcu_watching_snap_rec;
> + int rcu_watching_idle_inj_snap;
> + bool rcu_watching_idle_inj_rec;
So how about:
struct rcu_watching_task {
int snap;
bool rec;
}
...
struct rcu_tasks_percpu {
...
struct rcu_watching_task idle_task;
struct rcu_watching_task idle_inject;
}
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 08/10] rcu/tasks: Make RCU-tasks pay attention to idle tasks
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (6 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 07/10] rcu/tasks: Check RCU watching state for holdout idle injection tasks neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 09/10] context_tracking: Invoke RCU-tasks enter/exit for NMI context neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 10/10] rcu: Allow short-circuiting of synchronize_rcu_tasks_rude() neeraj.upadhyay
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Currently, idle tasks are ignored by RCU-tasks. Change this to
start paying attention to idle tasks except in deep-idle functions
where RCU is not watching. With this, for architectures where
kernel entry/exit and deep-idle functions have been properly tagged
noinstr, Tasks Rude RCU can be disabled.
[ neeraj.upadhyay: Frederic Weisbecker and Paul E. McKenney feedback. ]
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
.../RCU/Design/Requirements/Requirements.rst | 12 +++---
kernel/rcu/tasks.h | 41 ++++++++-----------
2 files changed, 24 insertions(+), 29 deletions(-)
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index 6125e7068d2c..5016b85d53d7 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -2611,8 +2611,8 @@ critical sections that are delimited by voluntary context switches, that
is, calls to schedule(), cond_resched(), and
synchronize_rcu_tasks(). In addition, transitions to and from
userspace execution also delimit tasks-RCU read-side critical sections.
-Idle tasks are ignored by Tasks RCU, and Tasks Rude RCU may be used to
-interact with them.
+Idle tasks which are idle from RCU's perspective are ignored by Tasks RCU,
+and Tasks Rude RCU may be used to interact with them.
Note well that involuntary context switches are *not* Tasks-RCU quiescent
states. After all, in preemptible kernels, a task executing code in a
@@ -2643,10 +2643,10 @@ moniker. And this operation is considered to be quite rude by real-time
workloads that don't want their ``nohz_full`` CPUs receiving IPIs and
by battery-powered systems that don't want their idle CPUs to be awakened.
-Once kernel entry/exit and deep-idle functions have been properly tagged
-``noinstr``, Tasks RCU can start paying attention to idle tasks (except
-those that are idle from RCU's perspective) and then Tasks Rude RCU can
-be removed from the kernel.
+As Tasks RCU now pays attention to idle tasks (except those that are idle
+from RCU's perspective), once kernel entry/exit and deep-idle functions have
+been properly tagged ``noinstr``, Tasks Rude RCU can be removed from the
+kernel.
The tasks-rude-RCU API is also reader-marking-free and thus quite compact,
consisting solely of synchronize_rcu_tasks_rude().
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 1947f9b6346d..72dc0d0a4a8f 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -912,14 +912,15 @@ static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
////////////////////////////////////////////////////////////////////////
//
// Simple variant of RCU whose quiescent states are voluntary context
-// switch, cond_resched_tasks_rcu_qs(), user-space execution, and idle.
-// As such, grace periods can take one good long time. There are no
-// read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
-// because this implementation is intended to get the system into a safe
-// state for some of the manipulations involved in tracing and the like.
-// Finally, this implementation does not support high call_rcu_tasks()
-// rates from multiple CPUs. If this is required, per-CPU callback lists
-// will be needed.
+// switch, cond_resched_tasks_rcu_qs(), user-space execution, and idle
+// tasks which are in RCU-idle context. As such, grace periods can take
+// one good long time. There are no read-side primitives similar to
+// rcu_read_lock() and rcu_read_unlock() because this implementation is
+// intended to get the system into a safe state for some of the
+// manipulations involved in tracing and the like. Finally, this
+// implementation does not support high call_rcu_tasks() rates from
+// multiple CPUs. If this is required, per-CPU callback lists will be
+// needed.
//
// The implementation uses rcu_tasks_wait_gp(), which relies on function
// pointers in the rcu_tasks structure. The rcu_spawn_tasks_kthread()
@@ -1079,14 +1080,6 @@ static bool rcu_tasks_is_holdout(struct task_struct *t)
if (!READ_ONCE(t->on_rq))
return false;
- /*
- * Idle tasks (or idle injection) within the idle loop are RCU-tasks
- * quiescent states. But CPU boot code performed by the idle task
- * isn't a quiescent state.
- */
- if (is_idle_task(t))
- return false;
-
cpu = task_cpu(t);
if (t == idle_task(cpu))
@@ -1265,11 +1258,12 @@ static void tasks_rcu_exit_srcu_stall(struct timer_list *unused)
* period elapses, in other words after all currently executing RCU
* read-side critical sections have completed. call_rcu_tasks() assumes
* that the read-side critical sections end at a voluntary context
- * switch (not a preemption!), cond_resched_tasks_rcu_qs(), entry into idle,
- * or transition to usermode execution. As such, there are no read-side
- * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
- * this primitive is intended to determine that all tasks have passed
- * through a safe state, not so much for data-structure synchronization.
+ * switch (not a preemption!), cond_resched_tasks_rcu_qs(), entry into
+ * RCU-idle context or transition to usermode execution. As such, there
+ * are no read-side primitives analogous to rcu_read_lock() and
+ * rcu_read_unlock() because this primitive is intended to determine
+ * that all tasks have passed through a safe state, not so much for
+ * data-structure synchronization.
*
* See the description of call_rcu() for more detailed information on
* memory ordering guarantees.
@@ -1287,8 +1281,9 @@ EXPORT_SYMBOL_GPL(call_rcu_tasks);
* grace period has elapsed, in other words after all currently
* executing rcu-tasks read-side critical sections have elapsed. These
* read-side critical sections are delimited by calls to schedule(),
- * cond_resched_tasks_rcu_qs(), idle execution, userspace execution, calls
- * to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched().
+ * cond_resched_tasks_rcu_qs(), idle execution within RCU-idle context,
+ * userspace execution, calls to synchronize_rcu_tasks(), and (in theory,
+ * anyway) cond_resched().
*
* This is a very specialized primitive, intended only for a few uses in
* tracing and other situations requiring manipulation of function
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 09/10] context_tracking: Invoke RCU-tasks enter/exit for NMI context
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (7 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 08/10] rcu/tasks: Make RCU-tasks pay attention to idle tasks neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
2024-10-09 12:51 ` [PATCH v2 10/10] rcu: Allow short-circuiting of synchronize_rcu_tasks_rude() neeraj.upadhyay
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
rcu_task_enter() and rcu_task_exit() are not called on NMI
entry and exit. So, Tasks-RCU-Rude grace period wait is required to
ensure that NMI handlers have entered/exited into Tasks-RCU eqs.
For architectures which do not require Tasks-RCU-Rude (as the code
sections where RCU is not watching are marked as noinstr), when
those architectures switch to not using Tasks-RCU-Rude, NMI handlers
task exit to eqs will need to be handled correctly for Tasks-RCU holdout
tasks running on nohz_full CPUs. As it is safe to call these two
functions from NMI context, remove the in_nmi() check. This ensures
that RCU-tasks entry/exit is marked correctly for NMI handlers.
With this check removed, all callers of ct_kernel_exit_state() and
ct_kernel_enter_state() now also call rcu_task_exit() and
rcu_task_enter() respectively. So, fold rcu_task_exit() and
rcu_task_entry() calls into ct_kernel_exit_state() and
ct_kernel_enter_state().
Reported-by: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: "Paul E. McKenney" <paulmck@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/context_tracking.c | 11 +++--------
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 938c48952d26..85ced563af23 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -91,6 +91,7 @@ static noinstr void ct_kernel_exit_state(int offset)
seq = ct_state_inc(offset);
// RCU is no longer watching. Better be in extended quiescent state!
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING));
+ rcu_task_exit();
}
/*
@@ -102,6 +103,8 @@ static noinstr void ct_kernel_enter_state(int offset)
{
int seq;
+ rcu_task_enter();
+
/*
* CPUs seeing atomic_add_return() must see prior idle sojourns,
* and we also must force ordering with the next RCU read-side
@@ -149,7 +152,6 @@ static void noinstr ct_kernel_exit(bool user, int offset)
// RCU is watching here ...
ct_kernel_exit_state(offset);
// ... but is no longer watching here.
- rcu_task_exit();
}
/*
@@ -173,7 +175,6 @@ static void noinstr ct_kernel_enter(bool user, int offset)
ct->nesting++;
return;
}
- rcu_task_enter();
// RCU is not watching here ...
ct_kernel_enter_state(offset);
// ... but is watching here.
@@ -238,9 +239,6 @@ void noinstr ct_nmi_exit(void)
// RCU is watching here ...
ct_kernel_exit_state(CT_RCU_WATCHING);
// ... but is no longer watching here.
-
- if (!in_nmi())
- rcu_task_exit();
}
/**
@@ -273,9 +271,6 @@ void noinstr ct_nmi_enter(void)
*/
if (!rcu_is_watching_curr_cpu()) {
- if (!in_nmi())
- rcu_task_enter();
-
// RCU is not watching here ...
ct_kernel_enter_state(CT_RCU_WATCHING);
// ... but is watching here.
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread* [PATCH v2 10/10] rcu: Allow short-circuiting of synchronize_rcu_tasks_rude()
2024-10-09 12:51 [PATCH v2 00/10] Make RCU Tasks scan idle tasks neeraj.upadhyay
` (8 preceding siblings ...)
2024-10-09 12:51 ` [PATCH v2 09/10] context_tracking: Invoke RCU-tasks enter/exit for NMI context neeraj.upadhyay
@ 2024-10-09 12:51 ` neeraj.upadhyay
9 siblings, 0 replies; 12+ messages in thread
From: neeraj.upadhyay @ 2024-10-09 12:51 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, paulmck, joel, frederic, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, peterz,
neeraj.upadhyay, Neeraj Upadhyay
From: "Paul E. McKenney" <paulmck@kernel.org>
There are now architectures for which all deep-idle and entry-exit
functions are properly inlined or marked noinstr. Such architectures do
not need synchronize_rcu_tasks_rude(), or will not once RCU Tasks has
been modified to pay attention to idle tasks. This commit therefore
allows a CONFIG_ARCH_HAS_NOINSTR_MARKINGS Kconfig option to turn
synchronize_rcu_tasks_rude() into a no-op.
To facilitate testing, kernels built by rcutorture scripting will enable
RCU Tasks Trace even on systems that do not need it.
[ paulmck: Apply Peter Zijlstra feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
---
kernel/rcu/tasks.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 72dc0d0a4a8f..ef9de6b91a3d 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1485,7 +1485,8 @@ static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func)
*/
void synchronize_rcu_tasks_rude(void)
{
- synchronize_rcu_tasks_generic(&rcu_tasks_rude);
+ if (!IS_ENABLED(CONFIG_ARCH_WANTS_NO_INSTR) || IS_ENABLED(CONFIG_FORCE_TASKS_RUDE_RCU))
+ synchronize_rcu_tasks_generic(&rcu_tasks_rude);
}
EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude);
--
2.40.1
^ permalink raw reply related [flat|nested] 12+ messages in thread