From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
Joel Fernandes <joel@joelfernandes.org>,
Neeraj Upadhyay <neeraj.upadhyay@amd.com>,
Uladzislau Rezki <urezki@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>, rcu <rcu@vger.kernel.org>,
"Paul E . McKenney" <paulmck@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE
Date: Fri, 17 May 2024 17:23:02 +0200 [thread overview]
Message-ID: <20240517152303.19689-2-frederic@kernel.org> (raw)
In-Reply-To: <20240517152303.19689-1-frederic@kernel.org>
When RCU-TASKS-TRACE pre-gp takes a snapshot of the current task running
on all online CPUs, no explicit ordering enforces a missed context
switched task to see the pre-GP update side accesses. The following
diagram, courtesy of Paul, shows the possible bad scenario:
CPU 0 CPU 1
----- -----
// Pre-GP update side access
WRITE_ONCE(*X, 1);
smp_mb();
r0 = rq->curr;
RCU_INIT_POINTER(rq->curr, TASK_B)
spin_unlock(rq)
rcu_read_lock_trace()
r1 = X;
/* ignore TASK_B */
Either r0==TASK_B or r1==1 is needed but neither is guaranteed.
One possible solution to solve this is to wait for an RCU grace period
at the beginning of the TASKS-TRACE grace period before taking the
current tasks snaphot. However this would introduce more latency to
TASKS-TRACE update sides.
Choose another solution: lock the target runqueue lock while taking the
current task snapshot. This makes sure that the update side sees
the latest context switch and subsequent context switches will see the
pre-GP update side accesses.
Fixes: e386b6725798 ("rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUs")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/rcu/tasks.h | 5 +++++
kernel/sched/core.c | 14 +++++++-------
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 8adbd886ad2e..6a9ee35a282e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1737,6 +1737,11 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
// allow safe access to the hop list.
for_each_online_cpu(cpu) {
rcu_read_lock();
+ /*
+ * RQ must be locked because no ordering exists/can be relied upon
+ * between rq->curr write and subsequent read sides. This ensures that
+ * further context switching tasks will see update side pre-GP accesses.
+ */
t = cpu_curr_snapshot(cpu);
if (rcu_tasks_trace_pertask_prep(t, true))
trc_add_holdout(t, hop);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7019a40457a6..fa6e60d5e3be 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4467,12 +4467,7 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
* @cpu: The CPU on which to snapshot the task.
*
* Returns the task_struct pointer of the task "currently" running on
- * the specified CPU. If the same task is running on that CPU throughout,
- * the return value will be a pointer to that task's task_struct structure.
- * If the CPU did any context switches even vaguely concurrently with the
- * execution of this function, the return value will be a pointer to the
- * task_struct structure of a randomly chosen task that was running on
- * that CPU somewhere around the time that this function was executing.
+ * the specified CPU.
*
* If the specified CPU was offline, the return value is whatever it
* is, perhaps a pointer to the task_struct structure of that CPU's idle
@@ -4486,11 +4481,16 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
*/
struct task_struct *cpu_curr_snapshot(int cpu)
{
+ struct rq *rq = cpu_rq(cpu);
struct task_struct *t;
+ struct rq_flags rf;
- smp_mb(); /* Pairing determined by caller's synchronization design. */
+ rq_lock_irqsave(rq, &rf);
+ smp_mb__after_spinlock(); /* Pairing determined by caller's synchronization design. */
t = rcu_dereference(cpu_curr(cpu));
+ rq_unlock_irqrestore(rq, &rf);
smp_mb(); /* Pairing determined by caller's synchronization design. */
+
return t;
}
--
2.44.0
next prev parent reply other threads:[~2024-05-17 15:23 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-17 15:23 [PATCH 0/2] rcu/tasks: Fix stale task snapshot Frederic Weisbecker
2024-05-17 15:23 ` Frederic Weisbecker [this message]
2024-05-17 23:30 ` [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE Paul E. McKenney
2024-05-17 15:23 ` [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE Frederic Weisbecker
2024-05-20 18:48 ` Paul E. McKenney
2024-05-20 20:41 ` Frederic Weisbecker
2024-05-20 23:25 ` Paul E. McKenney
2024-05-21 9:59 ` Frederic Weisbecker
2024-05-21 13:38 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240517152303.19689-2-frederic@kernel.org \
--to=frederic@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=joel@joelfernandes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.upadhyay@amd.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox