public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] rcu/tasks: Fix stale task snapshot
@ 2024-05-17 15:23 Frederic Weisbecker
  2024-05-17 15:23 ` [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE Frederic Weisbecker
  2024-05-17 15:23 ` [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE Frederic Weisbecker
  0 siblings, 2 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2024-05-17 15:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Paul E . McKenney, Peter Zijlstra

Fix an issue on RCU task trace pre-gp that might collect stale current
tasks on online CPUs, missing the read side of latest context switching
tasks.

Frederic Weisbecker (2):
  rcu/tasks: Fix stale task snaphot from TASK-TRACE
  rcu/tasks: Further comment ordering around current task snapshot on
    TASK-TRACE

 kernel/rcu/tasks.h  | 17 +++++++++++++++++
 kernel/sched/core.c | 14 +++++++-------
 2 files changed, 24 insertions(+), 7 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE
  2024-05-17 15:23 [PATCH 0/2] rcu/tasks: Fix stale task snapshot Frederic Weisbecker
@ 2024-05-17 15:23 ` Frederic Weisbecker
  2024-05-17 23:30   ` Paul E. McKenney
  2024-05-17 15:23 ` [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE Frederic Weisbecker
  1 sibling, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2024-05-17 15:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Paul E . McKenney, Peter Zijlstra

When RCU-TASKS-TRACE pre-gp takes a snapshot of the current task running
on all online CPUs, no explicit ordering enforces a missed context
switched task to see the pre-GP update side accesses. The following
diagram, courtesy of Paul, shows the possible bad scenario:

        CPU 0                                           CPU 1
        -----                                           -----

        // Pre-GP update side access
        WRITE_ONCE(*X, 1);
        smp_mb();
        r0 = rq->curr;
                                                        RCU_INIT_POINTER(rq->curr, TASK_B)
                                                        spin_unlock(rq)
                                                        rcu_read_lock_trace()
                                                        r1 = X;
        /* ignore TASK_B */

Either r0==TASK_B or r1==1 is needed but neither is guaranteed.

One possible solution to solve this is to wait for an RCU grace period
at the beginning of the TASKS-TRACE grace period before taking the
current tasks snaphot. However this would introduce more latency to
TASKS-TRACE update sides.

Choose another solution: lock the target runqueue lock while taking the
current task snapshot. This makes sure that the update side sees
the latest context switch and subsequent context switches will see the
pre-GP update side accesses.

Fixes: e386b6725798 ("rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUs")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tasks.h  |  5 +++++
 kernel/sched/core.c | 14 +++++++-------
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 8adbd886ad2e..6a9ee35a282e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1737,6 +1737,11 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
 	// allow safe access to the hop list.
 	for_each_online_cpu(cpu) {
 		rcu_read_lock();
+		/*
+		 * RQ must be locked because no ordering exists/can be relied upon
+		 * between rq->curr write and subsequent read sides. This ensures that
+		 * further context switching tasks will see update side pre-GP accesses.
+		 */
 		t = cpu_curr_snapshot(cpu);
 		if (rcu_tasks_trace_pertask_prep(t, true))
 			trc_add_holdout(t, hop);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7019a40457a6..fa6e60d5e3be 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4467,12 +4467,7 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
  * @cpu: The CPU on which to snapshot the task.
  *
  * Returns the task_struct pointer of the task "currently" running on
- * the specified CPU.  If the same task is running on that CPU throughout,
- * the return value will be a pointer to that task's task_struct structure.
- * If the CPU did any context switches even vaguely concurrently with the
- * execution of this function, the return value will be a pointer to the
- * task_struct structure of a randomly chosen task that was running on
- * that CPU somewhere around the time that this function was executing.
+ * the specified CPU.
  *
  * If the specified CPU was offline, the return value is whatever it
  * is, perhaps a pointer to the task_struct structure of that CPU's idle
@@ -4486,11 +4481,16 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
  */
 struct task_struct *cpu_curr_snapshot(int cpu)
 {
+	struct rq *rq = cpu_rq(cpu);
 	struct task_struct *t;
+	struct rq_flags rf;
 
-	smp_mb(); /* Pairing determined by caller's synchronization design. */
+	rq_lock_irqsave(rq, &rf);
+	smp_mb__after_spinlock(); /* Pairing determined by caller's synchronization design. */
 	t = rcu_dereference(cpu_curr(cpu));
+	rq_unlock_irqrestore(rq, &rf);
 	smp_mb(); /* Pairing determined by caller's synchronization design. */
+
 	return t;
 }
 
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-17 15:23 [PATCH 0/2] rcu/tasks: Fix stale task snapshot Frederic Weisbecker
  2024-05-17 15:23 ` [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE Frederic Weisbecker
@ 2024-05-17 15:23 ` Frederic Weisbecker
  2024-05-20 18:48   ` Paul E. McKenney
  1 sibling, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2024-05-17 15:23 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Paul E . McKenney, Peter Zijlstra

Comment the current understanding of barriers and locking role around
task snapshot.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/rcu/tasks.h | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 6a9ee35a282e..05413b37dd6e 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
 	for_each_online_cpu(cpu) {
 		rcu_read_lock();
 		/*
-		 * RQ must be locked because no ordering exists/can be relied upon
-		 * between rq->curr write and subsequent read sides. This ensures that
-		 * further context switching tasks will see update side pre-GP accesses.
+		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
+		 * two purposes:
+		 *
+		 * 1) Ordering against previous tasks accesses (though already enforced
+		 *    by upcoming IPIs and post-gp synchronize_rcu()).
+		 *
+		 * 2) Make sure not to miss latest context switch, because no ordering
+		 *    exists/can be relied upon between rq->curr write and subsequent read
+		 *    sides.
+		 *
+		 * 3) Make sure subsequent context switching tasks will see update side
+		 *    pre-GP accesses.
+		 *
+		 * smp_mb() after reading rq->curr doesn't play a significant role and might
+		 * be considered for removal in the future.
 		 */
 		t = cpu_curr_snapshot(cpu);
 		if (rcu_tasks_trace_pertask_prep(t, true))
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE
  2024-05-17 15:23 ` [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE Frederic Weisbecker
@ 2024-05-17 23:30   ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2024-05-17 23:30 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

On Fri, May 17, 2024 at 05:23:02PM +0200, Frederic Weisbecker wrote:
> When RCU-TASKS-TRACE pre-gp takes a snapshot of the current task running
> on all online CPUs, no explicit ordering enforces a missed context
> switched task to see the pre-GP update side accesses. The following
> diagram, courtesy of Paul, shows the possible bad scenario:
> 
>         CPU 0                                           CPU 1
>         -----                                           -----
> 
>         // Pre-GP update side access
>         WRITE_ONCE(*X, 1);
>         smp_mb();
>         r0 = rq->curr;
>                                                         RCU_INIT_POINTER(rq->curr, TASK_B)
>                                                         spin_unlock(rq)
>                                                         rcu_read_lock_trace()
>                                                         r1 = X;
>         /* ignore TASK_B */
> 
> Either r0==TASK_B or r1==1 is needed but neither is guaranteed.
> 
> One possible solution to solve this is to wait for an RCU grace period
> at the beginning of the TASKS-TRACE grace period before taking the
> current tasks snaphot. However this would introduce more latency to
> TASKS-TRACE update sides.
> 
> Choose another solution: lock the target runqueue lock while taking the
> current task snapshot. This makes sure that the update side sees
> the latest context switch and subsequent context switches will see the
> pre-GP update side accesses.
> 
> Fixes: e386b6725798 ("rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUs")
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Excellent catch!!!

Queued for review and testing with the usual wordsmithing shown below.

I am happy to push this via -rcu, but if you were instead looking to
send it via some other path:

Acked-by: Paul E. McKenney <paulmck@kernel.org>

							Thanx, Paul

------------------------------------------------------------------------

commit f04b876e13b8218867f4e4538488c20fbcafc4f0
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Fri May 17 17:23:02 2024 +0200

    rcu/tasks: Fix stale task snaphot for Tasks Trace
    
    When RCU-TASKS-TRACE pre-gp takes a snapshot of the current task running
    on all online CPUs, no explicit ordering synchronizes properly with a
    context switch.  This lack of ordering can permit the new task to miss
    pre-grace-period update-side accesses.  The following diagram, courtesy
    of Paul, shows the possible bad scenario:
    
            CPU 0                                           CPU 1
            -----                                           -----
    
            // Pre-GP update side access
            WRITE_ONCE(*X, 1);
            smp_mb();
            r0 = rq->curr;
                                                            RCU_INIT_POINTER(rq->curr, TASK_B)
                                                            spin_unlock(rq)
                                                            rcu_read_lock_trace()
                                                            r1 = X;
            /* ignore TASK_B */
    
    Either r0==TASK_B or r1==1 is needed but neither is guaranteed.
    
    One possible solution to solve this is to wait for an RCU grace period
    at the beginning of the RCU-tasks-trace grace period before taking the
    current tasks snaphot. However this would introduce large additional
    latencies to RCU-tasks-trace grace periods.
    
    Another solution is to lock the target runqueue while taking the current
    task snapshot. This ensures that the update side sees the latest context
    switch and subsequent context switches will see the pre-grace-period
    update side accesses.
    
    This commit therefore adds runqueue locking to cpu_curr_snapshot().
    
    Fixes: e386b6725798 ("rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUs")
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 8adbd886ad2ee..58d8263c12392 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1737,6 +1737,9 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
 	// allow safe access to the hop list.
 	for_each_online_cpu(cpu) {
 		rcu_read_lock();
+		// Note that cpu_curr_snapshot() locks the target CPU's
+		// runqueue.  This ensures that subsequent tasks running
+		// on that CPU will see the updater's pre-GP accesses.
 		t = cpu_curr_snapshot(cpu);
 		if (rcu_tasks_trace_pertask_prep(t, true))
 			trc_add_holdout(t, hop);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7019a40457a6d..fa6e60d5e3be3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4467,12 +4467,7 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
  * @cpu: The CPU on which to snapshot the task.
  *
  * Returns the task_struct pointer of the task "currently" running on
- * the specified CPU.  If the same task is running on that CPU throughout,
- * the return value will be a pointer to that task's task_struct structure.
- * If the CPU did any context switches even vaguely concurrently with the
- * execution of this function, the return value will be a pointer to the
- * task_struct structure of a randomly chosen task that was running on
- * that CPU somewhere around the time that this function was executing.
+ * the specified CPU.
  *
  * If the specified CPU was offline, the return value is whatever it
  * is, perhaps a pointer to the task_struct structure of that CPU's idle
@@ -4486,11 +4481,16 @@ int task_call_func(struct task_struct *p, task_call_f func, void *arg)
  */
 struct task_struct *cpu_curr_snapshot(int cpu)
 {
+	struct rq *rq = cpu_rq(cpu);
 	struct task_struct *t;
+	struct rq_flags rf;
 
-	smp_mb(); /* Pairing determined by caller's synchronization design. */
+	rq_lock_irqsave(rq, &rf);
+	smp_mb__after_spinlock(); /* Pairing determined by caller's synchronization design. */
 	t = rcu_dereference(cpu_curr(cpu));
+	rq_unlock_irqrestore(rq, &rf);
 	smp_mb(); /* Pairing determined by caller's synchronization design. */
+
 	return t;
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-17 15:23 ` [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE Frederic Weisbecker
@ 2024-05-20 18:48   ` Paul E. McKenney
  2024-05-20 20:41     ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2024-05-20 18:48 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> Comment the current understanding of barriers and locking role around
> task snapshot.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>  kernel/rcu/tasks.h | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> index 6a9ee35a282e..05413b37dd6e 100644
> --- a/kernel/rcu/tasks.h
> +++ b/kernel/rcu/tasks.h
> @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
>  	for_each_online_cpu(cpu) {
>  		rcu_read_lock();
>  		/*
> -		 * RQ must be locked because no ordering exists/can be relied upon
> -		 * between rq->curr write and subsequent read sides. This ensures that
> -		 * further context switching tasks will see update side pre-GP accesses.
> +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> +		 * two purposes:
> +		 *
> +		 * 1) Ordering against previous tasks accesses (though already enforced
> +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> +		 *
> +		 * 2) Make sure not to miss latest context switch, because no ordering
> +		 *    exists/can be relied upon between rq->curr write and subsequent read
> +		 *    sides.
> +		 *
> +		 * 3) Make sure subsequent context switching tasks will see update side
> +		 *    pre-GP accesses.
> +		 *
> +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> +		 * be considered for removal in the future.
>  		 */
>  		t = cpu_curr_snapshot(cpu);
>  		if (rcu_tasks_trace_pertask_prep(t, true))

How about this for that comment?

		// Note that cpu_curr_snapshot() picks up the target
		// CPU's current task while its runqueue is locked with an
		// smp_mb__after_spinlock().  This ensures that subsequent
		// tasks running on that CPU will see the updater's pre-GP
		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
		// does not currently play a role other than simplify
		// that function's ordering semantics.  If these simplified
		// ordering semantics continue to be redundant, that smp_mb()
		// might be removed.

I left out the "ordering agains previous tasks accesses" because,
as you say, this ordering is provided elsewhere.

Thoughts?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-20 18:48   ` Paul E. McKenney
@ 2024-05-20 20:41     ` Frederic Weisbecker
  2024-05-20 23:25       ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2024-05-20 20:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

Le Mon, May 20, 2024 at 11:48:54AM -0700, Paul E. McKenney a écrit :
> On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> > Comment the current understanding of barriers and locking role around
> > task snapshot.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/rcu/tasks.h | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > index 6a9ee35a282e..05413b37dd6e 100644
> > --- a/kernel/rcu/tasks.h
> > +++ b/kernel/rcu/tasks.h
> > @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
> >  	for_each_online_cpu(cpu) {
> >  		rcu_read_lock();
> >  		/*
> > -		 * RQ must be locked because no ordering exists/can be relied upon
> > -		 * between rq->curr write and subsequent read sides. This ensures that
> > -		 * further context switching tasks will see update side pre-GP accesses.
> > +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> > +		 * two purposes:
> > +		 *
> > +		 * 1) Ordering against previous tasks accesses (though already enforced
> > +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> > +		 *
> > +		 * 2) Make sure not to miss latest context switch, because no ordering
> > +		 *    exists/can be relied upon between rq->curr write and subsequent read
> > +		 *    sides.
> > +		 *
> > +		 * 3) Make sure subsequent context switching tasks will see update side
> > +		 *    pre-GP accesses.
> > +		 *
> > +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> > +		 * be considered for removal in the future.
> >  		 */
> >  		t = cpu_curr_snapshot(cpu);
> >  		if (rcu_tasks_trace_pertask_prep(t, true))
> 
> How about this for that comment?
> 
> 		// Note that cpu_curr_snapshot() picks up the target
> 		// CPU's current task while its runqueue is locked with an
> 		// smp_mb__after_spinlock().  This ensures that subsequent
> 		// tasks running on that CPU will see the updater's pre-GP
> 		// accesses.

Right but to achieve that, the smp_mb() was already enough, courtesy of
the official full barrier on schedule that (this one at least) we could rely on:

Updater             Reader
------             -------
X = 1              rq->curr = A
                   // another context switch later
smp_mb()           smp_mb__after_spin_lock() // right after rq_lock on __schedule()
READ rq->curr      rq->curr = B
                   READ X

If the updater misses A, B will see the update on X.

So I think we still need to justify the rq locking on the comments.

>                          The trailng smp_mb() in cpu_curr_snapshot()
> 		// does not currently play a role other than simplify
> 		// that function's ordering semantics.  If these simplified
> 		// ordering semantics continue to be redundant, that smp_mb()
> 		// might be removed.

That looks good.

> 
> I left out the "ordering agains previous tasks accesses" because,
> as you say, this ordering is provided elsewhere.

Right!

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-20 20:41     ` Frederic Weisbecker
@ 2024-05-20 23:25       ` Paul E. McKenney
  2024-05-21  9:59         ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2024-05-20 23:25 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

On Mon, May 20, 2024 at 10:41:52PM +0200, Frederic Weisbecker wrote:
> Le Mon, May 20, 2024 at 11:48:54AM -0700, Paul E. McKenney a écrit :
> > On Fri, May 17, 2024 at 05:23:03PM +0200, Frederic Weisbecker wrote:
> > > Comment the current understanding of barriers and locking role around
> > > task snapshot.
> > > 
> > > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > > ---
> > >  kernel/rcu/tasks.h | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
> > > index 6a9ee35a282e..05413b37dd6e 100644
> > > --- a/kernel/rcu/tasks.h
> > > +++ b/kernel/rcu/tasks.h
> > > @@ -1738,9 +1738,21 @@ static void rcu_tasks_trace_pregp_step(struct list_head *hop)
> > >  	for_each_online_cpu(cpu) {
> > >  		rcu_read_lock();
> > >  		/*
> > > -		 * RQ must be locked because no ordering exists/can be relied upon
> > > -		 * between rq->curr write and subsequent read sides. This ensures that
> > > -		 * further context switching tasks will see update side pre-GP accesses.
> > > +		 * RQ lock + smp_mb__after_spinlock() before reading rq->curr serve
> > > +		 * two purposes:
> > > +		 *
> > > +		 * 1) Ordering against previous tasks accesses (though already enforced
> > > +		 *    by upcoming IPIs and post-gp synchronize_rcu()).
> > > +		 *
> > > +		 * 2) Make sure not to miss latest context switch, because no ordering
> > > +		 *    exists/can be relied upon between rq->curr write and subsequent read
> > > +		 *    sides.
> > > +		 *
> > > +		 * 3) Make sure subsequent context switching tasks will see update side
> > > +		 *    pre-GP accesses.
> > > +		 *
> > > +		 * smp_mb() after reading rq->curr doesn't play a significant role and might
> > > +		 * be considered for removal in the future.
> > >  		 */
> > >  		t = cpu_curr_snapshot(cpu);
> > >  		if (rcu_tasks_trace_pertask_prep(t, true))
> > 
> > How about this for that comment?
> > 
> > 		// Note that cpu_curr_snapshot() picks up the target
> > 		// CPU's current task while its runqueue is locked with an
> > 		// smp_mb__after_spinlock().  This ensures that subsequent
> > 		// tasks running on that CPU will see the updater's pre-GP
> > 		// accesses.
> 
> Right but to achieve that, the smp_mb() was already enough, courtesy of
> the official full barrier on schedule that (this one at least) we could rely on:
> 
> Updater             Reader
> ------             -------
> X = 1              rq->curr = A
>                    // another context switch later
> smp_mb()           smp_mb__after_spin_lock() // right after rq_lock on __schedule()
> READ rq->curr      rq->curr = B
>                    READ X
> 
> If the updater misses A, B will see the update on X.
> 
> So I think we still need to justify the rq locking on the comments.
> 
> >                          The trailng smp_mb() in cpu_curr_snapshot()
> > 		// does not currently play a role other than simplify
> > 		// that function's ordering semantics.  If these simplified
> > 		// ordering semantics continue to be redundant, that smp_mb()
> > 		// might be removed.
> 
> That looks good.
> 
> > 
> > I left out the "ordering agains previous tasks accesses" because,
> > as you say, this ordering is provided elsewhere.
> 
> Right!

Good points!  How about the following?

		// Note that cpu_curr_snapshot() picks up the target
		// CPU's current task while its runqueue is locked with
		// an smp_mb__after_spinlock().  This ensures that either
		// the grace-period kthread will see that task's read-side
		// critical section or the task will see the updater's pre-GP
		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
		// does not currently play a role other than simplify
		// that function's ordering semantics.  If these simplified
		// ordering semantics continue to be redundant, that smp_mb()
		// might be removed.

Keeping in mind that the commit's log fully lays out the troublesome
scenario.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-20 23:25       ` Paul E. McKenney
@ 2024-05-21  9:59         ` Frederic Weisbecker
  2024-05-21 13:38           ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Frederic Weisbecker @ 2024-05-21  9:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

Le Mon, May 20, 2024 at 04:25:33PM -0700, Paul E. McKenney a écrit :
> Good points!  How about the following?
> 
> 		// Note that cpu_curr_snapshot() picks up the target
> 		// CPU's current task while its runqueue is locked with
> 		// an smp_mb__after_spinlock().  This ensures that either
> 		// the grace-period kthread will see that task's read-side
> 		// critical section or the task will see the updater's pre-GP
> 		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()

*trailing

> 		// does not currently play a role other than simplify
> 		// that function's ordering semantics.  If these simplified
> 		// ordering semantics continue to be redundant, that smp_mb()
> 		// might be removed.
> 
> Keeping in mind that the commit's log fully lays out the troublesome
> scenario.

Yep, looks very good!

Thanks!

> 
> 							Thanx, Paul
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE
  2024-05-21  9:59         ` Frederic Weisbecker
@ 2024-05-21 13:38           ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2024-05-21 13:38 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Boqun Feng, Joel Fernandes, Neeraj Upadhyay,
	Uladzislau Rezki, Zqiang, rcu, Peter Zijlstra

On Tue, May 21, 2024 at 11:59:57AM +0200, Frederic Weisbecker wrote:
> Le Mon, May 20, 2024 at 04:25:33PM -0700, Paul E. McKenney a écrit :
> > Good points!  How about the following?
> > 
> > 		// Note that cpu_curr_snapshot() picks up the target
> > 		// CPU's current task while its runqueue is locked with
> > 		// an smp_mb__after_spinlock().  This ensures that either
> > 		// the grace-period kthread will see that task's read-side
> > 		// critical section or the task will see the updater's pre-GP
> > 		// accesses.  The trailng smp_mb() in cpu_curr_snapshot()
> 
> *trailing

Good catch!

> > 		// does not currently play a role other than simplify
> > 		// that function's ordering semantics.  If these simplified
> > 		// ordering semantics continue to be redundant, that smp_mb()
> > 		// might be removed.
> > 
> > Keeping in mind that the commit's log fully lays out the troublesome
> > scenario.
> 
> Yep, looks very good!
> 
> Thanks!

Very good, I will fold this in on my next rebase.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-05-21 13:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-17 15:23 [PATCH 0/2] rcu/tasks: Fix stale task snapshot Frederic Weisbecker
2024-05-17 15:23 ` [PATCH 1/2] rcu/tasks: Fix stale task snaphot from TASK-TRACE Frederic Weisbecker
2024-05-17 23:30   ` Paul E. McKenney
2024-05-17 15:23 ` [PATCH 2/2] rcu/tasks: Further comment ordering around current task snapshot on TASK-TRACE Frederic Weisbecker
2024-05-20 18:48   ` Paul E. McKenney
2024-05-20 20:41     ` Frederic Weisbecker
2024-05-20 23:25       ` Paul E. McKenney
2024-05-21  9:59         ` Frederic Weisbecker
2024-05-21 13:38           ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox