* [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
@ 2024-11-04 15:12 Zilin Guan
2024-11-06 20:18 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Zilin Guan @ 2024-11-04 15:12 UTC (permalink / raw)
To: paulmck
Cc: frederic, neeraj.upadhyay, joel, josh, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, rcu,
linux-kernel, Zilin Guan
In function __note_gp_changes(), rdp->gpwrap is read using READ_ONCE()
in line 1307:
1307 if (IS_ENABLED(CONFIG_PROVE_RCU) && READ_ONCE(rdp->gpwrap))
1308 WRITE_ONCE(rdp->last_sched_clock, jiffies);
while read directly in line 1305:
1305 if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) ||
rdp->gpwrap)
1306 WRITE_ONCE(rdp->gp_seq_needed, rnp->gp_seq_needed);
In the same environment, reads in two places should have the same
protection.
Signed-off-by: Zilin Guan <zilinguan811@gmail.com>
---
kernel/rcu/tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index b1f883fcd918..d3e2b420dce5 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1302,7 +1302,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
zero_cpu_stall_ticks(rdp);
}
rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */
- if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || rdp->gpwrap)
+ if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || READ_ONCE(rdp->gpwrap))
WRITE_ONCE(rdp->gp_seq_needed, rnp->gp_seq_needed);
if (IS_ENABLED(CONFIG_PROVE_RCU) && READ_ONCE(rdp->gpwrap))
WRITE_ONCE(rdp->last_sched_clock, jiffies);
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
2024-11-04 15:12 [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes() Zilin Guan
@ 2024-11-06 20:18 ` Paul E. McKenney
2024-11-07 14:01 ` Zilin Guan
0 siblings, 1 reply; 4+ messages in thread
From: Paul E. McKenney @ 2024-11-06 20:18 UTC (permalink / raw)
To: Zilin Guan
Cc: frederic, neeraj.upadhyay, joel, josh, boqun.feng, urezki,
rostedt, mathieu.desnoyers, jiangshanlai, qiang.zhang1211, rcu,
linux-kernel
On Mon, Nov 04, 2024 at 03:12:30PM +0000, Zilin Guan wrote:
> In function __note_gp_changes(), rdp->gpwrap is read using READ_ONCE()
> in line 1307:
>
> 1307 if (IS_ENABLED(CONFIG_PROVE_RCU) && READ_ONCE(rdp->gpwrap))
> 1308 WRITE_ONCE(rdp->last_sched_clock, jiffies);
>
> while read directly in line 1305:
>
> 1305 if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) ||
> rdp->gpwrap)
> 1306 WRITE_ONCE(rdp->gp_seq_needed, rnp->gp_seq_needed);
>
> In the same environment, reads in two places should have the same
> protection.
>
> Signed-off-by: Zilin Guan <zilinguan811@gmail.com>
Good eyes!!!
But did you find this with KCSAN, or by visual inspection?
The reason that I ask is that the __note_gp_changes() should be
invoked with the leaf rnp->lock held, which should exclude writes to
the rdp->gpwrap fields for all CPUs corresponding to that leaf rcu_node
structure.
Note the raw_lockdep_assert_held_rcu_node(rnp) call at the beginning of
this function.
So I believe that the proper fix is to *remove* READ_ONCE() from accesses
to rdp->gpwrap in this function.
Or am I missing something here?
Thanx, Paul
> ---
> kernel/rcu/tree.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index b1f883fcd918..d3e2b420dce5 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1302,7 +1302,7 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
> zero_cpu_stall_ticks(rdp);
> }
> rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */
> - if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || rdp->gpwrap)
> + if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || READ_ONCE(rdp->gpwrap))
> WRITE_ONCE(rdp->gp_seq_needed, rnp->gp_seq_needed);
> if (IS_ENABLED(CONFIG_PROVE_RCU) && READ_ONCE(rdp->gpwrap))
> WRITE_ONCE(rdp->last_sched_clock, jiffies);
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
2024-11-06 20:18 ` Paul E. McKenney
@ 2024-11-07 14:01 ` Zilin Guan
2024-11-07 14:15 ` Paul E. McKenney
0 siblings, 1 reply; 4+ messages in thread
From: Zilin Guan @ 2024-11-07 14:01 UTC (permalink / raw)
To: paulmck
Cc: boqun.feng, frederic, jiangshanlai, joel, josh, linux-kernel,
mathieu.desnoyers, neeraj.upadhyay, qiang.zhang1211, rcu, rostedt,
urezki, zilinguan811, xujianhao01
On Wed, Nov 06, 2024 at 12:18:25PM -0800, Paul E. McKenney wrote:
> Good eyes!!!
>
> But did you find this with KCSAN, or by visual inspection?
>
> The reason that I ask is that the __note_gp_changes() should be
> invoked with the leaf rnp->lock held, which should exclude writes to
> the rdp->gpwrap fields for all CPUs corresponding to that leaf rcu_node
> structure.
>
> Note the raw_lockdep_assert_held_rcu_node(rnp) call at the beginning of
> this function.
>
> So I believe that the proper fix is to *remove* READ_ONCE() from accesses
> to rdp->gpwrap in this function.
>
> Or am I missing something here?
>
> Thanx, Paul
I found this by visual inspection.
When reviewing the function __note_gp_changes(), I noticed that other
accesses to rdp->gpwrap are protected with either READ_ONCE() or
WRITE_ONCE(), which led me to suspect a potential data race at line 1305.
However, I am not certain whether holding rnp->lock protects access to
rdp->gpwrap in this case. If it indeed ensures that no concurrent writes
can occur, then I agree that the correct approach would be to remove
READ_ONCE() from those accesses.
Thanks,
Zilin
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
2024-11-07 14:01 ` Zilin Guan
@ 2024-11-07 14:15 ` Paul E. McKenney
0 siblings, 0 replies; 4+ messages in thread
From: Paul E. McKenney @ 2024-11-07 14:15 UTC (permalink / raw)
To: Zilin Guan
Cc: boqun.feng, frederic, jiangshanlai, joel, josh, linux-kernel,
mathieu.desnoyers, neeraj.upadhyay, qiang.zhang1211, rcu, rostedt,
urezki, xujianhao01
On Thu, Nov 07, 2024 at 02:01:17PM +0000, Zilin Guan wrote:
> On Wed, Nov 06, 2024 at 12:18:25PM -0800, Paul E. McKenney wrote:
> > Good eyes!!!
> >
> > But did you find this with KCSAN, or by visual inspection?
> >
> > The reason that I ask is that the __note_gp_changes() should be
> > invoked with the leaf rnp->lock held, which should exclude writes to
> > the rdp->gpwrap fields for all CPUs corresponding to that leaf rcu_node
> > structure.
> >
> > Note the raw_lockdep_assert_held_rcu_node(rnp) call at the beginning of
> > this function.
> >
> > So I believe that the proper fix is to *remove* READ_ONCE() from accesses
> > to rdp->gpwrap in this function.
> >
> > Or am I missing something here?
> >
> > Thanx, Paul
>
> I found this by visual inspection.
Good eyes! ;-)
> When reviewing the function __note_gp_changes(), I noticed that other
> accesses to rdp->gpwrap are protected with either READ_ONCE() or
> WRITE_ONCE(), which led me to suspect a potential data race at line 1305.
>
> However, I am not certain whether holding rnp->lock protects access to
> rdp->gpwrap in this case. If it indeed ensures that no concurrent writes
> can occur, then I agree that the correct approach would be to remove
> READ_ONCE() from those accesses.
One way to check this is via inspection of all the updates to the
->gpwrap field.
Another approach is to run KCSAN, for example, from the top-level
directory of the Linux-kernel source tree on a system with qemu/KVM
enabled:
tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 30m --configs "4*TREE03" --kconfigs "CONFIG_NR_CPUS=4" --kcsan --trust-make
This particular command is set up for my 16-CPU laptop. You can of
course adjust the "4*" and the "=4" to match your hardware. For example,
on a 64-CPU system you might instead do this:
tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 30m --configs "8*TREE03" --kconfigs "CONFIG_NR_CPUS=8" --kcsan --trust-make
Please see Documentation/dev-tools/kcsan.rst for information on how
to interpret KCSAN reports.
This will find false positives in the non-RCU portions of the kernel,
so you should look for reports involving __note_gp_changes() and/or
its callers (inlining and all that).
So why not try it? ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-11-07 14:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-04 15:12 [PATCH] rcu: Use READ_ONCE() for rdp->gpwrap access in __note_gp_changes() Zilin Guan
2024-11-06 20:18 ` Paul E. McKenney
2024-11-07 14:01 ` Zilin Guan
2024-11-07 14:15 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox