* Testing of shared RCU branching
@ 2025-02-25 15:58 Paul E. McKenney
2025-02-25 16:08 ` Boqun Feng
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-02-25 15:58 UTC (permalink / raw)
To: boqun.feng; +Cc: rcu
Hello, Boqun,
I have run overnight tests on your earlier branches here:
ccb986e8b69f ("MAINTAINERS: Update Joel's email address")
These passed other than a KCSAN complaint involving
rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
This looks like the plain C-language writes to ->defer_qs_iw_pending.
My guess is that this is low probability, despite having happened twice,
and that it happens when rcu_read_unlock_special() is interrupted,
resulting in rcu_preempt_deferred_qs_handler() being invoked as an
IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
data races between task and handler on the same CPU.
Thoughts?
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: Testing of shared RCU branching 2025-02-25 15:58 Testing of shared RCU branching Paul E. McKenney @ 2025-02-25 16:08 ` Boqun Feng 2025-02-25 16:11 ` Boqun Feng 0 siblings, 1 reply; 9+ messages in thread From: Boqun Feng @ 2025-02-25 16:08 UTC (permalink / raw) To: Paul E. McKenney; +Cc: rcu On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote: > Hello, Boqun, > > I have run overnight tests on your earlier branches here: > > ccb986e8b69f ("MAINTAINERS: Update Joel's email address") > > These passed other than a KCSAN complaint involving > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > This looks like the plain C-language writes to ->defer_qs_iw_pending. > > My guess is that this is low probability, despite having happened twice, > and that it happens when rcu_read_unlock_special() is interrupted, > resulting in rcu_preempt_deferred_qs_handler() being invoked as an > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > data races between task and handler on the same CPU. > > Thoughts? > Do you have a KCSAN of this? Also this is not a regression, right? Meaning you probably have seen this before? Anyway, it should be an easy fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out and put it in. Regards, Boqun > Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 16:08 ` Boqun Feng @ 2025-02-25 16:11 ` Boqun Feng 2025-02-25 17:54 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Boqun Feng @ 2025-02-25 16:11 UTC (permalink / raw) To: Paul E. McKenney; +Cc: rcu On Tue, Feb 25, 2025 at 08:08:53AM -0800, Boqun Feng wrote: > On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote: > > Hello, Boqun, > > > > I have run overnight tests on your earlier branches here: > > > > ccb986e8b69f ("MAINTAINERS: Update Joel's email address") > > Oh and I should have let you know, I updated next and dev branch, the latest ones are: next.2025.02.24a and dev.2025.02.24a in rcu repo. Regards, Boqun > > These passed other than a KCSAN complaint involving > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > > This looks like the plain C-language writes to ->defer_qs_iw_pending. > > > > My guess is that this is low probability, despite having happened twice, > > and that it happens when rcu_read_unlock_special() is interrupted, > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > > data races between task and handler on the same CPU. > > > > Thoughts? > > > > Do you have a KCSAN of this? Also this is not a regression, right? > Meaning you probably have seen this before? Anyway, it should be an easy > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out > and put it in. > > Regards, > Boqun > > > Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 16:11 ` Boqun Feng @ 2025-02-25 17:54 ` Paul E. McKenney 2025-02-25 21:20 ` Joel Fernandes 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2025-02-25 17:54 UTC (permalink / raw) To: Boqun Feng; +Cc: rcu On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: > On Tue, Feb 25, 2025 at 08:08:53AM -0800, Boqun Feng wrote: > > On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote: > > > Hello, Boqun, > > > > > > I have run overnight tests on your earlier branches here: > > > > > > ccb986e8b69f ("MAINTAINERS: Update Joel's email address") > > > > > Oh and I should have let you know, I updated next and dev branch, the > latest ones are: > > next.2025.02.24a and dev.2025.02.24a in rcu repo. Very well, I will try them out later today. > Regards, > Boqun > > > > These passed other than a KCSAN complaint involving > > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > > > This looks like the plain C-language writes to ->defer_qs_iw_pending. > > > > > > My guess is that this is low probability, despite having happened twice, > > > and that it happens when rcu_read_unlock_special() is interrupted, > > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an > > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > > > data races between task and handler on the same CPU. > > > > > > Thoughts? > > > > > > > Do you have a KCSAN of this? Also this is not a regression, right? > > Meaning you probably have seen this before? Anyway, it should be an easy > > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out > > and put it in. Here you go! And you are right, if it is a regression, it is from a long time ago, though something more recent might have made it more probable. In any case, not at all urgent. Thanx, Paul ------------------------------------------------------------------------ [ 624.037869] ================================================================== [ 624.037883] BUG: KCSAN: data-race in rcu_preempt_deferred_qs_handler / rcu_read_unlock_special [ 624.037906] [ 624.037909] read to 0xffffa034df2eff90 of 1 bytes by task 45 on cpu 3: [ 624.037916] rcu_read_unlock_special+0x177/0x260 [ 624.037925] __rcu_read_unlock+0x92/0xa0 [ 624.037935] rt_spin_unlock+0x9b/0xc0 [ 624.037946] __local_bh_enable+0x10e/0x170 [ 624.037957] __local_bh_enable_ip+0xe9/0x140 [ 624.037967] rcu_cpu_kthread+0x95f/0x1190 [ 624.037976] smpboot_thread_fn+0x230/0x320 [ 624.037985] kthread+0x3b8/0x400 [ 624.037995] ret_from_fork+0x35/0x40 [ 624.038025] ret_from_fork_asm+0x1a/0x30 [ 624.038036] [ 624.038039] write to 0xffffa034df2eff90 of 1 bytes by task 43 on cpu 3: [ 624.038046] rcu_preempt_deferred_qs_handler+0x1e/0x30 [ 624.038057] irq_work_single+0xaf/0x160 [ 624.038066] run_irq_workd+0x92/0xd0 [ 624.038075] smpboot_thread_fn+0x230/0x320 [ 624.038085] kthread+0x3b8/0x400 [ 624.038095] ret_from_fork+0x35/0x40 [ 624.038105] ret_from_fork_asm+0x1a/0x30 [ 624.038116] [ 624.038118] no locks held by irq_work/3/43. [ 624.038123] irq event stamp: 202724 [ 624.038126] hardirqs last enabled at (202724): [<ffffffffa8950831>] finish_task_switch+0x131/0x320 [ 624.038138] hardirqs last disabled at (202723): [<ffffffffa9f8ce02>] __schedule+0xe2/0xbb0 [ 624.038146] softirqs last enabled at (0): [<ffffffffa88dbfd1>] copy_process+0x4e1/0x1cc0 [ 624.038159] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 624.038167] [ 624.038169] Reported by Kernel Concurrency Sanitizer on: [ 624.038173] CPU: 3 UID: 0 PID: 43 Comm: irq_work/3 Not tainted 6.14.0-rc1-00080-gd6558730a4de #6410 [ 624.038185] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 624.038191] ================================================================== ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 17:54 ` Paul E. McKenney @ 2025-02-25 21:20 ` Joel Fernandes 2025-02-25 21:53 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Joel Fernandes @ 2025-02-25 21:20 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Boqun Feng, rcu On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote: > On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: [...] > > > > These passed other than a KCSAN complaint involving > > > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > > > > This looks like the plain C-language writes to ->defer_qs_iw_pending. > > > > > > > > My guess is that this is low probability, despite having happened twice, > > > > and that it happens when rcu_read_unlock_special() is interrupted, > > > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an > > > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > > > > data races between task and handler on the same CPU. > > > > > > > > Thoughts? > > > > > > > > > > Do you have a KCSAN of this? Also this is not a regression, right? > > > Meaning you probably have seen this before? Anyway, it should be an easy > > > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out > > > and put it in. > > Here you go! And you are right, if it is a regression, it is from a > long time ago, though something more recent might have made it more > probable. In my opinion I probably wouldn't even call it a regression because the data-race is happening on a boolean element. If I am not mistaken, this is thus a false-positive and KCSAN has no way of silencing it? thanks, - Joel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 21:20 ` Joel Fernandes @ 2025-02-25 21:53 ` Paul E. McKenney 2025-02-25 22:04 ` Joel Fernandes 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2025-02-25 21:53 UTC (permalink / raw) To: Joel Fernandes; +Cc: Boqun Feng, rcu On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote: > On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote: > > On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: > [...] > > > > > These passed other than a KCSAN complaint involving > > > > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > > > > > This looks like the plain C-language writes to ->defer_qs_iw_pending. > > > > > > > > > > My guess is that this is low probability, despite having happened twice, > > > > > and that it happens when rcu_read_unlock_special() is interrupted, > > > > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an > > > > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > > > > > data races between task and handler on the same CPU. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > Do you have a KCSAN of this? Also this is not a regression, right? > > > > Meaning you probably have seen this before? Anyway, it should be an easy > > > > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out > > > > and put it in. > > > > Here you go! And you are right, if it is a regression, it is from a > > long time ago, though something more recent might have made it more > > probable. > > In my opinion I probably wouldn't even call it a regression because the > data-race is happening on a boolean element. If I am not mistaken, this is > thus a false-positive and KCSAN has no way of silencing it? You can still get in trouble with booleans. The usual example is as follows: bool x; ... while (!x) do_something(); In many cases, the compiler is free to transform that "while" loop into this: if (!x) for (;;) do_something(); Putting a READ_ONCE() in the original "while" condition prevents this transformation. Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 21:53 ` Paul E. McKenney @ 2025-02-25 22:04 ` Joel Fernandes 2025-02-25 23:45 ` Paul E. McKenney 0 siblings, 1 reply; 9+ messages in thread From: Joel Fernandes @ 2025-02-25 22:04 UTC (permalink / raw) To: paulmck; +Cc: Boqun Feng, rcu On 2/25/2025 4:53 PM, Paul E. McKenney wrote: > On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote: >> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote: >>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: >> [...] >>>>>> These passed other than a KCSAN complaint involving >>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). >>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending. >>>>>> >>>>>> My guess is that this is low probability, despite having happened twice, >>>>>> and that it happens when rcu_read_unlock_special() is interrupted, >>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an >>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate >>>>>> data races between task and handler on the same CPU. >>>>>> >>>>>> Thoughts? >>>>>> >>>>> >>>>> Do you have a KCSAN of this? Also this is not a regression, right? >>>>> Meaning you probably have seen this before? Anyway, it should be an easy >>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out >>>>> and put it in. >>> >>> Here you go! And you are right, if it is a regression, it is from a >>> long time ago, though something more recent might have made it more >>> probable. >> >> In my opinion I probably wouldn't even call it a regression because the >> data-race is happening on a boolean element. If I am not mistaken, this is >> thus a false-positive and KCSAN has no way of silencing it? > > You can still get in trouble with booleans. The usual example > is as follows: > > bool x; > > ... > > > while (!x) > do_something(); > > In many cases, the compiler is free to transform that "while" loop > into this: > > if (!x) > for (;;) > do_something(); > > Putting a READ_ONCE() in the original "while" condition prevents this > transformation. True, thanks for clarifying. I will be a bit more annoying and say that in rcu_read_unlock_special(), there is no such looping transformation possible though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to analyze I guess. Thanks! - Joel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 22:04 ` Joel Fernandes @ 2025-02-25 23:45 ` Paul E. McKenney 2025-02-26 0:12 ` Joel Fernandes 0 siblings, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2025-02-25 23:45 UTC (permalink / raw) To: Joel Fernandes; +Cc: Boqun Feng, rcu On Tue, Feb 25, 2025 at 05:04:37PM -0500, Joel Fernandes wrote: > > > On 2/25/2025 4:53 PM, Paul E. McKenney wrote: > > On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote: > >> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote: > >>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: > >> [...] > >>>>>> These passed other than a KCSAN complaint involving > >>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). > >>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending. > >>>>>> > >>>>>> My guess is that this is low probability, despite having happened twice, > >>>>>> and that it happens when rcu_read_unlock_special() is interrupted, > >>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an > >>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate > >>>>>> data races between task and handler on the same CPU. > >>>>>> > >>>>>> Thoughts? > >>>>>> > >>>>> > >>>>> Do you have a KCSAN of this? Also this is not a regression, right? > >>>>> Meaning you probably have seen this before? Anyway, it should be an easy > >>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out > >>>>> and put it in. > >>> > >>> Here you go! And you are right, if it is a regression, it is from a > >>> long time ago, though something more recent might have made it more > >>> probable. > >> > >> In my opinion I probably wouldn't even call it a regression because the > >> data-race is happening on a boolean element. If I am not mistaken, this is > >> thus a false-positive and KCSAN has no way of silencing it? > > > > You can still get in trouble with booleans. The usual example > > is as follows: > > > > bool x; > > > > ... > > > > > > while (!x) > > do_something(); > > > > In many cases, the compiler is free to transform that "while" loop > > into this: > > > > if (!x) > > for (;;) > > do_something(); > > > > Putting a READ_ONCE() in the original "while" condition prevents this > > transformation. > > True, thanks for clarifying. I will be a bit more annoying and say that in > rcu_read_unlock_special(), there is no such looping transformation possible > though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to > analyze I guess. Besides, better safe than sorry. Especially given the decades-long trend of increasingly clever compilers. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching 2025-02-25 23:45 ` Paul E. McKenney @ 2025-02-26 0:12 ` Joel Fernandes 0 siblings, 0 replies; 9+ messages in thread From: Joel Fernandes @ 2025-02-26 0:12 UTC (permalink / raw) To: paulmck; +Cc: Boqun Feng, rcu On 2/25/2025 6:45 PM, Paul E. McKenney wrote: > On Tue, Feb 25, 2025 at 05:04:37PM -0500, Joel Fernandes wrote: >> >> >> On 2/25/2025 4:53 PM, Paul E. McKenney wrote: >>> On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote: >>>> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote: >>>>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote: >>>> [...] >>>>>>>> These passed other than a KCSAN complaint involving >>>>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special(). >>>>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending. >>>>>>>> >>>>>>>> My guess is that this is low probability, despite having happened twice, >>>>>>>> and that it happens when rcu_read_unlock_special() is interrupted, >>>>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an >>>>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate >>>>>>>> data races between task and handler on the same CPU. >>>>>>>> >>>>>>>> Thoughts? >>>>>>>> >>>>>>> >>>>>>> Do you have a KCSAN of this? Also this is not a regression, right? >>>>>>> Meaning you probably have seen this before? Anyway, it should be an easy >>>>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out >>>>>>> and put it in. >>>>> >>>>> Here you go! And you are right, if it is a regression, it is from a >>>>> long time ago, though something more recent might have made it more >>>>> probable. >>>> >>>> In my opinion I probably wouldn't even call it a regression because the >>>> data-race is happening on a boolean element. If I am not mistaken, this is >>>> thus a false-positive and KCSAN has no way of silencing it? >>> >>> You can still get in trouble with booleans. The usual example >>> is as follows: >>> >>> bool x; >>> >>> ... >>> >>> >>> while (!x) >>> do_something(); >>> >>> In many cases, the compiler is free to transform that "while" loop >>> into this: >>> >>> if (!x) >>> for (;;) >>> do_something(); >>> >>> Putting a READ_ONCE() in the original "while" condition prevents this >>> transformation. >> >> True, thanks for clarifying. I will be a bit more annoying and say that in >> rcu_read_unlock_special(), there is no such looping transformation possible >> though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to >> analyze I guess. > > Besides, better safe than sorry. Especially given the decades-long > trend of increasingly clever compilers. ;-) Yeah true, thanks! - Joel ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-26 0:12 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-25 15:58 Testing of shared RCU branching Paul E. McKenney 2025-02-25 16:08 ` Boqun Feng 2025-02-25 16:11 ` Boqun Feng 2025-02-25 17:54 ` Paul E. McKenney 2025-02-25 21:20 ` Joel Fernandes 2025-02-25 21:53 ` Paul E. McKenney 2025-02-25 22:04 ` Joel Fernandes 2025-02-25 23:45 ` Paul E. McKenney 2025-02-26 0:12 ` Joel Fernandes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox