* Testing of shared RCU branching
@ 2025-02-25 15:58 Paul E. McKenney
2025-02-25 16:08 ` Boqun Feng
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-02-25 15:58 UTC (permalink / raw)
To: boqun.feng; +Cc: rcu
Hello, Boqun,
I have run overnight tests on your earlier branches here:
ccb986e8b69f ("MAINTAINERS: Update Joel's email address")
These passed other than a KCSAN complaint involving
rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
This looks like the plain C-language writes to ->defer_qs_iw_pending.
My guess is that this is low probability, despite having happened twice,
and that it happens when rcu_read_unlock_special() is interrupted,
resulting in rcu_preempt_deferred_qs_handler() being invoked as an
IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
data races between task and handler on the same CPU.
Thoughts?
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 15:58 Testing of shared RCU branching Paul E. McKenney
@ 2025-02-25 16:08 ` Boqun Feng
2025-02-25 16:11 ` Boqun Feng
0 siblings, 1 reply; 9+ messages in thread
From: Boqun Feng @ 2025-02-25 16:08 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: rcu
On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote:
> Hello, Boqun,
>
> I have run overnight tests on your earlier branches here:
>
> ccb986e8b69f ("MAINTAINERS: Update Joel's email address")
>
> These passed other than a KCSAN complaint involving
> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> This looks like the plain C-language writes to ->defer_qs_iw_pending.
>
> My guess is that this is low probability, despite having happened twice,
> and that it happens when rcu_read_unlock_special() is interrupted,
> resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> data races between task and handler on the same CPU.
>
> Thoughts?
>
Do you have a KCSAN of this? Also this is not a regression, right?
Meaning you probably have seen this before? Anyway, it should be an easy
fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
and put it in.
Regards,
Boqun
> Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 16:08 ` Boqun Feng
@ 2025-02-25 16:11 ` Boqun Feng
2025-02-25 17:54 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Boqun Feng @ 2025-02-25 16:11 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: rcu
On Tue, Feb 25, 2025 at 08:08:53AM -0800, Boqun Feng wrote:
> On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote:
> > Hello, Boqun,
> >
> > I have run overnight tests on your earlier branches here:
> >
> > ccb986e8b69f ("MAINTAINERS: Update Joel's email address")
> >
Oh and I should have let you know, I updated next and dev branch, the
latest ones are:
next.2025.02.24a and dev.2025.02.24a in rcu repo.
Regards,
Boqun
> > These passed other than a KCSAN complaint involving
> > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> > This looks like the plain C-language writes to ->defer_qs_iw_pending.
> >
> > My guess is that this is low probability, despite having happened twice,
> > and that it happens when rcu_read_unlock_special() is interrupted,
> > resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> > data races between task and handler on the same CPU.
> >
> > Thoughts?
> >
>
> Do you have a KCSAN of this? Also this is not a regression, right?
> Meaning you probably have seen this before? Anyway, it should be an easy
> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
> and put it in.
>
> Regards,
> Boqun
>
> > Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 16:11 ` Boqun Feng
@ 2025-02-25 17:54 ` Paul E. McKenney
2025-02-25 21:20 ` Joel Fernandes
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-02-25 17:54 UTC (permalink / raw)
To: Boqun Feng; +Cc: rcu
On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
> On Tue, Feb 25, 2025 at 08:08:53AM -0800, Boqun Feng wrote:
> > On Tue, Feb 25, 2025 at 07:58:01AM -0800, Paul E. McKenney wrote:
> > > Hello, Boqun,
> > >
> > > I have run overnight tests on your earlier branches here:
> > >
> > > ccb986e8b69f ("MAINTAINERS: Update Joel's email address")
> > >
>
> Oh and I should have let you know, I updated next and dev branch, the
> latest ones are:
>
> next.2025.02.24a and dev.2025.02.24a in rcu repo.
Very well, I will try them out later today.
> Regards,
> Boqun
>
> > > These passed other than a KCSAN complaint involving
> > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> > > This looks like the plain C-language writes to ->defer_qs_iw_pending.
> > >
> > > My guess is that this is low probability, despite having happened twice,
> > > and that it happens when rcu_read_unlock_special() is interrupted,
> > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> > > data races between task and handler on the same CPU.
> > >
> > > Thoughts?
> > >
> >
> > Do you have a KCSAN of this? Also this is not a regression, right?
> > Meaning you probably have seen this before? Anyway, it should be an easy
> > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
> > and put it in.
Here you go! And you are right, if it is a regression, it is from a
long time ago, though something more recent might have made it more
probable.
In any case, not at all urgent.
Thanx, Paul
------------------------------------------------------------------------
[ 624.037869] ==================================================================
[ 624.037883] BUG: KCSAN: data-race in rcu_preempt_deferred_qs_handler / rcu_read_unlock_special
[ 624.037906]
[ 624.037909] read to 0xffffa034df2eff90 of 1 bytes by task 45 on cpu 3:
[ 624.037916] rcu_read_unlock_special+0x177/0x260
[ 624.037925] __rcu_read_unlock+0x92/0xa0
[ 624.037935] rt_spin_unlock+0x9b/0xc0
[ 624.037946] __local_bh_enable+0x10e/0x170
[ 624.037957] __local_bh_enable_ip+0xe9/0x140
[ 624.037967] rcu_cpu_kthread+0x95f/0x1190
[ 624.037976] smpboot_thread_fn+0x230/0x320
[ 624.037985] kthread+0x3b8/0x400
[ 624.037995] ret_from_fork+0x35/0x40
[ 624.038025] ret_from_fork_asm+0x1a/0x30
[ 624.038036]
[ 624.038039] write to 0xffffa034df2eff90 of 1 bytes by task 43 on cpu 3:
[ 624.038046] rcu_preempt_deferred_qs_handler+0x1e/0x30
[ 624.038057] irq_work_single+0xaf/0x160
[ 624.038066] run_irq_workd+0x92/0xd0
[ 624.038075] smpboot_thread_fn+0x230/0x320
[ 624.038085] kthread+0x3b8/0x400
[ 624.038095] ret_from_fork+0x35/0x40
[ 624.038105] ret_from_fork_asm+0x1a/0x30
[ 624.038116]
[ 624.038118] no locks held by irq_work/3/43.
[ 624.038123] irq event stamp: 202724
[ 624.038126] hardirqs last enabled at (202724): [<ffffffffa8950831>] finish_task_switch+0x131/0x320
[ 624.038138] hardirqs last disabled at (202723): [<ffffffffa9f8ce02>] __schedule+0xe2/0xbb0
[ 624.038146] softirqs last enabled at (0): [<ffffffffa88dbfd1>] copy_process+0x4e1/0x1cc0
[ 624.038159] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 624.038167]
[ 624.038169] Reported by Kernel Concurrency Sanitizer on:
[ 624.038173] CPU: 3 UID: 0 PID: 43 Comm: irq_work/3 Not tainted 6.14.0-rc1-00080-gd6558730a4de #6410
[ 624.038185] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 624.038191] ==================================================================
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 17:54 ` Paul E. McKenney
@ 2025-02-25 21:20 ` Joel Fernandes
2025-02-25 21:53 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2025-02-25 21:20 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Boqun Feng, rcu
On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
[...]
> > > > These passed other than a KCSAN complaint involving
> > > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> > > > This looks like the plain C-language writes to ->defer_qs_iw_pending.
> > > >
> > > > My guess is that this is low probability, despite having happened twice,
> > > > and that it happens when rcu_read_unlock_special() is interrupted,
> > > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> > > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> > > > data races between task and handler on the same CPU.
> > > >
> > > > Thoughts?
> > > >
> > >
> > > Do you have a KCSAN of this? Also this is not a regression, right?
> > > Meaning you probably have seen this before? Anyway, it should be an easy
> > > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
> > > and put it in.
>
> Here you go! And you are right, if it is a regression, it is from a
> long time ago, though something more recent might have made it more
> probable.
In my opinion I probably wouldn't even call it a regression because the
data-race is happening on a boolean element. If I am not mistaken, this is
thus a false-positive and KCSAN has no way of silencing it?
thanks,
- Joel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 21:20 ` Joel Fernandes
@ 2025-02-25 21:53 ` Paul E. McKenney
2025-02-25 22:04 ` Joel Fernandes
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-02-25 21:53 UTC (permalink / raw)
To: Joel Fernandes; +Cc: Boqun Feng, rcu
On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote:
> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
> [...]
> > > > > These passed other than a KCSAN complaint involving
> > > > > rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> > > > > This looks like the plain C-language writes to ->defer_qs_iw_pending.
> > > > >
> > > > > My guess is that this is low probability, despite having happened twice,
> > > > > and that it happens when rcu_read_unlock_special() is interrupted,
> > > > > resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> > > > > IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> > > > > data races between task and handler on the same CPU.
> > > > >
> > > > > Thoughts?
> > > > >
> > > >
> > > > Do you have a KCSAN of this? Also this is not a regression, right?
> > > > Meaning you probably have seen this before? Anyway, it should be an easy
> > > > fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
> > > > and put it in.
> >
> > Here you go! And you are right, if it is a regression, it is from a
> > long time ago, though something more recent might have made it more
> > probable.
>
> In my opinion I probably wouldn't even call it a regression because the
> data-race is happening on a boolean element. If I am not mistaken, this is
> thus a false-positive and KCSAN has no way of silencing it?
You can still get in trouble with booleans. The usual example
is as follows:
bool x;
...
while (!x)
do_something();
In many cases, the compiler is free to transform that "while" loop
into this:
if (!x)
for (;;)
do_something();
Putting a READ_ONCE() in the original "while" condition prevents this
transformation.
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 21:53 ` Paul E. McKenney
@ 2025-02-25 22:04 ` Joel Fernandes
2025-02-25 23:45 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Joel Fernandes @ 2025-02-25 22:04 UTC (permalink / raw)
To: paulmck; +Cc: Boqun Feng, rcu
On 2/25/2025 4:53 PM, Paul E. McKenney wrote:
> On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote:
>> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote:
>>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
>> [...]
>>>>>> These passed other than a KCSAN complaint involving
>>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
>>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending.
>>>>>>
>>>>>> My guess is that this is low probability, despite having happened twice,
>>>>>> and that it happens when rcu_read_unlock_special() is interrupted,
>>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an
>>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
>>>>>> data races between task and handler on the same CPU.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>
>>>>> Do you have a KCSAN of this? Also this is not a regression, right?
>>>>> Meaning you probably have seen this before? Anyway, it should be an easy
>>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
>>>>> and put it in.
>>>
>>> Here you go! And you are right, if it is a regression, it is from a
>>> long time ago, though something more recent might have made it more
>>> probable.
>>
>> In my opinion I probably wouldn't even call it a regression because the
>> data-race is happening on a boolean element. If I am not mistaken, this is
>> thus a false-positive and KCSAN has no way of silencing it?
>
> You can still get in trouble with booleans. The usual example
> is as follows:
>
> bool x;
>
> ...
>
>
> while (!x)
> do_something();
>
> In many cases, the compiler is free to transform that "while" loop
> into this:
>
> if (!x)
> for (;;)
> do_something();
>
> Putting a READ_ONCE() in the original "while" condition prevents this
> transformation.
True, thanks for clarifying. I will be a bit more annoying and say that in
rcu_read_unlock_special(), there is no such looping transformation possible
though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to
analyze I guess.
Thanks!
- Joel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 22:04 ` Joel Fernandes
@ 2025-02-25 23:45 ` Paul E. McKenney
2025-02-26 0:12 ` Joel Fernandes
0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-02-25 23:45 UTC (permalink / raw)
To: Joel Fernandes; +Cc: Boqun Feng, rcu
On Tue, Feb 25, 2025 at 05:04:37PM -0500, Joel Fernandes wrote:
>
>
> On 2/25/2025 4:53 PM, Paul E. McKenney wrote:
> > On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote:
> >> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote:
> >>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
> >> [...]
> >>>>>> These passed other than a KCSAN complaint involving
> >>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
> >>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending.
> >>>>>>
> >>>>>> My guess is that this is low probability, despite having happened twice,
> >>>>>> and that it happens when rcu_read_unlock_special() is interrupted,
> >>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an
> >>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
> >>>>>> data races between task and handler on the same CPU.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>
> >>>>> Do you have a KCSAN of this? Also this is not a regression, right?
> >>>>> Meaning you probably have seen this before? Anyway, it should be an easy
> >>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
> >>>>> and put it in.
> >>>
> >>> Here you go! And you are right, if it is a regression, it is from a
> >>> long time ago, though something more recent might have made it more
> >>> probable.
> >>
> >> In my opinion I probably wouldn't even call it a regression because the
> >> data-race is happening on a boolean element. If I am not mistaken, this is
> >> thus a false-positive and KCSAN has no way of silencing it?
> >
> > You can still get in trouble with booleans. The usual example
> > is as follows:
> >
> > bool x;
> >
> > ...
> >
> >
> > while (!x)
> > do_something();
> >
> > In many cases, the compiler is free to transform that "while" loop
> > into this:
> >
> > if (!x)
> > for (;;)
> > do_something();
> >
> > Putting a READ_ONCE() in the original "while" condition prevents this
> > transformation.
>
> True, thanks for clarifying. I will be a bit more annoying and say that in
> rcu_read_unlock_special(), there is no such looping transformation possible
> though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to
> analyze I guess.
Besides, better safe than sorry. Especially given the decades-long
trend of increasingly clever compilers. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Testing of shared RCU branching
2025-02-25 23:45 ` Paul E. McKenney
@ 2025-02-26 0:12 ` Joel Fernandes
0 siblings, 0 replies; 9+ messages in thread
From: Joel Fernandes @ 2025-02-26 0:12 UTC (permalink / raw)
To: paulmck; +Cc: Boqun Feng, rcu
On 2/25/2025 6:45 PM, Paul E. McKenney wrote:
> On Tue, Feb 25, 2025 at 05:04:37PM -0500, Joel Fernandes wrote:
>>
>>
>> On 2/25/2025 4:53 PM, Paul E. McKenney wrote:
>>> On Tue, Feb 25, 2025 at 04:20:07PM -0500, Joel Fernandes wrote:
>>>> On Tue, Feb 25, 2025 at 09:54:29AM -0800, Paul E. McKenney wrote:
>>>>> On Tue, Feb 25, 2025 at 08:11:11AM -0800, Boqun Feng wrote:
>>>> [...]
>>>>>>>> These passed other than a KCSAN complaint involving
>>>>>>>> rcu_preempt_deferred_qs_handler() and rcu_read_unlock_special().
>>>>>>>> This looks like the plain C-language writes to ->defer_qs_iw_pending.
>>>>>>>>
>>>>>>>> My guess is that this is low probability, despite having happened twice,
>>>>>>>> and that it happens when rcu_read_unlock_special() is interrupted,
>>>>>>>> resulting in rcu_preempt_deferred_qs_handler() being invoked as an
>>>>>>>> IRQ-work handler. Keeping in mind that RCU runs KCSAN so as to locate
>>>>>>>> data races between task and handler on the same CPU.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>
>>>>>>> Do you have a KCSAN of this? Also this is not a regression, right?
>>>>>>> Meaning you probably have seen this before? Anyway, it should be an easy
>>>>>>> fix (just using READ_ONCE() and WRITE_ONCE()). I can send the fix out
>>>>>>> and put it in.
>>>>>
>>>>> Here you go! And you are right, if it is a regression, it is from a
>>>>> long time ago, though something more recent might have made it more
>>>>> probable.
>>>>
>>>> In my opinion I probably wouldn't even call it a regression because the
>>>> data-race is happening on a boolean element. If I am not mistaken, this is
>>>> thus a false-positive and KCSAN has no way of silencing it?
>>>
>>> You can still get in trouble with booleans. The usual example
>>> is as follows:
>>>
>>> bool x;
>>>
>>> ...
>>>
>>>
>>> while (!x)
>>> do_something();
>>>
>>> In many cases, the compiler is free to transform that "while" loop
>>> into this:
>>>
>>> if (!x)
>>> for (;;)
>>> do_something();
>>>
>>> Putting a READ_ONCE() in the original "while" condition prevents this
>>> transformation.
>>
>> True, thanks for clarifying. I will be a bit more annoying and say that in
>> rcu_read_unlock_special(), there is no such looping transformation possible
>> though AFAICS. The test is an if() block. But this is beyond KCSAN's ability to
>> analyze I guess.
>
> Besides, better safe than sorry. Especially given the decades-long
> trend of increasingly clever compilers. ;-)
Yeah true, thanks!
- Joel
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-02-26 0:12 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25 15:58 Testing of shared RCU branching Paul E. McKenney
2025-02-25 16:08 ` Boqun Feng
2025-02-25 16:11 ` Boqun Feng
2025-02-25 17:54 ` Paul E. McKenney
2025-02-25 21:20 ` Joel Fernandes
2025-02-25 21:53 ` Paul E. McKenney
2025-02-25 22:04 ` Joel Fernandes
2025-02-25 23:45 ` Paul E. McKenney
2025-02-26 0:12 ` Joel Fernandes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox