[syzbot] [can?] KCSAN: data-race in can_send / can

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
@ 2025-03-09 10:46 syzbot
  2025-03-09 18:47 ` Oliver Hartkopp
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2025-03-09 10:46 UTC (permalink / raw)
  To: linux-can, linux-kernel, mkl, socketcan, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    0f52fd4f67c6 Merge tag 'bcachefs-2025-03-06' of git://evil..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12d12a54580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=523b0e2f15224775
dashboard link: https://syzkaller.appspot.com/bug?extid=78ce4489b812515d5e4d
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/eb0d7b540c67/disk-0f52fd4f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/51c261332ad9/vmlinux-0f52fd4f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/38914a4790c8/bzImage-0f52fd4f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+78ce4489b812515d5e4d@syzkaller.appspotmail.com

==================================================================
BUG: KCSAN: data-race in can_send / can_send

read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 0:
 can_send+0x5a2/0x6d0 net/can/af_can.c:290
 bcm_can_tx+0x314/0x420 net/can/bcm.c:314
 bcm_tx_timeout_handler+0xea/0x280
 __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
 __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
 hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
 handle_softirqs+0xbf/0x280 kernel/softirq.c:561
 run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
 smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
 kthread+0x4ae/0x520 kernel/kthread.c:464
 ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 1:
 can_send+0x5a2/0x6d0 net/can/af_can.c:290
 bcm_can_tx+0x314/0x420 net/can/bcm.c:314
 bcm_tx_timeout_handler+0xea/0x280
 __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
 __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
 hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
 handle_softirqs+0xbf/0x280 kernel/softirq.c:561
 run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
 smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
 kthread+0x4ae/0x520 kernel/kthread.c:464
 ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

value changed: 0x0000000000002b9d -> 0x0000000000002b9e

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 23 Comm: ksoftirqd/1 Tainted: G        W          6.14.0-rc5-syzkaller-00109-g0f52fd4f67c6 #0
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-09 10:46 [syzbot] [can?] KCSAN: data-race in can_send / can_send (5) syzbot
@ 2025-03-09 18:47 ` Oliver Hartkopp
  2025-03-10  9:29   ` Vincent Mailhol
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Hartkopp @ 2025-03-09 18:47 UTC (permalink / raw)
  To: mkl; +Cc: syzbot, linux-kernel, syzkaller-bugs, linux-can

Hello Marc,

On 09.03.25 11:46, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    0f52fd4f67c6 Merge tag 'bcachefs-2025-03-06' of git://evil..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=12d12a54580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=523b0e2f15224775
> dashboard link: https://syzkaller.appspot.com/bug?extid=78ce4489b812515d5e4d
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/eb0d7b540c67/disk-0f52fd4f.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/51c261332ad9/vmlinux-0f52fd4f.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/38914a4790c8/bzImage-0f52fd4f.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+78ce4489b812515d5e4d@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KCSAN: data-race in can_send / can_send
> 
> read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 0:
>   can_send+0x5a2/0x6d0 net/can/af_can.c:290
>   bcm_can_tx+0x314/0x420 net/can/bcm.c:314
>   bcm_tx_timeout_handler+0xea/0x280
>   __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
>   __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
>   hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
>   handle_softirqs+0xbf/0x280 kernel/softirq.c:561
>   run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
>   smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
>   kthread+0x4ae/0x520 kernel/kthread.c:464
>   ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> 
> read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 1:
>   can_send+0x5a2/0x6d0 net/can/af_can.c:290
>   bcm_can_tx+0x314/0x420 net/can/bcm.c:314
>   bcm_tx_timeout_handler+0xea/0x280
>   __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
>   __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
>   hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
>   handle_softirqs+0xbf/0x280 kernel/softirq.c:561
>   run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
>   smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
>   kthread+0x4ae/0x520 kernel/kthread.c:464
>   ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> 
> value changed: 0x0000000000002b9d -> 0x0000000000002b9e
> 

Increased by '1' ...

I assume this problem is caused by increasing the per-netdevice statistic in

https://elixir.bootlin.com/linux/v6.13.6/source/net/can/af_can.c#L289

pkg_stats->tx_frames++;
pkg_stats->tx_frames_delta++;

We update the statistics for the device and in this specific case the 
hrtimer fired on two CPUs resulting in a can_send() to the same netdevice.

Do you agree with this quick analysis?

Isn't there some lock-less per-cpu safe statistic handling within netdev 
we might pick for our use-case?

Best regards,
Oliver

> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 UID: 0 PID: 23 Comm: ksoftirqd/1 Tainted: G        W          6.14.0-rc5-syzkaller-00109-g0f52fd4f67c6 #0
> Tainted: [W]=WARN
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
> ==================================================================
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-09 18:47 ` Oliver Hartkopp
@ 2025-03-10  9:29   ` Vincent Mailhol
  2025-03-10  9:45     ` Oliver Hartkopp
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Mailhol @ 2025-03-10  9:29 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: mkl, syzbot, linux-kernel, syzkaller-bugs, linux-can

On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> Hello Marc,
>
> On 09.03.25 11:46, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:    0f52fd4f67c6 Merge tag 'bcachefs-2025-03-06' of git://evil..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=12d12a54580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=523b0e2f15224775
> > dashboard link: https://syzkaller.appspot.com/bug?extid=78ce4489b812515d5e4d
> > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/eb0d7b540c67/disk-0f52fd4f.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/51c261332ad9/vmlinux-0f52fd4f.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/38914a4790c8/bzImage-0f52fd4f.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+78ce4489b812515d5e4d@syzkaller.appspotmail.com
> >
> > ==================================================================
> > BUG: KCSAN: data-race in can_send / can_send
> >
> > read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 0:
> >   can_send+0x5a2/0x6d0 net/can/af_can.c:290
> >   bcm_can_tx+0x314/0x420 net/can/bcm.c:314
> >   bcm_tx_timeout_handler+0xea/0x280
> >   __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
> >   __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
> >   hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
> >   handle_softirqs+0xbf/0x280 kernel/softirq.c:561
> >   run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
> >   smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
> >   kthread+0x4ae/0x520 kernel/kthread.c:464
> >   ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
> >   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> >
> > read-write to 0xffff888117566290 of 8 bytes by interrupt on cpu 1:
> >   can_send+0x5a2/0x6d0 net/can/af_can.c:290
> >   bcm_can_tx+0x314/0x420 net/can/bcm.c:314
> >   bcm_tx_timeout_handler+0xea/0x280
> >   __run_hrtimer kernel/time/hrtimer.c:1801 [inline]
> >   __hrtimer_run_queues+0x20d/0x5e0 kernel/time/hrtimer.c:1865
> >   hrtimer_run_softirq+0xe4/0x2c0 kernel/time/hrtimer.c:1882
> >   handle_softirqs+0xbf/0x280 kernel/softirq.c:561
> >   run_ksoftirqd+0x1c/0x30 kernel/softirq.c:950
> >   smpboot_thread_fn+0x31c/0x4c0 kernel/smpboot.c:164
> >   kthread+0x4ae/0x520 kernel/kthread.c:464
> >   ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:148
> >   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> >
> > value changed: 0x0000000000002b9d -> 0x0000000000002b9e
> >
>
> Increased by '1' ...
>
> I assume this problem is caused by increasing the per-netdevice statistic in
>
> https://elixir.bootlin.com/linux/v6.13.6/source/net/can/af_can.c#L289
>
> pkg_stats->tx_frames++;
> pkg_stats->tx_frames_delta++;
>
> We update the statistics for the device and in this specific case the
> hrtimer fired on two CPUs resulting in a can_send() to the same netdevice.
>
> Do you agree with this quick analysis?

Ack. Same conclusion here.

> Isn't there some lock-less per-cpu safe statistic handling within netdev
> we might pick for our use-case?

I see two solutions. Either we use lock_sock(skb->sk) and
release_sock(skb->sk) or we can change the types of
can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
to atomic_long_t.

The atomic_long_t is the closest solution to a lock-less. But my
preference goes to the lock_sock() which looks more natural in this
context. And look_sock() is just a spinlock which under the hood is
also an atomic, so no big penalty either.


Yours sincerely,
Vincent Mailhol

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-10  9:29   ` Vincent Mailhol
@ 2025-03-10  9:45     ` Oliver Hartkopp
  2025-03-10  9:55       ` Vincent Mailhol
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Hartkopp @ 2025-03-10  9:45 UTC (permalink / raw)
  To: Vincent Mailhol; +Cc: mkl, syzbot, linux-kernel, syzkaller-bugs, linux-can



On 10.03.25 10:29, Vincent Mailhol wrote:
> On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@hartkopp.net> wrote:


>>> value changed: 0x0000000000002b9d -> 0x0000000000002b9e
>>>
>>
>> Increased by '1' ...
>>
>> I assume this problem is caused by increasing the per-netdevice statistic in
>>
>> https://elixir.bootlin.com/linux/v6.13.6/source/net/can/af_can.c#L289
>>
>> pkg_stats->tx_frames++;
>> pkg_stats->tx_frames_delta++;
>>
>> We update the statistics for the device and in this specific case the
>> hrtimer fired on two CPUs resulting in a can_send() to the same netdevice.
>>
>> Do you agree with this quick analysis?
> 
> Ack. Same conclusion here.
> 
>> Isn't there some lock-less per-cpu safe statistic handling within netdev
>> we might pick for our use-case?
> 
> I see two solutions. Either we use lock_sock(skb->sk) and
> release_sock(skb->sk) or we can change the types of
> can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
> to atomic_long_t.
> 
> The atomic_long_t is the closest solution to a lock-less. But my
> preference goes to the lock_sock() which looks more natural in this
> context. And look_sock() is just a spinlock which under the hood is
> also an atomic, so no big penalty either.

When we get skbs from the netdevice (and not from user space), we do not 
have a valid sk value. It is set to zero.

See:
https://elixir.bootlin.com/linux/v6.13.6/source/net/can/raw.c#L203

And those skbs can also be forwarded by can-gw using can_send().

Therefore there is no lock_sock() without a valid sk ;-)

When 'atomic_long_t' would also fix this simple statistics handling, we 
should use that.

Best regards,
Oliver


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-10  9:45     ` Oliver Hartkopp
@ 2025-03-10  9:55       ` Vincent Mailhol
  2025-03-10 14:36         ` Oliver Hartkopp
  0 siblings, 1 reply; 7+ messages in thread
From: Vincent Mailhol @ 2025-03-10  9:55 UTC (permalink / raw)
  To: Oliver Hartkopp; +Cc: mkl, syzbot, linux-kernel, syzkaller-bugs, linux-can

On Mon. 10 Mar 2025 at 18:46, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> On 10.03.25 10:29, Vincent Mailhol wrote:
> > On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@hartkopp.net> wrote:

(...)

> >> Isn't there some lock-less per-cpu safe statistic handling within netdev
> >> we might pick for our use-case?
> >
> > I see two solutions. Either we use lock_sock(skb->sk) and
> > release_sock(skb->sk) or we can change the types of
> > can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
> > to atomic_long_t.
> >
> > The atomic_long_t is the closest solution to a lock-less. But my
> > preference goes to the lock_sock() which looks more natural in this
> > context. And look_sock() is just a spinlock which under the hood is
> > also an atomic, so no big penalty either.
>
> When we get skbs from the netdevice (and not from user space), we do not
> have a valid sk value. It is set to zero.
>
> See:
> https://elixir.bootlin.com/linux/v6.13.6/source/net/can/raw.c#L203
>
> And those skbs can also be forwarded by can-gw using can_send().
>
> Therefore there is no lock_sock() without a valid sk ;-)
>
> When 'atomic_long_t' would also fix this simple statistics handling, we
> should use that.

I see, Thanks for the explanation. Then atomic_long_t seems the best
(and easiest).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-10  9:55       ` Vincent Mailhol
@ 2025-03-10 14:36         ` Oliver Hartkopp
  2025-03-20 21:19           ` Marco Elver
  0 siblings, 1 reply; 7+ messages in thread
From: Oliver Hartkopp @ 2025-03-10 14:36 UTC (permalink / raw)
  To: Vincent Mailhol, mkl; +Cc: syzbot, linux-kernel, syzkaller-bugs, linux-can

Hi Vincent, Marc,

I sent a patch to be reviewed:
https://lore.kernel.org/linux-can/20250310143353.3242-1-socketcan@hartkopp.net/T/#u

I've also tested this patch without any new issues.

Best regards,
Oliver

On 10.03.25 10:55, Vincent Mailhol wrote:
> On Mon. 10 Mar 2025 at 18:46, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
>> On 10.03.25 10:29, Vincent Mailhol wrote:
>>> On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> 
> (...)
> 
>>>> Isn't there some lock-less per-cpu safe statistic handling within netdev
>>>> we might pick for our use-case?
>>>
>>> I see two solutions. Either we use lock_sock(skb->sk) and
>>> release_sock(skb->sk) or we can change the types of
>>> can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
>>> to atomic_long_t.
>>>
>>> The atomic_long_t is the closest solution to a lock-less. But my
>>> preference goes to the lock_sock() which looks more natural in this
>>> context. And look_sock() is just a spinlock which under the hood is
>>> also an atomic, so no big penalty either.
>>
>> When we get skbs from the netdevice (and not from user space), we do not
>> have a valid sk value. It is set to zero.
>>
>> See:
>> https://elixir.bootlin.com/linux/v6.13.6/source/net/can/raw.c#L203
>>
>> And those skbs can also be forwarded by can-gw using can_send().
>>
>> Therefore there is no lock_sock() without a valid sk ;-)
>>
>> When 'atomic_long_t' would also fix this simple statistics handling, we
>> should use that.
> 
> I see, Thanks for the explanation. Then atomic_long_t seems the best
> (and easiest).


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [can?] KCSAN: data-race in can_send / can_send (5)
  2025-03-10 14:36         ` Oliver Hartkopp
@ 2025-03-20 21:19           ` Marco Elver
  0 siblings, 0 replies; 7+ messages in thread
From: Marco Elver @ 2025-03-20 21:19 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Vincent Mailhol, mkl, syzbot, linux-kernel, syzkaller-bugs,
	linux-can

On Mon, 10 Mar 2025 at 15:36, 'Oliver Hartkopp' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
>
> Hi Vincent, Marc,
>
> I sent a patch to be reviewed:
> https://lore.kernel.org/linux-can/20250310143353.3242-1-socketcan@hartkopp.net/T/#u
>
> I've also tested this patch without any new issues.
>
> Best regards,
> Oliver
>
> On 10.03.25 10:55, Vincent Mailhol wrote:
> > On Mon. 10 Mar 2025 at 18:46, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> >> On 10.03.25 10:29, Vincent Mailhol wrote:
> >>> On Mon. 10 Mar 2025 at 03:59, Oliver Hartkopp <socketcan@hartkopp.net> wrote:
> >
> > (...)
> >
> >>>> Isn't there some lock-less per-cpu safe statistic handling within netdev
> >>>> we might pick for our use-case?
> >>>
> >>> I see two solutions. Either we use lock_sock(skb->sk) and
> >>> release_sock(skb->sk) or we can change the types of
> >>> can_pkg_stats->tx_frames and can_pkg_stats->tx_frames_delta from long
> >>> to atomic_long_t.
> >>>
> >>> The atomic_long_t is the closest solution to a lock-less. But my
> >>> preference goes to the lock_sock() which looks more natural in this
> >>> context. And look_sock() is just a spinlock which under the hood is
> >>> also an atomic, so no big penalty either.
> >>
> >> When we get skbs from the netdevice (and not from user space), we do not
> >> have a valid sk value. It is set to zero.
> >>
> >> See:
> >> https://elixir.bootlin.com/linux/v6.13.6/source/net/can/raw.c#L203
> >>
> >> And those skbs can also be forwarded by can-gw using can_send().
> >>
> >> Therefore there is no lock_sock() without a valid sk ;-)
> >>
> >> When 'atomic_long_t' would also fix this simple statistics handling, we
> >> should use that.
> >
> > I see, Thanks for the explanation. Then atomic_long_t seems the best
> > (and easiest).

While I would prefer atomic_long_t myself, just to point out an
alternative for "lossy" stats counters: could use __data_racy or
data_race(..), and just accept the data race if "approximate"
statistics can be lived with if the stats counting is happening from a
very performance sensitive hot path. See section "Data-Racy Reads for
Approximate Diagnostics" in
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-03-20 21:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-09 10:46 [syzbot] [can?] KCSAN: data-race in can_send / can_send (5) syzbot
2025-03-09 18:47 ` Oliver Hartkopp
2025-03-10  9:29   ` Vincent Mailhol
2025-03-10  9:45     ` Oliver Hartkopp
2025-03-10  9:55       ` Vincent Mailhol
2025-03-10 14:36         ` Oliver Hartkopp
2025-03-20 21:19           ` Marco Elver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox