* [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities
@ 2024-09-27 14:57 syzbot
2024-10-22 8:06 ` Alexander Potapenko
0 siblings, 1 reply; 13+ messages in thread
From: syzbot @ 2024-09-27 14:57 UTC (permalink / raw)
To: audit, eparis, linux-kernel, paul, syzkaller-bugs
Hello,
syzbot found the following issue on:
HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000
kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98
dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com
audit: type=1400 audit(6022412112.141:243086): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0
audit: type=1400 audit(6022412112.181:243087): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0
==================================================================
BUG: KCSAN: assert: race in __block_task kernel/sched/sched.h:2770 [inline]
BUG: KCSAN: assert: race in dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177
race at unknown origin, with assert no writes to 0xffff888101764268 of 4 bytes by task 29 on cpu 1:
__block_task kernel/sched/sched.h:2770 [inline]
dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177
pick_next_entity kernel/sched/fair.c:5627 [inline]
pick_task_fair kernel/sched/fair.c:8856 [inline]
pick_next_task_fair+0x7d/0x410 kernel/sched/fair.c:8876
__pick_next_task kernel/sched/core.c:5955 [inline]
pick_next_task kernel/sched/core.c:6477 [inline]
__schedule+0x284/0x940 kernel/sched/core.c:6629
__schedule_loop kernel/sched/core.c:6752 [inline]
schedule+0x55/0xc0 kernel/sched/core.c:6767
kauditd_thread+0x56b/0x650 kernel/audit.c:911
kthread+0x1d1/0x210 kernel/kthread.c:389
ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
value changed: 0x00000000 -> 0x00000001
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 29 Comm: kauditd Not tainted 6.11.0-syzkaller-11558-g075dbe9f6e3c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-09-27 14:57 [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities syzbot @ 2024-10-22 8:06 ` Alexander Potapenko 2024-10-22 11:31 ` Peter Zijlstra 0 siblings, 1 reply; 13+ messages in thread From: Alexander Potapenko @ 2024-10-22 8:06 UTC (permalink / raw) To: syzbot, Peter Zijlstra; +Cc: audit, eparis, linux-kernel, paul, syzkaller-bugs On Fri, Sep 27, 2024 at 4:57 PM syzbot <syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com> wrote: > > Hello, > > syzbot found the following issue on: > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000 > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98 > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > Unfortunately, I don't have any reproducer for this issue yet. > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com > > audit: type=1400 audit(6022412112.141:243086): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0 > audit: type=1400 audit(6022412112.181:243087): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0 > ================================================================== > BUG: KCSAN: assert: race in __block_task kernel/sched/sched.h:2770 [inline] > BUG: KCSAN: assert: race in dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177 > > race at unknown origin, with assert no writes to 0xffff888101764268 of 4 bytes by task 29 on cpu 1: > __block_task kernel/sched/sched.h:2770 [inline] > dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177 > pick_next_entity kernel/sched/fair.c:5627 [inline] > pick_task_fair kernel/sched/fair.c:8856 [inline] > pick_next_task_fair+0x7d/0x410 kernel/sched/fair.c:8876 > __pick_next_task kernel/sched/core.c:5955 [inline] > pick_next_task kernel/sched/core.c:6477 [inline] > __schedule+0x284/0x940 kernel/sched/core.c:6629 > __schedule_loop kernel/sched/core.c:6752 [inline] > schedule+0x55/0xc0 kernel/sched/core.c:6767 > kauditd_thread+0x56b/0x650 kernel/audit.c:911 > kthread+0x1d1/0x210 kernel/kthread.c:389 > ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > > value changed: 0x00000000 -> 0x00000001 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 UID: 0 PID: 29 Comm: kauditd Not tainted 6.11.0-syzkaller-11558-g075dbe9f6e3c #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > ================================================================== +PeterZ, who added the KCSAN assertion. Peter, I am also hitting the same assert on ARM while fuzzing KVM. It is tricky to reproduce though (we probably need to stress the scheduler hard enough): ================================================================== BUG: KCSAN: assert: race in dequeue_entities+0x6d8/0x794 kernel/sched/sched.h:2773 race at unknown origin, with assert no writes to 0xffffff80039f91e8 of 4 bytes by task 27858 on cpu 2: dequeue_entities+0x6d8/0x794 kernel/sched/sched.h:2773 pick_next_task_fair+0x84/0x394 kernel/sched/fair.c:5627 __schedule+0x218/0x888 kernel/sched/core.c:5968 preempt_schedule_irq+0x3c/0x68 kernel/sched/core.c:7012 arm64_preempt_schedule_irq+0x2c/0x40 arch/arm64/kernel/entry-common.c:301 __el1_irq arch/arm64/kernel/entry-common.c:539 [inline] el1_interrupt+0x3c/0x54 arch/arm64/kernel/entry-common.c:551 el1h_64_irq_handler+0x18/0x24 arch/arm64/kernel/entry-common.c:556 el1h_64_irq+0x64/0x68 arch/arm64/kernel/entry.S:594 flush_ptrace_hw_breakpoint+0x4f8/0x6b0 arch/arm64/kernel/ptrace.c:221 flush_thread+0x70/0xf4 arch/arm64/kernel/process.c:287 begin_new_exec+0x9ac/0xcdc fs/exec.c:1307 load_elf_binary+0x4d0/0x1330 fs/binfmt_elf.c:996 bprm_execve+0x3d8/0x958 fs/exec.c:1752 kernel_execve+0x550/0x57c fs/exec.c:2012 call_usermodehelper_exec_async+0x18c/0x270 kernel/umh.c:110 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860 value changed: 0x00000000 -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 2 UID: 0 PID: 27858 Comm: kworker/u16:3 Not tainted 6.12.0-rc4-00001-g517096192323-dirty #84 Hardware name: linux,dummy-virt (DT) ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 8:06 ` Alexander Potapenko @ 2024-10-22 11:31 ` Peter Zijlstra 2024-10-22 13:40 ` Marco Elver 0 siblings, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2024-10-22 11:31 UTC (permalink / raw) To: Alexander Potapenko Cc: syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote: > On Fri, Sep 27, 2024 at 4:57 PM syzbot > <syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com> wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98 > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com > > > > audit: type=1400 audit(6022412112.141:243086): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0 > > audit: type=1400 audit(6022412112.181:243087): avc: denied { read } for pid=2950 comm="syslogd" name="log" dev="sda1" ino=1915 scontext=system_u:system_r:syslogd_t tcontext=system_u:object_r:var_t tclass=lnk_file permissive=0 > > ================================================================== > > BUG: KCSAN: assert: race in __block_task kernel/sched/sched.h:2770 [inline] > > BUG: KCSAN: assert: race in dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177 > > > > race at unknown origin, with assert no writes to 0xffff888101764268 of 4 bytes by task 29 on cpu 1: > > __block_task kernel/sched/sched.h:2770 [inline] > > dequeue_entities+0x6df/0x760 kernel/sched/fair.c:7177 > > pick_next_entity kernel/sched/fair.c:5627 [inline] > > pick_task_fair kernel/sched/fair.c:8856 [inline] > > pick_next_task_fair+0x7d/0x410 kernel/sched/fair.c:8876 > > __pick_next_task kernel/sched/core.c:5955 [inline] > > pick_next_task kernel/sched/core.c:6477 [inline] > > __schedule+0x284/0x940 kernel/sched/core.c:6629 > > __schedule_loop kernel/sched/core.c:6752 [inline] > > schedule+0x55/0xc0 kernel/sched/core.c:6767 > > kauditd_thread+0x56b/0x650 kernel/audit.c:911 > > kthread+0x1d1/0x210 kernel/kthread.c:389 > > ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 > > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > > > > value changed: 0x00000000 -> 0x00000001 > > > > Reported by Kernel Concurrency Sanitizer on: > > CPU: 1 UID: 0 PID: 29 Comm: kauditd Not tainted 6.11.0-syzkaller-11558-g075dbe9f6e3c #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > > ================================================================== > > +PeterZ, who added the KCSAN assertion. Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split DEQUEUE_SLEEP from deactivate_task()"). I'm not at all sure I have any inkling as to what the annotation does nor what KCSAN is trying to tell us above. Can someone please translate? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 11:31 ` Peter Zijlstra @ 2024-10-22 13:40 ` Marco Elver 2024-10-22 14:31 ` Paul E. McKenney 2024-10-22 19:12 ` Peter Zijlstra 0 siblings, 2 replies; 13+ messages in thread From: Marco Elver @ 2024-10-22 13:40 UTC (permalink / raw) To: Peter Zijlstra, paulmck Cc: Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote: > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote: > > On Fri, Sep 27, 2024 at 4:57 PM syzbot > > <syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com> wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98 > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > Downloadable assets: > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com [...] > > +PeterZ, who added the KCSAN assertion. > > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split > DEQUEUE_SLEEP from deactivate_task()"). > > I'm not at all sure I have any inkling as to what the annotation does > nor what KCSAN is trying to tell us above. ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no concurrent writes to var; other readers are allowed. If KCSAN is enabled, it then goes and reports any violations of that assertion. Main usecase is for already marked accesses where concurrent accesses are _not_ data races, but the algorithm does not assume concurrent writers regardless. In this case it seems that Paul was trying to say that there should be no concurrent writers to this variable. But KCSAN disproved that. > Can someone please translate? We can get the 2nd stack trace with: --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n # Disable KCSAN to avoid excessive noise and performance degradation. To avoid # false positives ensure barriers implied by sched functions are instrumented. -KCSAN_SANITIZE := n -KCSAN_INSTRUMENT_BARRIERS := y +#KCSAN_SANITIZE := n +#KCSAN_INSTRUMENT_BARRIERS := y Which gives us: | ================================================================== | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate | | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0: | activate_task kernel/sched/core.c:2064 [inline] This is this one: void activate_task(struct rq *rq, struct task_struct *p, int flags) { if (task_on_rq_migrating(p)) flags |= ENQUEUE_MIGRATED; if (flags & ENQUEUE_MIGRATED) sched_mm_cid_migrate_to(rq, p); enqueue_task(rq, p, flags); WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED); ASSERT_EXCLUSIVE_WRITER(p->on_rq); } | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671 | ttwu_queue kernel/sched/core.c:3944 [inline] | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270 | default_wake_function+0x25/0x30 kernel/sched/core.c:7009 | __pollwake fs/select.c:205 [inline] | pollwake+0xc0/0x100 fs/select.c:215 | __wake_up_common kernel/sched/wait.c:89 [inline] | __wake_up_common_lock kernel/sched/wait.c:106 [inline] | __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173 | sock_def_readable+0x6f/0x180 net/core/sock.c:3442 | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193 | tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283 | tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237 | tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915 | tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350 | ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205 | ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233 | NF_HOOK include/linux/netfilter.h:314 [inline] | ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254 | dst_input include/net/dst.h:460 [inline] | ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline] | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline] | ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638 | ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672 | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline] | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756 | __netif_receive_skb_list net/core/dev.c:5808 [inline] | netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899 | gro_normal_list include/net/gro.h:515 [inline] | napi_complete_done+0x161/0x3a0 net/core/dev.c:6250 | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808 | __napi_poll+0x66/0x360 net/core/dev.c:6775 | napi_poll net/core/dev.c:6844 [inline] | net_rx_action+0x3d9/0x820 net/core/dev.c:6966 | handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554 | __do_softirq kernel/softirq.c:588 [inline] | invoke_softirq kernel/softirq.c:428 [inline] | __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637 | common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278 | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693 | | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3: | __block_task kernel/sched/sched.h:2770 [inline] And that's: static inline void __block_task(struct rq *rq, struct task_struct *p) { WRITE_ONCE(p->on_rq, 0); ASSERT_EXCLUSIVE_WRITER(p->on_rq); if (p->sched_contributes_to_load) rq->nr_uninterruptible++; | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177 | pick_next_entity kernel/sched/fair.c:5627 [inline] | pick_task_fair kernel/sched/fair.c:8856 [inline] | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876 | __pick_next_task kernel/sched/core.c:5955 [inline] | pick_next_task kernel/sched/core.c:6477 [inline] | __schedule+0x47a/0x1130 kernel/sched/core.c:6629 | __schedule_loop kernel/sched/core.c:6752 [inline] | schedule+0x7b/0x130 kernel/sched/core.c:6767 | do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032 | hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080 | common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365 | __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline] | __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388 | __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388 | x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231 | do_syscall_x64 arch/x86/entry/common.c:52 [inline] | do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83 | entry_SYSCALL_64_after_hwframe+0x77/0x7f | | Reported by Kernel Concurrency Sanitizer on: | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 | ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 13:40 ` Marco Elver @ 2024-10-22 14:31 ` Paul E. McKenney 2024-10-23 14:13 ` Paul E. McKenney 2024-10-22 19:12 ` Peter Zijlstra 1 sibling, 1 reply; 13+ messages in thread From: Paul E. McKenney @ 2024-10-22 14:31 UTC (permalink / raw) To: Marco Elver Cc: Peter Zijlstra, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote: > On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote: > > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote: > > > On Fri, Sep 27, 2024 at 4:57 PM syzbot > > > <syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com> wrote: > > > > > > > > Hello, > > > > > > > > syzbot found the following issue on: > > > > > > > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k.. > > > > git tree: upstream > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000 > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98 > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > Downloadable assets: > > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz > > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com > [...] > > > +PeterZ, who added the KCSAN assertion. > > > > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for > > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split > > DEQUEUE_SLEEP from deactivate_task()"). > > > > I'm not at all sure I have any inkling as to what the annotation does > > nor what KCSAN is trying to tell us above. > > ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no > concurrent writes to var; other readers are allowed. If KCSAN is > enabled, it then goes and reports any violations of that assertion. > Main usecase is for already marked accesses where concurrent accesses > are _not_ data races, but the algorithm does not assume concurrent > writers regardless. > > In this case it seems that Paul was trying to say that there should be > no concurrent writers to this variable. But KCSAN disproved that. Just confirming that this was my intent. And for all I know, maybe it is now OK to have concurrent writers to that variable, but if so, would we please have an explanatory comment (or a reference to one)? Thanx, Paul > > Can someone please translate? > > We can get the 2nd stack trace with: > > --- a/kernel/sched/Makefile > +++ b/kernel/sched/Makefile > @@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n > > # Disable KCSAN to avoid excessive noise and performance degradation. To avoid > # false positives ensure barriers implied by sched functions are instrumented. > -KCSAN_SANITIZE := n > -KCSAN_INSTRUMENT_BARRIERS := y > +#KCSAN_SANITIZE := n > +#KCSAN_INSTRUMENT_BARRIERS := y > > Which gives us: > > | ================================================================== > | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate > | > | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0: > | activate_task kernel/sched/core.c:2064 [inline] > > This is this one: > > void activate_task(struct rq *rq, struct task_struct *p, int flags) > { > if (task_on_rq_migrating(p)) > flags |= ENQUEUE_MIGRATED; > if (flags & ENQUEUE_MIGRATED) > sched_mm_cid_migrate_to(rq, p); > > enqueue_task(rq, p, flags); > > WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED); > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > } > > | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671 > | ttwu_queue kernel/sched/core.c:3944 [inline] > | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270 > | default_wake_function+0x25/0x30 kernel/sched/core.c:7009 > | __pollwake fs/select.c:205 [inline] > | pollwake+0xc0/0x100 fs/select.c:215 > | __wake_up_common kernel/sched/wait.c:89 [inline] > | __wake_up_common_lock kernel/sched/wait.c:106 [inline] > | __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173 > | sock_def_readable+0x6f/0x180 net/core/sock.c:3442 > | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193 > | tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283 > | tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237 > | tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915 > | tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350 > | ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205 > | ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233 > | NF_HOOK include/linux/netfilter.h:314 [inline] > | ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254 > | dst_input include/net/dst.h:460 [inline] > | ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline] > | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline] > | ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638 > | ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672 > | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline] > | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756 > | __netif_receive_skb_list net/core/dev.c:5808 [inline] > | netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899 > | gro_normal_list include/net/gro.h:515 [inline] > | napi_complete_done+0x161/0x3a0 net/core/dev.c:6250 > | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808 > | __napi_poll+0x66/0x360 net/core/dev.c:6775 > | napi_poll net/core/dev.c:6844 [inline] > | net_rx_action+0x3d9/0x820 net/core/dev.c:6966 > | handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554 > | __do_softirq kernel/softirq.c:588 [inline] > | invoke_softirq kernel/softirq.c:428 [inline] > | __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637 > | common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278 > | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693 > | > | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3: > | __block_task kernel/sched/sched.h:2770 [inline] > > And that's: > > static inline void __block_task(struct rq *rq, struct task_struct *p) > { > WRITE_ONCE(p->on_rq, 0); > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > if (p->sched_contributes_to_load) > rq->nr_uninterruptible++; > > | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177 > | pick_next_entity kernel/sched/fair.c:5627 [inline] > | pick_task_fair kernel/sched/fair.c:8856 [inline] > | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876 > | __pick_next_task kernel/sched/core.c:5955 [inline] > | pick_next_task kernel/sched/core.c:6477 [inline] > | __schedule+0x47a/0x1130 kernel/sched/core.c:6629 > | __schedule_loop kernel/sched/core.c:6752 [inline] > | schedule+0x7b/0x130 kernel/sched/core.c:6767 > | do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032 > | hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080 > | common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365 > | __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline] > | __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388 > | __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388 > | x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231 > | do_syscall_x64 arch/x86/entry/common.c:52 [inline] > | do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83 > | entry_SYSCALL_64_after_hwframe+0x77/0x7f > | > | Reported by Kernel Concurrency Sanitizer on: > | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7 > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > | ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 14:31 ` Paul E. McKenney @ 2024-10-23 14:13 ` Paul E. McKenney 0 siblings, 0 replies; 13+ messages in thread From: Paul E. McKenney @ 2024-10-23 14:13 UTC (permalink / raw) To: Marco Elver Cc: Peter Zijlstra, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 07:31:28AM -0700, Paul E. McKenney wrote: > On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote: > > On Tue, Oct 22, 2024 at 01:31PM +0200, Peter Zijlstra wrote: > > > On Tue, Oct 22, 2024 at 10:06:23AM +0200, Alexander Potapenko wrote: > > > > On Fri, Sep 27, 2024 at 4:57 PM syzbot > > > > <syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com> wrote: > > > > > > > > > > Hello, > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > HEAD commit: 075dbe9f6e3c Merge tag 'soc-ep93xx-dt-6.12' of git://git.k.. > > > > > git tree: upstream > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15f07a80580000 > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=86254f9e0a8f2c98 > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0ec1e96c2cdf5c0e512a > > > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > > > Downloadable assets: > > > > > disk image: https://storage.googleapis.com/syzbot-assets/1be80941df60/disk-075dbe9f.raw.xz > > > > > vmlinux: https://storage.googleapis.com/syzbot-assets/494a9ac89c09/vmlinux-075dbe9f.xz > > > > > kernel image: https://storage.googleapis.com/syzbot-assets/919788d8c731/bzImage-075dbe9f.xz > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > > Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com > > [...] > > > > +PeterZ, who added the KCSAN assertion. > > > > > > Well, PaulMck did in d6111cf45c57 ("sched: Use WRITE_ONCE() for > > > p->on_rq"), I just moved it around in e8901061ca0c ("sched: Split > > > DEQUEUE_SLEEP from deactivate_task()"). > > > > > > I'm not at all sure I have any inkling as to what the annotation does > > > nor what KCSAN is trying to tell us above. > > > > ASSERT_EXCLUSIVE_WRITER(var) is to say that that there should be no > > concurrent writes to var; other readers are allowed. If KCSAN is > > enabled, it then goes and reports any violations of that assertion. > > Main usecase is for already marked accesses where concurrent accesses > > are _not_ data races, but the algorithm does not assume concurrent > > writers regardless. > > > > In this case it seems that Paul was trying to say that there should be > > no concurrent writers to this variable. But KCSAN disproved that. > > Just confirming that this was my intent. Except that it looks to me that Peter added this one, not me. ;-) > And for all I know, maybe it is now OK to have concurrent writers to > that variable, but if so, would we please have an explanatory comment > (or a reference to one)? Thanx, Paul > > > Can someone please translate? > > > > We can get the 2nd stack trace with: > > > > --- a/kernel/sched/Makefile > > +++ b/kernel/sched/Makefile > > @@ -10,8 +10,8 @@ KCOV_INSTRUMENT := n > > > > # Disable KCSAN to avoid excessive noise and performance degradation. To avoid > > # false positives ensure barriers implied by sched functions are instrumented. > > -KCSAN_SANITIZE := n > > -KCSAN_INSTRUMENT_BARRIERS := y > > +#KCSAN_SANITIZE := n > > +#KCSAN_INSTRUMENT_BARRIERS := y > > > > Which gives us: > > > > | ================================================================== > > | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate > > | > > | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0: > > | activate_task kernel/sched/core.c:2064 [inline] > > > > This is this one: > > > > void activate_task(struct rq *rq, struct task_struct *p, int flags) > > { > > if (task_on_rq_migrating(p)) > > flags |= ENQUEUE_MIGRATED; > > if (flags & ENQUEUE_MIGRATED) > > sched_mm_cid_migrate_to(rq, p); > > > > enqueue_task(rq, p, flags); > > > > WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED); > > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > > } > > > > | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671 > > | ttwu_queue kernel/sched/core.c:3944 [inline] > > | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270 > > | default_wake_function+0x25/0x30 kernel/sched/core.c:7009 > > | __pollwake fs/select.c:205 [inline] > > | pollwake+0xc0/0x100 fs/select.c:215 > > | __wake_up_common kernel/sched/wait.c:89 [inline] > > | __wake_up_common_lock kernel/sched/wait.c:106 [inline] > > | __wake_up_sync_key+0x85/0xc0 kernel/sched/wait.c:173 > > | sock_def_readable+0x6f/0x180 net/core/sock.c:3442 > > | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193 > > | tcp_data_queue+0x1052/0x2710 net/ipv4/tcp_input.c:5283 > > | tcp_rcv_established+0x7e3/0xd60 net/ipv4/tcp_input.c:6237 > > | tcp_v4_do_rcv+0x545/0x600 net/ipv4/tcp_ipv4.c:1915 > > | tcp_v4_rcv+0x159c/0x1890 net/ipv4/tcp_ipv4.c:2350 > > | ip_protocol_deliver_rcu+0x2d8/0x620 net/ipv4/ip_input.c:205 > > | ip_local_deliver_finish+0x11a/0x150 net/ipv4/ip_input.c:233 > > | NF_HOOK include/linux/netfilter.h:314 [inline] > > | ip_local_deliver+0xce/0x1a0 net/ipv4/ip_input.c:254 > > | dst_input include/net/dst.h:460 [inline] > > | ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline] > > | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline] > > | ip_sublist_rcv+0x43d/0x520 net/ipv4/ip_input.c:638 > > | ip_list_rcv+0x262/0x2a0 net/ipv4/ip_input.c:672 > > | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline] > > | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756 > > | __netif_receive_skb_list net/core/dev.c:5808 [inline] > > | netif_receive_skb_list_internal+0x46d/0x5e0 net/core/dev.c:5899 > > | gro_normal_list include/net/gro.h:515 [inline] > > | napi_complete_done+0x161/0x3a0 net/core/dev.c:6250 > > | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808 > > | __napi_poll+0x66/0x360 net/core/dev.c:6775 > > | napi_poll net/core/dev.c:6844 [inline] > > | net_rx_action+0x3d9/0x820 net/core/dev.c:6966 > > | handle_softirqs+0xe6/0x2d0 kernel/softirq.c:554 > > | __do_softirq kernel/softirq.c:588 [inline] > > | invoke_softirq kernel/softirq.c:428 [inline] > > | __irq_exit_rcu+0x45/0xc0 kernel/softirq.c:637 > > | common_interrupt+0x4f/0xc0 arch/x86/kernel/irq.c:278 > > | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693 > > | > > | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3: > > | __block_task kernel/sched/sched.h:2770 [inline] > > > > And that's: > > > > static inline void __block_task(struct rq *rq, struct task_struct *p) > > { > > WRITE_ONCE(p->on_rq, 0); > > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > > if (p->sched_contributes_to_load) > > rq->nr_uninterruptible++; > > > > | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177 > > | pick_next_entity kernel/sched/fair.c:5627 [inline] > > | pick_task_fair kernel/sched/fair.c:8856 [inline] > > | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876 > > | __pick_next_task kernel/sched/core.c:5955 [inline] > > | pick_next_task kernel/sched/core.c:6477 [inline] > > | __schedule+0x47a/0x1130 kernel/sched/core.c:6629 > > | __schedule_loop kernel/sched/core.c:6752 [inline] > > | schedule+0x7b/0x130 kernel/sched/core.c:6767 > > | do_nanosleep+0xdb/0x310 kernel/time/hrtimer.c:2032 > > | hrtimer_nanosleep+0xa0/0x180 kernel/time/hrtimer.c:2080 > > | common_nsleep+0x52/0x70 kernel/time/posix-timers.c:1365 > > | __do_sys_clock_nanosleep kernel/time/posix-timers.c:1411 [inline] > > | __se_sys_clock_nanosleep+0x1b2/0x1f0 kernel/time/posix-timers.c:1388 > > | __x64_sys_clock_nanosleep+0x55/0x70 kernel/time/posix-timers.c:1388 > > | x64_sys_call+0x2612/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:231 > > | do_syscall_x64 arch/x86/entry/common.c:52 [inline] > > | do_syscall_64+0xd0/0x1a0 arch/x86/entry/common.c:83 > > | entry_SYSCALL_64_after_hwframe+0x77/0x7f > > | > > | Reported by Kernel Concurrency Sanitizer on: > > | CPU: 3 UID: 0 PID: 10571 Comm: syz.3.1083 Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #7 > > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > > | ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 13:40 ` Marco Elver 2024-10-22 14:31 ` Paul E. McKenney @ 2024-10-22 19:12 ` Peter Zijlstra 2024-10-22 19:57 ` Marco Elver 1 sibling, 1 reply; 13+ messages in thread From: Peter Zijlstra @ 2024-10-22 19:12 UTC (permalink / raw) To: Marco Elver Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote: > Which gives us: > > | ================================================================== > | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate > | > | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0: > | activate_task kernel/sched/core.c:2064 [inline] > > This is this one: > > void activate_task(struct rq *rq, struct task_struct *p, int flags) > { > if (task_on_rq_migrating(p)) > flags |= ENQUEUE_MIGRATED; > if (flags & ENQUEUE_MIGRATED) > sched_mm_cid_migrate_to(rq, p); > > enqueue_task(rq, p, flags); > > WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED); > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > } > > | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671 > | ttwu_queue kernel/sched/core.c:3944 [inline] > | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270 > | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3: > | __block_task kernel/sched/sched.h:2770 [inline] > > And that's: > > static inline void __block_task(struct rq *rq, struct task_struct *p) > { > WRITE_ONCE(p->on_rq, 0); > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > if (p->sched_contributes_to_load) > rq->nr_uninterruptible++; > > | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177 > | pick_next_entity kernel/sched/fair.c:5627 [inline] > | pick_task_fair kernel/sched/fair.c:8856 [inline] > | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876 > | __pick_next_task kernel/sched/core.c:5955 [inline] > | pick_next_task kernel/sched/core.c:6477 [inline] > | __schedule+0x47a/0x1130 kernel/sched/core.c:6629 > | __schedule_loop kernel/sched/core.c:6752 [inline] > | schedule+0x7b/0x130 kernel/sched/core.c:6767 So KCSAn is trying to tell me these two paths run concurrently on the same 'p' ?!? That would be a horrible bug -- both these call chains should be holding rq->__lock (for task_rq(p)). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 19:12 ` Peter Zijlstra @ 2024-10-22 19:57 ` Marco Elver 2024-10-23 8:54 ` Marco Elver 0 siblings, 1 reply; 13+ messages in thread From: Marco Elver @ 2024-10-22 19:57 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, 22 Oct 2024 at 21:12, Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Oct 22, 2024 at 03:40:52PM +0200, Marco Elver wrote: > > > Which gives us: > > > > | ================================================================== > > | BUG: KCSAN: assert: race in dequeue_entities / ttwu_do_activate > > | > > | write (marked) to 0xffff9e100329c628 of 4 bytes by interrupt on cpu 0: > > | activate_task kernel/sched/core.c:2064 [inline] > > > > This is this one: > > > > void activate_task(struct rq *rq, struct task_struct *p, int flags) > > { > > if (task_on_rq_migrating(p)) > > flags |= ENQUEUE_MIGRATED; > > if (flags & ENQUEUE_MIGRATED) > > sched_mm_cid_migrate_to(rq, p); > > > > enqueue_task(rq, p, flags); > > > > WRITE_ONCE(p->on_rq, TASK_ON_RQ_QUEUED); > > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > > } > > > > | ttwu_do_activate+0x153/0x3e0 kernel/sched/core.c:3671 > > | ttwu_queue kernel/sched/core.c:3944 [inline] > > | try_to_wake_up+0x60f/0xaf0 kernel/sched/core.c:4270 > > > | assert no writes to 0xffff9e100329c628 of 4 bytes by task 10571 on cpu 3: > > | __block_task kernel/sched/sched.h:2770 [inline] > > > > And that's: > > > > static inline void __block_task(struct rq *rq, struct task_struct *p) > > { > > WRITE_ONCE(p->on_rq, 0); > > ASSERT_EXCLUSIVE_WRITER(p->on_rq); > > if (p->sched_contributes_to_load) > > rq->nr_uninterruptible++; > > > > | dequeue_entities+0xd83/0xe70 kernel/sched/fair.c:7177 > > | pick_next_entity kernel/sched/fair.c:5627 [inline] > > | pick_task_fair kernel/sched/fair.c:8856 [inline] > > | pick_next_task_fair+0xaf/0x710 kernel/sched/fair.c:8876 > > | __pick_next_task kernel/sched/core.c:5955 [inline] > > | pick_next_task kernel/sched/core.c:6477 [inline] > > | __schedule+0x47a/0x1130 kernel/sched/core.c:6629 > > | __schedule_loop kernel/sched/core.c:6752 [inline] > > | schedule+0x7b/0x130 kernel/sched/core.c:6767 > > > So KCSAn is trying to tell me these two paths run concurrently on the > same 'p' ?!? That would be a horrible bug -- both these call chains > should be holding rq->__lock (for task_rq(p)). Yes correct. And just to confirm this is no false positive, the way KCSAN works _requires_ the race to actually happen before it reports anything; this can also be seen in Alexander's report with just 1 stack trace where it saw the value transition from 0 to 1 (TASK_ON_RQ_QUEUED) but didn't know who did the write because kernel/sched was uninstrumented. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-22 19:57 ` Marco Elver @ 2024-10-23 8:54 ` Marco Elver 2024-10-23 9:03 ` Marco Elver 0 siblings, 1 reply; 13+ messages in thread From: Marco Elver @ 2024-10-23 8:54 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Tue, Oct 22, 2024 at 09:57PM +0200, Marco Elver wrote: > On Tue, 22 Oct 2024 at 21:12, Peter Zijlstra <peterz@infradead.org> wrote: [...] > > So KCSAn is trying to tell me these two paths run concurrently on the > > same 'p' ?!? That would be a horrible bug -- both these call chains > > should be holding rq->__lock (for task_rq(p)). > > Yes correct. > > And just to confirm this is no false positive, the way KCSAN works > _requires_ the race to actually happen before it reports anything; > this can also be seen in Alexander's report with just 1 stack trace > where it saw the value transition from 0 to 1 (TASK_ON_RQ_QUEUED) but > didn't know who did the write because kernel/sched was uninstrumented. Got another version of the splat with CONFIG_KCSAN_VERBOSE=y. Lockdep seems to think that both threads here are holding rq->__lock. | ================================================================== | BUG: KCSAN: assert: race in activate_task / dequeue_entities | | write (marked) to 0xffffa2e506240068 of 4 bytes by interrupt on cpu 0: | activate_task+0xfb/0x130 kernel/sched/core.c:2064 | ttwu_do_activate+0xee/0x2b0 kernel/sched/core.c:3671 | ttwu_queue kernel/sched/core.c:3944 [inline] | try_to_wake_up+0x509/0x930 kernel/sched/core.c:4270 | default_wake_function+0x25/0x30 kernel/sched/core.c:7009 | __pollwake fs/select.c:205 [inline] | pollwake+0xc0/0x100 fs/select.c:215 | __wake_up_common kernel/sched/wait.c:89 [inline] | __wake_up_common_lock+0xc0/0x110 kernel/sched/wait.c:106 | __wake_up_sync_key+0x1b/0x30 kernel/sched/wait.c:173 | sock_def_readable+0x19a/0x460 net/core/sock.c:3442 | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193 | tcp_data_queue+0xfd8/0x26b0 net/ipv4/tcp_input.c:5283 | tcp_rcv_established+0x8e5/0xeb0 net/ipv4/tcp_input.c:6237 | tcp_v4_do_rcv+0x606/0x620 net/ipv4/tcp_ipv4.c:1915 | tcp_v4_rcv+0x16e9/0x1890 net/ipv4/tcp_ipv4.c:2350 | ip_protocol_deliver_rcu+0x382/0x6a0 net/ipv4/ip_input.c:205 | ip_local_deliver_finish+0x1bf/0x290 net/ipv4/ip_input.c:233 | NF_HOOK+0x274/0x2f0 include/linux/netfilter.h:314 | ip_local_deliver net/ipv4/ip_input.c:254 [inline] | dst_input include/net/dst.h:460 [inline] | ip_sublist_rcv_finish+0x22a/0x2c0 net/ipv4/ip_input.c:580 | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline] | ip_sublist_rcv+0x2e3/0x560 net/ipv4/ip_input.c:638 | ip_list_rcv+0x275/0x2b0 net/ipv4/ip_input.c:672 | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline] | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756 | __netif_receive_skb_list net/core/dev.c:5808 [inline] | netif_receive_skb_list_internal+0x516/0x720 net/core/dev.c:5899 | gro_normal_list include/net/gro.h:515 [inline] | napi_complete_done+0x161/0x3c0 net/core/dev.c:6250 | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808 | __napi_poll+0x6c/0x2c0 net/core/dev.c:6775 | napi_poll net/core/dev.c:6844 [inline] | net_rx_action+0x433/0x8e0 net/core/dev.c:6966 | handle_softirqs+0x15c/0x450 kernel/softirq.c:554 | __do_softirq kernel/softirq.c:588 [inline] | invoke_softirq kernel/softirq.c:428 [inline] | __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 | irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 | common_interrupt+0xa7/0xc0 arch/x86/kernel/irq.c:278 | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693 | finish_task_switch+0x11c/0x3e0 kernel/sched/core.c:5189 | context_switch kernel/sched/core.c:5318 [inline] | __schedule+0xf23/0x14a0 kernel/sched/core.c:6675 | preempt_schedule_irq+0x9a/0xf0 kernel/sched/core.c:6997 | raw_irqentry_exit_cond_resched+0x52/0x60 kernel/entry/common.c:311 | irqentry_exit+0x6a/0xa0 kernel/entry/common.c:354 | sysvec_apic_timer_interrupt+0x50/0xb0 arch/x86/kernel/apic/apic.c:1037 | asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 | __seqprop_sequence include/linux/seqlock.h:211 [inline] | get_counters net/ipv6/netfilter/ip6_tables.c:774 [inline] | alloc_counters net/ipv6/netfilter/ip6_tables.c:820 [inline] | copy_entries_to_user net/ipv6/netfilter/ip6_tables.c:837 [inline] | get_entries net/ipv6/netfilter/ip6_tables.c:1039 [inline] | do_ip6t_get_ctl+0x665/0x9d0 net/ipv6/netfilter/ip6_tables.c:1677 | nf_getsockopt+0x14e/0x170 net/netfilter/nf_sockopt.c:116 | ipv6_getsockopt+0x13c/0x1b0 net/ipv6/ipv6_sockglue.c:1493 | tcp_getsockopt+0x9e/0xd0 net/ipv4/tcp.c:4670 | sock_common_getsockopt+0x5d/0x70 net/core/sock.c:3776 | do_sock_getsockopt+0x18f/0x220 net/socket.c:2391 | __sys_getsockopt+0xf1/0x170 net/socket.c:2420 | __do_sys_getsockopt net/socket.c:2430 [inline] | __se_sys_getsockopt net/socket.c:2427 [inline] | __x64_sys_getsockopt+0x64/0x80 net/socket.c:2427 | x64_sys_call+0x120e/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:56 | do_syscall_x64 arch/x86/entry/common.c:52 [inline] | do_syscall_64+0xec/0x1d0 arch/x86/entry/common.c:83 | entry_SYSCALL_64_after_hwframe+0x77/0x7f | | 8 locks held by syz-executor/24825: | #0: ffffa2e5031a2f48 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x5e/0x230 net/netfilter/x_tables.c:1243 | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0x24e/0x720 net/core/dev.c:5884 | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0xe1/0x290 net/ipv4/ip_input.c:232 | #3: ffffa2e5072201d8 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1677/0x1890 net/ipv4/tcp_ipv4.c:2346 | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: sock_def_readable+0x91/0x460 net/core/sock.c:3439 | #5: ffffa2e503a65858 (&ei->socket.wq.wait){..-.}-{2:2}, at: __wake_up_common_lock+0x26/0x110 kernel/sched/wait.c:105 | #6: ffffa2e506240930 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] | #6: ffffa2e506240930 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0x50/0x930 kernel/sched/core.c:4154 | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:593 [inline] | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1505 [inline] | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1804 [inline] | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: ttwu_queue kernel/sched/core.c:3942 [inline] | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: try_to_wake_up+0x4ce/0x930 kernel/sched/core.c:4270 | irq event stamp: 89470 | hardirqs last enabled at (89469): [<ffffffff978623b3>] seqcount_lockdep_reader_access include/linux/seqlock.h:74 [inline] | hardirqs last enabled at (89469): [<ffffffff978623b3>] ktime_get+0xf3/0x2c0 kernel/time/timekeeping.c:848 | hardirqs last disabled at (89470): [<ffffffff9971605e>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline] | hardirqs last disabled at (89470): [<ffffffff9971605e>] _raw_spin_lock_irqsave+0x4e/0xb0 kernel/locking/spinlock.c:162 | softirqs last enabled at (89413): [<ffffffff98f80605>] spin_unlock_bh include/linux/spinlock.h:396 [inline] | softirqs last enabled at (89413): [<ffffffff98f80605>] release_sock+0x105/0x130 net/core/sock.c:3635 | softirqs last disabled at (89462): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] | softirqs last disabled at (89462): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] | softirqs last disabled at (89462): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 | | assert no writes to 0xffffa2e506240068 of 4 bytes by task 16 on cpu 2: | __block_task kernel/sched/sched.h:2770 [inline] | dequeue_entities+0xde4/0xed0 kernel/sched/fair.c:7177 | pick_next_entity kernel/sched/fair.c:5627 [inline] | pick_task_fair kernel/sched/fair.c:8856 [inline] | pick_next_task_fair+0xaf/0x820 kernel/sched/fair.c:8876 | __pick_next_task kernel/sched/core.c:5955 [inline] | pick_next_task kernel/sched/core.c:6477 [inline] | __schedule+0x5e0/0x14a0 kernel/sched/core.c:6629 | __schedule_loop kernel/sched/core.c:6752 [inline] | schedule+0xe7/0x1a0 kernel/sched/core.c:6767 | schedule_timeout+0xaf/0x160 kernel/time/timer.c:2615 | rcu_gp_fqs_loop+0x2d8/0xd40 kernel/rcu/tree.c:2045 | rcu_gp_kthread+0x28/0x2f0 kernel/rcu/tree.c:2247 | kthread+0x1dd/0x220 kernel/kthread.c:389 | ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 | ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 | | 1 lock held by rcu_preempt/16: | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:593 [inline] | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1505 [inline] | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1804 [inline] | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x1f9/0x14a0 kernel/sched/core.c:6575 | irq event stamp: 791056 | hardirqs last enabled at (791055): [<ffffffff99716270>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline] | hardirqs last enabled at (791055): [<ffffffff99716270>] _raw_spin_unlock_irqrestore+0x30/0x60 kernel/locking/spinlock.c:194 | hardirqs last disabled at (791056): [<ffffffff99706ad7>] __schedule+0x1d7/0x14a0 kernel/sched/core.c:6555 | softirqs last enabled at (788540): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] | softirqs last enabled at (788540): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] | softirqs last enabled at (788540): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 | softirqs last disabled at (788529): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] | softirqs last disabled at (788529): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] | softirqs last disabled at (788529): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 | | Reported by Kernel Concurrency Sanitizer on: | CPU: 2 UID: 0 PID: 16 Comm: rcu_preempt Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #8 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 | ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-23 8:54 ` Marco Elver @ 2024-10-23 9:03 ` Marco Elver 2024-10-23 9:36 ` Peter Zijlstra 0 siblings, 1 reply; 13+ messages in thread From: Marco Elver @ 2024-10-23 9:03 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs On Wed, 23 Oct 2024 at 10:54, Marco Elver <elver@google.com> wrote: > > On Tue, Oct 22, 2024 at 09:57PM +0200, Marco Elver wrote: > > On Tue, 22 Oct 2024 at 21:12, Peter Zijlstra <peterz@infradead.org> wrote: > [...] > > > So KCSAn is trying to tell me these two paths run concurrently on the > > > same 'p' ?!? That would be a horrible bug -- both these call chains > > > should be holding rq->__lock (for task_rq(p)). > > > > Yes correct. > > > > And just to confirm this is no false positive, the way KCSAN works > > _requires_ the race to actually happen before it reports anything; > > this can also be seen in Alexander's report with just 1 stack trace > > where it saw the value transition from 0 to 1 (TASK_ON_RQ_QUEUED) but > > didn't know who did the write because kernel/sched was uninstrumented. > > Got another version of the splat with CONFIG_KCSAN_VERBOSE=y. Lockdep seems to > think that both threads here are holding rq->__lock. Gotta read more carefully, one instance is ffffa2e57dc2f398 another is ffffa2e57dd2f398. If I read it right, then they're not actually the same lock. > | ================================================================== > | BUG: KCSAN: assert: race in activate_task / dequeue_entities > | > | write (marked) to 0xffffa2e506240068 of 4 bytes by interrupt on cpu 0: > | activate_task+0xfb/0x130 kernel/sched/core.c:2064 > | ttwu_do_activate+0xee/0x2b0 kernel/sched/core.c:3671 > | ttwu_queue kernel/sched/core.c:3944 [inline] > | try_to_wake_up+0x509/0x930 kernel/sched/core.c:4270 > | default_wake_function+0x25/0x30 kernel/sched/core.c:7009 > | __pollwake fs/select.c:205 [inline] > | pollwake+0xc0/0x100 fs/select.c:215 > | __wake_up_common kernel/sched/wait.c:89 [inline] > | __wake_up_common_lock+0xc0/0x110 kernel/sched/wait.c:106 > | __wake_up_sync_key+0x1b/0x30 kernel/sched/wait.c:173 > | sock_def_readable+0x19a/0x460 net/core/sock.c:3442 > | tcp_data_ready+0x194/0x230 net/ipv4/tcp_input.c:5193 > | tcp_data_queue+0xfd8/0x26b0 net/ipv4/tcp_input.c:5283 > | tcp_rcv_established+0x8e5/0xeb0 net/ipv4/tcp_input.c:6237 > | tcp_v4_do_rcv+0x606/0x620 net/ipv4/tcp_ipv4.c:1915 > | tcp_v4_rcv+0x16e9/0x1890 net/ipv4/tcp_ipv4.c:2350 > | ip_protocol_deliver_rcu+0x382/0x6a0 net/ipv4/ip_input.c:205 > | ip_local_deliver_finish+0x1bf/0x290 net/ipv4/ip_input.c:233 > | NF_HOOK+0x274/0x2f0 include/linux/netfilter.h:314 > | ip_local_deliver net/ipv4/ip_input.c:254 [inline] > | dst_input include/net/dst.h:460 [inline] > | ip_sublist_rcv_finish+0x22a/0x2c0 net/ipv4/ip_input.c:580 > | ip_list_rcv_finish net/ipv4/ip_input.c:630 [inline] > | ip_sublist_rcv+0x2e3/0x560 net/ipv4/ip_input.c:638 > | ip_list_rcv+0x275/0x2b0 net/ipv4/ip_input.c:672 > | __netif_receive_skb_list_ptype net/core/dev.c:5709 [inline] > | __netif_receive_skb_list_core+0x4fc/0x520 net/core/dev.c:5756 > | __netif_receive_skb_list net/core/dev.c:5808 [inline] > | netif_receive_skb_list_internal+0x516/0x720 net/core/dev.c:5899 > | gro_normal_list include/net/gro.h:515 [inline] > | napi_complete_done+0x161/0x3c0 net/core/dev.c:6250 > | e1000_clean+0x7c7/0x1a70 drivers/net/ethernet/intel/e1000/e1000_main.c:3808 > | __napi_poll+0x6c/0x2c0 net/core/dev.c:6775 > | napi_poll net/core/dev.c:6844 [inline] > | net_rx_action+0x433/0x8e0 net/core/dev.c:6966 > | handle_softirqs+0x15c/0x450 kernel/softirq.c:554 > | __do_softirq kernel/softirq.c:588 [inline] > | invoke_softirq kernel/softirq.c:428 [inline] > | __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 > | irq_exit_rcu+0x9/0x30 kernel/softirq.c:649 > | common_interrupt+0xa7/0xc0 arch/x86/kernel/irq.c:278 > | asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693 > | finish_task_switch+0x11c/0x3e0 kernel/sched/core.c:5189 > | context_switch kernel/sched/core.c:5318 [inline] > | __schedule+0xf23/0x14a0 kernel/sched/core.c:6675 > | preempt_schedule_irq+0x9a/0xf0 kernel/sched/core.c:6997 > | raw_irqentry_exit_cond_resched+0x52/0x60 kernel/entry/common.c:311 > | irqentry_exit+0x6a/0xa0 kernel/entry/common.c:354 > | sysvec_apic_timer_interrupt+0x50/0xb0 arch/x86/kernel/apic/apic.c:1037 > | asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 > | __seqprop_sequence include/linux/seqlock.h:211 [inline] > | get_counters net/ipv6/netfilter/ip6_tables.c:774 [inline] > | alloc_counters net/ipv6/netfilter/ip6_tables.c:820 [inline] > | copy_entries_to_user net/ipv6/netfilter/ip6_tables.c:837 [inline] > | get_entries net/ipv6/netfilter/ip6_tables.c:1039 [inline] > | do_ip6t_get_ctl+0x665/0x9d0 net/ipv6/netfilter/ip6_tables.c:1677 > | nf_getsockopt+0x14e/0x170 net/netfilter/nf_sockopt.c:116 > | ipv6_getsockopt+0x13c/0x1b0 net/ipv6/ipv6_sockglue.c:1493 > | tcp_getsockopt+0x9e/0xd0 net/ipv4/tcp.c:4670 > | sock_common_getsockopt+0x5d/0x70 net/core/sock.c:3776 > | do_sock_getsockopt+0x18f/0x220 net/socket.c:2391 > | __sys_getsockopt+0xf1/0x170 net/socket.c:2420 > | __do_sys_getsockopt net/socket.c:2430 [inline] > | __se_sys_getsockopt net/socket.c:2427 [inline] > | __x64_sys_getsockopt+0x64/0x80 net/socket.c:2427 > | x64_sys_call+0x120e/0x2f00 arch/x86/include/generated/asm/syscalls_64.h:56 > | do_syscall_x64 arch/x86/entry/common.c:52 [inline] > | do_syscall_64+0xec/0x1d0 arch/x86/entry/common.c:83 > | entry_SYSCALL_64_after_hwframe+0x77/0x7f > | > | 8 locks held by syz-executor/24825: > | #0: ffffa2e5031a2f48 (&xt[i].mutex){+.+.}-{3:3}, at: xt_find_table_lock+0x5e/0x230 net/netfilter/x_tables.c:1243 > | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] > | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] > | #1: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb_list_internal+0x24e/0x720 net/core/dev.c:5884 > | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] > | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] > | #2: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0xe1/0x290 net/ipv4/ip_input.c:232 > | #3: ffffa2e5072201d8 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1677/0x1890 net/ipv4/tcp_ipv4.c:2346 > | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline] > | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline] > | #4: ffffffff9a357d18 (rcu_read_lock){....}-{1:2}, at: sock_def_readable+0x91/0x460 net/core/sock.c:3439 > | #5: ffffa2e503a65858 (&ei->socket.wq.wait){..-.}-{2:2}, at: __wake_up_common_lock+0x26/0x110 kernel/sched/wait.c:105 > | #6: ffffa2e506240930 (&p->pi_lock){-.-.}-{2:2}, at: class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] > | #6: ffffa2e506240930 (&p->pi_lock){-.-.}-{2:2}, at: try_to_wake_up+0x50/0x930 kernel/sched/core.c:4154 > | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:593 [inline] > | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1505 [inline] > | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1804 [inline] > | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: ttwu_queue kernel/sched/core.c:3942 [inline] > | #7: ffffa2e57dc2f398 (&rq->__lock){-.-.}-{2:2}, at: try_to_wake_up+0x4ce/0x930 kernel/sched/core.c:4270 > | irq event stamp: 89470 > | hardirqs last enabled at (89469): [<ffffffff978623b3>] seqcount_lockdep_reader_access include/linux/seqlock.h:74 [inline] > | hardirqs last enabled at (89469): [<ffffffff978623b3>] ktime_get+0xf3/0x2c0 kernel/time/timekeeping.c:848 > | hardirqs last disabled at (89470): [<ffffffff9971605e>] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline] > | hardirqs last disabled at (89470): [<ffffffff9971605e>] _raw_spin_lock_irqsave+0x4e/0xb0 kernel/locking/spinlock.c:162 > | softirqs last enabled at (89413): [<ffffffff98f80605>] spin_unlock_bh include/linux/spinlock.h:396 [inline] > | softirqs last enabled at (89413): [<ffffffff98f80605>] release_sock+0x105/0x130 net/core/sock.c:3635 > | softirqs last disabled at (89462): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] > | softirqs last disabled at (89462): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] > | softirqs last disabled at (89462): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 > | > | assert no writes to 0xffffa2e506240068 of 4 bytes by task 16 on cpu 2: > | __block_task kernel/sched/sched.h:2770 [inline] > | dequeue_entities+0xde4/0xed0 kernel/sched/fair.c:7177 > | pick_next_entity kernel/sched/fair.c:5627 [inline] > | pick_task_fair kernel/sched/fair.c:8856 [inline] > | pick_next_task_fair+0xaf/0x820 kernel/sched/fair.c:8876 > | __pick_next_task kernel/sched/core.c:5955 [inline] > | pick_next_task kernel/sched/core.c:6477 [inline] > | __schedule+0x5e0/0x14a0 kernel/sched/core.c:6629 > | __schedule_loop kernel/sched/core.c:6752 [inline] > | schedule+0xe7/0x1a0 kernel/sched/core.c:6767 > | schedule_timeout+0xaf/0x160 kernel/time/timer.c:2615 > | rcu_gp_fqs_loop+0x2d8/0xd40 kernel/rcu/tree.c:2045 > | rcu_gp_kthread+0x28/0x2f0 kernel/rcu/tree.c:2247 > | kthread+0x1dd/0x220 kernel/kthread.c:389 > | ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147 > | ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 > | > | 1 lock held by rcu_preempt/16: > | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested kernel/sched/core.c:593 [inline] > | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock kernel/sched/sched.h:1505 [inline] > | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1804 [inline] > | #0: ffffa2e57dd2f398 (&rq->__lock){-.-.}-{2:2}, at: __schedule+0x1f9/0x14a0 kernel/sched/core.c:6575 > | irq event stamp: 791056 > | hardirqs last enabled at (791055): [<ffffffff99716270>] __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:151 [inline] > | hardirqs last enabled at (791055): [<ffffffff99716270>] _raw_spin_unlock_irqrestore+0x30/0x60 kernel/locking/spinlock.c:194 > | hardirqs last disabled at (791056): [<ffffffff99706ad7>] __schedule+0x1d7/0x14a0 kernel/sched/core.c:6555 > | softirqs last enabled at (788540): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] > | softirqs last enabled at (788540): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] > | softirqs last enabled at (788540): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 > | softirqs last disabled at (788529): [<ffffffff976fb3f2>] __do_softirq kernel/softirq.c:588 [inline] > | softirqs last disabled at (788529): [<ffffffff976fb3f2>] invoke_softirq kernel/softirq.c:428 [inline] > | softirqs last disabled at (788529): [<ffffffff976fb3f2>] __irq_exit_rcu+0x72/0x110 kernel/softirq.c:637 > | > | Reported by Kernel Concurrency Sanitizer on: > | CPU: 2 UID: 0 PID: 16 Comm: rcu_preempt Not tainted 6.12.0-rc2-00003-g44423ac48780-dirty #8 > | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > | ================================================================== ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-23 9:03 ` Marco Elver @ 2024-10-23 9:36 ` Peter Zijlstra 2024-10-23 13:18 ` Marco Elver 2024-10-24 10:22 ` [tip: sched/urgent] sched: Fix pick_next_task_fair() vs try_to_wake_up() race tip-bot2 for Peter Zijlstra 0 siblings, 2 replies; 13+ messages in thread From: Peter Zijlstra @ 2024-10-23 9:36 UTC (permalink / raw) To: Marco Elver Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs, kent.overstreet On Wed, Oct 23, 2024 at 11:03:11AM +0200, Marco Elver wrote: > On Wed, 23 Oct 2024 at 10:54, Marco Elver <elver@google.com> wrote: > > > > On Tue, Oct 22, 2024 at 09:57PM +0200, Marco Elver wrote: > > > On Tue, 22 Oct 2024 at 21:12, Peter Zijlstra <peterz@infradead.org> wrote: > > [...] > > > > So KCSAn is trying to tell me these two paths run concurrently on the > > > > same 'p' ?!? That would be a horrible bug -- both these call chains > > > > should be holding rq->__lock (for task_rq(p)). > > > > > > Yes correct. > > > > > > And just to confirm this is no false positive, the way KCSAN works > > > _requires_ the race to actually happen before it reports anything; > > > this can also be seen in Alexander's report with just 1 stack trace > > > where it saw the value transition from 0 to 1 (TASK_ON_RQ_QUEUED) but > > > didn't know who did the write because kernel/sched was uninstrumented. > > > > Got another version of the splat with CONFIG_KCSAN_VERBOSE=y. Lockdep seems to > > think that both threads here are holding rq->__lock. > > Gotta read more carefully, one instance is ffffa2e57dc2f398 another is > ffffa2e57dd2f398. If I read it right, then they're not actually the > same lock. Yeah, as explained in the diagram below, the moment the ->on_rq = 0 store goes through, we no longer own the task. And since ASSERT_EXCLUSIVE_WRITER is after that, we go splat. The below patch changes this order and switches to using smp_store_release() and ensures to not reference the task after it. I've boot tested it, but not much else. Could you please give this a go (on top of -rc3)? This also explains the SCHED_WARN_ON() Kent saw, that is subject to the same race. --- kernel/sched/fair.c | 21 ++++++++++++++------- kernel/sched/sched.h | 34 ++++++++++++++++++++++++++++++++-- 2 files changed, 46 insertions(+), 9 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6512258dc71f..8edac978edb2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5625,8 +5625,9 @@ pick_next_entity(struct rq *rq, struct cfs_rq *cfs_rq) struct sched_entity *se = pick_eevdf(cfs_rq); if (se->sched_delayed) { dequeue_entities(rq, se, DEQUEUE_SLEEP | DEQUEUE_DELAYED); - SCHED_WARN_ON(se->sched_delayed); - SCHED_WARN_ON(se->on_rq); + /* + * Must not reference @se again, see __block_task(). + */ return NULL; } return se; @@ -7170,7 +7171,11 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags) /* Fix-up what dequeue_task_fair() skipped */ hrtick_update(rq); - /* Fix-up what block_task() skipped. */ + /* + * Fix-up what block_task() skipped. + * + * Must be last, @p might not be valid after this. + */ __block_task(rq, p); } @@ -7187,12 +7192,14 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & DEQUEUE_SAVE)))) util_est_dequeue(&rq->cfs, p); - if (dequeue_entities(rq, &p->se, flags) < 0) { - util_est_update(&rq->cfs, p, DEQUEUE_SLEEP); + util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP); + if (dequeue_entities(rq, &p->se, flags) < 0) return false; - } - util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP); + /* + * Must not reference @p after dequeue_entities(DEQUEUE_DELAYED). + */ + hrtick_update(rq); return true; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 7b139016cbd9..32e9c41b7ec0 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2791,8 +2791,6 @@ static inline void sub_nr_running(struct rq *rq, unsigned count) static inline void __block_task(struct rq *rq, struct task_struct *p) { - WRITE_ONCE(p->on_rq, 0); - ASSERT_EXCLUSIVE_WRITER(p->on_rq); if (p->sched_contributes_to_load) rq->nr_uninterruptible++; @@ -2800,6 +2798,38 @@ static inline void __block_task(struct rq *rq, struct task_struct *p) atomic_inc(&rq->nr_iowait); delayacct_blkio_start(); } + + ASSERT_EXCLUSIVE_WRITER(p->on_rq); + + /* + * The moment this write goes through, ttwu() can swoop in and migrate + * this task, rendering our rq->__lock ineffective. + * + * __schedule() try_to_wake_up() + * LOCK rq->__lock LOCK p->pi_lock + * pick_next_task() + * pick_next_task_fair() + * pick_next_entity() + * dequeue_entities() + * __block_task() + * RELEASE p->on_rq = 0; if (p->on_rq && ...) + * break; + * + * ACQUIRE (after ctrl-dep) + * + * cpu = select_task_rq(); + * set_task_cpu(p, cpu); + * ttwu_queue() + * ttwu_do_activate() + * LOCK rq->__lock + * activate_task() + * STORE p->on_rq = 1 + * UNLOCK rq->__lock + * + * Callers must ensure to not reference @p after this -- we no longer + * own it. + */ + smp_store_release(&p->on_rq, 0); } extern void activate_task(struct rq *rq, struct task_struct *p, int flags); ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities 2024-10-23 9:36 ` Peter Zijlstra @ 2024-10-23 13:18 ` Marco Elver 2024-10-24 10:22 ` [tip: sched/urgent] sched: Fix pick_next_task_fair() vs try_to_wake_up() race tip-bot2 for Peter Zijlstra 1 sibling, 0 replies; 13+ messages in thread From: Marco Elver @ 2024-10-23 13:18 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, Alexander Potapenko, syzbot, audit, eparis, linux-kernel, paul, syzkaller-bugs, kent.overstreet On Wed, 23 Oct 2024 at 11:36, Peter Zijlstra <peterz@infradead.org> wrote: > On Wed, Oct 23, 2024 at 11:03:11AM +0200, Marco Elver wrote: > > On Wed, 23 Oct 2024 at 10:54, Marco Elver <elver@google.com> wrote: > > > On Tue, Oct 22, 2024 at 09:57PM +0200, Marco Elver wrote: > > > > On Tue, 22 Oct 2024 at 21:12, Peter Zijlstra <peterz@infradead.org> wrote: > > > [...] > > > > > So KCSAn is trying to tell me these two paths run concurrently on the > > > > > same 'p' ?!? That would be a horrible bug -- both these call chains > > > > > should be holding rq->__lock (for task_rq(p)). > > > > > > > > Yes correct. > > > > > > > > And just to confirm this is no false positive, the way KCSAN works > > > > _requires_ the race to actually happen before it reports anything; > > > > this can also be seen in Alexander's report with just 1 stack trace > > > > where it saw the value transition from 0 to 1 (TASK_ON_RQ_QUEUED) but > > > > didn't know who did the write because kernel/sched was uninstrumented. > > > > > > Got another version of the splat with CONFIG_KCSAN_VERBOSE=y. Lockdep seems to > > > think that both threads here are holding rq->__lock. > > > > Gotta read more carefully, one instance is ffffa2e57dc2f398 another is > > ffffa2e57dd2f398. If I read it right, then they're not actually the > > same lock. > > Yeah, as explained in the diagram below, the moment the ->on_rq = 0 > store goes through, we no longer own the task. And since > ASSERT_EXCLUSIVE_WRITER is after that, we go splat. > > The below patch changes this order and switches to using > smp_store_release() and ensures to not reference the task after it. > > I've boot tested it, but not much else. > > Could you please give this a go (on top of -rc3)? > > This also explains the SCHED_WARN_ON() Kent saw, that is subject to the > same race. > > --- > kernel/sched/fair.c | 21 ++++++++++++++------- > kernel/sched/sched.h | 34 ++++++++++++++++++++++++++++++++-- > 2 files changed, 46 insertions(+), 9 deletions(-) [...] Tested-by: Marco Elver <elver@google.com> Previously syzkaller would give us that report within ~1h of fuzzing. Have been fuzzing with your patch applied for 3h now, and this report has not resurfaced. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip: sched/urgent] sched: Fix pick_next_task_fair() vs try_to_wake_up() race 2024-10-23 9:36 ` Peter Zijlstra 2024-10-23 13:18 ` Marco Elver @ 2024-10-24 10:22 ` tip-bot2 for Peter Zijlstra 1 sibling, 0 replies; 13+ messages in thread From: tip-bot2 for Peter Zijlstra @ 2024-10-24 10:22 UTC (permalink / raw) To: linux-tip-commits Cc: syzbot+0ec1e96c2cdf5c0e512a, Kent Overstreet, Peter Zijlstra (Intel), Marco Elver, x86, linux-kernel The following commit has been merged into the sched/urgent branch of tip: Commit-ID: b55945c500c5723992504aa03b362fab416863a6 Gitweb: https://git.kernel.org/tip/b55945c500c5723992504aa03b362fab416863a6 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Wed, 23 Oct 2024 11:36:41 +02:00 Committer: Peter Zijlstra <peterz@infradead.org> CommitterDate: Wed, 23 Oct 2024 20:52:26 +02:00 sched: Fix pick_next_task_fair() vs try_to_wake_up() race Syzkaller robot reported KCSAN tripping over the ASSERT_EXCLUSIVE_WRITER(p->on_rq) in __block_task(). The report noted that both pick_next_task_fair() and try_to_wake_up() were concurrently trying to write to the same p->on_rq, violating the assertion -- even though both paths hold rq->__lock. The logical consequence is that both code paths end up holding a different rq->__lock. And looking through ttwu(), this is possible when the __block_task() 'p->on_rq = 0' store is visible to the ttwu() 'p->on_rq' load, which then assumes the task is not queued and continues to migrate it. Rearrange things such that __block_task() releases @p with the store and no code thereafter will use @p again. Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue") Reported-by: syzbot+0ec1e96c2cdf5c0e512a@syzkaller.appspotmail.com Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20241023093641.GE16066@noisy.programming.kicks-ass.net --- kernel/sched/fair.c | 21 ++++++++++++++------- kernel/sched/sched.h | 34 ++++++++++++++++++++++++++++++++-- 2 files changed, 46 insertions(+), 9 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c157d48..8796146 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5625,8 +5625,9 @@ pick_next_entity(struct rq *rq, struct cfs_rq *cfs_rq) struct sched_entity *se = pick_eevdf(cfs_rq); if (se->sched_delayed) { dequeue_entities(rq, se, DEQUEUE_SLEEP | DEQUEUE_DELAYED); - SCHED_WARN_ON(se->sched_delayed); - SCHED_WARN_ON(se->on_rq); + /* + * Must not reference @se again, see __block_task(). + */ return NULL; } return se; @@ -7176,7 +7177,11 @@ static int dequeue_entities(struct rq *rq, struct sched_entity *se, int flags) /* Fix-up what dequeue_task_fair() skipped */ hrtick_update(rq); - /* Fix-up what block_task() skipped. */ + /* + * Fix-up what block_task() skipped. + * + * Must be last, @p might not be valid after this. + */ __block_task(rq, p); } @@ -7193,12 +7198,14 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & DEQUEUE_SAVE)))) util_est_dequeue(&rq->cfs, p); - if (dequeue_entities(rq, &p->se, flags) < 0) { - util_est_update(&rq->cfs, p, DEQUEUE_SLEEP); + util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP); + if (dequeue_entities(rq, &p->se, flags) < 0) return false; - } - util_est_update(&rq->cfs, p, flags & DEQUEUE_SLEEP); + /* + * Must not reference @p after dequeue_entities(DEQUEUE_DELAYED). + */ + hrtick_update(rq); return true; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 081519f..9f9d1cc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2769,8 +2769,6 @@ static inline void sub_nr_running(struct rq *rq, unsigned count) static inline void __block_task(struct rq *rq, struct task_struct *p) { - WRITE_ONCE(p->on_rq, 0); - ASSERT_EXCLUSIVE_WRITER(p->on_rq); if (p->sched_contributes_to_load) rq->nr_uninterruptible++; @@ -2778,6 +2776,38 @@ static inline void __block_task(struct rq *rq, struct task_struct *p) atomic_inc(&rq->nr_iowait); delayacct_blkio_start(); } + + ASSERT_EXCLUSIVE_WRITER(p->on_rq); + + /* + * The moment this write goes through, ttwu() can swoop in and migrate + * this task, rendering our rq->__lock ineffective. + * + * __schedule() try_to_wake_up() + * LOCK rq->__lock LOCK p->pi_lock + * pick_next_task() + * pick_next_task_fair() + * pick_next_entity() + * dequeue_entities() + * __block_task() + * RELEASE p->on_rq = 0 if (p->on_rq && ...) + * break; + * + * ACQUIRE (after ctrl-dep) + * + * cpu = select_task_rq(); + * set_task_cpu(p, cpu); + * ttwu_queue() + * ttwu_do_activate() + * LOCK rq->__lock + * activate_task() + * STORE p->on_rq = 1 + * UNLOCK rq->__lock + * + * Callers must ensure to not reference @p after this -- we no longer + * own it. + */ + smp_store_release(&p->on_rq, 0); } extern void activate_task(struct rq *rq, struct task_struct *p, int flags); ^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-10-24 10:22 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-27 14:57 [syzbot] [kernel?] KCSAN: assert: race in dequeue_entities syzbot 2024-10-22 8:06 ` Alexander Potapenko 2024-10-22 11:31 ` Peter Zijlstra 2024-10-22 13:40 ` Marco Elver 2024-10-22 14:31 ` Paul E. McKenney 2024-10-23 14:13 ` Paul E. McKenney 2024-10-22 19:12 ` Peter Zijlstra 2024-10-22 19:57 ` Marco Elver 2024-10-23 8:54 ` Marco Elver 2024-10-23 9:03 ` Marco Elver 2024-10-23 9:36 ` Peter Zijlstra 2024-10-23 13:18 ` Marco Elver 2024-10-24 10:22 ` [tip: sched/urgent] sched: Fix pick_next_task_fair() vs try_to_wake_up() race tip-bot2 for Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox