[PATCH] drm/sched: Prevent stopped entities from being added to the run queue.

dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
@ 2025-07-20 23:56 James Flowers
  2025-07-21  7:52 ` Philipp Stanner
  0 siblings, 1 reply; 17+ messages in thread
From: James Flowers @ 2025-07-20 23:56 UTC (permalink / raw)
  To: matthew.brost, dakr, phasta, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan
  Cc: James Flowers, dri-devel, linux-kernel, linux-kernel-mentees

Fixes an issue where entities are added to the run queue in
drm_sched_rq_update_fifo_locked after being killed, causing a
slab-use-after-free error.

Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
---
This issue was detected by syzkaller running on a Steam Deck OLED.
Unfortunately I don't have a reproducer for it. I've
included the KASAN reports below:

==================================================================
BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0-flowejam-+ #1
Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
 print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
 print_report+0xfc/0x1ff mm/kasan/report.c:521
 kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
 rb_next+0xda/0x160 lib/rbtree.c:505
 drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/sched_main.c:332 [inline] [gpu_sched]
 drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/sched_main.c:1081 [gpu_sched]
 drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/sched_main.c:1206 [gpu_sched]
 process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
 process_scheduled_works kernel/workqueue.c:3319 [inline]
 worker_thread+0x734/0x1060 kernel/workqueue.c:3400
 kthread+0x3fd/0x810 kernel/kthread.c:464
 ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
Allocated by task 73472:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
 kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
 kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
 amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
 drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
 drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
 drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
 drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
 chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
 do_dentry_open+0x743/0x1bf0 fs/open.c:956
 vfs_open+0x87/0x3f0 fs/open.c:1086
 do_open+0x72f/0xf80 fs/namei.c:3830
 path_openat+0x2ec/0x770 fs/namei.c:3989
 do_filp_open+0x1ff/0x420 fs/namei.c:4016
 do_sys_openat2+0x181/0x1e0 fs/open.c:1428
 do_sys_open fs/open.c:1443 [inline]
 __do_sys_openat fs/open.c:1459 [inline]
 __se_sys_openat fs/open.c:1454 [inline]
 __x64_sys_openat+0x149/0x210 fs/open.c:1454
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 73472:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2353 [inline]
 slab_free mm/slub.c:4609 [inline]
 kfree+0x14f/0x4d0 mm/slub.c:4757
 amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 get_signal+0x1be/0x19d0 kernel/signal.c:2809
 arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff888180508000
The buggy address is located 1504 bytes inside of
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x180508
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
head: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
==================================================================
BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/rbtree.c:443
Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0-flowejam-+ #1
Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
 print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
 print_report+0xfc/0x1ff mm/kasan/report.c:521
 kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
 rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
 __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
 rb_erase+0x157c/0x1b10 lib/rbtree.c:443
 rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
 drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
 drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
 drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
 drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
 drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
 amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
 amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
 amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x841/0xf60 kernel/exit.c:938
 do_group_exit+0xda/0x2b0 kernel/exit.c:1087
 get_signal+0x171f/0x19d0 kernel/signal.c:3036
 arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f2d90da36ed
Code: Unable to access opcode bytes at 0x7f2d90da36c3.
RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
 </TASK>
Allocated by task 12381:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
 kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
 kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
 amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
 drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
 drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
 drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
 drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
 chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
 do_dentry_open+0x743/0x1bf0 fs/open.c:956
 vfs_open+0x87/0x3f0 fs/open.c:1086
 do_open+0x72f/0xf80 fs/namei.c:3830
 path_openat+0x2ec/0x770 fs/namei.c:3989
 do_filp_open+0x1ff/0x420 fs/namei.c:4016
 do_sys_openat2+0x181/0x1e0 fs/open.c:1428
 do_sys_open fs/open.c:1443 [inline]
 __do_sys_openat fs/open.c:1459 [inline]
 __se_sys_openat fs/open.c:1454 [inline]
 __x64_sys_openat+0x149/0x210 fs/open.c:1454
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 12381:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2353 [inline]
 slab_free mm/slub.c:4609 [inline]
 kfree+0x14f/0x4d0 mm/slub.c:4757
 amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 get_signal+0x1be/0x19d0 kernel/signal.c:2809
 arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff88816414c000
The buggy address is located 1488 bytes inside of
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x164148
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                 ^
 ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
==================================================================
BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0-flowejam-+ #1
Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
 print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
 print_report+0xfc/0x1ff mm/kasan/report.c:521
 kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
 __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
 rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
 rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
 drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
 drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
 drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
 drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
 drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
 amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
 amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
 amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fd23eba36ed
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
 </TASK>
Allocated by task 6559:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
 kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
 kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
 amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
 drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
 drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
 drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
 drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
 chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
 do_dentry_open+0x743/0x1bf0 fs/open.c:956
 vfs_open+0x87/0x3f0 fs/open.c:1086
 do_open+0x72f/0xf80 fs/namei.c:3830
 path_openat+0x2ec/0x770 fs/namei.c:3989
 do_filp_open+0x1ff/0x420 fs/namei.c:4016
 do_sys_openat2+0x181/0x1e0 fs/open.c:1428
 do_sys_open fs/open.c:1443 [inline]
 __do_sys_openat fs/open.c:1459 [inline]
 __se_sys_openat fs/open.c:1454 [inline]
 __x64_sys_openat+0x149/0x210 fs/open.c:1454
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 6559:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2353 [inline]
 slab_free mm/slub.c:4609 [inline]
 kfree+0x14f/0x4d0 mm/slub.c:4757
 amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 get_signal+0x1be/0x19d0 kernel/signal.c:2809
 arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
 exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff88812ebcc000
The buggy address is located 1504 bytes inside of
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12ebc8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                       ^
 ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
==================================================================
BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
BUG: KASAN: slab-use-after-free in drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0-flowejam-+ #1
Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
 print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
 print_report+0xfc/0x1ff mm/kasan/report.c:521
 kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
 drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
 rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
 drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
 drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/sched_entity.c:623 [gpu_sched]
 amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:314 [amdgpu]
 amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c:122 [amdgpu]
 amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:422 [amdgpu]
 amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2609 [amdgpu]
 amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1418 [amdgpu]
 drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
 drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
 drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
 drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
 chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
 do_dentry_open+0x743/0x1bf0 fs/open.c:956
 vfs_open+0x87/0x3f0 fs/open.c:1086
 do_open+0x72f/0xf80 fs/namei.c:3830
 path_openat+0x2ec/0x770 fs/namei.c:3989
 do_filp_open+0x1ff/0x420 fs/namei.c:4016
 do_sys_openat2+0x181/0x1e0 fs/open.c:1428
 do_sys_open fs/open.c:1443 [inline]
 __do_sys_openat fs/open.c:1459 [inline]
 __se_sys_openat fs/open.c:1454 [inline]
 __x64_sys_openat+0x149/0x210 fs/open.c:1454
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7feb303a36ed
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
 </TASK>
Allocated by task 146638:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
 kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
 kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
 amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
 drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
 drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
 drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
 drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
 chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
 do_dentry_open+0x743/0x1bf0 fs/open.c:956
 vfs_open+0x87/0x3f0 fs/open.c:1086
 do_open+0x72f/0xf80 fs/namei.c:3830
 path_openat+0x2ec/0x770 fs/namei.c:3989
 do_filp_open+0x1ff/0x420 fs/namei.c:4016
 do_sys_openat2+0x181/0x1e0 fs/open.c:1428
 do_sys_open fs/open.c:1443 [inline]
 __do_sys_openat fs/open.c:1459 [inline]
 __se_sys_openat fs/open.c:1454 [inline]
 __x64_sys_openat+0x149/0x210 fs/open.c:1454
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 146638:
 kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2353 [inline]
 slab_free mm/slub.c:4609 [inline]
 kfree+0x14f/0x4d0 mm/slub.c:4757
 amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
 drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
 drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
 drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
 drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
 __fput+0x402/0xb50 fs/file_table.c:464
 task_work_run+0x155/0x250 kernel/task_work.c:227
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
 exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
 __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
 syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
 do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff888120844000
The buggy address is located 1480 bytes inside of
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120840
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
page_type: f5(slab)
raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 0000000000000000
head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
 ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

 drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index bfea608a7106..997a2cc1a635 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
 
 	entity->oldest_job_waiting = ts;
 
-	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
-		      drm_sched_entity_compare_before);
+	if (!entity->stopped) {
+		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
+			      drm_sched_entity_compare_before);
+	}
 }
 
 /**
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-20 23:56 [PATCH] drm/sched: Prevent stopped entities from being added to the run queue James Flowers
@ 2025-07-21  7:52 ` Philipp Stanner
  2025-07-21  8:16   ` Philipp Stanner
  2025-08-14 10:42   ` Tvrtko Ursulin
  0 siblings, 2 replies; 17+ messages in thread
From: Philipp Stanner @ 2025-07-21  7:52 UTC (permalink / raw)
  To: James Flowers, matthew.brost, dakr, phasta,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

+Cc Tvrtko, who's currently reworking FIFO and RR.

On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> Fixes an issue where entities are added to the run queue in
> drm_sched_rq_update_fifo_locked after being killed, causing a
> slab-use-after-free error.
> 
> Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
> ---
> This issue was detected by syzkaller running on a Steam Deck OLED.
> Unfortunately I don't have a reproducer for it. I've

Well, now that's kind of an issue – if you don't have a reproducer, how
can you know that your patch is correct? How can we?

It would certainly be good to know what the fuzz testing framework
does.

> included the KASAN reports below:


Anyways, KASAN reports look interesting. But those might be many
different issues. Again, would be good to know what the fuzzer has been
testing. Can you maybe split this fuzz test into sub-tests? I suspsect
those might be different faults.


Anyways, taking a first look…


> 
> ==================================================================
> BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
> Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
> CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0-flowejam-+ #1
> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:94 [inline]
>  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>  print_report+0xfc/0x1ff mm/kasan/report.c:521
>  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>  rb_next+0xda/0x160 lib/rbtree.c:505
>  drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/sched_main.c:332 [inline] [gpu_sched]
>  drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/sched_main.c:1081 [gpu_sched]
>  drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/sched_main.c:1206 [gpu_sched]
>  process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
>  process_scheduled_works kernel/workqueue.c:3319 [inline]
>  worker_thread+0x734/0x1060 kernel/workqueue.c:3400
>  kthread+0x3fd/0x810 kernel/kthread.c:464
>  ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>  </TASK>
> Allocated by task 73472:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>  vfs_open+0x87/0x3f0 fs/open.c:1086
>  do_open+0x72f/0xf80 fs/namei.c:3830
>  path_openat+0x2ec/0x770 fs/namei.c:3989
>  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>  do_sys_open fs/open.c:1443 [inline]
>  __do_sys_openat fs/open.c:1459 [inline]
>  __se_sys_openat fs/open.c:1454 [inline]
>  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Freed by task 73472:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>  poison_slab_object mm/kasan/common.c:247 [inline]
>  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>  kasan_slab_free include/linux/kasan.h:233 [inline]
>  slab_free_hook mm/slub.c:2353 [inline]
>  slab_free mm/slub.c:4609 [inline]
>  kfree+0x14f/0x4d0 mm/slub.c:4757
>  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> The buggy address belongs to the object at ffff888180508000
> The buggy address is located 1504 bytes inside of
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x180508
> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> page_type: f5(slab)
> raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
> head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 0000000000000000
> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> Memory state around the buggy address:
>  ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                        ^
>  ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> ==================================================================
> BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
> BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/rbtree.c:443
> Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
> CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0-flowejam-+ #1
> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:94 [inline]
>  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>  print_report+0xfc/0x1ff mm/kasan/report.c:521
>  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>  rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>  __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>  rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  exit_task_work include/linux/task_work.h:40 [inline]
>  do_exit+0x841/0xf60 kernel/exit.c:938
>  do_group_exit+0xda/0x2b0 kernel/exit.c:1087
>  get_signal+0x171f/0x19d0 kernel/signal.c:3036
>  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f2d90da36ed
> Code: Unable to access opcode bytes at 0x7f2d90da36c3.
> RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
> RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
> R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
>  </TASK>
> Allocated by task 12381:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>  vfs_open+0x87/0x3f0 fs/open.c:1086
>  do_open+0x72f/0xf80 fs/namei.c:3830
>  path_openat+0x2ec/0x770 fs/namei.c:3989
>  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>  do_sys_open fs/open.c:1443 [inline]
>  __do_sys_openat fs/open.c:1459 [inline]
>  __se_sys_openat fs/open.c:1454 [inline]
>  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Freed by task 12381:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>  poison_slab_object mm/kasan/common.c:247 [inline]
>  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>  kasan_slab_free include/linux/kasan.h:233 [inline]
>  slab_free_hook mm/slub.c:2353 [inline]
>  slab_free mm/slub.c:4609 [inline]
>  kfree+0x14f/0x4d0 mm/slub.c:4757
>  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> The buggy address belongs to the object at ffff88816414c000
> The buggy address is located 1488 bytes inside of
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x164148
> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> page_type: f5(slab)
> raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
> head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 0000000000000000
> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> Memory state around the buggy address:
>  ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                  ^
>  ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> ==================================================================
> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
> BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
> Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
> CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0-flowejam-+ #1
> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:94 [inline]
>  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>  print_report+0xfc/0x1ff mm/kasan/report.c:521
>  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>  __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>  rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7fd23eba36ed
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
> RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
> RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
> RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
> R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
> R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
>  </TASK>
> Allocated by task 6559:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>  vfs_open+0x87/0x3f0 fs/open.c:1086
>  do_open+0x72f/0xf80 fs/namei.c:3830
>  path_openat+0x2ec/0x770 fs/namei.c:3989
>  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>  do_sys_open fs/open.c:1443 [inline]
>  __do_sys_openat fs/open.c:1459 [inline]
>  __se_sys_openat fs/open.c:1454 [inline]
>  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Freed by task 6559:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>  poison_slab_object mm/kasan/common.c:247 [inline]
>  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>  kasan_slab_free include/linux/kasan.h:233 [inline]
>  slab_free_hook mm/slub.c:2353 [inline]
>  slab_free mm/slub.c:4609 [inline]
>  kfree+0x14f/0x4d0 mm/slub.c:4757
>  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> The buggy address belongs to the object at ffff88812ebcc000
> The buggy address is located 1504 bytes inside of
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12ebc8
> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> page_type: f5(slab)
> raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
> head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 0000000000000000
> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> Memory state around the buggy address:
>  ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                        ^
>  ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> ==================================================================
> BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
> BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
> BUG: KASAN: slab-use-after-free in drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
> Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
> CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0-flowejam-+ #1
> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:94 [inline]
>  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>  print_report+0xfc/0x1ff mm/kasan/report.c:521
>  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>  drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>  rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>  drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
>  drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/sched_entity.c:623 [gpu_sched]

This might be a race between entity killing and the push_job. Let's
look at your patch below…

>  amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:314 [amdgpu]
>  amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c:122 [amdgpu]
>  amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:422 [amdgpu]
>  amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2609 [amdgpu]
>  amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1418 [amdgpu]
>  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>  vfs_open+0x87/0x3f0 fs/open.c:1086
>  do_open+0x72f/0xf80 fs/namei.c:3830
>  path_openat+0x2ec/0x770 fs/namei.c:3989
>  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>  do_sys_open fs/open.c:1443 [inline]
>  __do_sys_openat fs/open.c:1459 [inline]
>  __se_sys_openat fs/open.c:1454 [inline]
>  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7feb303a36ed
> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
> RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
> RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
> RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
>  </TASK>
> Allocated by task 146638:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>  vfs_open+0x87/0x3f0 fs/open.c:1086
>  do_open+0x72f/0xf80 fs/namei.c:3830
>  path_openat+0x2ec/0x770 fs/namei.c:3989
>  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>  do_sys_open fs/open.c:1443 [inline]
>  __do_sys_openat fs/open.c:1459 [inline]
>  __se_sys_openat fs/open.c:1454 [inline]
>  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Freed by task 146638:
>  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>  poison_slab_object mm/kasan/common.c:247 [inline]
>  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>  kasan_slab_free include/linux/kasan.h:233 [inline]
>  slab_free_hook mm/slub.c:2353 [inline]
>  slab_free mm/slub.c:4609 [inline]
>  kfree+0x14f/0x4d0 mm/slub.c:4757
>  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>  __fput+0x402/0xb50 fs/file_table.c:464
>  task_work_run+0x155/0x250 kernel/task_work.c:227
>  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> The buggy address belongs to the object at ffff888120844000
> The buggy address is located 1480 bytes inside of
> The buggy address belongs to the physical page:
> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120840
> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> page_type: f5(slab)
> raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
> head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 0000000000000000
> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
> Memory state around the buggy address:
>  ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                               ^
>  ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
> 
>  drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index bfea608a7106..997a2cc1a635 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
>  
>  	entity->oldest_job_waiting = ts;
>  
> -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> -		      drm_sched_entity_compare_before);
> +	if (!entity->stopped) {
> +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> +			      drm_sched_entity_compare_before);
> +	}

If this is a race, then this patch here is broken, too, because you're
checking the 'stopped' boolean as the callers of that function do, too
– just later. :O

Could still race, just less likely.

The proper way to fix it would then be to address the issue where the
locking is supposed to happen. Let's look at, for example,
drm_sched_entity_push_job():


void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
{
	(Bla bla bla)

 	…………

	/* first job wakes up scheduler */
	if (first) {
		struct drm_gpu_scheduler *sched;
		struct drm_sched_rq *rq;

		/* Add the entity to the run queue */
		spin_lock(&entity->lock);
		if (entity->stopped) {                  <---- Aha!
			spin_unlock(&entity->lock);

			DRM_ERROR("Trying to push to a killed entity\n");
			return;
		}

		rq = entity->rq;
		sched = rq->sched;

		spin_lock(&rq->lock);
		drm_sched_rq_add_entity(rq, entity);

		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!

		spin_unlock(&rq->lock);
		spin_unlock(&entity->lock);

But the locks are still being hold. So that "shouldn't be happening"(tm).

Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
stop entities. The former holds appropriate locks, but drm_sched_fini()
doesn't. So that looks like a hot candidate to me. Opinions?

On the other hand, aren't drivers prohibited from calling
drm_sched_entity_push_job() after calling drm_sched_fini()? If the
fuzzer does that, then it's not the scheduler's fault.

Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?

Would be cool if Tvrtko and Christian take a look. Maybe we even have a
fundamental design issue.


Regards
P.


>  }
>  
>  /**


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21  7:52 ` Philipp Stanner
@ 2025-07-21  8:16   ` Philipp Stanner
  2025-07-21 10:14     ` Danilo Krummrich
  2025-07-22 20:05     ` James
  2025-08-14 10:42   ` Tvrtko Ursulin
  1 sibling, 2 replies; 17+ messages in thread
From: Philipp Stanner @ 2025-07-21  8:16 UTC (permalink / raw)
  To: phasta, James Flowers, matthew.brost, dakr,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> +Cc Tvrtko, who's currently reworking FIFO and RR.
> 
> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > Fixes an issue where entities are added to the run queue in
> > drm_sched_rq_update_fifo_locked after being killed, causing a
> > slab-use-after-free error.
> > 
> > Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
> > ---
> > This issue was detected by syzkaller running on a Steam Deck OLED.
> > Unfortunately I don't have a reproducer for it. I've
> 
> Well, now that's kind of an issue – if you don't have a reproducer, how
> can you know that your patch is correct? How can we?
> 
> It would certainly be good to know what the fuzz testing framework
> does.
> 
> > included the KASAN reports below:
> 
> 
> Anyways, KASAN reports look interesting. But those might be many
> different issues. Again, would be good to know what the fuzzer has been
> testing. Can you maybe split this fuzz test into sub-tests? I suspsect
> those might be different faults.
> 
> 
> Anyways, taking a first look…
> 
> 
> > 
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
> > Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
> > CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0-flowejam-+ #1
> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> > Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:94 [inline]
> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
> >  rb_next+0xda/0x160 lib/rbtree.c:505
> >  drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/sched_main.c:332 [inline] [gpu_sched]
> >  drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/sched_main.c:1081 [gpu_sched]
> >  drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/sched_main.c:1206 [gpu_sched]
> >  process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
> >  process_scheduled_works kernel/workqueue.c:3319 [inline]
> >  worker_thread+0x734/0x1060 kernel/workqueue.c:3400
> >  kthread+0x3fd/0x810 kernel/kthread.c:464
> >  ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> >  </TASK>
> > Allocated by task 73472:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
> >  vfs_open+0x87/0x3f0 fs/open.c:1086
> >  do_open+0x72f/0xf80 fs/namei.c:3830
> >  path_openat+0x2ec/0x770 fs/namei.c:3989
> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
> >  do_sys_open fs/open.c:1443 [inline]
> >  __do_sys_openat fs/open.c:1459 [inline]
> >  __se_sys_openat fs/open.c:1454 [inline]
> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > Freed by task 73472:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
> >  poison_slab_object mm/kasan/common.c:247 [inline]
> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
> >  kasan_slab_free include/linux/kasan.h:233 [inline]
> >  slab_free_hook mm/slub.c:2353 [inline]
> >  slab_free mm/slub.c:4609 [inline]
> >  kfree+0x14f/0x4d0 mm/slub.c:4757
> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > The buggy address belongs to the object at ffff888180508000
> > The buggy address is located 1504 bytes inside of
> > The buggy address belongs to the physical page:
> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x180508
> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> > page_type: f5(slab)
> > raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
> > raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
> > head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 0000000000000000
> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> > Memory state around the buggy address:
> >  ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                                                        ^
> >  ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
> > BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
> > BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/rbtree.c:443
> > Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
> > CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0-flowejam-+ #1
> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:94 [inline]
> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
> >  rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
> >  __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
> >  rb_erase+0x157c/0x1b10 lib/rbtree.c:443
> >  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
> >  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
> >  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
> >  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
> >  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
> >  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
> >  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
> >  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
> >  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  exit_task_work include/linux/task_work.h:40 [inline]
> >  do_exit+0x841/0xf60 kernel/exit.c:938
> >  do_group_exit+0xda/0x2b0 kernel/exit.c:1087
> >  get_signal+0x171f/0x19d0 kernel/signal.c:3036
> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > RIP: 0033:0x7f2d90da36ed
> > Code: Unable to access opcode bytes at 0x7f2d90da36c3.
> > RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> > RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
> > RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
> > RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
> > R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
> >  </TASK>
> > Allocated by task 12381:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
> >  vfs_open+0x87/0x3f0 fs/open.c:1086
> >  do_open+0x72f/0xf80 fs/namei.c:3830
> >  path_openat+0x2ec/0x770 fs/namei.c:3989
> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
> >  do_sys_open fs/open.c:1443 [inline]
> >  __do_sys_openat fs/open.c:1459 [inline]
> >  __se_sys_openat fs/open.c:1454 [inline]
> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > Freed by task 12381:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
> >  poison_slab_object mm/kasan/common.c:247 [inline]
> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
> >  kasan_slab_free include/linux/kasan.h:233 [inline]
> >  slab_free_hook mm/slub.c:2353 [inline]
> >  slab_free mm/slub.c:4609 [inline]
> >  kfree+0x14f/0x4d0 mm/slub.c:4757
> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > The buggy address belongs to the object at ffff88816414c000
> > The buggy address is located 1488 bytes inside of
> > The buggy address belongs to the physical page:
> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x164148
> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> > page_type: f5(slab)
> > raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
> > raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
> > head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 0000000000000000
> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> > Memory state around the buggy address:
> >  ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                                                  ^
> >  ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
> > BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
> > Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
> > CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0-flowejam-+ #1
> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:94 [inline]
> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
> >  __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
> >  rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
> >  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
> >  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
> >  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
> >  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
> >  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
> >  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
> >  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
> >  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
> >  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
> >  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > RIP: 0033:0x7fd23eba36ed
> > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
> > RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
> > RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
> > RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
> > R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
> > R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
> >  </TASK>
> > Allocated by task 6559:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
> >  vfs_open+0x87/0x3f0 fs/open.c:1086
> >  do_open+0x72f/0xf80 fs/namei.c:3830
> >  path_openat+0x2ec/0x770 fs/namei.c:3989
> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
> >  do_sys_open fs/open.c:1443 [inline]
> >  __do_sys_openat fs/open.c:1459 [inline]
> >  __se_sys_openat fs/open.c:1454 [inline]
> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > Freed by task 6559:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
> >  poison_slab_object mm/kasan/common.c:247 [inline]
> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
> >  kasan_slab_free include/linux/kasan.h:233 [inline]
> >  slab_free_hook mm/slub.c:2353 [inline]
> >  slab_free mm/slub.c:4609 [inline]
> >  kfree+0x14f/0x4d0 mm/slub.c:4757
> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > The buggy address belongs to the object at ffff88812ebcc000
> > The buggy address is located 1504 bytes inside of
> > The buggy address belongs to the physical page:
> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12ebc8
> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> > page_type: f5(slab)
> > raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
> > raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
> > head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 0000000000000000
> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> > Memory state around the buggy address:
> >  ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                                                        ^
> >  ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
> > BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
> > BUG: KASAN: slab-use-after-free in drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
> > Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
> > CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0-flowejam-+ #1
> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:94 [inline]
> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
> >  drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
> >  rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
> >  drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
> >  drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/sched_entity.c:623 [gpu_sched]
> 
> This might be a race between entity killing and the push_job. Let's
> look at your patch below…
> 
> >  amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:314 [amdgpu]
> >  amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c:122 [amdgpu]
> >  amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:422 [amdgpu]
> >  amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2609 [amdgpu]
> >  amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1418 [amdgpu]
> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
> >  vfs_open+0x87/0x3f0 fs/open.c:1086
> >  do_open+0x72f/0xf80 fs/namei.c:3830
> >  path_openat+0x2ec/0x770 fs/namei.c:3989
> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
> >  do_sys_open fs/open.c:1443 [inline]
> >  __do_sys_openat fs/open.c:1459 [inline]
> >  __se_sys_openat fs/open.c:1454 [inline]
> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > RIP: 0033:0x7feb303a36ed
> > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
> > RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
> > RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
> > RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
> >  </TASK>
> > Allocated by task 146638:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
> >  vfs_open+0x87/0x3f0 fs/open.c:1086
> >  do_open+0x72f/0xf80 fs/namei.c:3830
> >  path_openat+0x2ec/0x770 fs/namei.c:3989
> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
> >  do_sys_open fs/open.c:1443 [inline]
> >  __do_sys_openat fs/open.c:1459 [inline]
> >  __se_sys_openat fs/open.c:1454 [inline]
> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > Freed by task 146638:
> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
> >  poison_slab_object mm/kasan/common.c:247 [inline]
> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
> >  kasan_slab_free include/linux/kasan.h:233 [inline]
> >  slab_free_hook mm/slub.c:2353 [inline]
> >  slab_free mm/slub.c:4609 [inline]
> >  kfree+0x14f/0x4d0 mm/slub.c:4757
> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
> >  __fput+0x402/0xb50 fs/file_table.c:464
> >  task_work_run+0x155/0x250 kernel/task_work.c:227
> >  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
> >  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
> >  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > The buggy address belongs to the object at ffff888120844000
> > The buggy address is located 1480 bytes inside of
> > The buggy address belongs to the physical page:
> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120840
> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> > page_type: f5(slab)
> > raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
> > raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
> > head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
> > head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 0000000000000000
> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
> > page dumped because: kasan: bad access detected
> > Memory state around the buggy address:
> >  ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > > ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >                                               ^
> >  ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> >  ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> > ==================================================================
> > 
> >  drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index bfea608a7106..997a2cc1a635 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> >  
> >  	entity->oldest_job_waiting = ts;
> >  
> > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > -		      drm_sched_entity_compare_before);
> > +	if (!entity->stopped) {
> > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > +			      drm_sched_entity_compare_before);
> > +	}
> 
> If this is a race, then this patch here is broken, too, because you're
> checking the 'stopped' boolean as the callers of that function do, too
> – just later. :O
> 
> Could still race, just less likely.
> 
> The proper way to fix it would then be to address the issue where the
> locking is supposed to happen. Let's look at, for example,
> drm_sched_entity_push_job():
> 
> 
> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> {
> 	(Bla bla bla)
> 
>  	…………
> 
> 	/* first job wakes up scheduler */
> 	if (first) {
> 		struct drm_gpu_scheduler *sched;
> 		struct drm_sched_rq *rq;
> 
> 		/* Add the entity to the run queue */
> 		spin_lock(&entity->lock);
> 		if (entity->stopped) {                  <---- Aha!
> 			spin_unlock(&entity->lock);
> 
> 			DRM_ERROR("Trying to push to a killed entity\n");
> 			return;
> 		}
> 
> 		rq = entity->rq;
> 		sched = rq->sched;
> 
> 		spin_lock(&rq->lock);
> 		drm_sched_rq_add_entity(rq, entity);
> 
> 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> 
> 		spin_unlock(&rq->lock);
> 		spin_unlock(&entity->lock);
> 
> But the locks are still being hold. So that "shouldn't be happening"(tm).
> 
> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> stop entities. The former holds appropriate locks, but drm_sched_fini()
> doesn't. So that looks like a hot candidate to me. Opinions?
> 
> On the other hand, aren't drivers prohibited from calling
> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> fuzzer does that, then it's not the scheduler's fault.
> 
> Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?

Ah no, forget about that.

In drm_sched_fini(), you'd have to take the locks in reverse order as
in drm_sched_entity_push/pop_job(), thereby replacing race with
deadlock.

I suspect that this is an issue in amdgpu. But let's wait for
Christian.


P.


> 
> Would be cool if Tvrtko and Christian take a look. Maybe we even have a
> fundamental design issue.
> 
> 
> Regards
> P.
> 
> 
> >  }
> >  
> >  /**
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21  8:16   ` Philipp Stanner
@ 2025-07-21 10:14     ` Danilo Krummrich
  2025-07-21 18:07       ` Matthew Brost
  2025-07-22 20:05     ` James
  1 sibling, 1 reply; 17+ messages in thread
From: Danilo Krummrich @ 2025-07-21 10:14 UTC (permalink / raw)
  To: Philipp Stanner
  Cc: phasta, James Flowers, matthew.brost, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
>> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
>> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> > index bfea608a7106..997a2cc1a635 100644
>> > --- a/drivers/gpu/drm/scheduler/sched_main.c
>> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
>> >  
>> >  	entity->oldest_job_waiting = ts;
>> >  
>> > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> > -		      drm_sched_entity_compare_before);
>> > +	if (!entity->stopped) {
>> > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> > +			      drm_sched_entity_compare_before);
>> > +	}
>> 
>> If this is a race, then this patch here is broken, too, because you're
>> checking the 'stopped' boolean as the callers of that function do, too
>> – just later. :O
>> 
>> Could still race, just less likely.
>> 
>> The proper way to fix it would then be to address the issue where the
>> locking is supposed to happen. Let's look at, for example,
>> drm_sched_entity_push_job():
>> 
>> 
>> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>> {
>> 	(Bla bla bla)
>> 
>>  	…………
>> 
>> 	/* first job wakes up scheduler */
>> 	if (first) {
>> 		struct drm_gpu_scheduler *sched;
>> 		struct drm_sched_rq *rq;
>> 
>> 		/* Add the entity to the run queue */
>> 		spin_lock(&entity->lock);
>> 		if (entity->stopped) {                  <---- Aha!
>> 			spin_unlock(&entity->lock);
>> 
>> 			DRM_ERROR("Trying to push to a killed entity\n");
>> 			return;
>> 		}
>> 
>> 		rq = entity->rq;
>> 		sched = rq->sched;
>> 
>> 		spin_lock(&rq->lock);
>> 		drm_sched_rq_add_entity(rq, entity);
>> 
>> 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>> 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
>> 
>> 		spin_unlock(&rq->lock);
>> 		spin_unlock(&entity->lock);
>> 
>> But the locks are still being hold. So that "shouldn't be happening"(tm).
>> 
>> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
>> stop entities. The former holds appropriate locks, but drm_sched_fini()
>> doesn't. So that looks like a hot candidate to me. Opinions?
>> 
>> On the other hand, aren't drivers prohibited from calling
>> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
>> fuzzer does that, then it's not the scheduler's fault.

Exactly, this is the first question to ask.

And I think it's even more restrictive:

In drm_sched_fini()

	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
		struct drm_sched_rq *rq = sched->sched_rq[i];

		spin_lock(&rq->lock);
		list_for_each_entry(s_entity, &rq->entities, list)
			/*
			 * Prevents reinsertion and marks job_queue as idle,
			 * it will be removed from the rq in drm_sched_entity_fini()
			 * eventually
			 */
			s_entity->stopped = true;
		spin_unlock(&rq->lock);
		kfree(sched->sched_rq[i]);
	}

In drm_sched_entity_kill()

	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
	{
		struct drm_sched_job *job;
		struct dma_fence *prev;

		if (!entity->rq)
			return;

		spin_lock(&entity->lock);
		entity->stopped = true;
		drm_sched_rq_remove_entity(entity->rq, entity);
		spin_unlock(&entity->lock);

		[...]
	}

If this runs concurrently, this is a UAF as well.

Personally, I have always been working with the assupmtion that entites have to
be torn down *before* the scheduler, but those lifetimes are not documented
properly.

There are two solutions:

  (1) Strictly require all entities to be torn down before drm_sched_fini(),
      i.e. stick to the natural ownership and lifetime rules here (see below).

  (2) Actually protect *any* changes of the relevent fields of the entity
      structure with the entity lock.

While (2) seems rather obvious, we run into lock inversion with this approach,
as you note below as well. And I think drm_sched_fini() should not mess with
entities anyways.

The ownership here seems obvious:

The scheduler *owns* a resource that is used by entities. Consequently, entities
are not allowed to out-live the scheduler.

Surely, the current implementation to just take the resource away from the
entity under the hood can work as well with appropriate locking, but that's a
mess.

If the resource *really* needs to be shared for some reason (which I don't see),
shared ownership, i.e. reference counting, is much less error prone.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21 10:14     ` Danilo Krummrich
@ 2025-07-21 18:07       ` Matthew Brost
  2025-07-22  7:37         ` Philipp Stanner
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-07-21 18:07 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Philipp Stanner, phasta, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> >> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> >> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >> > index bfea608a7106..997a2cc1a635 100644
> >> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> >> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >> > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> >> >  
> >> >  	entity->oldest_job_waiting = ts;
> >> >  
> >> > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> >> > -		      drm_sched_entity_compare_before);
> >> > +	if (!entity->stopped) {
> >> > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> >> > +			      drm_sched_entity_compare_before);
> >> > +	}
> >> 
> >> If this is a race, then this patch here is broken, too, because you're
> >> checking the 'stopped' boolean as the callers of that function do, too
> >> – just later. :O
> >> 
> >> Could still race, just less likely.
> >> 
> >> The proper way to fix it would then be to address the issue where the
> >> locking is supposed to happen. Let's look at, for example,
> >> drm_sched_entity_push_job():
> >> 
> >> 
> >> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> >> {
> >> 	(Bla bla bla)
> >> 
> >>  	…………
> >> 
> >> 	/* first job wakes up scheduler */
> >> 	if (first) {
> >> 		struct drm_gpu_scheduler *sched;
> >> 		struct drm_sched_rq *rq;
> >> 
> >> 		/* Add the entity to the run queue */
> >> 		spin_lock(&entity->lock);
> >> 		if (entity->stopped) {                  <---- Aha!
> >> 			spin_unlock(&entity->lock);
> >> 
> >> 			DRM_ERROR("Trying to push to a killed entity\n");
> >> 			return;
> >> 		}
> >> 
> >> 		rq = entity->rq;
> >> 		sched = rq->sched;
> >> 
> >> 		spin_lock(&rq->lock);
> >> 		drm_sched_rq_add_entity(rq, entity);
> >> 
> >> 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> >> 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> >> 
> >> 		spin_unlock(&rq->lock);
> >> 		spin_unlock(&entity->lock);
> >> 
> >> But the locks are still being hold. So that "shouldn't be happening"(tm).
> >> 
> >> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> >> stop entities. The former holds appropriate locks, but drm_sched_fini()
> >> doesn't. So that looks like a hot candidate to me. Opinions?
> >> 
> >> On the other hand, aren't drivers prohibited from calling
> >> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> >> fuzzer does that, then it's not the scheduler's fault.
> 
> Exactly, this is the first question to ask.
> 
> And I think it's even more restrictive:
> 
> In drm_sched_fini()
> 
> 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> 		struct drm_sched_rq *rq = sched->sched_rq[i];
> 
> 		spin_lock(&rq->lock);
> 		list_for_each_entry(s_entity, &rq->entities, list)
> 			/*
> 			 * Prevents reinsertion and marks job_queue as idle,
> 			 * it will be removed from the rq in drm_sched_entity_fini()
> 			 * eventually
> 			 */
> 			s_entity->stopped = true;
> 		spin_unlock(&rq->lock);
> 		kfree(sched->sched_rq[i]);
> 	}
> 
> In drm_sched_entity_kill()
> 
> 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> 	{
> 		struct drm_sched_job *job;
> 		struct dma_fence *prev;
> 
> 		if (!entity->rq)
> 			return;
> 
> 		spin_lock(&entity->lock);
> 		entity->stopped = true;
> 		drm_sched_rq_remove_entity(entity->rq, entity);
> 		spin_unlock(&entity->lock);
> 
> 		[...]
> 	}
> 
> If this runs concurrently, this is a UAF as well.
> 
> Personally, I have always been working with the assupmtion that entites have to
> be torn down *before* the scheduler, but those lifetimes are not documented
> properly.

Yes, this is my assumption too. I would even take it further: an entity
shouldn't be torn down until all jobs associated with it are freed as
well. I think this would solve a lot of issues I've seen on the list
related to UAF, teardown, etc.

> 
> There are two solutions:
> 
>   (1) Strictly require all entities to be torn down before drm_sched_fini(),
>       i.e. stick to the natural ownership and lifetime rules here (see below).
> 
>   (2) Actually protect *any* changes of the relevent fields of the entity
>       structure with the entity lock.
> 
> While (2) seems rather obvious, we run into lock inversion with this approach,
> as you note below as well. And I think drm_sched_fini() should not mess with
> entities anyways.
> 
> The ownership here seems obvious:
> 
> The scheduler *owns* a resource that is used by entities. Consequently, entities
> are not allowed to out-live the scheduler.
> 
> Surely, the current implementation to just take the resource away from the
> entity under the hood can work as well with appropriate locking, but that's a
> mess.
> 
> If the resource *really* needs to be shared for some reason (which I don't see),
> shared ownership, i.e. reference counting, is much less error prone.

Yes, Xe solves all of this via reference counting (jobs refcount the
entity). It's a bit easier in Xe since the scheduler and entities are
the same object due to their 1:1 relationship. But even in non-1:1
relationships, an entity could refcount the scheduler. The teardown
sequence would then be: all jobs complete on the entity → teardown the
entity → all entities torn down → teardown the scheduler.

Matt

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21 18:07       ` Matthew Brost
@ 2025-07-22  7:37         ` Philipp Stanner
  2025-07-22  8:07           ` Matthew Brost
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Stanner @ 2025-07-22  7:37 UTC (permalink / raw)
  To: Matthew Brost, Danilo Krummrich
  Cc: phasta, James Flowers, ckoenig.leichtzumerken, maarten.lankhorst,
	mripard, tzimmermann, airlied, simona, skhan, dri-devel,
	linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > >  
> > > > >  	entity->oldest_job_waiting = ts;
> > > > >  
> > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > -		      drm_sched_entity_compare_before);
> > > > > +	if (!entity->stopped) {
> > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > +			      drm_sched_entity_compare_before);
> > > > > +	}
> > > > 
> > > > If this is a race, then this patch here is broken, too, because you're
> > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > – just later. :O
> > > > 
> > > > Could still race, just less likely.
> > > > 
> > > > The proper way to fix it would then be to address the issue where the
> > > > locking is supposed to happen. Let's look at, for example,
> > > > drm_sched_entity_push_job():
> > > > 
> > > > 
> > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > {
> > > > 	(Bla bla bla)
> > > > 
> > > >  	…………
> > > > 
> > > > 	/* first job wakes up scheduler */
> > > > 	if (first) {
> > > > 		struct drm_gpu_scheduler *sched;
> > > > 		struct drm_sched_rq *rq;
> > > > 
> > > > 		/* Add the entity to the run queue */
> > > > 		spin_lock(&entity->lock);
> > > > 		if (entity->stopped) {                  <---- Aha!
> > > > 			spin_unlock(&entity->lock);
> > > > 
> > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > 			return;
> > > > 		}
> > > > 
> > > > 		rq = entity->rq;
> > > > 		sched = rq->sched;
> > > > 
> > > > 		spin_lock(&rq->lock);
> > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > 
> > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > 
> > > > 		spin_unlock(&rq->lock);
> > > > 		spin_unlock(&entity->lock);
> > > > 
> > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > 
> > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > 
> > > > On the other hand, aren't drivers prohibited from calling
> > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > fuzzer does that, then it's not the scheduler's fault.
> > 
> > Exactly, this is the first question to ask.
> > 
> > And I think it's even more restrictive:
> > 
> > In drm_sched_fini()
> > 
> > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > 
> > 		spin_lock(&rq->lock);
> > 		list_for_each_entry(s_entity, &rq->entities, list)
> > 			/*
> > 			 * Prevents reinsertion and marks job_queue as idle,
> > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > 			 * eventually
> > 			 */
> > 			s_entity->stopped = true;
> > 		spin_unlock(&rq->lock);
> > 		kfree(sched->sched_rq[i]);
> > 	}
> > 
> > In drm_sched_entity_kill()
> > 
> > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > 	{
> > 		struct drm_sched_job *job;
> > 		struct dma_fence *prev;
> > 
> > 		if (!entity->rq)
> > 			return;
> > 
> > 		spin_lock(&entity->lock);
> > 		entity->stopped = true;
> > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > 		spin_unlock(&entity->lock);
> > 
> > 		[...]
> > 	}
> > 
> > If this runs concurrently, this is a UAF as well.
> > 
> > Personally, I have always been working with the assupmtion that entites have to
> > be torn down *before* the scheduler, but those lifetimes are not documented
> > properly.
> 
> Yes, this is my assumption too. I would even take it further: an entity
> shouldn't be torn down until all jobs associated with it are freed as
> well. I think this would solve a lot of issues I've seen on the list
> related to UAF, teardown, etc.

That's kind of impossible with the new tear down design, because
drm_sched_fini() ensures that all jobs are freed on teardown. And
drm_sched_fini() wouldn't be called before all jobs are gone,
effectively resulting in a chicken-egg-problem, or rather: the driver
implementing its own solution for teardown.

P.


> 
> > 
> > There are two solutions:
> > 
> >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > 
> >   (2) Actually protect *any* changes of the relevent fields of the entity
> >       structure with the entity lock.
> > 
> > While (2) seems rather obvious, we run into lock inversion with this approach,
> > as you note below as well. And I think drm_sched_fini() should not mess with
> > entities anyways.
> > 
> > The ownership here seems obvious:
> > 
> > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > are not allowed to out-live the scheduler.
> > 
> > Surely, the current implementation to just take the resource away from the
> > entity under the hood can work as well with appropriate locking, but that's a
> > mess.
> > 
> > If the resource *really* needs to be shared for some reason (which I don't see),
> > shared ownership, i.e. reference counting, is much less error prone.
> 
> Yes, Xe solves all of this via reference counting (jobs refcount the
> entity). It's a bit easier in Xe since the scheduler and entities are
> the same object due to their 1:1 relationship. But even in non-1:1
> relationships, an entity could refcount the scheduler. The teardown
> sequence would then be: all jobs complete on the entity → teardown the
> entity → all entities torn down → teardown the scheduler.
> 
> Matt


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-22  7:37         ` Philipp Stanner
@ 2025-07-22  8:07           ` Matthew Brost
  2025-07-22  8:45             ` Matthew Brost
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-07-22  8:07 UTC (permalink / raw)
  To: phasta
  Cc: Danilo Krummrich, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > > >  
> > > > > >  	entity->oldest_job_waiting = ts;
> > > > > >  
> > > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > -		      drm_sched_entity_compare_before);
> > > > > > +	if (!entity->stopped) {
> > > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > +			      drm_sched_entity_compare_before);
> > > > > > +	}
> > > > > 
> > > > > If this is a race, then this patch here is broken, too, because you're
> > > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > > – just later. :O
> > > > > 
> > > > > Could still race, just less likely.
> > > > > 
> > > > > The proper way to fix it would then be to address the issue where the
> > > > > locking is supposed to happen. Let's look at, for example,
> > > > > drm_sched_entity_push_job():
> > > > > 
> > > > > 
> > > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > > {
> > > > > 	(Bla bla bla)
> > > > > 
> > > > >  	…………
> > > > > 
> > > > > 	/* first job wakes up scheduler */
> > > > > 	if (first) {
> > > > > 		struct drm_gpu_scheduler *sched;
> > > > > 		struct drm_sched_rq *rq;
> > > > > 
> > > > > 		/* Add the entity to the run queue */
> > > > > 		spin_lock(&entity->lock);
> > > > > 		if (entity->stopped) {                  <---- Aha!
> > > > > 			spin_unlock(&entity->lock);
> > > > > 
> > > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > > 			return;
> > > > > 		}
> > > > > 
> > > > > 		rq = entity->rq;
> > > > > 		sched = rq->sched;
> > > > > 
> > > > > 		spin_lock(&rq->lock);
> > > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > > 
> > > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > > 
> > > > > 		spin_unlock(&rq->lock);
> > > > > 		spin_unlock(&entity->lock);
> > > > > 
> > > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > > 
> > > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > > 
> > > > > On the other hand, aren't drivers prohibited from calling
> > > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > > fuzzer does that, then it's not the scheduler's fault.
> > > 
> > > Exactly, this is the first question to ask.
> > > 
> > > And I think it's even more restrictive:
> > > 
> > > In drm_sched_fini()
> > > 
> > > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > > 
> > > 		spin_lock(&rq->lock);
> > > 		list_for_each_entry(s_entity, &rq->entities, list)
> > > 			/*
> > > 			 * Prevents reinsertion and marks job_queue as idle,
> > > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > > 			 * eventually
> > > 			 */
> > > 			s_entity->stopped = true;
> > > 		spin_unlock(&rq->lock);
> > > 		kfree(sched->sched_rq[i]);
> > > 	}
> > > 
> > > In drm_sched_entity_kill()
> > > 
> > > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > > 	{
> > > 		struct drm_sched_job *job;
> > > 		struct dma_fence *prev;
> > > 
> > > 		if (!entity->rq)
> > > 			return;
> > > 
> > > 		spin_lock(&entity->lock);
> > > 		entity->stopped = true;
> > > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > > 		spin_unlock(&entity->lock);
> > > 
> > > 		[...]
> > > 	}
> > > 
> > > If this runs concurrently, this is a UAF as well.
> > > 
> > > Personally, I have always been working with the assupmtion that entites have to
> > > be torn down *before* the scheduler, but those lifetimes are not documented
> > > properly.
> > 
> > Yes, this is my assumption too. I would even take it further: an entity
> > shouldn't be torn down until all jobs associated with it are freed as
> > well. I think this would solve a lot of issues I've seen on the list
> > related to UAF, teardown, etc.
> 
> That's kind of impossible with the new tear down design, because
> drm_sched_fini() ensures that all jobs are freed on teardown. And
> drm_sched_fini() wouldn't be called before all jobs are gone,
> effectively resulting in a chicken-egg-problem, or rather: the driver
> implementing its own solution for teardown.
> 

I've read this four times and I'm still generally confused.

"drm_sched_fini ensures that all jobs are freed on teardown" — Yes,
that's how a refcounting-based solution works. drm_sched_fini would
never be called if there were pending jobs.

"drm_sched_fini() wouldn't be called before all jobs are gone" — See
above.

"effectively resulting in a chicken-and-egg problem" — A job is created
after the scheduler, and it holds a reference to the scheduler until
it's freed. I don't see how this idiom applies.

"the driver implementing its own solution for teardown" — It’s just
following the basic lifetime rules I outlined below. Perhaps Xe was
ahead of its time, but the number of DRM scheduler blowups we've had is
zero — maybe a strong indication that this design is correct.

Matt

> P.
> 
> 
> > 
> > > 
> > > There are two solutions:
> > > 
> > >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> > >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > > 
> > >   (2) Actually protect *any* changes of the relevent fields of the entity
> > >       structure with the entity lock.
> > > 
> > > While (2) seems rather obvious, we run into lock inversion with this approach,
> > > as you note below as well. And I think drm_sched_fini() should not mess with
> > > entities anyways.
> > > 
> > > The ownership here seems obvious:
> > > 
> > > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > > are not allowed to out-live the scheduler.
> > > 
> > > Surely, the current implementation to just take the resource away from the
> > > entity under the hood can work as well with appropriate locking, but that's a
> > > mess.
> > > 
> > > If the resource *really* needs to be shared for some reason (which I don't see),
> > > shared ownership, i.e. reference counting, is much less error prone.
> > 
> > Yes, Xe solves all of this via reference counting (jobs refcount the
> > entity). It's a bit easier in Xe since the scheduler and entities are
> > the same object due to their 1:1 relationship. But even in non-1:1
> > relationships, an entity could refcount the scheduler. The teardown
> > sequence would then be: all jobs complete on the entity → teardown the
> > entity → all entities torn down → teardown the scheduler.
> > 
> > Matt
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-22  8:07           ` Matthew Brost
@ 2025-07-22  8:45             ` Matthew Brost
  2025-07-23  6:56               ` Philipp Stanner
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-07-22  8:45 UTC (permalink / raw)
  To: phasta
  Cc: Danilo Krummrich, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Tue, Jul 22, 2025 at 01:07:29AM -0700, Matthew Brost wrote:
> On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> > On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > > On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > > > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > > > >  
> > > > > > >  	entity->oldest_job_waiting = ts;
> > > > > > >  
> > > > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > -		      drm_sched_entity_compare_before);
> > > > > > > +	if (!entity->stopped) {
> > > > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > +			      drm_sched_entity_compare_before);
> > > > > > > +	}
> > > > > > 
> > > > > > If this is a race, then this patch here is broken, too, because you're
> > > > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > > > – just later. :O
> > > > > > 
> > > > > > Could still race, just less likely.
> > > > > > 
> > > > > > The proper way to fix it would then be to address the issue where the
> > > > > > locking is supposed to happen. Let's look at, for example,
> > > > > > drm_sched_entity_push_job():
> > > > > > 
> > > > > > 
> > > > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > > > {
> > > > > > 	(Bla bla bla)
> > > > > > 
> > > > > >  	…………
> > > > > > 
> > > > > > 	/* first job wakes up scheduler */
> > > > > > 	if (first) {
> > > > > > 		struct drm_gpu_scheduler *sched;
> > > > > > 		struct drm_sched_rq *rq;
> > > > > > 
> > > > > > 		/* Add the entity to the run queue */
> > > > > > 		spin_lock(&entity->lock);
> > > > > > 		if (entity->stopped) {                  <---- Aha!
> > > > > > 			spin_unlock(&entity->lock);
> > > > > > 
> > > > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > > > 			return;
> > > > > > 		}
> > > > > > 
> > > > > > 		rq = entity->rq;
> > > > > > 		sched = rq->sched;
> > > > > > 
> > > > > > 		spin_lock(&rq->lock);
> > > > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > > > 
> > > > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > > > 
> > > > > > 		spin_unlock(&rq->lock);
> > > > > > 		spin_unlock(&entity->lock);
> > > > > > 
> > > > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > > > 
> > > > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > > > 
> > > > > > On the other hand, aren't drivers prohibited from calling
> > > > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > > > fuzzer does that, then it's not the scheduler's fault.
> > > > 
> > > > Exactly, this is the first question to ask.
> > > > 
> > > > And I think it's even more restrictive:
> > > > 
> > > > In drm_sched_fini()
> > > > 
> > > > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > > > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > > > 
> > > > 		spin_lock(&rq->lock);
> > > > 		list_for_each_entry(s_entity, &rq->entities, list)
> > > > 			/*
> > > > 			 * Prevents reinsertion and marks job_queue as idle,
> > > > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > > > 			 * eventually
> > > > 			 */
> > > > 			s_entity->stopped = true;
> > > > 		spin_unlock(&rq->lock);
> > > > 		kfree(sched->sched_rq[i]);
> > > > 	}
> > > > 
> > > > In drm_sched_entity_kill()
> > > > 
> > > > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > > > 	{
> > > > 		struct drm_sched_job *job;
> > > > 		struct dma_fence *prev;
> > > > 
> > > > 		if (!entity->rq)
> > > > 			return;
> > > > 
> > > > 		spin_lock(&entity->lock);
> > > > 		entity->stopped = true;
> > > > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > > > 		spin_unlock(&entity->lock);
> > > > 
> > > > 		[...]
> > > > 	}
> > > > 
> > > > If this runs concurrently, this is a UAF as well.
> > > > 
> > > > Personally, I have always been working with the assupmtion that entites have to
> > > > be torn down *before* the scheduler, but those lifetimes are not documented
> > > > properly.
> > > 
> > > Yes, this is my assumption too. I would even take it further: an entity
> > > shouldn't be torn down until all jobs associated with it are freed as
> > > well. I think this would solve a lot of issues I've seen on the list
> > > related to UAF, teardown, etc.
> > 
> > That's kind of impossible with the new tear down design, because
> > drm_sched_fini() ensures that all jobs are freed on teardown. And
> > drm_sched_fini() wouldn't be called before all jobs are gone,
> > effectively resulting in a chicken-egg-problem, or rather: the driver
> > implementing its own solution for teardown.
> > 
> 
> I've read this four times and I'm still generally confused.
> 
> "drm_sched_fini ensures that all jobs are freed on teardown" — Yes,
> that's how a refcounting-based solution works. drm_sched_fini would
> never be called if there were pending jobs.
> 
> "drm_sched_fini() wouldn't be called before all jobs are gone" — See
> above.
> 
> "effectively resulting in a chicken-and-egg problem" — A job is created
> after the scheduler, and it holds a reference to the scheduler until
> it's freed. I don't see how this idiom applies.
> 
> "the driver implementing its own solution for teardown" — It’s just
> following the basic lifetime rules I outlined below. Perhaps Xe was
> ahead of its time, but the number of DRM scheduler blowups we've had is
> zero — maybe a strong indication that this design is correct.
> 

Sorry—self-reply.

To expand on this: the reason Xe implemented a refcount-based teardown
solution is because the internals of the DRM scheduler during teardown
looked wildly scary. A lower layer should not impose its will on upper
layers. I think that’s the root cause of all the problems I've listed.

In my opinion, we should document the lifetime rules I’ve outlined, fix
all drivers accordingly, and assert these rules in the scheduler layer.

Matt

> Matt
> 
> > P.
> > 
> > 
> > > 
> > > > 
> > > > There are two solutions:
> > > > 
> > > >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> > > >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > > > 
> > > >   (2) Actually protect *any* changes of the relevent fields of the entity
> > > >       structure with the entity lock.
> > > > 
> > > > While (2) seems rather obvious, we run into lock inversion with this approach,
> > > > as you note below as well. And I think drm_sched_fini() should not mess with
> > > > entities anyways.
> > > > 
> > > > The ownership here seems obvious:
> > > > 
> > > > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > > > are not allowed to out-live the scheduler.
> > > > 
> > > > Surely, the current implementation to just take the resource away from the
> > > > entity under the hood can work as well with appropriate locking, but that's a
> > > > mess.
> > > > 
> > > > If the resource *really* needs to be shared for some reason (which I don't see),
> > > > shared ownership, i.e. reference counting, is much less error prone.
> > > 
> > > Yes, Xe solves all of this via reference counting (jobs refcount the
> > > entity). It's a bit easier in Xe since the scheduler and entities are
> > > the same object due to their 1:1 relationship. But even in non-1:1
> > > relationships, an entity could refcount the scheduler. The teardown
> > > sequence would then be: all jobs complete on the entity → teardown the
> > > entity → all entities torn down → teardown the scheduler.
> > > 
> > > Matt
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-22  8:45             ` Matthew Brost
@ 2025-07-23  6:56               ` Philipp Stanner
  2025-07-24  4:13                 ` Matthew Brost
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Stanner @ 2025-07-23  6:56 UTC (permalink / raw)
  To: Matthew Brost, phasta
  Cc: Danilo Krummrich, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Tue, 2025-07-22 at 01:45 -0700, Matthew Brost wrote:
> On Tue, Jul 22, 2025 at 01:07:29AM -0700, Matthew Brost wrote:
> > On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> > > On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > > > On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > > > > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > > > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > > > > >  
> > > > > > > >  	entity->oldest_job_waiting = ts;
> > > > > > > >  
> > > > > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > -		      drm_sched_entity_compare_before);
> > > > > > > > +	if (!entity->stopped) {
> > > > > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > +			      drm_sched_entity_compare_before);
> > > > > > > > +	}
> > > > > > > 
> > > > > > > If this is a race, then this patch here is broken, too, because you're
> > > > > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > > > > – just later. :O
> > > > > > > 
> > > > > > > Could still race, just less likely.
> > > > > > > 
> > > > > > > The proper way to fix it would then be to address the issue where the
> > > > > > > locking is supposed to happen. Let's look at, for example,
> > > > > > > drm_sched_entity_push_job():
> > > > > > > 
> > > > > > > 
> > > > > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > > > > {
> > > > > > > 	(Bla bla bla)
> > > > > > > 
> > > > > > >  	…………
> > > > > > > 
> > > > > > > 	/* first job wakes up scheduler */
> > > > > > > 	if (first) {
> > > > > > > 		struct drm_gpu_scheduler *sched;
> > > > > > > 		struct drm_sched_rq *rq;
> > > > > > > 
> > > > > > > 		/* Add the entity to the run queue */
> > > > > > > 		spin_lock(&entity->lock);
> > > > > > > 		if (entity->stopped) {                  <---- Aha!
> > > > > > > 			spin_unlock(&entity->lock);
> > > > > > > 
> > > > > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > > > > 			return;
> > > > > > > 		}
> > > > > > > 
> > > > > > > 		rq = entity->rq;
> > > > > > > 		sched = rq->sched;
> > > > > > > 
> > > > > > > 		spin_lock(&rq->lock);
> > > > > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > > > > 
> > > > > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > > > > 
> > > > > > > 		spin_unlock(&rq->lock);
> > > > > > > 		spin_unlock(&entity->lock);
> > > > > > > 
> > > > > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > > > > 
> > > > > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > > > > 
> > > > > > > On the other hand, aren't drivers prohibited from calling
> > > > > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > > > > fuzzer does that, then it's not the scheduler's fault.
> > > > > 
> > > > > Exactly, this is the first question to ask.
> > > > > 
> > > > > And I think it's even more restrictive:
> > > > > 
> > > > > In drm_sched_fini()
> > > > > 
> > > > > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > > > > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > > > > 
> > > > > 		spin_lock(&rq->lock);
> > > > > 		list_for_each_entry(s_entity, &rq->entities, list)
> > > > > 			/*
> > > > > 			 * Prevents reinsertion and marks job_queue as idle,
> > > > > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > > > > 			 * eventually
> > > > > 			 */
> > > > > 			s_entity->stopped = true;
> > > > > 		spin_unlock(&rq->lock);
> > > > > 		kfree(sched->sched_rq[i]);
> > > > > 	}
> > > > > 
> > > > > In drm_sched_entity_kill()
> > > > > 
> > > > > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > > > > 	{
> > > > > 		struct drm_sched_job *job;
> > > > > 		struct dma_fence *prev;
> > > > > 
> > > > > 		if (!entity->rq)
> > > > > 			return;
> > > > > 
> > > > > 		spin_lock(&entity->lock);
> > > > > 		entity->stopped = true;
> > > > > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > > > > 		spin_unlock(&entity->lock);
> > > > > 
> > > > > 		[...]
> > > > > 	}
> > > > > 
> > > > > If this runs concurrently, this is a UAF as well.
> > > > > 
> > > > > Personally, I have always been working with the assupmtion that entites have to
> > > > > be torn down *before* the scheduler, but those lifetimes are not documented
> > > > > properly.
> > > > 
> > > > Yes, this is my assumption too. I would even take it further: an entity
> > > > shouldn't be torn down until all jobs associated with it are freed as
> > > > well. I think this would solve a lot of issues I've seen on the list
> > > > related to UAF, teardown, etc.
> > > 
> > > That's kind of impossible with the new tear down design, because
> > > drm_sched_fini() ensures that all jobs are freed on teardown. And
> > > drm_sched_fini() wouldn't be called before all jobs are gone,
> > > effectively resulting in a chicken-egg-problem, or rather: the driver
> > > implementing its own solution for teardown.
> > > 
> > 
> > I've read this four times and I'm still generally confused.
> > 
> > "drm_sched_fini ensures that all jobs are freed on teardown" — Yes,
> > that's how a refcounting-based solution works. drm_sched_fini would
> > never be called if there were pending jobs.
> > 
> > "drm_sched_fini() wouldn't be called before all jobs are gone" — See
> > above.
> > 
> > "effectively resulting in a chicken-and-egg problem" — A job is created
> > after the scheduler, and it holds a reference to the scheduler until
> > it's freed. I don't see how this idiom applies.
> > 
> > "the driver implementing its own solution for teardown" — It’s just
> > following the basic lifetime rules I outlined below. Perhaps Xe was
> > ahead of its time, but the number of DRM scheduler blowups we've had is
> > zero — maybe a strong indication that this design is correct.
> > 
> 
> Sorry—self-reply.
> 
> To expand on this: the reason Xe implemented a refcount-based teardown
> solution is because the internals of the DRM scheduler during teardown
> looked wildly scary. A lower layer should not impose its will on upper
> layers. I think that’s the root cause of all the problems I've listed.
> 
> In my opinion, we should document the lifetime rules I’ve outlined, fix
> all drivers accordingly, and assert these rules in the scheduler layer.


Everyone had a separate solution for that. Nouveau used a waitqueue.
That's what happens when there's no centralized mechanism for solving a
problem.

Did you see the series we recently merged which repairs the memory
leaks of drm/sched? It had been around for quite some time.

https://lore.kernel.org/dri-devel/20250701132142.76899-3-phasta@kernel.org/


P.

> 
> Matt
> 
> > Matt
> > 
> > > P.
> > > 
> > > 
> > > > 
> > > > > 
> > > > > There are two solutions:
> > > > > 
> > > > >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> > > > >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > > > > 
> > > > >   (2) Actually protect *any* changes of the relevent fields of the entity
> > > > >       structure with the entity lock.
> > > > > 
> > > > > While (2) seems rather obvious, we run into lock inversion with this approach,
> > > > > as you note below as well. And I think drm_sched_fini() should not mess with
> > > > > entities anyways.
> > > > > 
> > > > > The ownership here seems obvious:
> > > > > 
> > > > > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > > > > are not allowed to out-live the scheduler.
> > > > > 
> > > > > Surely, the current implementation to just take the resource away from the
> > > > > entity under the hood can work as well with appropriate locking, but that's a
> > > > > mess.
> > > > > 
> > > > > If the resource *really* needs to be shared for some reason (which I don't see),
> > > > > shared ownership, i.e. reference counting, is much less error prone.
> > > > 
> > > > Yes, Xe solves all of this via reference counting (jobs refcount the
> > > > entity). It's a bit easier in Xe since the scheduler and entities are
> > > > the same object due to their 1:1 relationship. But even in non-1:1
> > > > relationships, an entity could refcount the scheduler. The teardown
> > > > sequence would then be: all jobs complete on the entity → teardown the
> > > > entity → all entities torn down → teardown the scheduler.
> > > > 
> > > > Matt
> > > 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-23  6:56               ` Philipp Stanner
@ 2025-07-24  4:13                 ` Matthew Brost
  2025-07-24  4:17                   ` Matthew Brost
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Brost @ 2025-07-24  4:13 UTC (permalink / raw)
  To: phasta
  Cc: Danilo Krummrich, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Wed, Jul 23, 2025 at 08:56:01AM +0200, Philipp Stanner wrote:
> On Tue, 2025-07-22 at 01:45 -0700, Matthew Brost wrote:
> > On Tue, Jul 22, 2025 at 01:07:29AM -0700, Matthew Brost wrote:
> > > On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> > > > On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > > > > On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > > > > > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > > > > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > > > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > > > > > >  
> > > > > > > > >  	entity->oldest_job_waiting = ts;
> > > > > > > > >  
> > > > > > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > > -		      drm_sched_entity_compare_before);
> > > > > > > > > +	if (!entity->stopped) {
> > > > > > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > > +			      drm_sched_entity_compare_before);
> > > > > > > > > +	}
> > > > > > > > 
> > > > > > > > If this is a race, then this patch here is broken, too, because you're
> > > > > > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > > > > > – just later. :O
> > > > > > > > 
> > > > > > > > Could still race, just less likely.
> > > > > > > > 
> > > > > > > > The proper way to fix it would then be to address the issue where the
> > > > > > > > locking is supposed to happen. Let's look at, for example,
> > > > > > > > drm_sched_entity_push_job():
> > > > > > > > 
> > > > > > > > 
> > > > > > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > > > > > {
> > > > > > > > 	(Bla bla bla)
> > > > > > > > 
> > > > > > > >  	…………
> > > > > > > > 
> > > > > > > > 	/* first job wakes up scheduler */
> > > > > > > > 	if (first) {
> > > > > > > > 		struct drm_gpu_scheduler *sched;
> > > > > > > > 		struct drm_sched_rq *rq;
> > > > > > > > 
> > > > > > > > 		/* Add the entity to the run queue */
> > > > > > > > 		spin_lock(&entity->lock);
> > > > > > > > 		if (entity->stopped) {                  <---- Aha!
> > > > > > > > 			spin_unlock(&entity->lock);
> > > > > > > > 
> > > > > > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > > > > > 			return;
> > > > > > > > 		}
> > > > > > > > 
> > > > > > > > 		rq = entity->rq;
> > > > > > > > 		sched = rq->sched;
> > > > > > > > 
> > > > > > > > 		spin_lock(&rq->lock);
> > > > > > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > > > > > 
> > > > > > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > > > > > 
> > > > > > > > 		spin_unlock(&rq->lock);
> > > > > > > > 		spin_unlock(&entity->lock);
> > > > > > > > 
> > > > > > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > > > > > 
> > > > > > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > > > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > > > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > > > > > 
> > > > > > > > On the other hand, aren't drivers prohibited from calling
> > > > > > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > > > > > fuzzer does that, then it's not the scheduler's fault.
> > > > > > 
> > > > > > Exactly, this is the first question to ask.
> > > > > > 
> > > > > > And I think it's even more restrictive:
> > > > > > 
> > > > > > In drm_sched_fini()
> > > > > > 
> > > > > > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > > > > > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > > > > > 
> > > > > > 		spin_lock(&rq->lock);
> > > > > > 		list_for_each_entry(s_entity, &rq->entities, list)
> > > > > > 			/*
> > > > > > 			 * Prevents reinsertion and marks job_queue as idle,
> > > > > > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > > > > > 			 * eventually
> > > > > > 			 */
> > > > > > 			s_entity->stopped = true;
> > > > > > 		spin_unlock(&rq->lock);
> > > > > > 		kfree(sched->sched_rq[i]);
> > > > > > 	}
> > > > > > 
> > > > > > In drm_sched_entity_kill()
> > > > > > 
> > > > > > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > > > > > 	{
> > > > > > 		struct drm_sched_job *job;
> > > > > > 		struct dma_fence *prev;
> > > > > > 
> > > > > > 		if (!entity->rq)
> > > > > > 			return;
> > > > > > 
> > > > > > 		spin_lock(&entity->lock);
> > > > > > 		entity->stopped = true;
> > > > > > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > > > > > 		spin_unlock(&entity->lock);
> > > > > > 
> > > > > > 		[...]
> > > > > > 	}
> > > > > > 
> > > > > > If this runs concurrently, this is a UAF as well.
> > > > > > 
> > > > > > Personally, I have always been working with the assupmtion that entites have to
> > > > > > be torn down *before* the scheduler, but those lifetimes are not documented
> > > > > > properly.
> > > > > 
> > > > > Yes, this is my assumption too. I would even take it further: an entity
> > > > > shouldn't be torn down until all jobs associated with it are freed as
> > > > > well. I think this would solve a lot of issues I've seen on the list
> > > > > related to UAF, teardown, etc.
> > > > 
> > > > That's kind of impossible with the new tear down design, because
> > > > drm_sched_fini() ensures that all jobs are freed on teardown. And
> > > > drm_sched_fini() wouldn't be called before all jobs are gone,
> > > > effectively resulting in a chicken-egg-problem, or rather: the driver
> > > > implementing its own solution for teardown.
> > > > 
> > > 
> > > I've read this four times and I'm still generally confused.
> > > 
> > > "drm_sched_fini ensures that all jobs are freed on teardown" — Yes,
> > > that's how a refcounting-based solution works. drm_sched_fini would
> > > never be called if there were pending jobs.
> > > 
> > > "drm_sched_fini() wouldn't be called before all jobs are gone" — See
> > > above.
> > > 
> > > "effectively resulting in a chicken-and-egg problem" — A job is created
> > > after the scheduler, and it holds a reference to the scheduler until
> > > it's freed. I don't see how this idiom applies.
> > > 
> > > "the driver implementing its own solution for teardown" — It’s just
> > > following the basic lifetime rules I outlined below. Perhaps Xe was
> > > ahead of its time, but the number of DRM scheduler blowups we've had is
> > > zero — maybe a strong indication that this design is correct.
> > > 
> > 
> > Sorry—self-reply.
> > 
> > To expand on this: the reason Xe implemented a refcount-based teardown
> > solution is because the internals of the DRM scheduler during teardown
> > looked wildly scary. A lower layer should not impose its will on upper
> > layers. I think that’s the root cause of all the problems I've listed.
> > 
> > In my opinion, we should document the lifetime rules I’ve outlined, fix
> > all drivers accordingly, and assert these rules in the scheduler layer.
> 
> 
> Everyone had a separate solution for that. Nouveau used a waitqueue.
> That's what happens when there's no centralized mechanism for solving a
> problem.
> 

Right, this is essentially my point — I think refcounting on the driver
side is what the long-term solution really needs to be.

To recap the basic rules:

- Entities should not be finalized or freed until all jobs associated
  with them are freed.
- Schedulers should not be finalized or freed until all associated
  entities are finalized.
- Jobs should hold a reference to the entity.
- Entities should hold a reference to the scheduler.

I understand this won’t happen overnight — or perhaps ever — but
adopting this model would solve a lot of problems across the subsystem
and reduce a significant amount of complexity in the DRM scheduler. I’ll
also acknowledge that part of this is my fault — years ago, I worked
around problems (implemented above ref count model) in the scheduler
related to teardown rather than proposing a common, unified solution,
and clear lifetime rules.

For drivers with a 1:1 entity-to-scheduler relationship, teardown
becomes fairly simple: set the TDR timeout to zero and naturally let the
remaining jobs flush out via TDR + the timedout_job callback, which
signals the job’s fence. Free job, is called after that.

For non-1:1 setups, we could introduce something like
drm_sched_entity_kill, which would move all jobs on the pending list of
a given entity to a kill list. A worker could then process that kill
list — calling timedout_job and signaling the associated fences.
Similarly, any jobs that had unresolved dependencies could be
immediately added to the kill list. The kill list would have to be
checked in drm_sched_free_job_work too.

This would ensure that all jobs submitted would go through the full
lifecycle:

- run_job is called
- free_job is called
- If the fence returned from run_job needs to be artificially signaled,
  timedout_job is called

We can add the infrastructure for this and once all driver adhere this
model, clean up ugliness in the scheduler related to teardown and all
races here.

> Did you see the series we recently merged which repairs the memory
> leaks of drm/sched? It had been around for quite some time.
> 
> https://lore.kernel.org/dri-devel/20250701132142.76899-3-phasta@kernel.org/
>

I would say this is just hacking around the fundamental issues with the
lifetime of these objects. Do you see anything in Nouveau that would
prevent the approach I described above from working?

Also, what if jobs have dependencies that aren't even on the pending
list yet? This further illustrates the problems with trying to finalize
objects while child objects (entities, job) are still around.

Matt

> 
> P.
> 
> > 
> > Matt
> > 
> > > Matt
> > > 
> > > > P.
> > > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > There are two solutions:
> > > > > > 
> > > > > >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> > > > > >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > > > > > 
> > > > > >   (2) Actually protect *any* changes of the relevent fields of the entity
> > > > > >       structure with the entity lock.
> > > > > > 
> > > > > > While (2) seems rather obvious, we run into lock inversion with this approach,
> > > > > > as you note below as well. And I think drm_sched_fini() should not mess with
> > > > > > entities anyways.
> > > > > > 
> > > > > > The ownership here seems obvious:
> > > > > > 
> > > > > > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > > > > > are not allowed to out-live the scheduler.
> > > > > > 
> > > > > > Surely, the current implementation to just take the resource away from the
> > > > > > entity under the hood can work as well with appropriate locking, but that's a
> > > > > > mess.
> > > > > > 
> > > > > > If the resource *really* needs to be shared for some reason (which I don't see),
> > > > > > shared ownership, i.e. reference counting, is much less error prone.
> > > > > 
> > > > > Yes, Xe solves all of this via reference counting (jobs refcount the
> > > > > entity). It's a bit easier in Xe since the scheduler and entities are
> > > > > the same object due to their 1:1 relationship. But even in non-1:1
> > > > > relationships, an entity could refcount the scheduler. The teardown
> > > > > sequence would then be: all jobs complete on the entity → teardown the
> > > > > entity → all entities torn down → teardown the scheduler.
> > > > > 
> > > > > Matt
> > > > 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-24  4:13                 ` Matthew Brost
@ 2025-07-24  4:17                   ` Matthew Brost
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Brost @ 2025-07-24  4:17 UTC (permalink / raw)
  To: phasta
  Cc: Danilo Krummrich, James Flowers, ckoenig.leichtzumerken,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona, skhan,
	dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Wed, Jul 23, 2025 at 09:13:34PM -0700, Matthew Brost wrote:
> On Wed, Jul 23, 2025 at 08:56:01AM +0200, Philipp Stanner wrote:
> > On Tue, 2025-07-22 at 01:45 -0700, Matthew Brost wrote:
> > > On Tue, Jul 22, 2025 at 01:07:29AM -0700, Matthew Brost wrote:
> > > > On Tue, Jul 22, 2025 at 09:37:11AM +0200, Philipp Stanner wrote:
> > > > > On Mon, 2025-07-21 at 11:07 -0700, Matthew Brost wrote:
> > > > > > On Mon, Jul 21, 2025 at 12:14:31PM +0200, Danilo Krummrich wrote:
> > > > > > > On Mon Jul 21, 2025 at 10:16 AM CEST, Philipp Stanner wrote:
> > > > > > > > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > > > > > > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > index bfea608a7106..997a2cc1a635 100644
> > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > > > > > > > >  
> > > > > > > > > >  	entity->oldest_job_waiting = ts;
> > > > > > > > > >  
> > > > > > > > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > > > -		      drm_sched_entity_compare_before);
> > > > > > > > > > +	if (!entity->stopped) {
> > > > > > > > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > > > > > > > +			      drm_sched_entity_compare_before);
> > > > > > > > > > +	}
> > > > > > > > > 
> > > > > > > > > If this is a race, then this patch here is broken, too, because you're
> > > > > > > > > checking the 'stopped' boolean as the callers of that function do, too
> > > > > > > > > – just later. :O
> > > > > > > > > 
> > > > > > > > > Could still race, just less likely.
> > > > > > > > > 
> > > > > > > > > The proper way to fix it would then be to address the issue where the
> > > > > > > > > locking is supposed to happen. Let's look at, for example,
> > > > > > > > > drm_sched_entity_push_job():
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > > > > > > > {
> > > > > > > > > 	(Bla bla bla)
> > > > > > > > > 
> > > > > > > > >  	…………
> > > > > > > > > 
> > > > > > > > > 	/* first job wakes up scheduler */
> > > > > > > > > 	if (first) {
> > > > > > > > > 		struct drm_gpu_scheduler *sched;
> > > > > > > > > 		struct drm_sched_rq *rq;
> > > > > > > > > 
> > > > > > > > > 		/* Add the entity to the run queue */
> > > > > > > > > 		spin_lock(&entity->lock);
> > > > > > > > > 		if (entity->stopped) {                  <---- Aha!
> > > > > > > > > 			spin_unlock(&entity->lock);
> > > > > > > > > 
> > > > > > > > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > > > > > > > 			return;
> > > > > > > > > 		}
> > > > > > > > > 
> > > > > > > > > 		rq = entity->rq;
> > > > > > > > > 		sched = rq->sched;
> > > > > > > > > 
> > > > > > > > > 		spin_lock(&rq->lock);
> > > > > > > > > 		drm_sched_rq_add_entity(rq, entity);
> > > > > > > > > 
> > > > > > > > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > > > > > > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > > > > > > > 
> > > > > > > > > 		spin_unlock(&rq->lock);
> > > > > > > > > 		spin_unlock(&entity->lock);
> > > > > > > > > 
> > > > > > > > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > > > > > > > 
> > > > > > > > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > > > > > > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > > > > > > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > > > > > > > 
> > > > > > > > > On the other hand, aren't drivers prohibited from calling
> > > > > > > > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > > > > > > > fuzzer does that, then it's not the scheduler's fault.
> > > > > > > 
> > > > > > > Exactly, this is the first question to ask.
> > > > > > > 
> > > > > > > And I think it's even more restrictive:
> > > > > > > 
> > > > > > > In drm_sched_fini()
> > > > > > > 
> > > > > > > 	for (i = DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) {
> > > > > > > 		struct drm_sched_rq *rq = sched->sched_rq[i];
> > > > > > > 
> > > > > > > 		spin_lock(&rq->lock);
> > > > > > > 		list_for_each_entry(s_entity, &rq->entities, list)
> > > > > > > 			/*
> > > > > > > 			 * Prevents reinsertion and marks job_queue as idle,
> > > > > > > 			 * it will be removed from the rq in drm_sched_entity_fini()
> > > > > > > 			 * eventually
> > > > > > > 			 */
> > > > > > > 			s_entity->stopped = true;
> > > > > > > 		spin_unlock(&rq->lock);
> > > > > > > 		kfree(sched->sched_rq[i]);
> > > > > > > 	}
> > > > > > > 
> > > > > > > In drm_sched_entity_kill()
> > > > > > > 
> > > > > > > 	static void drm_sched_entity_kill(struct drm_sched_entity *entity)
> > > > > > > 	{
> > > > > > > 		struct drm_sched_job *job;
> > > > > > > 		struct dma_fence *prev;
> > > > > > > 
> > > > > > > 		if (!entity->rq)
> > > > > > > 			return;
> > > > > > > 
> > > > > > > 		spin_lock(&entity->lock);
> > > > > > > 		entity->stopped = true;
> > > > > > > 		drm_sched_rq_remove_entity(entity->rq, entity);
> > > > > > > 		spin_unlock(&entity->lock);
> > > > > > > 
> > > > > > > 		[...]
> > > > > > > 	}
> > > > > > > 
> > > > > > > If this runs concurrently, this is a UAF as well.
> > > > > > > 
> > > > > > > Personally, I have always been working with the assupmtion that entites have to
> > > > > > > be torn down *before* the scheduler, but those lifetimes are not documented
> > > > > > > properly.
> > > > > > 
> > > > > > Yes, this is my assumption too. I would even take it further: an entity
> > > > > > shouldn't be torn down until all jobs associated with it are freed as
> > > > > > well. I think this would solve a lot of issues I've seen on the list
> > > > > > related to UAF, teardown, etc.
> > > > > 
> > > > > That's kind of impossible with the new tear down design, because
> > > > > drm_sched_fini() ensures that all jobs are freed on teardown. And
> > > > > drm_sched_fini() wouldn't be called before all jobs are gone,
> > > > > effectively resulting in a chicken-egg-problem, or rather: the driver
> > > > > implementing its own solution for teardown.
> > > > > 
> > > > 
> > > > I've read this four times and I'm still generally confused.
> > > > 
> > > > "drm_sched_fini ensures that all jobs are freed on teardown" — Yes,
> > > > that's how a refcounting-based solution works. drm_sched_fini would
> > > > never be called if there were pending jobs.
> > > > 
> > > > "drm_sched_fini() wouldn't be called before all jobs are gone" — See
> > > > above.
> > > > 
> > > > "effectively resulting in a chicken-and-egg problem" — A job is created
> > > > after the scheduler, and it holds a reference to the scheduler until
> > > > it's freed. I don't see how this idiom applies.
> > > > 
> > > > "the driver implementing its own solution for teardown" — It’s just
> > > > following the basic lifetime rules I outlined below. Perhaps Xe was
> > > > ahead of its time, but the number of DRM scheduler blowups we've had is
> > > > zero — maybe a strong indication that this design is correct.
> > > > 
> > > 
> > > Sorry—self-reply.
> > > 
> > > To expand on this: the reason Xe implemented a refcount-based teardown
> > > solution is because the internals of the DRM scheduler during teardown
> > > looked wildly scary. A lower layer should not impose its will on upper
> > > layers. I think that’s the root cause of all the problems I've listed.
> > > 
> > > In my opinion, we should document the lifetime rules I’ve outlined, fix
> > > all drivers accordingly, and assert these rules in the scheduler layer.
> > 
> > 
> > Everyone had a separate solution for that. Nouveau used a waitqueue.
> > That's what happens when there's no centralized mechanism for solving a
> > problem.
> > 
> 
> Right, this is essentially my point — I think refcounting on the driver
> side is what the long-term solution really needs to be.
> 
> To recap the basic rules:
> 
> - Entities should not be finalized or freed until all jobs associated
>   with them are freed.
> - Schedulers should not be finalized or freed until all associated
>   entities are finalized.
> - Jobs should hold a reference to the entity.
> - Entities should hold a reference to the scheduler.
> 
> I understand this won’t happen overnight — or perhaps ever — but
> adopting this model would solve a lot of problems across the subsystem
> and reduce a significant amount of complexity in the DRM scheduler. I’ll
> also acknowledge that part of this is my fault — years ago, I worked
> around problems (implemented above ref count model) in the scheduler
> related to teardown rather than proposing a common, unified solution,
> and clear lifetime rules.
> 
> For drivers with a 1:1 entity-to-scheduler relationship, teardown
> becomes fairly simple: set the TDR timeout to zero and naturally let the
> remaining jobs flush out via TDR + the timedout_job callback, which
> signals the job’s fence. Free job, is called after that.
> 
> For non-1:1 setups, we could introduce something like
> drm_sched_entity_kill, which would move all jobs on the pending list of
> a given entity to a kill list. A worker could then process that kill
> list — calling timedout_job and signaling the associated fences.
> Similarly, any jobs that had unresolved dependencies could be
> immediately added to the kill list. The kill list would have to be

s/added to the kill list/added to the kill list after calling run_job/

Matt

> checked in drm_sched_free_job_work too.
> 
> This would ensure that all jobs submitted would go through the full
> lifecycle:
> 
> - run_job is called
> - free_job is called
> - If the fence returned from run_job needs to be artificially signaled,
>   timedout_job is called
> 
> We can add the infrastructure for this and once all driver adhere this
> model, clean up ugliness in the scheduler related to teardown and all
> races here.
> 
> > Did you see the series we recently merged which repairs the memory
> > leaks of drm/sched? It had been around for quite some time.
> > 
> > https://lore.kernel.org/dri-devel/20250701132142.76899-3-phasta@kernel.org/
> >
> 
> I would say this is just hacking around the fundamental issues with the
> lifetime of these objects. Do you see anything in Nouveau that would
> prevent the approach I described above from working?
> 
> Also, what if jobs have dependencies that aren't even on the pending
> list yet? This further illustrates the problems with trying to finalize
> objects while child objects (entities, job) are still around.
> 
> Matt
> 
> > 
> > P.
> > 
> > > 
> > > Matt
> > > 
> > > > Matt
> > > > 
> > > > > P.
> > > > > 
> > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > There are two solutions:
> > > > > > > 
> > > > > > >   (1) Strictly require all entities to be torn down before drm_sched_fini(),
> > > > > > >       i.e. stick to the natural ownership and lifetime rules here (see below).
> > > > > > > 
> > > > > > >   (2) Actually protect *any* changes of the relevent fields of the entity
> > > > > > >       structure with the entity lock.
> > > > > > > 
> > > > > > > While (2) seems rather obvious, we run into lock inversion with this approach,
> > > > > > > as you note below as well. And I think drm_sched_fini() should not mess with
> > > > > > > entities anyways.
> > > > > > > 
> > > > > > > The ownership here seems obvious:
> > > > > > > 
> > > > > > > The scheduler *owns* a resource that is used by entities. Consequently, entities
> > > > > > > are not allowed to out-live the scheduler.
> > > > > > > 
> > > > > > > Surely, the current implementation to just take the resource away from the
> > > > > > > entity under the hood can work as well with appropriate locking, but that's a
> > > > > > > mess.
> > > > > > > 
> > > > > > > If the resource *really* needs to be shared for some reason (which I don't see),
> > > > > > > shared ownership, i.e. reference counting, is much less error prone.
> > > > > > 
> > > > > > Yes, Xe solves all of this via reference counting (jobs refcount the
> > > > > > entity). It's a bit easier in Xe since the scheduler and entities are
> > > > > > the same object due to their 1:1 relationship. But even in non-1:1
> > > > > > relationships, an entity could refcount the scheduler. The teardown
> > > > > > sequence would then be: all jobs complete on the entity → teardown the
> > > > > > entity → all entities torn down → teardown the scheduler.
> > > > > > 
> > > > > > Matt
> > > > > 
> > 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21  8:16   ` Philipp Stanner
  2025-07-21 10:14     ` Danilo Krummrich
@ 2025-07-22 20:05     ` James
  2025-07-23 14:41       ` Philipp Stanner
  1 sibling, 1 reply; 17+ messages in thread
From: James @ 2025-07-22 20:05 UTC (permalink / raw)
  To: phasta, matthew.brost, dakr, Christian König,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	Shuah Khan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

On Mon, Jul 21, 2025, at 1:16 AM, Philipp Stanner wrote:
> On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
>> +Cc Tvrtko, who's currently reworking FIFO and RR.
>> 
>> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
>> > Fixes an issue where entities are added to the run queue in
>> > drm_sched_rq_update_fifo_locked after being killed, causing a
>> > slab-use-after-free error.
>> > 
>> > Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
>> > ---
>> > This issue was detected by syzkaller running on a Steam Deck OLED.
>> > Unfortunately I don't have a reproducer for it. I've
>> 
>> Well, now that's kind of an issue – if you don't have a reproducer, how
>> can you know that your patch is correct? How can we?
>> 
>> It would certainly be good to know what the fuzz testing framework
>> does.
>> 
>> > included the KASAN reports below:
>> 
>> 
>> Anyways, KASAN reports look interesting. But those might be many
>> different issues. Again, would be good to know what the fuzzer has been
>> testing. Can you maybe split this fuzz test into sub-tests? I suspsect
>> those might be different faults.
>> 
>> 
>> Anyways, taking a first look…
>> 
>> 
>> > 
>> > ==================================================================
>> > BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
>> > Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
>> > CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0-flowejam-+ #1
>> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> > Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>> > Call Trace:
>> >  <TASK>
>> >  __dump_stack lib/dump_stack.c:94 [inline]
>> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
>> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>> >  rb_next+0xda/0x160 lib/rbtree.c:505
>> >  drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/sched_main.c:332 [inline] [gpu_sched]
>> >  drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/sched_main.c:1081 [gpu_sched]
>> >  drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/sched_main.c:1206 [gpu_sched]
>> >  process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
>> >  process_scheduled_works kernel/workqueue.c:3319 [inline]
>> >  worker_thread+0x734/0x1060 kernel/workqueue.c:3400
>> >  kthread+0x3fd/0x810 kernel/kthread.c:464
>> >  ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
>> >  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>> >  </TASK>
>> > Allocated by task 73472:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>> >  vfs_open+0x87/0x3f0 fs/open.c:1086
>> >  do_open+0x72f/0xf80 fs/namei.c:3830
>> >  path_openat+0x2ec/0x770 fs/namei.c:3989
>> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>> >  do_sys_open fs/open.c:1443 [inline]
>> >  __do_sys_openat fs/open.c:1459 [inline]
>> >  __se_sys_openat fs/open.c:1454 [inline]
>> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > Freed by task 73472:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>> >  poison_slab_object mm/kasan/common.c:247 [inline]
>> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>> >  kasan_slab_free include/linux/kasan.h:233 [inline]
>> >  slab_free_hook mm/slub.c:2353 [inline]
>> >  slab_free mm/slub.c:4609 [inline]
>> >  kfree+0x14f/0x4d0 mm/slub.c:4757
>> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > The buggy address belongs to the object at ffff888180508000
>> > The buggy address is located 1504 bytes inside of
>> > The buggy address belongs to the physical page:
>> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x180508
>> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> > page_type: f5(slab)
>> > raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
>> > raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
>> > head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 0000000000000000
>> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> > page dumped because: kasan: bad access detected
>> > Memory state around the buggy address:
>> >  ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > > ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >                                                        ^
>> >  ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > ==================================================================
>> > ==================================================================
>> > BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>> > BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>> > BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>> > Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
>> > CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0-flowejam-+ #1
>> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> > Call Trace:
>> >  <TASK>
>> >  __dump_stack lib/dump_stack.c:94 [inline]
>> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
>> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>> >  rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>> >  __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>> >  rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>> >  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>> >  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>> >  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>> >  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>> >  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>> >  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>> >  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>> >  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>> >  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  exit_task_work include/linux/task_work.h:40 [inline]
>> >  do_exit+0x841/0xf60 kernel/exit.c:938
>> >  do_group_exit+0xda/0x2b0 kernel/exit.c:1087
>> >  get_signal+0x171f/0x19d0 kernel/signal.c:3036
>> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > RIP: 0033:0x7f2d90da36ed
>> > Code: Unable to access opcode bytes at 0x7f2d90da36c3.
>> > RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> > RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
>> > RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
>> > RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
>> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
>> > R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
>> >  </TASK>
>> > Allocated by task 12381:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>> >  vfs_open+0x87/0x3f0 fs/open.c:1086
>> >  do_open+0x72f/0xf80 fs/namei.c:3830
>> >  path_openat+0x2ec/0x770 fs/namei.c:3989
>> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>> >  do_sys_open fs/open.c:1443 [inline]
>> >  __do_sys_openat fs/open.c:1459 [inline]
>> >  __se_sys_openat fs/open.c:1454 [inline]
>> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > Freed by task 12381:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>> >  poison_slab_object mm/kasan/common.c:247 [inline]
>> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>> >  kasan_slab_free include/linux/kasan.h:233 [inline]
>> >  slab_free_hook mm/slub.c:2353 [inline]
>> >  slab_free mm/slub.c:4609 [inline]
>> >  kfree+0x14f/0x4d0 mm/slub.c:4757
>> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > The buggy address belongs to the object at ffff88816414c000
>> > The buggy address is located 1488 bytes inside of
>> > The buggy address belongs to the physical page:
>> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x164148
>> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> > page_type: f5(slab)
>> > raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
>> > raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
>> > head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 0000000000000000
>> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> > page dumped because: kasan: bad access detected
>> > Memory state around the buggy address:
>> >  ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > > ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >                                                  ^
>> >  ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > ==================================================================
>> > ==================================================================
>> > BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>> > BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>> > Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
>> > CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0-flowejam-+ #1
>> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> > Call Trace:
>> >  <TASK>
>> >  __dump_stack lib/dump_stack.c:94 [inline]
>> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
>> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>> >  __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>> >  rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>> >  rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>> >  drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>> >  drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>> >  drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>> >  drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>> >  drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>> >  amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>> >  amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>> >  amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>> >  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > RIP: 0033:0x7fd23eba36ed
>> > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>> > RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
>> > RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
>> > RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
>> > RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
>> > R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
>> > R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
>> >  </TASK>
>> > Allocated by task 6559:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>> >  vfs_open+0x87/0x3f0 fs/open.c:1086
>> >  do_open+0x72f/0xf80 fs/namei.c:3830
>> >  path_openat+0x2ec/0x770 fs/namei.c:3989
>> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>> >  do_sys_open fs/open.c:1443 [inline]
>> >  __do_sys_openat fs/open.c:1459 [inline]
>> >  __se_sys_openat fs/open.c:1454 [inline]
>> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > Freed by task 6559:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>> >  poison_slab_object mm/kasan/common.c:247 [inline]
>> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>> >  kasan_slab_free include/linux/kasan.h:233 [inline]
>> >  slab_free_hook mm/slub.c:2353 [inline]
>> >  slab_free mm/slub.c:4609 [inline]
>> >  kfree+0x14f/0x4d0 mm/slub.c:4757
>> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  get_signal+0x1be/0x19d0 kernel/signal.c:2809
>> >  arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>> >  exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > The buggy address belongs to the object at ffff88812ebcc000
>> > The buggy address is located 1504 bytes inside of
>> > The buggy address belongs to the physical page:
>> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12ebc8
>> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> > page_type: f5(slab)
>> > raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
>> > raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
>> > head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 0000000000000000
>> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> > page dumped because: kasan: bad access detected
>> > Memory state around the buggy address:
>> >  ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > > ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >                                                        ^
>> >  ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > ==================================================================
>> > ==================================================================
>> > BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>> > BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>> > BUG: KASAN: slab-use-after-free in drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
>> > Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
>> > CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0-flowejam-+ #1
>> > Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> > Call Trace:
>> >  <TASK>
>> >  __dump_stack lib/dump_stack.c:94 [inline]
>> >  dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>> >  print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>> >  print_report+0xfc/0x1ff mm/kasan/report.c:521
>> >  kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>> >  drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>> >  rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>> >  drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
>> >  drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/sched_entity.c:623 [gpu_sched]
>> 
>> This might be a race between entity killing and the push_job. Let's
>> look at your patch below…
>> 
>> >  amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:314 [amdgpu]
>> >  amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c:122 [amdgpu]
>> >  amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:422 [amdgpu]
>> >  amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2609 [amdgpu]
>> >  amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1418 [amdgpu]
>> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>> >  vfs_open+0x87/0x3f0 fs/open.c:1086
>> >  do_open+0x72f/0xf80 fs/namei.c:3830
>> >  path_openat+0x2ec/0x770 fs/namei.c:3989
>> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>> >  do_sys_open fs/open.c:1443 [inline]
>> >  __do_sys_openat fs/open.c:1459 [inline]
>> >  __se_sys_openat fs/open.c:1454 [inline]
>> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > RIP: 0033:0x7feb303a36ed
>> > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>> > RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
>> > RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
>> > RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
>> > RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
>> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> > R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
>> >  </TASK>
>> > Allocated by task 146638:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>> >  __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>> >  kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>> >  kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>> >  amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>> >  drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>> >  drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>> >  drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>> >  drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>> >  chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>> >  do_dentry_open+0x743/0x1bf0 fs/open.c:956
>> >  vfs_open+0x87/0x3f0 fs/open.c:1086
>> >  do_open+0x72f/0xf80 fs/namei.c:3830
>> >  path_openat+0x2ec/0x770 fs/namei.c:3989
>> >  do_filp_open+0x1ff/0x420 fs/namei.c:4016
>> >  do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>> >  do_sys_open fs/open.c:1443 [inline]
>> >  __do_sys_openat fs/open.c:1459 [inline]
>> >  __se_sys_openat fs/open.c:1454 [inline]
>> >  __x64_sys_openat+0x149/0x210 fs/open.c:1454
>> >  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>> >  do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > Freed by task 146638:
>> >  kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>> >  kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>> >  kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>> >  poison_slab_object mm/kasan/common.c:247 [inline]
>> >  __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>> >  kasan_slab_free include/linux/kasan.h:233 [inline]
>> >  slab_free_hook mm/slub.c:2353 [inline]
>> >  slab_free mm/slub.c:4609 [inline]
>> >  kfree+0x14f/0x4d0 mm/slub.c:4757
>> >  amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>> >  drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>> >  drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>> >  drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>> >  drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>> >  __fput+0x402/0xb50 fs/file_table.c:464
>> >  task_work_run+0x155/0x250 kernel/task_work.c:227
>> >  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>> >  exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>> >  exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>> >  syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>> >  do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>> >  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> > The buggy address belongs to the object at ffff888120844000
>> > The buggy address is located 1480 bytes inside of
>> > The buggy address belongs to the physical page:
>> > page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120840
>> > head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> > flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> > page_type: f5(slab)
>> > raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
>> > raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
>> > head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> > head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 0000000000000000
>> > head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> > page dumped because: kasan: bad access detected
>> > Memory state around the buggy address:
>> >  ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > > ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >                                               ^
>> >  ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> >  ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> > ==================================================================
>> > 
>> >  drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
>> >  1 file changed, 4 insertions(+), 2 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> > index bfea608a7106..997a2cc1a635 100644
>> > --- a/drivers/gpu/drm/scheduler/sched_main.c
>> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
>> >  
>> >  	entity->oldest_job_waiting = ts;
>> >  
>> > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> > -		      drm_sched_entity_compare_before);
>> > +	if (!entity->stopped) {
>> > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> > +			      drm_sched_entity_compare_before);
>> > +	}
>> 
>> If this is a race, then this patch here is broken, too, because you're
>> checking the 'stopped' boolean as the callers of that function do, too
>> – just later. :O
>> 
>> Could still race, just less likely.
>> 
>> The proper way to fix it would then be to address the issue where the
>> locking is supposed to happen. Let's look at, for example,
>> drm_sched_entity_push_job():
>> 
>> 
>> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>> {
>> 	(Bla bla bla)
>> 
>>  	…………
>> 
>> 	/* first job wakes up scheduler */
>> 	if (first) {
>> 		struct drm_gpu_scheduler *sched;
>> 		struct drm_sched_rq *rq;
>> 
>> 		/* Add the entity to the run queue */
>> 		spin_lock(&entity->lock);
>> 		if (entity->stopped) {                  <---- Aha!
>> 			spin_unlock(&entity->lock);
>> 
>> 			DRM_ERROR("Trying to push to a killed entity\n");
>> 			return;
>> 		}
>> 
>> 		rq = entity->rq;
>> 		sched = rq->sched;
>> 
>> 		spin_lock(&rq->lock);
>> 		drm_sched_rq_add_entity(rq, entity);
>> 
>> 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>> 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
>> 
>> 		spin_unlock(&rq->lock);
>> 		spin_unlock(&entity->lock);
>> 
>> But the locks are still being hold. So that "shouldn't be happening"(tm).
>> 
>> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
>> stop entities. The former holds appropriate locks, but drm_sched_fini()
>> doesn't. So that looks like a hot candidate to me. Opinions?
>> 
>> On the other hand, aren't drivers prohibited from calling
>> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
>> fuzzer does that, then it's not the scheduler's fault.
>> 
>> Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
>
> Ah no, forget about that.
>
> In drm_sched_fini(), you'd have to take the locks in reverse order as
> in drm_sched_entity_push/pop_job(), thereby replacing race with
> deadlock.
>
> I suspect that this is an issue in amdgpu. But let's wait for
> Christian.
>
>
> P.
>
>
>> 
>> Would be cool if Tvrtko and Christian take a look. Maybe we even have a
>> fundamental design issue.
>> 
>> 
>> Regards
>> P.
>> 
>> 
>> >  }
>> >  
>> >  /**
>>

Thanks for taking a look at this. I did try to get a reproducer using syzkaller, without success. I can attempt it myself but I expect it will take me some time, if I'm able to at all with this bug. I did run some of the igt-gpu-tools tests (amdgpu and drm ones), and there was no difference after the changes on my system. After this change I wasn't running into the UAF errors after 100k+ executions but I see what you mean, Philipp - perhaps it's missing the root issue. 

FYI, as an experiment I forced the use of RR with "drm_sched_policy = DRM_SCHED_POLICY_RR", and I'm not seeing any slab-use-after-frees, so maybe the problem is with the FIFO implementation? 

For now, the closest thing to a reproducer I can provide is my syzkaller config, in case anyone else is able to try this with a Steam Deck OLED. I've included this below along with an example program run by syzkaller (in generated C code and a Syz language version).
---------------------------------------------------
{
	"target": "linux/amd64",
	"http": "127.0.0.1:56741",
	"sshkey" : "/path",
	"workdir": "/path",
	"kernel_obj": "/path",
	"kernel_src": "/path",
	"syzkaller": "/path",
	"sandbox": "setuid",
	"type": "isolated",
	"enable_syscalls": ["openat$drirender128", "ioctl$DRM_*", "close"],
	"disable_syscalls": ["ioctl$DRM_IOCTL_SYNCOBJ_*"],
	"reproduce": false,
	"vm": {
		"targets" : [ "10.0.0.1" ],
		"pstore": false,
		"target_dir" : "/path",
                "target_reboot" : true
	}
}
---------------------------------------------------
Generated C program:

// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE 

#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>

uint64_t r[15] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, 0x0, 0x0, 0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};

int main(void)
{
		syscall(__NR_mmap, /*addr=*/0x1ffffffff000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/(intptr_t)-1, /*offset=*/0ul);
	syscall(__NR_mmap, /*addr=*/0x200000000000ul, /*len=*/0x1000000ul, /*prot=PROT_WRITE|PROT_READ|PROT_EXEC*/7ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/(intptr_t)-1, /*offset=*/0ul);
	syscall(__NR_mmap, /*addr=*/0x200001000000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_ANONYMOUS|MAP_PRIVATE*/0x32ul, /*fd=*/(intptr_t)-1, /*offset=*/0ul);
	const char* reason;
	(void)reason;
				intptr_t res = 0;
	if (write(1, "executing program\n", sizeof("executing program\n") - 1)) {}
memcpy((void*)0x200000000200, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000200ul, /*flags=O_SYNC*/0x101000, /*mode=*/0);
	if (res != -1)
		r[0] = res;
*(uint32_t*)0x200000002440 = 0;
*(uint32_t*)0x200000002444 = 0x80000;
	syscall(__NR_ioctl, /*fd=*/r[0], /*cmd=*/0xc00c642d, /*arg=*/0x200000002440ul);
memcpy((void*)0x2000000001c0, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x2000000001c0ul, /*flags=O_NOFOLLOW|O_CREAT|O_CLOEXEC*/0xa0040, /*mode=*/0);
	if (res != -1)
		r[1] = res;
	syscall(__NR_ioctl, /*fd=*/r[1], /*cmd=*/0x4b47, /*arg=*/0ul);
memcpy((void*)0x200000000300, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000300ul, /*flags=O_SYNC|O_NONBLOCK|O_LARGEFILE*/0x109800, /*mode=*/0);
	if (res != -1)
		r[2] = res;
memcpy((void*)0x200000000100, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000100ul, /*flags=*/0, /*mode=*/0);
	if (res != -1)
		r[3] = res;
	syscall(__NR_ioctl, /*fd=*/r[3], /*cmd=*/0x80f86406, /*arg=*/0ul);
memcpy((void*)0x200000000440, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000440ul, /*flags=O_NONBLOCK|O_CREAT*/0x840, /*mode=*/0);
	if (res != -1)
		r[4] = res;
*(uint64_t*)0x200000000fc0 = 0;
*(uint32_t*)0x200000000fc8 = 0;
*(uint32_t*)0x200000000fcc = 0;
	res = syscall(__NR_ioctl, /*fd=*/r[4], /*cmd=*/0xc06864a1, /*arg=*/0x200000000fc0ul);
	if (res != -1)
r[5] = *(uint32_t*)0x200000000fd0;
	syscall(__NR_close, /*fd=*/r[2]);
*(uint32_t*)0x200000000000 = 0;
*(uint32_t*)0x200000000004 = 0;
	syscall(__NR_ioctl, /*fd=*/r[4], /*cmd=*/0xc00c642d, /*arg=*/0x200000000000ul);
*(uint32_t*)0x200000000040 = 0;
*(uint32_t*)0x200000000044 = 0;
	res = syscall(__NR_ioctl, /*fd=*/r[3], /*cmd=*/0xc00c642d, /*arg=*/0x200000000040ul);
	if (res != -1)
r[6] = *(uint32_t*)0x200000000048;
*(uint32_t*)0x200000000080 = 0;
	res = syscall(__NR_ioctl, /*fd=*/r[2], /*cmd=*/0xc010640b, /*arg=*/0x200000000080ul);
	if (res != -1)
r[7] = *(uint32_t*)0x200000000084;
*(uint32_t*)0x2000000000c0 = 0;
	res = syscall(__NR_ioctl, /*fd=*/r[4], /*cmd=*/0xc010640b, /*arg=*/0x2000000000c0ul);
	if (res != -1)
r[8] = *(uint32_t*)0x2000000000c4;
memcpy((void*)0x2000000001c0, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x2000000001c0ul, /*flags=O_NOFOLLOW|O_CREAT|O_CLOEXEC*/0xa0040, /*mode=*/0);
	if (res != -1)
		r[9] = res;
	syscall(__NR_ioctl, /*fd=*/r[9], /*cmd=*/0x5421, /*arg=*/0ul);
*(uint32_t*)0x200000000180 = r[5];
*(uint32_t*)0x200000000184 = 0x2534dd8;
*(uint32_t*)0x200000000188 = 7;
*(uint32_t*)0x20000000018c = 3;
*(uint32_t*)0x200000000190 = 2;
*(uint32_t*)0x2000000001a4 = 0x97;
*(uint32_t*)0x2000000001a8 = 0x74c83423;
*(uint32_t*)0x2000000001ac = 4;
*(uint32_t*)0x2000000001b0 = 8;
*(uint32_t*)0x2000000001b4 = 6;
*(uint32_t*)0x2000000001b8 = 0x7f;
*(uint32_t*)0x2000000001bc = 0;
*(uint32_t*)0x2000000001c0 = 9;
*(uint64_t*)0x2000000001c8 = 3;
*(uint64_t*)0x2000000001d0 = 0;
*(uint64_t*)0x2000000001d8 = 1;
*(uint64_t*)0x2000000001e0 = 1;
	res = syscall(__NR_ioctl, /*fd=*/r[3], /*cmd=*/0xc06864ce, /*arg=*/0x200000000180ul);
	if (res != -1)
r[10] = *(uint32_t*)0x200000000198;
*(uint32_t*)0x200000000200 = r[5];
*(uint32_t*)0x200000000204 = 1;
*(uint32_t*)0x200000000208 = 1;
*(uint32_t*)0x20000000020c = 0;
*(uint32_t*)0x200000000210 = 1;
*(uint32_t*)0x200000000214 = r[7];
*(uint32_t*)0x200000000218 = r[8];
*(uint32_t*)0x20000000021c = 0;
*(uint32_t*)0x200000000220 = r[10];
*(uint32_t*)0x200000000224 = 9;
*(uint32_t*)0x200000000228 = 7;
*(uint32_t*)0x20000000022c = 2;
*(uint32_t*)0x200000000230 = 2;
*(uint32_t*)0x200000000234 = 0x400;
*(uint32_t*)0x200000000238 = 0x367;
*(uint32_t*)0x20000000023c = 7;
*(uint32_t*)0x200000000240 = 8;
*(uint64_t*)0x200000000248 = 0x3e;
*(uint64_t*)0x200000000250 = 3;
*(uint64_t*)0x200000000258 = 9;
*(uint64_t*)0x200000000260 = 6;
	syscall(__NR_ioctl, /*fd=*/r[6], /*cmd=*/0xc06864b8, /*arg=*/0x200000000200ul);
	res = syscall(__NR_ioctl, /*fd=*/r[6], /*cmd=*/0xc0086420, /*arg=*/0x200000000140ul);
	if (res != -1)
r[11] = *(uint32_t*)0x200000000140;
*(uint32_t*)0x200000000280 = r[11];
*(uint32_t*)0x200000000284 = 0x26;
	syscall(__NR_ioctl, /*fd=*/r[3], /*cmd=*/0x4008642a, /*arg=*/0x200000000280ul);
memcpy((void*)0x200000000300, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000300ul, /*flags=O_SYNC|O_NONBLOCK|O_LARGEFILE*/0x109800, /*mode=*/0);
	if (res != -1)
		r[12] = res;
memcpy((void*)0x200000000100, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000100ul, /*flags=*/0, /*mode=*/0);
	if (res != -1)
		r[13] = res;
	syscall(__NR_ioctl, /*fd=*/r[13], /*cmd=*/0x80f86406, /*arg=*/0ul);
memcpy((void*)0x200000000440, "/dev/dri/renderD128\000", 20);
	res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul, /*file=*/0x200000000440ul, /*flags=O_NONBLOCK|O_CREAT*/0x840, /*mode=*/0);
	if (res != -1)
		r[14] = res;
*(uint64_t*)0x200000000fc0 = 0;
*(uint32_t*)0x200000000fc8 = 0;
*(uint32_t*)0x200000000fcc = 0;
	syscall(__NR_ioctl, /*fd=*/r[14], /*cmd=*/0xc06864a1, /*arg=*/0x200000000fc0ul);
	syscall(__NR_close, /*fd=*/r[12]);
*(uint32_t*)0x200000000000 = 0;
*(uint32_t*)0x200000000004 = 0;
	syscall(__NR_ioctl, /*fd=*/r[14], /*cmd=*/0xc00c642d, /*arg=*/0x200000000000ul);
*(uint32_t*)0x200000000040 = 0;
*(uint32_t*)0x200000000044 = 0;
	syscall(__NR_ioctl, /*fd=*/r[13], /*cmd=*/0xc00c642d, /*arg=*/0x200000000040ul);
*(uint32_t*)0x200000000080 = 0;
	syscall(__NR_ioctl, /*fd=*/r[12], /*cmd=*/0xc010640b, /*arg=*/0x200000000080ul);
*(uint32_t*)0x2000000000c0 = 0;
	syscall(__NR_ioctl, /*fd=*/r[14], /*cmd=*/0xc010640b, /*arg=*/0x2000000000c0ul);
	return 0;
}

---------------------------------------------------
Syzkaller program (Syz language):

r0 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000200), 0x101000, 0x0)
ioctl$DRM_IOCTL_PRIME_HANDLE_TO_FD(r0, 0xc00c642d, &(0x7f0000002440)={0x0, 0x80000})
r1 = openat$drirender128(0xffffffffffffff9c, &(0x7f00000001c0), 0xa0040, 0x0)
ioctl$DRM_IOCTL_RES_CTX(r1, 0x4b47, 0x0)
r2 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000300), 0x109800, 0x0)
r3 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000100), 0x0, 0x0)
ioctl$DRM_IOCTL_GET_STATS(r3, 0x80f86406, 0x0)
r4 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000440), 0x840, 0x0)
ioctl$DRM_IOCTL_MODE_GETCRTC(r4, 0xc06864a1, &(0x7f0000000fc0)={0x0, 0x0, 0x0, <r5=>0x0})
close(r2)
ioctl$DRM_IOCTL_PRIME_HANDLE_TO_FD(r4, 0xc00c642d, &(0x7f0000000000))
ioctl$DRM_IOCTL_PRIME_HANDLE_TO_FD(r3, 0xc00c642d, &(0x7f0000000040)={0x0, 0x0, <r6=>0xffffffffffffffff})
ioctl$DRM_IOCTL_GEM_OPEN(r2, 0xc010640b, &(0x7f0000000080)={0x0, <r7=>0x0})
ioctl$DRM_IOCTL_GEM_OPEN(r4, 0xc010640b, &(0x7f00000000c0)={0x0, <r8=>0x0})
r9 = openat$drirender128(0xffffffffffffff9c, &(0x7f00000001c0), 0xa0040, 0x0)
ioctl$DRM_IOCTL_RES_CTX(r9, 0x5421, 0x0)
ioctl$DRM_IOCTL_MODE_GETFB2(r3, 0xc06864ce, &(0x7f0000000180)={r5, 0x2534dd8, 0x7, 0x3, 0x2, [0x0, <r10=>0x0], [0x97, 0x74c83423, 0x4, 0x8], [0x6, 0x7f, 0x0, 0x9], [0x3, 0x0, 0x1, 0x1]})
ioctl$DRM_IOCTL_MODE_ADDFB2(r6, 0xc06864b8, &(0x7f0000000200)={r5, 0x1, 0x1, 0x0, 0x1, [r7, r8, 0x0, r10], [0x9, 0x7, 0x2, 0x2], [0x400, 0x367, 0x7, 0x8], [0x3e, 0x3, 0x9, 0x6]})
ioctl$DRM_IOCTL_ADD_CTX(r6, 0xc0086420, &(0x7f0000000140)={<r11=>0x0})
ioctl$DRM_IOCTL_LOCK(r3, 0x4008642a, &(0x7f0000000280)={r11, 0x26})
r12 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000300), 0x109800, 0x0)
r13 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000100), 0x0, 0x0)
ioctl$DRM_IOCTL_GET_STATS(r13, 0x80f86406, 0x0)
r14 = openat$drirender128(0xffffffffffffff9c, &(0x7f0000000440), 0x840, 0x0)
ioctl$DRM_IOCTL_MODE_GETCRTC(r14, 0xc06864a1, &(0x7f0000000fc0)={0x0})
close(r12)
ioctl$DRM_IOCTL_PRIME_HANDLE_TO_FD(r14, 0xc00c642d, &(0x7f0000000000))
ioctl$DRM_IOCTL_PRIME_HANDLE_TO_FD(r13, 0xc00c642d, &(0x7f0000000040))
ioctl$DRM_IOCTL_GEM_OPEN(r12, 0xc010640b, &(0x7f0000000080))
ioctl$DRM_IOCTL_GEM_OPEN(r14, 0xc010640b, &(0x7f00000000c0))

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-22 20:05     ` James
@ 2025-07-23 14:41       ` Philipp Stanner
  0 siblings, 0 replies; 17+ messages in thread
From: Philipp Stanner @ 2025-07-23 14:41 UTC (permalink / raw)
  To: James, phasta, matthew.brost, dakr, Christian König,
	maarten.lankhorst, mripard, tzimmermann, airlied, simona,
	Shuah Khan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees, Tvrtko Ursulin

Hello,

On Tue, 2025-07-22 at 13:05 -0700, James wrote:
> On Mon, Jul 21, 2025, at 1:16 AM, Philipp Stanner wrote:
> > On Mon, 2025-07-21 at 09:52 +0200, Philipp Stanner wrote:
> > > +Cc Tvrtko, who's currently reworking FIFO and RR.
> > > 
> > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > Fixes an issue where entities are added to the run queue in
> > > > drm_sched_rq_update_fifo_locked after being killed, causing a
> > > > slab-use-after-free error.
> > > > 
> > > > Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
> > > > ---
> > > > This issue was detected by syzkaller running on a Steam Deck OLED.
> > > > Unfortunately I don't have a reproducer for it. I've
> > > 
> > > Well, now that's kind of an issue – if you don't have a reproducer, how
> > > can you know that your patch is correct? How can we?
> > > 
> > > It would certainly be good to know what the fuzz testing framework
> > > does.
> > > 
> > > > included the KASAN reports below:
> > > 
> > > 
> > > Anyways, KASAN reports look interesting. But those might be many
> > > different issues. Again, would be good to know what the fuzzer has been
> > > testing. Can you maybe split this fuzz test into sub-tests? I suspsect
> > > those might be different faults.
> > > 
> > > 
> > > Anyways, taking a first look…
> > > 
> > > 
> > > > 
> > > > ==================================================================
> > > > BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
> > > > Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192

[SNIP]

> > > > 
> > > >  drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > > > index bfea608a7106..997a2cc1a635 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
> > > >  
> > > >  	entity->oldest_job_waiting = ts;
> > > >  
> > > > -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > -		      drm_sched_entity_compare_before);
> > > > +	if (!entity->stopped) {
> > > > +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
> > > > +			      drm_sched_entity_compare_before);
> > > > +	}
> > > 
> > > If this is a race, then this patch here is broken, too, because you're
> > > checking the 'stopped' boolean as the callers of that function do, too
> > > – just later. :O
> > > 
> > > Could still race, just less likely.
> > > 
> > > The proper way to fix it would then be to address the issue where the
> > > locking is supposed to happen. Let's look at, for example,
> > > drm_sched_entity_push_job():
> > > 
> > > 
> > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > {
> > > 	(Bla bla bla)
> > > 
> > >  	…………
> > > 
> > > 	/* first job wakes up scheduler */
> > > 	if (first) {
> > > 		struct drm_gpu_scheduler *sched;
> > > 		struct drm_sched_rq *rq;
> > > 
> > > 		/* Add the entity to the run queue */
> > > 		spin_lock(&entity->lock);
> > > 		if (entity->stopped) {                  <---- Aha!
> > > 			spin_unlock(&entity->lock);
> > > 
> > > 			DRM_ERROR("Trying to push to a killed entity\n");
> > > 			return;
> > > 		}
> > > 
> > > 		rq = entity->rq;
> > > 		sched = rq->sched;
> > > 
> > > 		spin_lock(&rq->lock);
> > > 		drm_sched_rq_add_entity(rq, entity);
> > > 
> > > 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > > 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> > > 
> > > 		spin_unlock(&rq->lock);
> > > 		spin_unlock(&entity->lock);
> > > 
> > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > 
> > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > 
> > > On the other hand, aren't drivers prohibited from calling
> > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > fuzzer does that, then it's not the scheduler's fault.
> > > 
> > > Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
> > 
> > Ah no, forget about that.
> > 
> > In drm_sched_fini(), you'd have to take the locks in reverse order as
> > in drm_sched_entity_push/pop_job(), thereby replacing race with
> > deadlock.
> > 
> > I suspect that this is an issue in amdgpu. But let's wait for
> > Christian.
> > 
> > 
> > P.
> > 
> > 
> > > 
> > > Would be cool if Tvrtko and Christian take a look. Maybe we even have a
> > > fundamental design issue.
> > > 
> > > 
> > > Regards
> > > P.
> > > 
> > > 
> > > >  }
> > > >  
> > > >  /**
> > > 
> 
> Thanks for taking a look at this. I did try to get a reproducer using syzkaller, without success. I can attempt it myself but I expect it will take me some time, if I'm able to at all with this bug. I did run some of the igt-gpu-tools tests (amdgpu and drm ones), and there was no difference after the changes on my system. After this change I wasn't running into the UAF errors after 100k+ executions but I see what you mean, Philipp - perhaps it's missing the root issue. 
> 
> FYI, as an experiment I forced the use of RR with "drm_sched_policy = DRM_SCHED_POLICY_RR", and I'm not seeing any slab-use-after-frees, so maybe the problem is with the FIFO implementation? 

I can't imagine that. The issue your encountering is most likely a race
caused by the driver tearing down entities after the scheduler, so
different scheduler runtime behavior might hide ("fix") the race
(that's the nature of races, actually: sometimes they're there,
sometimes not). RR running with different time patterns than FIFO
doesn't mean that FIFO has a bug.

> 
> For now, the closest thing to a reproducer I can provide is my syzkaller config, in case anyone else is able to try this with a Steam Deck OLED. I've included this below along with an example program run by syzkaller (in generated C code and a Syz language version).

Thanks for investigating this.

My recommendation for now is that you write a reproducer program,
possibly inspired by the syzkaller code you showed.

Reproduce it cleanly and (optionally) try a fix. Then another mail
would be good, especially with the amdgpu maintainers on Cc since I
suspect that this is a driver issue.

Don't get me wrong, a UAF definitely needs to be fixed; but since it's
not occurring outside of fuzzing currently and as we can't reproduce
it, we can't really do much about it until that's the case.

I will in the mean time provide a patch pimping up the memory life time
documentation for scheduler objects.

Thx
P.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-07-21  7:52 ` Philipp Stanner
  2025-07-21  8:16   ` Philipp Stanner
@ 2025-08-14 10:42   ` Tvrtko Ursulin
  2025-08-14 11:45     ` Tvrtko Ursulin
  1 sibling, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2025-08-14 10:42 UTC (permalink / raw)
  To: phasta, James Flowers, matthew.brost, dakr,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees


On 21/07/2025 08:52, Philipp Stanner wrote:
> +Cc Tvrtko, who's currently reworking FIFO and RR.
> 
> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
>> Fixes an issue where entities are added to the run queue in
>> drm_sched_rq_update_fifo_locked after being killed, causing a
>> slab-use-after-free error.
>>
>> Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
>> ---
>> This issue was detected by syzkaller running on a Steam Deck OLED.
>> Unfortunately I don't have a reproducer for it. I've
> 
> Well, now that's kind of an issue – if you don't have a reproducer, how
> can you know that your patch is correct? How can we?
> 
> It would certainly be good to know what the fuzz testing framework
> does.
> 
>> included the KASAN reports below:
> 
> 
> Anyways, KASAN reports look interesting. But those might be many
> different issues. Again, would be good to know what the fuzzer has been
> testing. Can you maybe split this fuzz test into sub-tests? I suspsect
> those might be different faults.
> 
> 
> Anyways, taking a first look…
> 
> 
>>
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
>> Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
>> CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0-flowejam-+ #1
>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>> Call Trace:
>>   <TASK>
>>   __dump_stack lib/dump_stack.c:94 [inline]
>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>   rb_next+0xda/0x160 lib/rbtree.c:505
>>   drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/sched_main.c:332 [inline] [gpu_sched]
>>   drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/sched_main.c:1081 [gpu_sched]
>>   drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/sched_main.c:1206 [gpu_sched]
>>   process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
>>   process_scheduled_works kernel/workqueue.c:3319 [inline]
>>   worker_thread+0x734/0x1060 kernel/workqueue.c:3400
>>   kthread+0x3fd/0x810 kernel/kthread.c:464
>>   ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
>>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>   </TASK>
>> Allocated by task 73472:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>   do_sys_open fs/open.c:1443 [inline]
>>   __do_sys_openat fs/open.c:1459 [inline]
>>   __se_sys_openat fs/open.c:1454 [inline]
>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> Freed by task 73472:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>   slab_free_hook mm/slub.c:2353 [inline]
>>   slab_free mm/slub.c:4609 [inline]
>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> The buggy address belongs to the object at ffff888180508000
>> The buggy address is located 1504 bytes inside of
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x180508
>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> page_type: f5(slab)
>> raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
>> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
>> head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 0000000000000000
>> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>> Memory state around the buggy address:
>>   ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                         ^
>>   ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>> BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>> Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
>> CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0-flowejam-+ #1
>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> Call Trace:
>>   <TASK>
>>   __dump_stack lib/dump_stack.c:94 [inline]
>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>   rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>>   __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>>   rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>>   rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>>   drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>>   drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>>   drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>>   drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>>   drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>>   amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>>   amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>>   amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   exit_task_work include/linux/task_work.h:40 [inline]
>>   do_exit+0x841/0xf60 kernel/exit.c:938
>>   do_group_exit+0xda/0x2b0 kernel/exit.c:1087
>>   get_signal+0x171f/0x19d0 kernel/signal.c:3036
>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> RIP: 0033:0x7f2d90da36ed
>> Code: Unable to access opcode bytes at 0x7f2d90da36c3.
>> RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
>> RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
>> R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
>>   </TASK>
>> Allocated by task 12381:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>   do_sys_open fs/open.c:1443 [inline]
>>   __do_sys_openat fs/open.c:1459 [inline]
>>   __se_sys_openat fs/open.c:1454 [inline]
>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> Freed by task 12381:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>   slab_free_hook mm/slub.c:2353 [inline]
>>   slab_free mm/slub.c:4609 [inline]
>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> The buggy address belongs to the object at ffff88816414c000
>> The buggy address is located 1488 bytes inside of
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x164148
>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> page_type: f5(slab)
>> raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
>> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
>> head: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 0000000000000000
>> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>> Memory state around the buggy address:
>>   ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                   ^
>>   ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>> BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>> Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
>> CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0-flowejam-+ #1
>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> Call Trace:
>>   <TASK>
>>   __dump_stack lib/dump_stack.c:94 [inline]
>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>   __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>>   rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>>   rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>>   drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/sched_main.c:154 [inline] [gpu_sched]
>>   drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/sched_main.c:243 [gpu_sched]
>>   drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/sched_entity.c:237 [gpu_sched]
>>   drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 [inline] [gpu_sched]
>>   drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/sched_entity.c:331 [gpu_sched]
>>   amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 [inline] [amdgpu]
>>   amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2752 [amdgpu]
>>   amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1526 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> RIP: 0033:0x7fd23eba36ed
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
>> RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
>> RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
>> RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
>> R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
>> R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
>>   </TASK>
>> Allocated by task 6559:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>   do_sys_open fs/open.c:1443 [inline]
>>   __do_sys_openat fs/open.c:1459 [inline]
>>   __se_sys_openat fs/open.c:1454 [inline]
>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> Freed by task 6559:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>   slab_free_hook mm/slub.c:2353 [inline]
>>   slab_free mm/slub.c:4609 [inline]
>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> The buggy address belongs to the object at ffff88812ebcc000
>> The buggy address is located 1504 bytes inside of
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12ebc8
>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> page_type: f5(slab)
>> raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
>> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
>> head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 0000000000000000
>> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>> Memory state around the buggy address:
>>   ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                         ^
>>   ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>> BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>> BUG: KASAN: slab-use-after-free in drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
>> Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
>> CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0-flowejam-+ #1
>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>> Call Trace:
>>   <TASK>
>>   __dump_stack lib/dump_stack.c:94 [inline]
>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>   drm_sched_entity_compare_before drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>>   rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>>   drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/scheduler/sched_main.c:175 [gpu_sched]
>>   drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/sched_entity.c:623 [gpu_sched]
> 
> This might be a race between entity killing and the push_job. Let's
> look at your patch below…
> 
>>   amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:314 [amdgpu]
>>   amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c:122 [amdgpu]
>>   amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:422 [amdgpu]
>>   amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2609 [amdgpu]
>>   amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1418 [amdgpu]
>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>   do_sys_open fs/open.c:1443 [inline]
>>   __do_sys_openat fs/open.c:1459 [inline]
>>   __se_sys_openat fs/open.c:1454 [inline]
>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> RIP: 0033:0x7feb303a36ed
>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
>> RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
>> RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
>> RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
>>   </TASK>
>> Allocated by task 146638:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1402 [amdgpu]
>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>   do_sys_open fs/open.c:1443 [inline]
>>   __do_sys_openat fs/open.c:1459 [inline]
>>   __se_sys_openat fs/open.c:1454 [inline]
>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> Freed by task 146638:
>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>   slab_free_hook mm/slub.c:2353 [inline]
>>   slab_free mm/slub.c:4609 [inline]
>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:1538 [amdgpu]
>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>   __fput+0x402/0xb50 fs/file_table.c:464
>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>   syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> The buggy address belongs to the object at ffff888120844000
>> The buggy address is located 1480 bytes inside of
>> The buggy address belongs to the physical page:
>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120840
>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>> page_type: f5(slab)
>> raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
>> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
>> head: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>> head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 0000000000000000
>> head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
>> page dumped because: kasan: bad access detected
>> Memory state around the buggy address:
>>   ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>                                                ^
>>   ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>   ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>> ==================================================================
>>
>>   drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index bfea608a7106..997a2cc1a635 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct drm_sched_entity *entity,
>>   
>>   	entity->oldest_job_waiting = ts;
>>   
>> -	rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> -		      drm_sched_entity_compare_before);
>> +	if (!entity->stopped) {
>> +		rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>> +			      drm_sched_entity_compare_before);
>> +	}
> 
> If this is a race, then this patch here is broken, too, because you're
> checking the 'stopped' boolean as the callers of that function do, too
> – just later. :O
> 
> Could still race, just less likely.
> 
> The proper way to fix it would then be to address the issue where the
> locking is supposed to happen. Let's look at, for example,
> drm_sched_entity_push_job():
> 
> 
> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> {
> 	(Bla bla bla)
> 
>   	…………
> 
> 	/* first job wakes up scheduler */
> 	if (first) {
> 		struct drm_gpu_scheduler *sched;
> 		struct drm_sched_rq *rq;
> 
> 		/* Add the entity to the run queue */
> 		spin_lock(&entity->lock);
> 		if (entity->stopped) {                  <---- Aha!
> 			spin_unlock(&entity->lock);
> 
> 			DRM_ERROR("Trying to push to a killed entity\n");
> 			return;
> 		}
> 
> 		rq = entity->rq;
> 		sched = rq->sched;
> 
> 		spin_lock(&rq->lock);
> 		drm_sched_rq_add_entity(rq, entity);
> 
> 		if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> 			drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); <---- bumm!
> 
> 		spin_unlock(&rq->lock);
> 		spin_unlock(&entity->lock);
> 
> But the locks are still being hold. So that "shouldn't be happening"(tm).
> 
> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> stop entities. The former holds appropriate locks, but drm_sched_fini()
> doesn't. So that looks like a hot candidate to me. Opinions?
> 
> On the other hand, aren't drivers prohibited from calling
> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> fuzzer does that, then it's not the scheduler's fault.
> 
> Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
> 
> Would be cool if Tvrtko and Christian take a look. Maybe we even have a
> fundamental design issue.

It would be nice to have a reproducer and from this thread I did not 
manage to figure out if the syzkaller snipper James posted was it, or 
not quite it.

In either case, I think one race I see relates to the early exit 
!entity->rq check before setting entity->stopped in drm_sched_entity_kill().

If the entity was not submitted at all yet (at the time of process exit 
/ entity kill), entity->stopped will therefore not be set. A parallel 
job submit can then re-add the entity to the tree, as process exit / 
file close / entity kill is continuing and is about to kfree the entity 
(in the case of amdgpu report there are two entities embedded in file_priv).

One way to make this more robust is to make the entity->rq check in 
drm_sched_entity_kill() stronger. Or actually to remove it altogether. 
But I think it also requires checking for entity->stopped in 
drm_sched_entity_select_rq() and propagating the error code all the way 
out from drm_sched_job_arm().

That was entity->stopped is properly serialized and acted upon early 
enough to avoid dereferencing a freed entity and avoid creating jobs not 
attached to anything (but only have a warning from push job).

Disclaimer I haven't tried to experiment with this yet, so I may be 
missing something. At least writing a reproducer for the race I 
described sounds easy so unless someone shouts I am talking nonsense I 
can do that and also sketch out a fix. *If* the theory will hold water 
after I write the test case.

Regards,

Tvrtko

> 
>>   }
>>   
>>   /**
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-08-14 10:42   ` Tvrtko Ursulin
@ 2025-08-14 11:45     ` Tvrtko Ursulin
  2025-08-14 11:49       ` Philipp Stanner
  0 siblings, 1 reply; 17+ messages in thread
From: Tvrtko Ursulin @ 2025-08-14 11:45 UTC (permalink / raw)
  To: phasta, James Flowers, matthew.brost, dakr,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees


On 14/08/2025 11:42, Tvrtko Ursulin wrote:
> 
> On 21/07/2025 08:52, Philipp Stanner wrote:
>> +Cc Tvrtko, who's currently reworking FIFO and RR.
>>
>> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
>>> Fixes an issue where entities are added to the run queue in
>>> drm_sched_rq_update_fifo_locked after being killed, causing a
>>> slab-use-after-free error.
>>>
>>> Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
>>> ---
>>> This issue was detected by syzkaller running on a Steam Deck OLED.
>>> Unfortunately I don't have a reproducer for it. I've
>>
>> Well, now that's kind of an issue – if you don't have a reproducer, how
>> can you know that your patch is correct? How can we?
>>
>> It would certainly be good to know what the fuzz testing framework
>> does.
>>
>>> included the KASAN reports below:
>>
>>
>> Anyways, KASAN reports look interesting. But those might be many
>> different issues. Again, would be good to know what the fuzzer has been
>> testing. Can you maybe split this fuzz test into sub-tests? I suspsect
>> those might be different faults.
>>
>>
>> Anyways, taking a first look…
>>
>>
>>>
>>> ==================================================================
>>> BUG: KASAN: slab-use-after-free in rb_next+0xda/0x160 lib/rbtree.c:505
>>> Read of size 8 at addr ffff8881805085e0 by task kworker/u32:12/192
>>> CPU: 3 UID: 0 PID: 192 Comm: kworker/u32:12 Not tainted 6.14.0- 
>>> flowejam-+ #1
>>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>>> Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
>>> Call Trace:
>>>   <TASK>
>>>   __dump_stack lib/dump_stack.c:94 [inline]
>>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>>   rb_next+0xda/0x160 lib/rbtree.c:505
>>>   drm_sched_rq_select_entity_fifo drivers/gpu/drm/scheduler/ 
>>> sched_main.c:332 [inline] [gpu_sched]
>>>   drm_sched_select_entity+0x497/0x720 drivers/gpu/drm/scheduler/ 
>>> sched_main.c:1081 [gpu_sched]
>>>   drm_sched_run_job_work+0x2e/0x710 drivers/gpu/drm/scheduler/ 
>>> sched_main.c:1206 [gpu_sched]
>>>   process_one_work+0x9c0/0x17e0 kernel/workqueue.c:3238
>>>   process_scheduled_works kernel/workqueue.c:3319 [inline]
>>>   worker_thread+0x734/0x1060 kernel/workqueue.c:3400
>>>   kthread+0x3fd/0x810 kernel/kthread.c:464
>>>   ret_from_fork+0x53/0x80 arch/x86/kernel/process.c:148
>>>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
>>>   </TASK>
>>> Allocated by task 73472:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1402 [amdgpu]
>>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>>   do_sys_open fs/open.c:1443 [inline]
>>>   __do_sys_openat fs/open.c:1459 [inline]
>>>   __se_sys_openat fs/open.c:1454 [inline]
>>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> Freed by task 73472:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>>   slab_free_hook mm/slub.c:2353 [inline]
>>>   slab_free mm/slub.c:4609 [inline]
>>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1538 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> The buggy address belongs to the object at ffff888180508000
>>> The buggy address is located 1504 bytes inside of
>>> The buggy address belongs to the physical page:
>>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 
>>> pfn:0x180508
>>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>>> page_type: f5(slab)
>>> raw: 0017ffffc0000040 ffff888100043180 dead000000000100 dead000000000122
>>> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>>> head: 0017ffffc0000040 ffff888100043180 dead000000000100 
>>> dead000000000122
>>> head: 0000000000000000 0000000080020002 00000000f5000000 
>>> 0000000000000000
>>> head: 0017ffffc0000003 ffffea0006014201 ffffffffffffffff 
>>> 0000000000000000
>>> head: 0000000000000008 0000000000000000 00000000ffffffff 
>>> 0000000000000000
>>> page dumped because: kasan: bad access detected
>>> Memory state around the buggy address:
>>>   ffff888180508480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff888180508500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff888180508580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                                                         ^
>>>   ffff888180508600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff888180508680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>> ==================================================================
>>> BUG: KASAN: slab-use-after-free in rb_set_parent_color include/linux/ 
>>> rbtree_augmented.h:191 [inline]
>>> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/ 
>>> linux/rbtree_augmented.h:312 [inline]
>>> BUG: KASAN: slab-use-after-free in rb_erase+0x157c/0x1b10 lib/ 
>>> rbtree.c:443
>>> Write of size 8 at addr ffff88816414c5d0 by task syz.2.3004/12376
>>> CPU: 7 UID: 65534 PID: 12376 Comm: syz.2.3004 Not tainted 6.14.0- 
>>> flowejam-+ #1
>>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>>> Call Trace:
>>>   <TASK>
>>>   __dump_stack lib/dump_stack.c:94 [inline]
>>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>>   rb_set_parent_color include/linux/rbtree_augmented.h:191 [inline]
>>>   __rb_erase_augmented include/linux/rbtree_augmented.h:312 [inline]
>>>   rb_erase+0x157c/0x1b10 lib/rbtree.c:443
>>>   rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>>>   drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/ 
>>> sched_main.c:154 [inline] [gpu_sched]
>>>   drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/ 
>>> sched_main.c:243 [gpu_sched]
>>>   drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/ 
>>> sched_entity.c:237 [gpu_sched]
>>>   drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 
>>> [inline] [gpu_sched]
>>>   drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/ 
>>> sched_entity.c:331 [gpu_sched]
>>>   amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 
>>> [inline] [amdgpu]
>>>   amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_vm.c:2752 [amdgpu]
>>>   amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1526 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   exit_task_work include/linux/task_work.h:40 [inline]
>>>   do_exit+0x841/0xf60 kernel/exit.c:938
>>>   do_group_exit+0xda/0x2b0 kernel/exit.c:1087
>>>   get_signal+0x171f/0x19d0 kernel/signal.c:3036
>>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> RIP: 0033:0x7f2d90da36ed
>>> Code: Unable to access opcode bytes at 0x7f2d90da36c3.
>>> RSP: 002b:00007f2d91b710d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>>> RAX: 0000000000000000 RBX: 00007f2d90fe6088 RCX: 00007f2d90da36ed
>>> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f2d90fe6088
>>> RBP: 00007f2d90fe6080 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2d90fe608c
>>> R13: 0000000000000000 R14: 0000000000000002 R15: 00007ffc34a67bd0
>>>   </TASK>
>>> Allocated by task 12381:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1402 [amdgpu]
>>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>>   do_sys_open fs/open.c:1443 [inline]
>>>   __do_sys_openat fs/open.c:1459 [inline]
>>>   __se_sys_openat fs/open.c:1454 [inline]
>>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> Freed by task 12381:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>>   slab_free_hook mm/slub.c:2353 [inline]
>>>   slab_free mm/slub.c:4609 [inline]
>>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1538 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> The buggy address belongs to the object at ffff88816414c000
>>> The buggy address is located 1488 bytes inside of
>>> The buggy address belongs to the physical page:
>>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 
>>> pfn:0x164148
>>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>>> page_type: f5(slab)
>>> raw: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 0000000000000000
>>> raw: 0000000000000000 0000000080020002 00000000f5000000 0000000000000000
>>> head: 0017ffffc0000040 ffff88810005c8c0 dead000000000122 
>>> 0000000000000000
>>> head: 0000000000000000 0000000080020002 00000000f5000000 
>>> 0000000000000000
>>> head: 0017ffffc0000003 ffffea0005905201 ffffffffffffffff 
>>> 0000000000000000
>>> head: 0000000000000008 0000000000000000 00000000ffffffff 
>>> 0000000000000000
>>> page dumped because: kasan: bad access detected
>>> Memory state around the buggy address:
>>>   ffff88816414c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff88816414c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff88816414c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                                                   ^
>>>   ffff88816414c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff88816414c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>> ==================================================================
>>> BUG: KASAN: slab-use-after-free in __rb_erase_augmented include/ 
>>> linux/rbtree_augmented.h:259 [inline]
>>> BUG: KASAN: slab-use-after-free in rb_erase+0xf5d/0x1b10 lib/ 
>>> rbtree.c:443
>>> Read of size 8 at addr ffff88812ebcc5e0 by task syz.1.814/6553
>>> CPU: 0 UID: 65534 PID: 6553 Comm: syz.1.814 Not tainted 6.14.0- 
>>> flowejam-+ #1
>>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>>> Call Trace:
>>>   <TASK>
>>>   __dump_stack lib/dump_stack.c:94 [inline]
>>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>>   __rb_erase_augmented include/linux/rbtree_augmented.h:259 [inline]
>>>   rb_erase+0xf5d/0x1b10 lib/rbtree.c:443
>>>   rb_erase_cached include/linux/rbtree.h:126 [inline] [gpu_sched]
>>>   drm_sched_rq_remove_fifo_locked drivers/gpu/drm/scheduler/ 
>>> sched_main.c:154 [inline] [gpu_sched]
>>>   drm_sched_rq_remove_entity+0x2d3/0x480 drivers/gpu/drm/scheduler/ 
>>> sched_main.c:243 [gpu_sched]
>>>   drm_sched_entity_kill.part.0+0x82/0x5e0 drivers/gpu/drm/scheduler/ 
>>> sched_entity.c:237 [gpu_sched]
>>>   drm_sched_entity_kill drivers/gpu/drm/scheduler/sched_entity.c:232 
>>> [inline] [gpu_sched]
>>>   drm_sched_entity_fini+0x4c/0x290 drivers/gpu/drm/scheduler/ 
>>> sched_entity.c:331 [gpu_sched]
>>>   amdgpu_vm_fini_entities drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:529 
>>> [inline] [amdgpu]
>>>   amdgpu_vm_fini+0x862/0x1180 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_vm.c:2752 [amdgpu]
>>>   amdgpu_driver_postclose_kms+0x3db/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1526 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> RIP: 0033:0x7fd23eba36ed
>>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 
>>> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 
>>> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>>> RSP: 002b:00007ffc2943a358 EFLAGS: 00000246 ORIG_RAX: 00000000000001b4
>>> RAX: 0000000000000000 RBX: 00007ffc2943a428 RCX: 00007fd23eba36ed
>>> RDX: 0000000000000000 RSI: 000000000000001e RDI: 0000000000000003
>>> RBP: 00007fd23ede7ba0 R08: 0000000000000001 R09: 0000000c00000000
>>> R10: 00007fd23ea00000 R11: 0000000000000246 R12: 00007fd23ede5fac
>>> R13: 00007fd23ede5fa0 R14: 0000000000059ad1 R15: 0000000000059a8e
>>>   </TASK>
>>> Allocated by task 6559:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1402 [amdgpu]
>>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>>   do_sys_open fs/open.c:1443 [inline]
>>>   __do_sys_openat fs/open.c:1459 [inline]
>>>   __se_sys_openat fs/open.c:1454 [inline]
>>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> Freed by task 6559:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>>   slab_free_hook mm/slub.c:2353 [inline]
>>>   slab_free mm/slub.c:4609 [inline]
>>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1538 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   get_signal+0x1be/0x19d0 kernel/signal.c:2809
>>>   arch_do_signal_or_restart+0x96/0x3a0 arch/x86/kernel/signal.c:337
>>>   exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x1fc/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> The buggy address belongs to the object at ffff88812ebcc000
>>> The buggy address is located 1504 bytes inside of
>>> The buggy address belongs to the physical page:
>>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 
>>> pfn:0x12ebc8
>>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>>> page_type: f5(slab)
>>> raw: 0017ffffc0000040 ffff888100058780 dead000000000122 0000000000000000
>>> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>>> head: 0017ffffc0000040 ffff888100058780 dead000000000122 
>>> 0000000000000000
>>> head: 0000000000000000 0000000000020002 00000000f5000000 
>>> 0000000000000000
>>> head: 0017ffffc0000003 ffffea0004baf201 ffffffffffffffff 
>>> 0000000000000000
>>> head: 0000000000000008 0000000000000000 00000000ffffffff 
>>> 0000000000000000
>>> page dumped because: kasan: bad access detected
>>> Memory state around the buggy address:
>>>   ffff88812ebcc480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff88812ebcc500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff88812ebcc580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                                                         ^
>>>   ffff88812ebcc600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff88812ebcc680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>> ==================================================================
>>> BUG: KASAN: slab-use-after-free in drm_sched_entity_compare_before 
>>> drivers/gpu/drm/scheduler/sched_main.c:147 [inline] [gpu_sched]
>>> BUG: KASAN: slab-use-after-free in rb_add_cached include/linux/ 
>>> rbtree.h:174 [inline] [gpu_sched]
>>> BUG: KASAN: slab-use-after-free in 
>>> drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/ 
>>> scheduler/sched_main.c:175 [gpu_sched]
>>> Read of size 8 at addr ffff8881208445c8 by task syz.1.49115/146644
>>> CPU: 7 UID: 65534 PID: 146644 Comm: syz.1.49115 Not tainted 6.14.0- 
>>> flowejam-+ #1
>>> Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
>>> Call Trace:
>>>   <TASK>
>>>   __dump_stack lib/dump_stack.c:94 [inline]
>>>   dump_stack_lvl+0xd2/0x130 lib/dump_stack.c:120
>>>   print_address_description.constprop.0+0x88/0x380 mm/kasan/report.c:408
>>>   print_report+0xfc/0x1ff mm/kasan/report.c:521
>>>   kasan_report+0xdd/0x1b0 mm/kasan/report.c:634
>>>   drm_sched_entity_compare_before drivers/gpu/drm/scheduler/ 
>>> sched_main.c:147 [inline] [gpu_sched]
>>>   rb_add_cached include/linux/rbtree.h:174 [inline] [gpu_sched]
>>>   drm_sched_rq_update_fifo_locked+0x47b/0x540 drivers/gpu/drm/ 
>>> scheduler/sched_main.c:175 [gpu_sched]
>>>   drm_sched_entity_push_job+0x509/0x5d0 drivers/gpu/drm/scheduler/ 
>>> sched_entity.c:623 [gpu_sched]
>>
>> This might be a race between entity killing and the push_job. Let's
>> look at your patch below…
>>
>>>   amdgpu_job_submit+0x1a4/0x270 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_job.c:314 [amdgpu]
>>>   amdgpu_vm_sdma_commit+0x1f9/0x7d0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_vm_sdma.c:122 [amdgpu]
>>>   amdgpu_vm_pt_clear+0x540/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_vm_pt.c:422 [amdgpu]
>>>   amdgpu_vm_init+0x9c2/0x12f0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_vm.c:2609 [amdgpu]
>>>   amdgpu_driver_open_kms+0x274/0x660 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1418 [amdgpu]
>>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>>   do_sys_open fs/open.c:1443 [inline]
>>>   __do_sys_openat fs/open.c:1459 [inline]
>>>   __se_sys_openat fs/open.c:1454 [inline]
>>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> RIP: 0033:0x7feb303a36ed
>>> Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 
>>> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 
>>> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
>>> RSP: 002b:00007feb3123c018 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
>>> RAX: ffffffffffffffda RBX: 00007feb305e5fa0 RCX: 00007feb303a36ed
>>> RDX: 0000000000000002 RSI: 0000200000000140 RDI: ffffffffffffff9c
>>> RBP: 00007feb30447722 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>>> R13: 0000000000000001 R14: 00007feb305e5fa0 R15: 00007ffcfd0a3460
>>>   </TASK>
>>> Allocated by task 146638:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
>>>   __kasan_kmalloc+0x9a/0xb0 mm/kasan/common.c:394
>>>   kmalloc_noprof include/linux/slab.h:901 [inline] [amdgpu]
>>>   kzalloc_noprof include/linux/slab.h:1037 [inline] [amdgpu]
>>>   amdgpu_driver_open_kms+0x151/0x660 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1402 [amdgpu]
>>>   drm_file_alloc+0x5d0/0xa00 drivers/gpu/drm/drm_file.c:171
>>>   drm_open_helper+0x1fe/0x540 drivers/gpu/drm/drm_file.c:323
>>>   drm_open+0x1a7/0x400 drivers/gpu/drm/drm_file.c:376
>>>   drm_stub_open+0x21a/0x390 drivers/gpu/drm/drm_drv.c:1149
>>>   chrdev_open+0x23b/0x6b0 fs/char_dev.c:414
>>>   do_dentry_open+0x743/0x1bf0 fs/open.c:956
>>>   vfs_open+0x87/0x3f0 fs/open.c:1086
>>>   do_open+0x72f/0xf80 fs/namei.c:3830
>>>   path_openat+0x2ec/0x770 fs/namei.c:3989
>>>   do_filp_open+0x1ff/0x420 fs/namei.c:4016
>>>   do_sys_openat2+0x181/0x1e0 fs/open.c:1428
>>>   do_sys_open fs/open.c:1443 [inline]
>>>   __do_sys_openat fs/open.c:1459 [inline]
>>>   __se_sys_openat fs/open.c:1454 [inline]
>>>   __x64_sys_openat+0x149/0x210 fs/open.c:1454
>>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>>   do_syscall_64+0x92/0x180 arch/x86/entry/common.c:83
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> Freed by task 146638:
>>>   kasan_save_stack+0x30/0x50 mm/kasan/common.c:47
>>>   kasan_save_track+0x14/0x30 mm/kasan/common.c:68
>>>   kasan_save_free_info+0x3b/0x70 mm/kasan/generic.c:576
>>>   poison_slab_object mm/kasan/common.c:247 [inline]
>>>   __kasan_slab_free+0x52/0x70 mm/kasan/common.c:264
>>>   kasan_slab_free include/linux/kasan.h:233 [inline]
>>>   slab_free_hook mm/slub.c:2353 [inline]
>>>   slab_free mm/slub.c:4609 [inline]
>>>   kfree+0x14f/0x4d0 mm/slub.c:4757
>>>   amdgpu_driver_postclose_kms+0x43d/0x6b0 drivers/gpu/drm/amd/amdgpu/ 
>>> amdgpu_kms.c:1538 [amdgpu]
>>>   drm_file_free.part.0+0x72d/0xbc0 drivers/gpu/drm/drm_file.c:255
>>>   drm_file_free drivers/gpu/drm/drm_file.c:228 [inline]
>>>   drm_close_helper.isra.0+0x197/0x230 drivers/gpu/drm/drm_file.c:278
>>>   drm_release+0x1b0/0x3d0 drivers/gpu/drm/drm_file.c:426
>>>   __fput+0x402/0xb50 fs/file_table.c:464
>>>   task_work_run+0x155/0x250 kernel/task_work.c:227
>>>   resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
>>>   exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
>>>   exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>>>   syscall_exit_to_user_mode+0x26b/0x290 kernel/entry/common.c:218
>>>   do_syscall_64+0x9f/0x180 arch/x86/entry/common.c:89
>>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>> The buggy address belongs to the object at ffff888120844000
>>> The buggy address is located 1480 bytes inside of
>>> The buggy address belongs to the physical page:
>>> page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 
>>> pfn:0x120840
>>> head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
>>> flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
>>> page_type: f5(slab)
>>> raw: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 dead000000000002
>>> raw: 0000000000000000 0000000000020002 00000000f5000000 0000000000000000
>>> head: 0017ffffc0000040 ffff88810005c8c0 ffffea0005744c00 
>>> dead000000000002
>>> head: 0000000000000000 0000000000020002 00000000f5000000 
>>> 0000000000000000
>>> head: 0017ffffc0000003 ffffea0004821001 ffffffffffffffff 
>>> 0000000000000000
>>> head: 0000000000000008 0000000000000000 00000000ffffffff 
>>> 0000000000000000
>>> page dumped because: kasan: bad access detected
>>> Memory state around the buggy address:
>>>   ffff888120844480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff888120844500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>> ffff888120844580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>                                                ^
>>>   ffff888120844600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>>   ffff888120844680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>> ==================================================================
>>>
>>>   drivers/gpu/drm/scheduler/sched_main.c | 6 ++++--
>>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/ 
>>> drm/scheduler/sched_main.c
>>> index bfea608a7106..997a2cc1a635 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -172,8 +172,10 @@ void drm_sched_rq_update_fifo_locked(struct 
>>> drm_sched_entity *entity,
>>>       entity->oldest_job_waiting = ts;
>>> -    rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>>> -              drm_sched_entity_compare_before);
>>> +    if (!entity->stopped) {
>>> +        rb_add_cached(&entity->rb_tree_node, &rq->rb_tree_root,
>>> +                  drm_sched_entity_compare_before);
>>> +    }
>>
>> If this is a race, then this patch here is broken, too, because you're
>> checking the 'stopped' boolean as the callers of that function do, too
>> – just later. :O
>>
>> Could still race, just less likely.
>>
>> The proper way to fix it would then be to address the issue where the
>> locking is supposed to happen. Let's look at, for example,
>> drm_sched_entity_push_job():
>>
>>
>> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>> {
>>     (Bla bla bla)
>>
>>       …………
>>
>>     /* first job wakes up scheduler */
>>     if (first) {
>>         struct drm_gpu_scheduler *sched;
>>         struct drm_sched_rq *rq;
>>
>>         /* Add the entity to the run queue */
>>         spin_lock(&entity->lock);
>>         if (entity->stopped) {                  <---- Aha!
>>             spin_unlock(&entity->lock);
>>
>>             DRM_ERROR("Trying to push to a killed entity\n");
>>             return;
>>         }
>>
>>         rq = entity->rq;
>>         sched = rq->sched;
>>
>>         spin_lock(&rq->lock);
>>         drm_sched_rq_add_entity(rq, entity);
>>
>>         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>             drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); 
>> <---- bumm!
>>
>>         spin_unlock(&rq->lock);
>>         spin_unlock(&entity->lock);
>>
>> But the locks are still being hold. So that "shouldn't be happening"(tm).
>>
>> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
>> stop entities. The former holds appropriate locks, but drm_sched_fini()
>> doesn't. So that looks like a hot candidate to me. Opinions?
>>
>> On the other hand, aren't drivers prohibited from calling
>> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
>> fuzzer does that, then it's not the scheduler's fault.
>>
>> Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
>>
>> Would be cool if Tvrtko and Christian take a look. Maybe we even have a
>> fundamental design issue.
> 
> It would be nice to have a reproducer and from this thread I did not 
> manage to figure out if the syzkaller snipper James posted was it, or 
> not quite it.
> 
> In either case, I think one race I see relates to the early exit ! 
> entity->rq check before setting entity->stopped in drm_sched_entity_kill().
> 
> If the entity was not submitted at all yet (at the time of process 
> exit / entity kill), entity->stopped will therefore not be set. A 
> parallel job submit can then re-add the entity to the tree, as process 
> exit / file close / entity kill is continuing and is about to kfree the 
> entity (in the case of amdgpu report there are two entities embedded in 
> file_priv).
> 
> One way to make this more robust is to make the entity->rq check in 
> drm_sched_entity_kill() stronger. Or actually to remove it altogether. 
> But I think it also requires checking for entity->stopped in 
> drm_sched_entity_select_rq() and propagating the error code all the way 
> out from drm_sched_job_arm().
> 
> That was entity->stopped is properly serialized and acted upon early 
> enough to avoid dereferencing a freed entity and avoid creating jobs not 
> attached to anything (but only have a warning from push job).
> 
> Disclaimer I haven't tried to experiment with this yet, so I may be 
> missing something. At least writing a reproducer for the race I 
> described sounds easy so unless someone shouts I am talking nonsense I 
> can do that and also sketch out a fix. *If* the theory will hold water 
> after I write the test case.

Nah I was talking nonsense. Forgot entity->rq is assigned on entity init 
and jobs cannot be created unless it is set.

Okay, I have no theories as to what bug syzkaller found.

Regards,

Tvrtko

> 
>>
>>>   }
>>>   /**
>>
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-08-14 11:45     ` Tvrtko Ursulin
@ 2025-08-14 11:49       ` Philipp Stanner
  2025-08-14 12:17         ` Tvrtko Ursulin
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Stanner @ 2025-08-14 11:49 UTC (permalink / raw)
  To: Tvrtko Ursulin, phasta, James Flowers, matthew.brost, dakr,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees

On Thu, 2025-08-14 at 12:45 +0100, Tvrtko Ursulin wrote:
> 
> On 14/08/2025 11:42, Tvrtko Ursulin wrote:
> > 
> > On 21/07/2025 08:52, Philipp Stanner wrote:
> > > +Cc Tvrtko, who's currently reworking FIFO and RR.
> > > 
> > > On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
> > > > Fixes an issue where entities are added to the run queue in
> > > > drm_sched_rq_update_fifo_locked after being killed, causing a
> > > > slab-use-after-free error.
> > > > 
> > > > Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
> > > > ---
> > > > This issue was detected by syzkaller running on a Steam Deck OLED.
> > > > Unfortunately I don't have a reproducer for it. I've
> > > 
> > > Well, now that's kind of an issue – if you don't have a reproducer, how
> > > can you know that your patch is correct? How can we?
> > > 
> > > It would certainly be good to know what the fuzz testing framework
> > > does.
> > > 
> > > > included the KASAN reports below:
> > > 
> > > 
> > > Anyways, KASAN reports look interesting. But those might be many
> > > different issues. Again, would be good to know what the fuzzer has been
> > > testing. Can you maybe split this fuzz test into sub-tests? I suspsect
> > > those might be different faults.
> > > 
> > > 
> > > Anyways, taking a first look…
> > > 
> > > 


[SNIP]

> > > > 
> > > > ==================================================================
> > > 
> > > If this is a race, then this patch here is broken, too, because you're
> > > checking the 'stopped' boolean as the callers of that function do, too
> > > – just later. :O
> > > 
> > > Could still race, just less likely.
> > > 
> > > The proper way to fix it would then be to address the issue where the
> > > locking is supposed to happen. Let's look at, for example,
> > > drm_sched_entity_push_job():
> > > 
> > > 
> > > void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
> > > {
> > >     (Bla bla bla)
> > > 
> > >       …………
> > > 
> > >     /* first job wakes up scheduler */
> > >     if (first) {
> > >         struct drm_gpu_scheduler *sched;
> > >         struct drm_sched_rq *rq;
> > > 
> > >         /* Add the entity to the run queue */
> > >         spin_lock(&entity->lock);
> > >         if (entity->stopped) {                  <---- Aha!
> > >             spin_unlock(&entity->lock);
> > > 
> > >             DRM_ERROR("Trying to push to a killed entity\n");
> > >             return;
> > >         }
> > > 
> > >         rq = entity->rq;
> > >         sched = rq->sched;
> > > 
> > >         spin_lock(&rq->lock);
> > >         drm_sched_rq_add_entity(rq, entity);
> > > 
> > >         if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
> > >             drm_sched_rq_update_fifo_locked(entity, rq, submit_ts); 
> > > <---- bumm!
> > > 
> > >         spin_unlock(&rq->lock);
> > >         spin_unlock(&entity->lock);
> > > 
> > > But the locks are still being hold. So that "shouldn't be happening"(tm).
> > > 
> > > Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
> > > stop entities. The former holds appropriate locks, but drm_sched_fini()
> > > doesn't. So that looks like a hot candidate to me. Opinions?
> > > 
> > > On the other hand, aren't drivers prohibited from calling
> > > drm_sched_entity_push_job() after calling drm_sched_fini()? If the
> > > fuzzer does that, then it's not the scheduler's fault.
> > > 
> > > Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
> > > 
> > > Would be cool if Tvrtko and Christian take a look. Maybe we even have a
> > > fundamental design issue.
> > 
> > It would be nice to have a reproducer and from this thread I did not 
> > manage to figure out if the syzkaller snipper James posted was it, or 
> > not quite it.
> > 
> > In either case, I think one race I see relates to the early exit ! 
> > entity->rq check before setting entity->stopped in drm_sched_entity_kill().
> > 
> > If the entity was not submitted at all yet (at the time of process 
> > exit / entity kill), entity->stopped will therefore not be set. A 
> > parallel job submit can then re-add the entity to the tree, as process 
> > exit / file close / entity kill is continuing and is about to kfree the 
> > entity (in the case of amdgpu report there are two entities embedded in 
> > file_priv).
> > 
> > One way to make this more robust is to make the entity->rq check in
> > drm_sched_entity_kill() stronger. Or actually to remove it altogether. 
> > But I think it also requires checking for entity->stopped in 
> > drm_sched_entity_select_rq() and propagating the error code all the way 
> > out from drm_sched_job_arm().
> > 
> > That was entity->stopped is properly serialized and acted upon early 
> > enough to avoid dereferencing a freed entity and avoid creating jobs not 
> > attached to anything (but only have a warning from push job).
> > 
> > Disclaimer I haven't tried to experiment with this yet, so I may be
> > missing something. At least writing a reproducer for the race I 
> > described sounds easy so unless someone shouts I am talking nonsense I 
> > can do that and also sketch out a fix. *If* the theory will hold water 
> > after I write the test case.
> 
> Nah I was talking nonsense. Forgot entity->rq is assigned on entity init 
> and jobs cannot be created unless it is set.
> 
> Okay, I have no theories as to what bug syzkaller found.

I just was about to answer.

I agree that the rq check should be fine.

As you can see in the thread, I suspect that this is a race between
drm_sched_entity_push_job() and drm_sched_fini().

See here:
https://lore.kernel.org/dri-devel/20250813085654.102504-2-phasta@kernel.org/


I think as long as there's no reproducer there is not much to do for us
here. A long term goal, though, is to enforce the life time rules.
Entities must be torn down before their scheduler. Checking this for
all drivers will be quite some work, though..


P.


> 
> Regards,
> 
> Tvrtko
> 
> > 
> > > 
> > > >   }
> > > >   /**
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] drm/sched: Prevent stopped entities from being added to the run queue.
  2025-08-14 11:49       ` Philipp Stanner
@ 2025-08-14 12:17         ` Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2025-08-14 12:17 UTC (permalink / raw)
  To: phasta, James Flowers, matthew.brost, dakr,
	ckoenig.leichtzumerken, maarten.lankhorst, mripard, tzimmermann,
	airlied, simona, skhan
  Cc: dri-devel, linux-kernel, linux-kernel-mentees


On 14/08/2025 12:49, Philipp Stanner wrote:
> On Thu, 2025-08-14 at 12:45 +0100, Tvrtko Ursulin wrote:
>>
>> On 14/08/2025 11:42, Tvrtko Ursulin wrote:
>>>
>>> On 21/07/2025 08:52, Philipp Stanner wrote:
>>>> +Cc Tvrtko, who's currently reworking FIFO and RR.
>>>>
>>>> On Sun, 2025-07-20 at 16:56 -0700, James Flowers wrote:
>>>>> Fixes an issue where entities are added to the run queue in
>>>>> drm_sched_rq_update_fifo_locked after being killed, causing a
>>>>> slab-use-after-free error.
>>>>>
>>>>> Signed-off-by: James Flowers <bold.zone2373@fastmail.com>
>>>>> ---
>>>>> This issue was detected by syzkaller running on a Steam Deck OLED.
>>>>> Unfortunately I don't have a reproducer for it. I've
>>>>
>>>> Well, now that's kind of an issue – if you don't have a reproducer, how
>>>> can you know that your patch is correct? How can we?
>>>>
>>>> It would certainly be good to know what the fuzz testing framework
>>>> does.
>>>>
>>>>> included the KASAN reports below:
>>>>
>>>>
>>>> Anyways, KASAN reports look interesting. But those might be many
>>>> different issues. Again, would be good to know what the fuzzer has been
>>>> testing. Can you maybe split this fuzz test into sub-tests? I suspsect
>>>> those might be different faults.
>>>>
>>>>
>>>> Anyways, taking a first look…
>>>>
>>>>
> 
> 
> [SNIP]
> 
>>>>>
>>>>> ==================================================================
>>>>
>>>> If this is a race, then this patch here is broken, too, because you're
>>>> checking the 'stopped' boolean as the callers of that function do, too
>>>> – just later. :O
>>>>
>>>> Could still race, just less likely.
>>>>
>>>> The proper way to fix it would then be to address the issue where the
>>>> locking is supposed to happen. Let's look at, for example,
>>>> drm_sched_entity_push_job():
>>>>
>>>>
>>>> void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>>>> {
>>>>      (Bla bla bla)
>>>>
>>>>        …………
>>>>
>>>>      /* first job wakes up scheduler */
>>>>      if (first) {
>>>>          struct drm_gpu_scheduler *sched;
>>>>          struct drm_sched_rq *rq;
>>>>
>>>>          /* Add the entity to the run queue */
>>>>          spin_lock(&entity->lock);
>>>>          if (entity->stopped) {                  <---- Aha!
>>>>              spin_unlock(&entity->lock);
>>>>
>>>>              DRM_ERROR("Trying to push to a killed entity\n");
>>>>              return;
>>>>          }
>>>>
>>>>          rq = entity->rq;
>>>>          sched = rq->sched;
>>>>
>>>>          spin_lock(&rq->lock);
>>>>          drm_sched_rq_add_entity(rq, entity);
>>>>
>>>>          if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
>>>>              drm_sched_rq_update_fifo_locked(entity, rq, submit_ts);
>>>> <---- bumm!
>>>>
>>>>          spin_unlock(&rq->lock);
>>>>          spin_unlock(&entity->lock);
>>>>
>>>> But the locks are still being hold. So that "shouldn't be happening"(tm).
>>>>
>>>> Interesting. AFAICS only drm_sched_entity_kill() and drm_sched_fini()
>>>> stop entities. The former holds appropriate locks, but drm_sched_fini()
>>>> doesn't. So that looks like a hot candidate to me. Opinions?
>>>>
>>>> On the other hand, aren't drivers prohibited from calling
>>>> drm_sched_entity_push_job() after calling drm_sched_fini()? If the
>>>> fuzzer does that, then it's not the scheduler's fault.
>>>>
>>>> Could you test adding spin_lock(&entity->lock) to drm_sched_fini()?
>>>>
>>>> Would be cool if Tvrtko and Christian take a look. Maybe we even have a
>>>> fundamental design issue.
>>>
>>> It would be nice to have a reproducer and from this thread I did not
>>> manage to figure out if the syzkaller snipper James posted was it, or
>>> not quite it.
>>>
>>> In either case, I think one race I see relates to the early exit !
>>> entity->rq check before setting entity->stopped in drm_sched_entity_kill().
>>>
>>> If the entity was not submitted at all yet (at the time of process
>>> exit / entity kill), entity->stopped will therefore not be set. A
>>> parallel job submit can then re-add the entity to the tree, as process
>>> exit / file close / entity kill is continuing and is about to kfree the
>>> entity (in the case of amdgpu report there are two entities embedded in
>>> file_priv).
>>>
>>> One way to make this more robust is to make the entity->rq check in
>>> drm_sched_entity_kill() stronger. Or actually to remove it altogether.
>>> But I think it also requires checking for entity->stopped in
>>> drm_sched_entity_select_rq() and propagating the error code all the way
>>> out from drm_sched_job_arm().
>>>
>>> That was entity->stopped is properly serialized and acted upon early
>>> enough to avoid dereferencing a freed entity and avoid creating jobs not
>>> attached to anything (but only have a warning from push job).
>>>
>>> Disclaimer I haven't tried to experiment with this yet, so I may be
>>> missing something. At least writing a reproducer for the race I
>>> described sounds easy so unless someone shouts I am talking nonsense I
>>> can do that and also sketch out a fix. *If* the theory will hold water
>>> after I write the test case.
>>
>> Nah I was talking nonsense. Forgot entity->rq is assigned on entity init
>> and jobs cannot be created unless it is set.
>>
>> Okay, I have no theories as to what bug syzkaller found.
> 
> I just was about to answer.
> 
> I agree that the rq check should be fine.
> 
> As you can see in the thread, I suspect that this is a race between
> drm_sched_entity_push_job() and drm_sched_fini().
> 
> See here:
> https://lore.kernel.org/dri-devel/20250813085654.102504-2-phasta@kernel.org/

Yeah I read it.

Problem with the amdgpu angle and this KASAN report is that to me it 
looked the UAF is about the two VM update entities embedded in struct 
file priv. And the schedulers used to initialize those are not torn down 
until driver unload. So I didn't think syzkaller would have hit that and 
was looking for alternative ideas.

Regards,

Tvrtko

> I think as long as there's no reproducer there is not much to do for us
> here. A long term goal, though, is to enforce the life time rules.
> Entities must be torn down before their scheduler. Checking this for
> all drivers will be quite some work, though..
> 
> 
> P.
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>>>
>>>>>    }
>>>>>    /**
>>>>
>>>
>>
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-08-14 12:17 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-20 23:56 [PATCH] drm/sched: Prevent stopped entities from being added to the run queue James Flowers
2025-07-21  7:52 ` Philipp Stanner
2025-07-21  8:16   ` Philipp Stanner
2025-07-21 10:14     ` Danilo Krummrich
2025-07-21 18:07       ` Matthew Brost
2025-07-22  7:37         ` Philipp Stanner
2025-07-22  8:07           ` Matthew Brost
2025-07-22  8:45             ` Matthew Brost
2025-07-23  6:56               ` Philipp Stanner
2025-07-24  4:13                 ` Matthew Brost
2025-07-24  4:17                   ` Matthew Brost
2025-07-22 20:05     ` James
2025-07-23 14:41       ` Philipp Stanner
2025-08-14 10:42   ` Tvrtko Ursulin
2025-08-14 11:45     ` Tvrtko Ursulin
2025-08-14 11:49       ` Philipp Stanner
2025-08-14 12:17         ` Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).