* [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn
@ 2024-04-30 8:25 syzbot
2024-04-30 13:05 ` [PATCH next] vhost_task: after freeing vhost_task it should not be accessed " Edward Adam Davis
2024-05-01 16:12 ` [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read " Michael S. Tsirkin
0 siblings, 2 replies; 14+ messages in thread
From: syzbot @ 2024-04-30 8:25 UTC (permalink / raw)
To: jasowang, kvm, linux-kernel, michael.christie, mst, netdev,
syzkaller-bugs, virtualization
Hello,
syzbot found the following issue on:
HEAD commit: bb7a2467e6be Add linux-next specific files for 20240426
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=123bf96b180000
kernel config: https://syzkaller.appspot.com/x/.config?x=5c6a0288262dd108
dashboard link: https://syzkaller.appspot.com/bug?extid=98edc2df894917b3431f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11c8a4ef180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16c30028980000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5175af7dda64/disk-bb7a2467.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/70db0462e868/vmlinux-bb7a2467.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3217fb825698/bzImage-bb7a2467.xz
The issue was bisected to:
commit a3df30984f4faf82d63d2a96f8ac773403ce935d
Author: Mike Christie <michael.christie@oracle.com>
Date: Sat Mar 16 00:47:06 2024 +0000
vhost_task: Handle SIGKILL by flushing work and exiting
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14423917180000
final oops: https://syzkaller.appspot.com/x/report.txt?x=16423917180000
console output: https://syzkaller.appspot.com/x/log.txt?x=12423917180000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+98edc2df894917b3431f@syzkaller.appspotmail.com
Fixes: a3df30984f4f ("vhost_task: Handle SIGKILL by flushing work and exiting")
==================================================================
BUG: KASAN: slab-use-after-free in instrument_atomic_read include/linux/instrumented.h:68 [inline]
BUG: KASAN: slab-use-after-free in atomic_long_read include/linux/atomic/atomic-instrumented.h:3188 [inline]
BUG: KASAN: slab-use-after-free in __mutex_unlock_slowpath+0xef/0x750 kernel/locking/mutex.c:921
Read of size 8 at addr ffff88802a9d9080 by task vhost-5103/5104
CPU: 1 PID: 5104 Comm: vhost-5103 Not tainted 6.9.0-rc5-next-20240426-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
instrument_atomic_read include/linux/instrumented.h:68 [inline]
atomic_long_read include/linux/atomic/atomic-instrumented.h:3188 [inline]
__mutex_unlock_slowpath+0xef/0x750 kernel/locking/mutex.c:921
vhost_task_fn+0x3bc/0x3f0 kernel/vhost_task.c:65
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Allocated by task 5103:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
kmalloc_trace_noprof+0x19c/0x2b0 mm/slub.c:4146
kmalloc_noprof include/linux/slab.h:660 [inline]
kzalloc_noprof include/linux/slab.h:778 [inline]
vhost_task_create+0x149/0x300 kernel/vhost_task.c:134
vhost_worker_create+0x17b/0x3f0 drivers/vhost/vhost.c:667
vhost_dev_set_owner+0x563/0x940 drivers/vhost/vhost.c:945
vhost_dev_ioctl+0xda/0xda0 drivers/vhost/vhost.c:2108
vhost_vsock_dev_ioctl+0x2bb/0xfa0 drivers/vhost/vsock.c:875
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 5103:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2190 [inline]
slab_free mm/slub.c:4430 [inline]
kfree+0x149/0x350 mm/slub.c:4551
vhost_worker_destroy drivers/vhost/vhost.c:629 [inline]
vhost_workers_free drivers/vhost/vhost.c:648 [inline]
vhost_dev_cleanup+0x9b0/0xba0 drivers/vhost/vhost.c:1051
vhost_vsock_dev_release+0x3aa/0x410 drivers/vhost/vsock.c:751
__fput+0x406/0x8b0 fs/file_table.c:422
__do_sys_close fs/open.c:1555 [inline]
__se_sys_close fs/open.c:1540 [inline]
__x64_sys_close+0x7f/0x110 fs/open.c:1540
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88802a9d9000
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 128 bytes inside of
freed 512-byte region [ffff88802a9d9000, ffff88802a9d9200)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2a9d8
head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff80000000040(head|node=0|zone=1|lastcpupid=0xfff)
page_type: 0xffffefff(slab)
raw: 00fff80000000040 ffff888015041c80 ffffea00007e4200 0000000000000002
raw: 0000000000000000 0000000080100010 00000001ffffefff 0000000000000000
head: 00fff80000000040 ffff888015041c80 ffffea00007e4200 0000000000000002
head: 0000000000000000 0000000080100010 00000001ffffefff 0000000000000000
head: 00fff80000000002 ffffea0000aa7601 ffffffffffffffff 0000000000000000
head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd2040(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 4750, tgid 4750 (S40network), ts 45835903170, free_ts 45731176473
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1468
prep_new_page mm/page_alloc.c:1476 [inline]
get_page_from_freelist+0x2ce2/0x2d90 mm/page_alloc.c:3438
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4696
__alloc_pages_node_noprof include/linux/gfp.h:244 [inline]
alloc_pages_node_noprof include/linux/gfp.h:271 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2259
allocate_slab+0x5a/0x2e0 mm/slub.c:2422
new_slab mm/slub.c:2475 [inline]
___slab_alloc+0xcd1/0x14b0 mm/slub.c:3661
__slab_alloc+0x58/0xa0 mm/slub.c:3751
__slab_alloc_node mm/slub.c:3804 [inline]
slab_alloc_node mm/slub.c:3982 [inline]
__do_kmalloc_node mm/slub.c:4114 [inline]
__kmalloc_noprof+0x25e/0x400 mm/slub.c:4128
kmalloc_noprof include/linux/slab.h:664 [inline]
kzalloc_noprof include/linux/slab.h:778 [inline]
tomoyo_init_log+0x1b3e/0x2050 security/tomoyo/audit.c:275
tomoyo_supervisor+0x38a/0x11f0 security/tomoyo/common.c:2089
tomoyo_audit_path_log security/tomoyo/file.c:168 [inline]
tomoyo_path_permission+0x243/0x360 security/tomoyo/file.c:587
tomoyo_check_open_permission+0x2fb/0x500 security/tomoyo/file.c:777
security_file_open+0x6a/0x730 security/security.c:2962
do_dentry_open+0x36c/0x1720 fs/open.c:942
do_open fs/namei.c:3650 [inline]
path_openat+0x289f/0x3280 fs/namei.c:3807
do_filp_open+0x235/0x490 fs/namei.c:3834
page last free pid 4749 tgid 4749 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1088 [inline]
free_unref_page+0xd22/0xea0 mm/page_alloc.c:2601
__slab_free+0x31b/0x3d0 mm/slub.c:4341
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x9e/0x140 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3934 [inline]
slab_alloc_node mm/slub.c:3994 [inline]
kmem_cache_alloc_noprof+0x135/0x290 mm/slub.c:4001
getname_flags+0xbd/0x4f0 fs/namei.c:139
vfs_fstatat+0x11c/0x190 fs/stat.c:303
__do_sys_newfstatat fs/stat.c:468 [inline]
__se_sys_newfstatat fs/stat.c:462 [inline]
__x64_sys_newfstatat+0x125/0x1b0 fs/stat.c:462
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Memory state around the buggy address:
ffff88802a9d8f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88802a9d9000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88802a9d9080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88802a9d9100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88802a9d9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-04-30 8:25 [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn syzbot
@ 2024-04-30 13:05 ` Edward Adam Davis
2024-04-30 16:23 ` Mike Christie
2024-05-01 16:12 ` [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read " Michael S. Tsirkin
1 sibling, 1 reply; 14+ messages in thread
From: Edward Adam Davis @ 2024-04-30 13:05 UTC (permalink / raw)
To: syzbot+98edc2df894917b3431f
Cc: jasowang, kvm, linux-kernel, michael.christie, mst, netdev,
syzkaller-bugs, virtualization
[syzbot reported]
BUG: KASAN: slab-use-after-free in instrument_atomic_read include/linux/instrumented.h:68 [inline]
BUG: KASAN: slab-use-after-free in atomic_long_read include/linux/atomic/atomic-instrumented.h:3188 [inline]
BUG: KASAN: slab-use-after-free in __mutex_unlock_slowpath+0xef/0x750 kernel/locking/mutex.c:921
Read of size 8 at addr ffff888023632880 by task vhost-5104/5105
CPU: 0 PID: 5105 Comm: vhost-5104 Not tainted 6.9.0-rc5-next-20240426-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
instrument_atomic_read include/linux/instrumented.h:68 [inline]
atomic_long_read include/linux/atomic/atomic-instrumented.h:3188 [inline]
__mutex_unlock_slowpath+0xef/0x750 kernel/locking/mutex.c:921
vhost_task_fn+0x3bc/0x3f0 kernel/vhost_task.c:65
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Allocated by task 5104:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387
kasan_kmalloc include/linux/kasan.h:211 [inline]
kmalloc_trace_noprof+0x19c/0x2b0 mm/slub.c:4146
kmalloc_noprof include/linux/slab.h:660 [inline]
kzalloc_noprof include/linux/slab.h:778 [inline]
vhost_task_create+0x149/0x300 kernel/vhost_task.c:134
vhost_worker_create+0x17b/0x3f0 drivers/vhost/vhost.c:667
vhost_dev_set_owner+0x563/0x940 drivers/vhost/vhost.c:945
vhost_dev_ioctl+0xda/0xda0 drivers/vhost/vhost.c:2108
vhost_vsock_dev_ioctl+0x2bb/0xfa0 drivers/vhost/vsock.c:875
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 5104:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2190 [inline]
slab_free mm/slub.c:4430 [inline]
kfree+0x149/0x350 mm/slub.c:4551
vhost_worker_destroy drivers/vhost/vhost.c:629 [inline]
vhost_workers_free drivers/vhost/vhost.c:648 [inline]
vhost_dev_cleanup+0x9b0/0xba0 drivers/vhost/vhost.c:1051
vhost_vsock_dev_release+0x3aa/0x410 drivers/vhost/vsock.c:751
__fput+0x406/0x8b0 fs/file_table.c:422
__do_sys_close fs/open.c:1555 [inline]
__se_sys_close fs/open.c:1540 [inline]
__x64_sys_close+0x7f/0x110 fs/open.c:1540
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
[Fix]
Delete the member exit_mutex from the struct vhost_task and replace it with a
declared global static mutex.
Fixes: a3df30984f4f ("vhost_task: Handle SIGKILL by flushing work and exiting")
Reported--by: syzbot+98edc2df894917b3431f@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
kernel/vhost_task.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
index 48c289947b99..375356499867 100644
--- a/kernel/vhost_task.c
+++ b/kernel/vhost_task.c
@@ -20,10 +20,10 @@ struct vhost_task {
struct completion exited;
unsigned long flags;
struct task_struct *task;
- /* serialize SIGKILL and vhost_task_stop calls */
- struct mutex exit_mutex;
};
+static DEFINE_MUTEX(exit_mutex); //serialize SIGKILL and vhost_task_stop calls
+
static int vhost_task_fn(void *data)
{
struct vhost_task *vtsk = data;
@@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
schedule();
}
- mutex_lock(&vtsk->exit_mutex);
+ mutex_lock(&exit_mutex);
/*
* If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
* When the vhost layer has called vhost_task_stop it's already stopped
@@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
vtsk->handle_sigkill(vtsk->data);
}
complete(&vtsk->exited);
- mutex_unlock(&vtsk->exit_mutex);
+ mutex_unlock(&exit_mutex);
do_exit(0);
}
@@ -88,12 +88,12 @@ EXPORT_SYMBOL_GPL(vhost_task_wake);
*/
void vhost_task_stop(struct vhost_task *vtsk)
{
- mutex_lock(&vtsk->exit_mutex);
+ mutex_lock(&exit_mutex);
if (!test_bit(VHOST_TASK_FLAGS_KILLED, &vtsk->flags)) {
set_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags);
vhost_task_wake(vtsk);
}
- mutex_unlock(&vtsk->exit_mutex);
+ mutex_unlock(&exit_mutex);
/*
* Make sure vhost_task_fn is no longer accessing the vhost_task before
@@ -135,7 +135,6 @@ struct vhost_task *vhost_task_create(bool (*fn)(void *),
if (!vtsk)
return NULL;
init_completion(&vtsk->exited);
- mutex_init(&vtsk->exit_mutex);
vtsk->data = arg;
vtsk->fn = fn;
vtsk->handle_sigkill = handle_sigkill;
--
2.43.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-04-30 13:05 ` [PATCH next] vhost_task: after freeing vhost_task it should not be accessed " Edward Adam Davis
@ 2024-04-30 16:23 ` Mike Christie
2024-04-30 18:06 ` Michael S. Tsirkin
2024-05-01 0:15 ` Hillf Danton
0 siblings, 2 replies; 14+ messages in thread
From: Mike Christie @ 2024-04-30 16:23 UTC (permalink / raw)
To: Edward Adam Davis, syzbot+98edc2df894917b3431f
Cc: jasowang, kvm, linux-kernel, mst, netdev, syzkaller-bugs,
virtualization
On 4/30/24 8:05 AM, Edward Adam Davis wrote:
> static int vhost_task_fn(void *data)
> {
> struct vhost_task *vtsk = data;
> @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
> schedule();
> }
>
> - mutex_lock(&vtsk->exit_mutex);
> + mutex_lock(&exit_mutex);
> /*
> * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
> * When the vhost layer has called vhost_task_stop it's already stopped
> @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
> vtsk->handle_sigkill(vtsk->data);
> }
> complete(&vtsk->exited);
> - mutex_unlock(&vtsk->exit_mutex);
> + mutex_unlock(&exit_mutex);
>
Edward, thanks for the patch. I think though I just needed to swap the
order of the calls above.
Instead of:
complete(&vtsk->exited);
mutex_unlock(&vtsk->exit_mutex);
it should have been:
mutex_unlock(&vtsk->exit_mutex);
complete(&vtsk->exited);
If my analysis is correct, then Michael do you want me to resubmit a
patch on top of your vhost branch or resubmit the entire patchset?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-04-30 16:23 ` Mike Christie
@ 2024-04-30 18:06 ` Michael S. Tsirkin
2024-05-01 0:15 ` Hillf Danton
1 sibling, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-04-30 18:06 UTC (permalink / raw)
To: Mike Christie
Cc: Edward Adam Davis, syzbot+98edc2df894917b3431f, jasowang, kvm,
linux-kernel, netdev, syzkaller-bugs, virtualization
On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote:
> On 4/30/24 8:05 AM, Edward Adam Davis wrote:
> > static int vhost_task_fn(void *data)
> > {
> > struct vhost_task *vtsk = data;
> > @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
> > schedule();
> > }
> >
> > - mutex_lock(&vtsk->exit_mutex);
> > + mutex_lock(&exit_mutex);
> > /*
> > * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
> > * When the vhost layer has called vhost_task_stop it's already stopped
> > @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
> > vtsk->handle_sigkill(vtsk->data);
> > }
> > complete(&vtsk->exited);
> > - mutex_unlock(&vtsk->exit_mutex);
> > + mutex_unlock(&exit_mutex);
> >
>
> Edward, thanks for the patch. I think though I just needed to swap the
> order of the calls above.
>
> Instead of:
>
> complete(&vtsk->exited);
> mutex_unlock(&vtsk->exit_mutex);
>
> it should have been:
>
> mutex_unlock(&vtsk->exit_mutex);
> complete(&vtsk->exited);
>
> If my analysis is correct, then Michael do you want me to resubmit a
> patch on top of your vhost branch or resubmit the entire patchset?
Resubmit all please.
--
MST
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-04-30 16:23 ` Mike Christie
2024-04-30 18:06 ` Michael S. Tsirkin
@ 2024-05-01 0:15 ` Hillf Danton
2024-05-01 1:01 ` Mike Christie
2024-05-01 6:01 ` Michael S. Tsirkin
1 sibling, 2 replies; 14+ messages in thread
From: Hillf Danton @ 2024-05-01 0:15 UTC (permalink / raw)
To: Mike Christie
Cc: Michael S. Tsirkin, Edward Adam Davis,
syzbot+98edc2df894917b3431f, jasowang, kvm, linux-kernel, netdev,
syzkaller-bugs, virtualization
On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote:
> On 4/30/24 8:05 AM, Edward Adam Davis wrote:
> > static int vhost_task_fn(void *data)
> > {
> > struct vhost_task *vtsk = data;
> > @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
> > schedule();
> > }
> >
> > - mutex_lock(&vtsk->exit_mutex);
> > + mutex_lock(&exit_mutex);
> > /*
> > * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
> > * When the vhost layer has called vhost_task_stop it's already stopped
> > @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
> > vtsk->handle_sigkill(vtsk->data);
> > }
> > complete(&vtsk->exited);
> > - mutex_unlock(&vtsk->exit_mutex);
> > + mutex_unlock(&exit_mutex);
> >
>
> Edward, thanks for the patch. I think though I just needed to swap the
> order of the calls above.
>
> Instead of:
>
> complete(&vtsk->exited);
> mutex_unlock(&vtsk->exit_mutex);
>
> it should have been:
>
> mutex_unlock(&vtsk->exit_mutex);
> complete(&vtsk->exited);
JFYI Edward did it [1]
[1] https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
>
> If my analysis is correct, then Michael do you want me to resubmit a
> patch on top of your vhost branch or resubmit the entire patchset?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 0:15 ` Hillf Danton
@ 2024-05-01 1:01 ` Mike Christie
2024-05-01 5:52 ` Michael S. Tsirkin
2024-05-01 6:01 ` Michael S. Tsirkin
1 sibling, 1 reply; 14+ messages in thread
From: Mike Christie @ 2024-05-01 1:01 UTC (permalink / raw)
To: Hillf Danton
Cc: Michael S. Tsirkin, Edward Adam Davis,
syzbot+98edc2df894917b3431f, jasowang, kvm, linux-kernel, netdev,
syzkaller-bugs, virtualization
On 4/30/24 7:15 PM, Hillf Danton wrote:
> On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote:
>> On 4/30/24 8:05 AM, Edward Adam Davis wrote:
>>> static int vhost_task_fn(void *data)
>>> {
>>> struct vhost_task *vtsk = data;
>>> @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
>>> schedule();
>>> }
>>>
>>> - mutex_lock(&vtsk->exit_mutex);
>>> + mutex_lock(&exit_mutex);
>>> /*
>>> * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
>>> * When the vhost layer has called vhost_task_stop it's already stopped
>>> @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
>>> vtsk->handle_sigkill(vtsk->data);
>>> }
>>> complete(&vtsk->exited);
>>> - mutex_unlock(&vtsk->exit_mutex);
>>> + mutex_unlock(&exit_mutex);
>>>
>>
>> Edward, thanks for the patch. I think though I just needed to swap the
>> order of the calls above.
>>
>> Instead of:
>>
>> complete(&vtsk->exited);
>> mutex_unlock(&vtsk->exit_mutex);
>>
>> it should have been:
>>
>> mutex_unlock(&vtsk->exit_mutex);
>> complete(&vtsk->exited);
>
> JFYI Edward did it [1]
>
> [1] https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
Thanks.
I tested the code with that change and it no longer triggers the UAF.
I've fixed up the original patch that had the bug and am going to
resubmit the patchset like how Michael requested.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 1:01 ` Mike Christie
@ 2024-05-01 5:52 ` Michael S. Tsirkin
0 siblings, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-05-01 5:52 UTC (permalink / raw)
To: Mike Christie
Cc: Hillf Danton, Edward Adam Davis, syzbot+98edc2df894917b3431f,
jasowang, kvm, linux-kernel, netdev, syzkaller-bugs,
virtualization
On Tue, Apr 30, 2024 at 08:01:11PM -0500, Mike Christie wrote:
> On 4/30/24 7:15 PM, Hillf Danton wrote:
> > On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote:
> >> On 4/30/24 8:05 AM, Edward Adam Davis wrote:
> >>> static int vhost_task_fn(void *data)
> >>> {
> >>> struct vhost_task *vtsk = data;
> >>> @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
> >>> schedule();
> >>> }
> >>>
> >>> - mutex_lock(&vtsk->exit_mutex);
> >>> + mutex_lock(&exit_mutex);
> >>> /*
> >>> * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
> >>> * When the vhost layer has called vhost_task_stop it's already stopped
> >>> @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
> >>> vtsk->handle_sigkill(vtsk->data);
> >>> }
> >>> complete(&vtsk->exited);
> >>> - mutex_unlock(&vtsk->exit_mutex);
> >>> + mutex_unlock(&exit_mutex);
> >>>
> >>
> >> Edward, thanks for the patch. I think though I just needed to swap the
> >> order of the calls above.
> >>
> >> Instead of:
> >>
> >> complete(&vtsk->exited);
> >> mutex_unlock(&vtsk->exit_mutex);
> >>
> >> it should have been:
> >>
> >> mutex_unlock(&vtsk->exit_mutex);
> >> complete(&vtsk->exited);
> >
> > JFYI Edward did it [1]
> >
> > [1] https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
>
> Thanks.
>
> I tested the code with that change and it no longer triggers the UAF.
Weird but syzcaller said that yes it triggers.
Compare
000000000000dcc0ca06174e65d4@google.com
which tests the order
mutex_unlock(&vtsk->exit_mutex);
complete(&vtsk->exited);
that you like and says it triggers
and
00000000000097bda906175219bc@google.com
which says it does not trigger.
Whatever you do please send it to syzcaller in the original
thread and then when you post please include the syzcaller report.
Given this gets confusing I'm fine with just a fixup patch,
and note in the commit log where I should squash it.
> I've fixed up the original patch that had the bug and am going to
> resubmit the patchset like how Michael requested.
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 0:15 ` Hillf Danton
2024-05-01 1:01 ` Mike Christie
@ 2024-05-01 6:01 ` Michael S. Tsirkin
2024-05-01 7:50 ` Hillf Danton
1 sibling, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-05-01 6:01 UTC (permalink / raw)
To: Hillf Danton
Cc: Mike Christie, Edward Adam Davis, syzbot+98edc2df894917b3431f,
jasowang, kvm, linux-kernel, netdev, syzkaller-bugs,
virtualization
On Wed, May 01, 2024 at 08:15:44AM +0800, Hillf Danton wrote:
> On Tue, Apr 30, 2024 at 11:23:04AM -0500, Mike Christie wrote:
> > On 4/30/24 8:05 AM, Edward Adam Davis wrote:
> > > static int vhost_task_fn(void *data)
> > > {
> > > struct vhost_task *vtsk = data;
> > > @@ -51,7 +51,7 @@ static int vhost_task_fn(void *data)
> > > schedule();
> > > }
> > >
> > > - mutex_lock(&vtsk->exit_mutex);
> > > + mutex_lock(&exit_mutex);
> > > /*
> > > * If a vhost_task_stop and SIGKILL race, we can ignore the SIGKILL.
> > > * When the vhost layer has called vhost_task_stop it's already stopped
> > > @@ -62,7 +62,7 @@ static int vhost_task_fn(void *data)
> > > vtsk->handle_sigkill(vtsk->data);
> > > }
> > > complete(&vtsk->exited);
> > > - mutex_unlock(&vtsk->exit_mutex);
> > > + mutex_unlock(&exit_mutex);
> > >
> >
> > Edward, thanks for the patch. I think though I just needed to swap the
> > order of the calls above.
> >
> > Instead of:
> >
> > complete(&vtsk->exited);
> > mutex_unlock(&vtsk->exit_mutex);
> >
> > it should have been:
> >
> > mutex_unlock(&vtsk->exit_mutex);
> > complete(&vtsk->exited);
>
> JFYI Edward did it [1]
>
> [1] https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
and then it failed testing.
> >
> > If my analysis is correct, then Michael do you want me to resubmit a
> > patch on top of your vhost branch or resubmit the entire patchset?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 6:01 ` Michael S. Tsirkin
@ 2024-05-01 7:50 ` Hillf Danton
2024-05-01 15:57 ` Mike Christie
0 siblings, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2024-05-01 7:50 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Mike Christie, Edward Adam Davis, syzbot+98edc2df894917b3431f,
jasowang, kvm, linux-kernel, netdev, syzkaller-bugs,
virtualization
On Wed, 1 May 2024 02:01:20 -0400 Michael S. Tsirkin <mst@redhat.com>
>
> and then it failed testing.
>
So did my patch [1] but then the reason was spotted [2,3]
[1] https://lore.kernel.org/lkml/20240430110209.4310-1-hdanton@sina.com/
[2] https://lore.kernel.org/lkml/20240430225005.4368-1-hdanton@sina.com/
[3] https://lore.kernel.org/lkml/000000000000a7f8470617589ff2@google.com/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 7:50 ` Hillf Danton
@ 2024-05-01 15:57 ` Mike Christie
2024-05-01 16:04 ` Michael S. Tsirkin
2024-05-01 16:15 ` Michael S. Tsirkin
0 siblings, 2 replies; 14+ messages in thread
From: Mike Christie @ 2024-05-01 15:57 UTC (permalink / raw)
To: Hillf Danton, Michael S. Tsirkin
Cc: Edward Adam Davis, syzbot+98edc2df894917b3431f, jasowang, kvm,
linux-kernel, netdev, syzkaller-bugs, virtualization
On 5/1/24 2:50 AM, Hillf Danton wrote:
> On Wed, 1 May 2024 02:01:20 -0400 Michael S. Tsirkin <mst@redhat.com>
>>
>> and then it failed testing.
>>
> So did my patch [1] but then the reason was spotted [2,3]
>
> [1] https://lore.kernel.org/lkml/20240430110209.4310-1-hdanton@sina.com/
> [2] https://lore.kernel.org/lkml/20240430225005.4368-1-hdanton@sina.com/
> [3] https://lore.kernel.org/lkml/000000000000a7f8470617589ff2@google.com/
Just to make sure I understand the conclusion.
Edward's patch that just swaps the order of the calls:
https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
fixes the UAF. I tested the same in my setup. However, when you guys tested it
with sysbot, it also triggered a softirq/RCU warning.
The softirq/RCU part of the issue is fixed with this commit:
https://lore.kernel.org/all/20240427102808.29356-1-qiang.zhang1211@gmail.com/
commit 1dd1eff161bd55968d3d46bc36def62d71fb4785
Author: Zqiang <qiang.zhang1211@gmail.com>
Date: Sat Apr 27 18:28:08 2024 +0800
softirq: Fix suspicious RCU usage in __do_softirq()
The problem was that I was testing with -next master which has that patch.
It looks like you guys were testing against bb7a2467e6be which didn't have
the patch, and so that's why you guys still hit the softirq/RCU issue. Later
when you added that patch to your patch, it worked with syzbot.
So is it safe to assume that the softirq/RCU patch above will be upstream
when the vhost changes go in or is there a tag I need to add to my patches?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 15:57 ` Mike Christie
@ 2024-05-01 16:04 ` Michael S. Tsirkin
2024-05-01 16:15 ` Michael S. Tsirkin
1 sibling, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-05-01 16:04 UTC (permalink / raw)
To: Mike Christie
Cc: Hillf Danton, Edward Adam Davis, syzbot+98edc2df894917b3431f,
jasowang, kvm, linux-kernel, netdev, syzkaller-bugs,
virtualization
On Wed, May 01, 2024 at 10:57:38AM -0500, Mike Christie wrote:
> On 5/1/24 2:50 AM, Hillf Danton wrote:
> > On Wed, 1 May 2024 02:01:20 -0400 Michael S. Tsirkin <mst@redhat.com>
> >>
> >> and then it failed testing.
> >>
> > So did my patch [1] but then the reason was spotted [2,3]
> >
> > [1] https://lore.kernel.org/lkml/20240430110209.4310-1-hdanton@sina.com/
> > [2] https://lore.kernel.org/lkml/20240430225005.4368-1-hdanton@sina.com/
> > [3] https://lore.kernel.org/lkml/000000000000a7f8470617589ff2@google.com/
>
> Just to make sure I understand the conclusion.
>
> Edward's patch that just swaps the order of the calls:
>
> https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
>
> fixes the UAF. I tested the same in my setup. However, when you guys tested it
> with sysbot, it also triggered a softirq/RCU warning.
>
> The softirq/RCU part of the issue is fixed with this commit:
>
> https://lore.kernel.org/all/20240427102808.29356-1-qiang.zhang1211@gmail.com/
>
> commit 1dd1eff161bd55968d3d46bc36def62d71fb4785
> Author: Zqiang <qiang.zhang1211@gmail.com>
> Date: Sat Apr 27 18:28:08 2024 +0800
>
> softirq: Fix suspicious RCU usage in __do_softirq()
>
> The problem was that I was testing with -next master which has that patch.
> It looks like you guys were testing against bb7a2467e6be which didn't have
> the patch, and so that's why you guys still hit the softirq/RCU issue. Later
> when you added that patch to your patch, it worked with syzbot.
>
> So is it safe to assume that the softirq/RCU patch above will be upstream
> when the vhost changes go in or is there a tag I need to add to my patches?
Two points:
- I do not want bisect broken. If you depend on this patch either I pick
it too before your patch, or we defer until 1dd1eff161bd55968d3d46bc36def62d71fb4785
is merged. You can also ask for that patch to be merged in this cycle.
- Do not assume - pls push somewhere a hash based on vhost that syzbot can test
and confirm all is well. Thanks!
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn
2024-04-30 8:25 [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn syzbot
2024-04-30 13:05 ` [PATCH next] vhost_task: after freeing vhost_task it should not be accessed " Edward Adam Davis
@ 2024-05-01 16:12 ` Michael S. Tsirkin
2024-05-01 16:56 ` syzbot
1 sibling, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-05-01 16:12 UTC (permalink / raw)
To: syzbot
Cc: jasowang, kvm, linux-kernel, michael.christie, netdev,
syzkaller-bugs, virtualization
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git f138e94c1f0dbeae721917694fb2203446a68ea9
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH next] vhost_task: after freeing vhost_task it should not be accessed in vhost_task_fn
2024-05-01 15:57 ` Mike Christie
2024-05-01 16:04 ` Michael S. Tsirkin
@ 2024-05-01 16:15 ` Michael S. Tsirkin
1 sibling, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2024-05-01 16:15 UTC (permalink / raw)
To: Mike Christie
Cc: Hillf Danton, Edward Adam Davis, syzbot+98edc2df894917b3431f,
jasowang, kvm, linux-kernel, netdev, syzkaller-bugs,
virtualization
On Wed, May 01, 2024 at 10:57:38AM -0500, Mike Christie wrote:
> On 5/1/24 2:50 AM, Hillf Danton wrote:
> > On Wed, 1 May 2024 02:01:20 -0400 Michael S. Tsirkin <mst@redhat.com>
> >>
> >> and then it failed testing.
> >>
> > So did my patch [1] but then the reason was spotted [2,3]
> >
> > [1] https://lore.kernel.org/lkml/20240430110209.4310-1-hdanton@sina.com/
> > [2] https://lore.kernel.org/lkml/20240430225005.4368-1-hdanton@sina.com/
> > [3] https://lore.kernel.org/lkml/000000000000a7f8470617589ff2@google.com/
>
> Just to make sure I understand the conclusion.
>
> Edward's patch that just swaps the order of the calls:
>
> https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
>
> fixes the UAF. I tested the same in my setup. However, when you guys tested it
> with sysbot, it also triggered a softirq/RCU warning.
>
> The softirq/RCU part of the issue is fixed with this commit:
>
> https://lore.kernel.org/all/20240427102808.29356-1-qiang.zhang1211@gmail.com/
>
> commit 1dd1eff161bd55968d3d46bc36def62d71fb4785
> Author: Zqiang <qiang.zhang1211@gmail.com>
> Date: Sat Apr 27 18:28:08 2024 +0800
>
> softirq: Fix suspicious RCU usage in __do_softirq()
>
> The problem was that I was testing with -next master which has that patch.
> It looks like you guys were testing against bb7a2467e6be which didn't have
> the patch, and so that's why you guys still hit the softirq/RCU issue. Later
> when you added that patch to your patch, it worked with syzbot.
>
> So is it safe to assume that the softirq/RCU patch above will be upstream
> when the vhost changes go in or is there a tag I need to add to my patches?
That patch is upstream now. I rebased and asked syzbot to test
https://lore.kernel.org/lkml/tencent_546DA49414E876EEBECF2C78D26D242EE50A@qq.com/
on top.
If that passes I will squash.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn
2024-05-01 16:12 ` [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read " Michael S. Tsirkin
@ 2024-05-01 16:56 ` syzbot
0 siblings, 0 replies; 14+ messages in thread
From: syzbot @ 2024-05-01 16:56 UTC (permalink / raw)
To: jasowang, kvm, linux-kernel, michael.christie, mst, netdev,
syzkaller-bugs, virtualization
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: syzbot+98edc2df894917b3431f@syzkaller.appspotmail.com
Tested on:
commit: f138e94c KASAN: slab-use-after-free Read in vhost_task..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
console output: https://syzkaller.appspot.com/x/log.txt?x=10a152a7180000
kernel config: https://syzkaller.appspot.com/x/.config?x=3714fc09f933e505
dashboard link: https://syzkaller.appspot.com/bug?extid=98edc2df894917b3431f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Note: no patches were applied.
Note: testing is done by a robot and is best-effort only.
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-05-01 16:56 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-30 8:25 [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read in vhost_task_fn syzbot
2024-04-30 13:05 ` [PATCH next] vhost_task: after freeing vhost_task it should not be accessed " Edward Adam Davis
2024-04-30 16:23 ` Mike Christie
2024-04-30 18:06 ` Michael S. Tsirkin
2024-05-01 0:15 ` Hillf Danton
2024-05-01 1:01 ` Mike Christie
2024-05-01 5:52 ` Michael S. Tsirkin
2024-05-01 6:01 ` Michael S. Tsirkin
2024-05-01 7:50 ` Hillf Danton
2024-05-01 15:57 ` Mike Christie
2024-05-01 16:04 ` Michael S. Tsirkin
2024-05-01 16:15 ` Michael S. Tsirkin
2024-05-01 16:12 ` [syzbot] [net?] [virt?] [kvm?] KASAN: slab-use-after-free Read " Michael S. Tsirkin
2024-05-01 16:56 ` syzbot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).