* [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch
@ 2023-08-01 13:46 Pengfei Xu
2023-08-02 7:53 ` Baolu Lu
2023-08-03 0:21 ` Jason Gunthorpe
0 siblings, 2 replies; 5+ messages in thread
From: Pengfei Xu @ 2023-08-01 13:46 UTC (permalink / raw)
To: jgg
Cc: iommu, baolu.lu, kevin.tian, nicolinc, heng.su, rafael.j.wysocki,
lenb, lkp
Hi Jason,
Greeting!
We tested the intel internal kernel and found that:
There was general protection fault in iommu_device_unlink issue in kernel
and found the problem commit:
14891af3799e iommu: Move the iommu driver sysfs setup into iommu_init/deinit_device()
Above commit is same as following patch:
https://lore.kernel.org/linux-iommu/6-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com/
All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230801_150257_iommu_device_unlink
Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.c
Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/kconfig_origin
repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.prog
repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.report
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/bisect_info.log
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/a3fe9a2e1692a413d887a6a0f1184c26481d6a2b_dmesg.log
"
[ 14.525705] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 14.526161] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 14.526438] CPU: 1 PID: 715 Comm: repro Not tainted 6.5.0-rc4-a3fe9a2e1692+ #1
[ 14.526706] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 14.527121] RIP: 0010:sysfs_remove_link_from_group+0x37/0x90
[ 14.527345] Code: 41 56 49 89 d6 41 55 49 89 f5 41 54 49 89 fc e8 4f 1a 6c ff 49 8d 7c 24 30 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 49 8b 7c 24 30 31 d2 4c 89 ee e8 54 e2 fe ff 49
[ 14.527994] RSP: 0018:ffff88802457fba8 EFLAGS: 00010206
[ 14.528201] RAX: dffffc0000000000 RBX: ffff888014f9b000 RCX: 0000000000000000
[ 14.528456] RDX: 0000000000000006 RSI: ffffffff81f39941 RDI: 0000000000000030
[ 14.528710] RBP: ffff88802457fbc0 R08: 0000000000000001 R09: ffffed100155b11b
[ 14.528964] R10: ffff88800aad88df R11: 0000000000000001 R12: 0000000000000000
[ 14.529222] R13: ffffffff85cc0180 R14: ffff8880109b6ac0 R15: ffffffff87070ae0
[ 14.529480] FS: 00007f354bf75600(0000) GS:ffff88806cb00000(0000) knlGS:0000000000000000
[ 14.529767] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 14.529974] CR2: 000055bf0ca7f990 CR3: 00000000135d2000 CR4: 0000000000750ee0
[ 14.530233] PKRU: 55555554
[ 14.530338] Call Trace:
[ 14.530433] <TASK>
[ 14.530517] ? show_regs+0x6e/0x80
[ 14.530658] ? die_addr+0x45/0xb0
[ 14.530797] ? exc_general_protection+0x159/0x250
[ 14.530988] ? asm_exc_general_protection+0x2b/0x30
[ 14.531178] ? sysfs_remove_link_from_group+0x21/0x90
[ 14.531365] ? sysfs_remove_link_from_group+0x37/0x90
[ 14.531558] ? sysfs_remove_link_from_group+0x21/0x90
[ 14.531746] iommu_device_unlink+0x85/0xd0
[ 14.531908] iommu_deinit_device+0x11f/0x4c0
[ 14.532075] __iommu_group_remove_device+0x26b/0x300
[ 14.532262] iommu_group_remove_device+0x8a/0xb0
[ 14.532447] iommufd_test+0x1f1a/0x2d10
[ 14.532593] ? __pfx_lock_release+0x10/0x10
[ 14.532752] ? __pfx_iommufd_test+0x10/0x10
[ 14.532908] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[ 14.533108] iommufd_fops_ioctl+0x377/0x510
[ 14.533265] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 14.533438] ? trace_hardirqs_on+0x26/0x120
[ 14.533602] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
[ 14.533832] ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[ 14.534013] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 14.534208] ? security_file_ioctl+0x9d/0xd0
[ 14.534373] ? __pfx_iommufd_fops_ioctl+0x10/0x10
[ 14.534546] __x64_sys_ioctl+0x1b9/0x230
[ 14.534696] do_syscall_64+0x3c/0x90
[ 14.534836] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 14.535019] RIP: 0033:0x7f354bc3ee5d
[ 14.535153] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
[ 14.535792] RSP: 002b:00007ffe42dd02d8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
[ 14.536061] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f354bc3ee5d
[ 14.536310] RDX: 0000000020000180 RSI: 0000000000003ba0 RDI: 0000000000000003
[ 14.536559] RBP: 00007ffe42dd02f0 R08: 0000000000000800 R09: 0000000000000800
[ 14.536809] R10: 0000000000000000 R11: 0000000000000213 R12: 00007ffe42dd0408
[ 14.537065] R13: 0000000000401136 R14: 0000000000403e08 R15: 00007f354bfb6000
[ 14.537320] </TASK>
[ 14.537403] Modules linked in:
[ 14.537544] ---[ end trace 0000000000000000 ]---
"
If it's useful, please add the Reported-by tag for updated patch.
Thanks!
---
If you don't need the following environment to reproduce the problem or if you
already have one, please ignore the following information.
How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
// Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost
After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/
Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has
Fill the bzImage file into above start3.sh to load the target kernel in vm.
Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install
Best Regards,
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch
2023-08-01 13:46 [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch Pengfei Xu
@ 2023-08-02 7:53 ` Baolu Lu
2023-08-02 12:26 ` Jason Gunthorpe
2023-08-03 0:21 ` Jason Gunthorpe
1 sibling, 1 reply; 5+ messages in thread
From: Baolu Lu @ 2023-08-02 7:53 UTC (permalink / raw)
To: Pengfei Xu, jgg
Cc: baolu.lu, iommu, kevin.tian, nicolinc, heng.su, rafael.j.wysocki,
lenb, lkp
On 2023/8/1 21:46, Pengfei Xu wrote:
> Hi Jason,
>
> Greeting!
>
> We tested the intel internal kernel and found that:
> There was general protection fault in iommu_device_unlink issue in kernel
> and found the problem commit:
> 14891af3799e iommu: Move the iommu driver sysfs setup into iommu_init/deinit_device()
>
> Above commit is same as following patch:
> https://lore.kernel.org/linux-iommu/6-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com/
>
> All detailed info:https://github.com/xupengfe/syzkaller_logs/tree/main/230801_150257_iommu_device_unlink
> Reproduced code:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.c
> Kconfig:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/kconfig_origin
> repro.prog:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.prog
> repro.report:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.report
> Bisect info:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/bisect_info.log
> Issue dmesg:https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/a3fe9a2e1692a413d887a6a0f1184c26481d6a2b_dmesg.log
>
> "
> [ 14.525705] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [ 14.526161] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
> [ 14.526438] CPU: 1 PID: 715 Comm: repro Not tainted 6.5.0-rc4-a3fe9a2e1692+ #1
> [ 14.526706] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [ 14.527121] RIP: 0010:sysfs_remove_link_from_group+0x37/0x90
> [ 14.527345] Code: 41 56 49 89 d6 41 55 49 89 f5 41 54 49 89 fc e8 4f 1a 6c ff 49 8d 7c 24 30 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 49 8b 7c 24 30 31 d2 4c 89 ee e8 54 e2 fe ff 49
> [ 14.527994] RSP: 0018:ffff88802457fba8 EFLAGS: 00010206
> [ 14.528201] RAX: dffffc0000000000 RBX: ffff888014f9b000 RCX: 0000000000000000
> [ 14.528456] RDX: 0000000000000006 RSI: ffffffff81f39941 RDI: 0000000000000030
> [ 14.528710] RBP: ffff88802457fbc0 R08: 0000000000000001 R09: ffffed100155b11b
> [ 14.528964] R10: ffff88800aad88df R11: 0000000000000001 R12: 0000000000000000
> [ 14.529222] R13: ffffffff85cc0180 R14: ffff8880109b6ac0 R15: ffffffff87070ae0
> [ 14.529480] FS: 00007f354bf75600(0000) GS:ffff88806cb00000(0000) knlGS:0000000000000000
> [ 14.529767] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 14.529974] CR2: 000055bf0ca7f990 CR3: 00000000135d2000 CR4: 0000000000750ee0
> [ 14.530233] PKRU: 55555554
> [ 14.530338] Call Trace:
> [ 14.530433] <TASK>
> [ 14.530517] ? show_regs+0x6e/0x80
> [ 14.530658] ? die_addr+0x45/0xb0
> [ 14.530797] ? exc_general_protection+0x159/0x250
> [ 14.530988] ? asm_exc_general_protection+0x2b/0x30
> [ 14.531178] ? sysfs_remove_link_from_group+0x21/0x90
> [ 14.531365] ? sysfs_remove_link_from_group+0x37/0x90
> [ 14.531558] ? sysfs_remove_link_from_group+0x21/0x90
> [ 14.531746] iommu_device_unlink+0x85/0xd0
> [ 14.531908] iommu_deinit_device+0x11f/0x4c0
> [ 14.532075] __iommu_group_remove_device+0x26b/0x300
> [ 14.532262] iommu_group_remove_device+0x8a/0xb0
> [ 14.532447] iommufd_test+0x1f1a/0x2d10
> [ 14.532593] ? __pfx_lock_release+0x10/0x10
> [ 14.532752] ? __pfx_iommufd_test+0x10/0x10
> [ 14.532908] ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
> [ 14.533108] iommufd_fops_ioctl+0x377/0x510
> [ 14.533265] ? __pfx_iommufd_fops_ioctl+0x10/0x10
> [ 14.533438] ? trace_hardirqs_on+0x26/0x120
> [ 14.533602] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
> [ 14.533832] ? __sanitizer_cov_trace_cmp4+0x1a/0x20
> [ 14.534013] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
> [ 14.534208] ? security_file_ioctl+0x9d/0xd0
> [ 14.534373] ? __pfx_iommufd_fops_ioctl+0x10/0x10
> [ 14.534546] __x64_sys_ioctl+0x1b9/0x230
> [ 14.534696] do_syscall_64+0x3c/0x90
> [ 14.534836] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [ 14.535019] RIP: 0033:0x7f354bc3ee5d
> [ 14.535153] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
> [ 14.535792] RSP: 002b:00007ffe42dd02d8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
> [ 14.536061] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f354bc3ee5d
> [ 14.536310] RDX: 0000000020000180 RSI: 0000000000003ba0 RDI: 0000000000000003
> [ 14.536559] RBP: 00007ffe42dd02f0 R08: 0000000000000800 R09: 0000000000000800
> [ 14.536809] R10: 0000000000000000 R11: 0000000000000213 R12: 00007ffe42dd0408
> [ 14.537065] R13: 0000000000401136 R14: 0000000000403e08 R15: 00007f354bfb6000
> [ 14.537320] </TASK>
> [ 14.537403] Modules linked in:
> [ 14.537544] ---[ end trace 0000000000000000 ]---
> "
This bug does not reproduce in the testing of hot adding and removing
PCI devices on bare metal. Is it possible that this is related to the
implementation of the mock device for iommufd selftest?
Best regards,
baolu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch
2023-08-02 7:53 ` Baolu Lu
@ 2023-08-02 12:26 ` Jason Gunthorpe
0 siblings, 0 replies; 5+ messages in thread
From: Jason Gunthorpe @ 2023-08-02 12:26 UTC (permalink / raw)
To: Baolu Lu
Cc: Pengfei Xu, iommu, kevin.tian, nicolinc, heng.su,
rafael.j.wysocki, lenb, lkp
On Wed, Aug 02, 2023 at 03:53:03PM +0800, Baolu Lu wrote:
> This bug does not reproduce in the testing of hot adding and removing
> PCI devices on bare metal. Is it possible that this is related to the
> implementation of the mock device for iommufd selftest?
The reproducer works on my test VM, I intend to look into it today..
I guess it is because iommufd mock doesn't use the probe path, I might
have already fixed it when I redid this in the series to fix the
default domains.
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch
2023-08-01 13:46 [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch Pengfei Xu
2023-08-02 7:53 ` Baolu Lu
@ 2023-08-03 0:21 ` Jason Gunthorpe
2023-08-03 1:04 ` Pengfei Xu
1 sibling, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2023-08-03 0:21 UTC (permalink / raw)
To: Pengfei Xu
Cc: iommu, baolu.lu, kevin.tian, nicolinc, heng.su, rafael.j.wysocki,
lenb, lkp
On Tue, Aug 01, 2023 at 09:46:31PM +0800, Pengfei Xu wrote:
> Hi Jason,
>
> Greeting!
>
> We tested the intel internal kernel and found that:
> There was general protection fault in iommu_device_unlink issue in kernel
> and found the problem commit:
> 14891af3799e iommu: Move the iommu driver sysfs setup into iommu_init/deinit_device()
>
> Above commit is same as following patch:
> https://lore.kernel.org/linux-iommu/6-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com/
>
> All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230801_150257_iommu_device_unlink
> Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.c
> Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/kconfig_origin
> repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.prog
> repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.report
> Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/bisect_info.log
> Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/a3fe9a2e1692a413d887a6a0f1184c26481d6a2b_dmesg.log
This is only an issue for the iommufd selftest and it looks very hard
to fix in a micro way.
Fortunately I already fixed it here:
https://lore.kernel.org/linux-iommu/15-v6-e8114faedade+425-iommu_all_defdom_jgg@nvidia.com
If Joerg doesn't take that series for this cycle I'll feed the above
patch through the iommufd tree to fix this.
Thanks,
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch
2023-08-03 0:21 ` Jason Gunthorpe
@ 2023-08-03 1:04 ` Pengfei Xu
0 siblings, 0 replies; 5+ messages in thread
From: Pengfei Xu @ 2023-08-03 1:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: iommu, baolu.lu, kevin.tian, nicolinc, heng.su, rafael.j.wysocki,
lenb, lkp
Hi Jason,
On 2023-08-02 at 21:21:58 -0300, Jason Gunthorpe wrote:
> On Tue, Aug 01, 2023 at 09:46:31PM +0800, Pengfei Xu wrote:
> > Hi Jason,
> >
> > Greeting!
> >
> > We tested the intel internal kernel and found that:
> > There was general protection fault in iommu_device_unlink issue in kernel
> > and found the problem commit:
> > 14891af3799e iommu: Move the iommu driver sysfs setup into iommu_init/deinit_device()
> >
> > Above commit is same as following patch:
> > https://lore.kernel.org/linux-iommu/6-v3-328044aa278c+45e49-iommu_probe_jgg@nvidia.com/
> >
> > All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/230801_150257_iommu_device_unlink
> > Reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.c
> > Kconfig: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/kconfig_origin
> > repro.prog: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.prog
> > repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/repro.report
> > Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/bisect_info.log
> > Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/230801_150257_iommu_device_unlink/a3fe9a2e1692a413d887a6a0f1184c26481d6a2b_dmesg.log
>
> This is only an issue for the iommufd selftest and it looks very hard
> to fix in a micro way.
>
> Fortunately I already fixed it here:
>
> https://lore.kernel.org/linux-iommu/15-v6-e8114faedade+425-iommu_all_defdom_jgg@nvidia.com
>
> If Joerg doesn't take that series for this cycle I'll feed the above
> patch through the iommufd tree to fix this.
Great! Thanks for your information.
Best Regards,
Thanks!
>
> Thanks,
> Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-03 1:01 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 13:46 [Syzkaller & bisect] There is general protection fault in iommu_device_unlink in upstream patch Pengfei Xu
2023-08-02 7:53 ` Baolu Lu
2023-08-02 12:26 ` Jason Gunthorpe
2023-08-03 0:21 ` Jason Gunthorpe
2023-08-03 1:04 ` Pengfei Xu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.