* mmap_assert_write_locked warnings during for vhost_vdpa_fault
@ 2024-06-17 15:50 Dragos Tatulea
2024-06-18 1:17 ` Jason Wang
0 siblings, 1 reply; 13+ messages in thread
From: Dragos Tatulea @ 2024-06-17 15:50 UTC (permalink / raw)
To: jasowang@redhat.com, mst@redhat.com, eperezma@redhat.com
Cc: virtualization@lists.linux-foundation.org
Hi,
After commit ba168b52bf8e "mm: use rwsem assertion macros for
mmap_lock") was submitted, we started getting a lot of the
following warnings about a missing mmap write lock during VM boot:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85
track_pfn_remap+0x12b/0x130
Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall
nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa
openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle ip6table_nat
iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack xt_MASQUERADE
nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter
rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser
libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_ib
ib_uverbs ib_core fuse mlx5_core
CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W
6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:track_pfn_remap+0x12b/0x130
Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48
89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44
00 00 80 3d ac 59 96 01 00 74 01 c3 48 89
RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000
RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000
R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918
R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000
FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0
Call Trace:
<TASK>
? __warn+0x78/0x110
? track_pfn_remap+0x12b/0x130
? report_bug+0x16d/0x180
? handle_bug+0x3c/0x60
? exc_invalid_op+0x14/0x70
? asm_exc_invalid_op+0x16/0x20
? track_pfn_remap+0x12b/0x130
remap_pfn_range+0x41/0xa0
vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa]
__do_fault+0x2f/0xb0
__handle_mm_fault+0x13d3/0x2210
handle_mm_fault+0xb0/0x260
fixup_user_fault+0x77/0x170
hva_to_pfn+0x2c5/0x4b0
kvm_faultin_pfn+0xd7/0x510
kvm_tdp_page_fault+0x111/0x190
kvm_mmu_do_page_fault+0x105/0x230
kvm_mmu_page_fault+0x7d/0x620
? vmx_deliver_interrupt+0x110/0x190
? __apic_accept_irq+0x16c/0x270
? vmx_vmexit+0x8d/0xc0
vmx_handle_exit+0x110/0x640
kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20
kvm_vcpu_ioctl+0x263/0x6a0
? futex_wake+0x81/0x180
__x64_sys_ioctl+0x4a7/0x9d0
? __x64_sys_futex+0x73/0x1c0
? kvm_on_user_return+0x86/0x90
do_syscall_64+0x4c/0x100
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f679186a17b
Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff
c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01
c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059
RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000
</TASK>
---[ end trace 0000000000000000 ]---
The warnings show up only when the vdpa page-per-vq option is used (doorbell
mapping to guest).
The issue seems to have existed before, but was visible only with CONFIG_LOCKDEP
enabled. I tried finding if this was introduced in more recent kernels, but
stopped after going as far back as 6.5: the issue was still visible there.
The warning is triggered for the following call chain:
vhost_vdpa_fault()
-> remap_pfn_range()
-> remap_pfn_range_notrack()
-> vm_flags_set()
-> vma_start_write()
-> __is_vma_write_locked()
-> mmap_assert_write_locked()
I've been trying to follow how the mm write lock is dropped in the above call
chain or not taken at all. But I couldn't make much sense of it...
Any ideas of what could have gone wrong here?
Thanks,
Dragos
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-17 15:50 mmap_assert_write_locked warnings during for vhost_vdpa_fault Dragos Tatulea @ 2024-06-18 1:17 ` Jason Wang 2024-06-18 2:03 ` Tian, Kevin 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2024-06-18 1:17 UTC (permalink / raw) To: Dragos Tatulea Cc: mst@redhat.com, eperezma@redhat.com, virtualization@lists.linux-foundation.org, Peter Xu On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > Hi, > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > mmap_lock") was submitted, we started getting a lot of the > following warnings about a missing mmap write lock during VM boot: > > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > track_pfn_remap+0x12b/0x130 > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle ip6table_nat > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack xt_MASQUERADE > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm mlx5_ib > ib_uverbs ib_core fuse mlx5_core > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > RIP: 0010:track_pfn_remap+0x12b/0x130 > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > Call Trace: > <TASK> > ? __warn+0x78/0x110 > ? track_pfn_remap+0x12b/0x130 > ? report_bug+0x16d/0x180 > ? handle_bug+0x3c/0x60 > ? exc_invalid_op+0x14/0x70 > ? asm_exc_invalid_op+0x16/0x20 > ? track_pfn_remap+0x12b/0x130 > remap_pfn_range+0x41/0xa0 > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > __do_fault+0x2f/0xb0 > __handle_mm_fault+0x13d3/0x2210 > handle_mm_fault+0xb0/0x260 > fixup_user_fault+0x77/0x170 > hva_to_pfn+0x2c5/0x4b0 > kvm_faultin_pfn+0xd7/0x510 > kvm_tdp_page_fault+0x111/0x190 > kvm_mmu_do_page_fault+0x105/0x230 > kvm_mmu_page_fault+0x7d/0x620 > ? vmx_deliver_interrupt+0x110/0x190 > ? __apic_accept_irq+0x16c/0x270 > ? vmx_vmexit+0x8d/0xc0 > vmx_handle_exit+0x110/0x640 > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > kvm_vcpu_ioctl+0x263/0x6a0 > ? futex_wake+0x81/0x180 > __x64_sys_ioctl+0x4a7/0x9d0 > ? __x64_sys_futex+0x73/0x1c0 > ? kvm_on_user_return+0x86/0x90 > do_syscall_64+0x4c/0x100 > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > RIP: 0033:0x7f679186a17b > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > </TASK> > ---[ end trace 0000000000000000 ]--- > > The warnings show up only when the vdpa page-per-vq option is used (doorbell > mapping to guest). > > The issue seems to have existed before, but was visible only with CONFIG_LOCKDEP > enabled. I tried finding if this was introduced in more recent kernels, but > stopped after going as far back as 6.5: the issue was still visible there. > > The warning is triggered for the following call chain: > vhost_vdpa_fault() > -> remap_pfn_range() > -> remap_pfn_range_notrack() > -> vm_flags_set() > -> vma_start_write() > -> __is_vma_write_locked() > -> mmap_assert_write_locked() > > > I've been trying to follow how the mm write lock is dropped in the above call > chain or not taken at all. But I couldn't make much sense of it... I've also had a glance at vfio_pci_mmap_fault, it seems to do something similar. > Any ideas of what could have gone wrong here? Adding Peter for more thought here. Thanks > > Thanks, > Dragos ^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-18 1:17 ` Jason Wang @ 2024-06-18 2:03 ` Tian, Kevin 2024-06-18 2:39 ` Jason Wang 0 siblings, 1 reply; 13+ messages in thread From: Tian, Kevin @ 2024-06-18 2:03 UTC (permalink / raw) To: Jason Wang, Dragos Tatulea Cc: mst@redhat.com, eperezma@redhat.com, virtualization@lists.linux-foundation.org, Peter Xu > From: Jason Wang <jasowang@redhat.com> > Sent: Tuesday, June 18, 2024 9:18 AM > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > wrote: > > > > Hi, > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > mmap_lock") was submitted, we started getting a lot of the > > following warnings about a missing mmap write lock during VM boot: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > track_pfn_remap+0x12b/0x130 > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > ip6table_nat > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > xt_MASQUERADE > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > ib_iser > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > mlx5_ib > > ib_uverbs ib_core fuse mlx5_core > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > Call Trace: > > <TASK> > > ? __warn+0x78/0x110 > > ? track_pfn_remap+0x12b/0x130 > > ? report_bug+0x16d/0x180 > > ? handle_bug+0x3c/0x60 > > ? exc_invalid_op+0x14/0x70 > > ? asm_exc_invalid_op+0x16/0x20 > > ? track_pfn_remap+0x12b/0x130 > > remap_pfn_range+0x41/0xa0 > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > __do_fault+0x2f/0xb0 > > __handle_mm_fault+0x13d3/0x2210 > > handle_mm_fault+0xb0/0x260 > > fixup_user_fault+0x77/0x170 > > hva_to_pfn+0x2c5/0x4b0 > > kvm_faultin_pfn+0xd7/0x510 > > kvm_tdp_page_fault+0x111/0x190 > > kvm_mmu_do_page_fault+0x105/0x230 > > kvm_mmu_page_fault+0x7d/0x620 > > ? vmx_deliver_interrupt+0x110/0x190 > > ? __apic_accept_irq+0x16c/0x270 > > ? vmx_vmexit+0x8d/0xc0 > > vmx_handle_exit+0x110/0x640 > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > kvm_vcpu_ioctl+0x263/0x6a0 > > ? futex_wake+0x81/0x180 > > __x64_sys_ioctl+0x4a7/0x9d0 > > ? __x64_sys_futex+0x73/0x1c0 > > ? kvm_on_user_return+0x86/0x90 > > do_syscall_64+0x4c/0x100 > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > RIP: 0033:0x7f679186a17b > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > 0000000000000010 > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > </TASK> > > ---[ end trace 0000000000000000 ]--- > > > > The warnings show up only when the vdpa page-per-vq option is used > (doorbell > > mapping to guest). > > > > The issue seems to have existed before, but was visible only with > CONFIG_LOCKDEP > > enabled. I tried finding if this was introduced in more recent kernels, but > > stopped after going as far back as 6.5: the issue was still visible there. > > > > The warning is triggered for the following call chain: > > vhost_vdpa_fault() > > -> remap_pfn_range() > > -> remap_pfn_range_notrack() > > -> vm_flags_set() > > -> vma_start_write() > > -> __is_vma_write_locked() > > -> mmap_assert_write_locked() > > > > > > I've been trying to follow how the mm write lock is dropped in the above > call > > chain or not taken at all. But I couldn't make much sense of it... > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > similar. > > > Any ideas of what could have gone wrong here? > > Adding Peter for more thought here. > vfio-side fix was just queued for rc4: https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-18 2:03 ` Tian, Kevin @ 2024-06-18 2:39 ` Jason Wang 2024-06-19 9:14 ` Dragos Tatulea 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2024-06-18 2:39 UTC (permalink / raw) To: Tian, Kevin, Dragos Tatulea Cc: mst@redhat.com, eperezma@redhat.com, virtualization@lists.linux-foundation.org, Peter Xu On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > From: Jason Wang <jasowang@redhat.com> > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > wrote: > > > > > > Hi, > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > mmap_lock") was submitted, we started getting a lot of the > > > following warnings about a missing mmap write lock during VM boot: > > > > > > ------------[ cut here ]------------ > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > track_pfn_remap+0x12b/0x130 > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > ip6table_nat > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > xt_MASQUERADE > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > ib_iser > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > mlx5_ib > > > ib_uverbs ib_core fuse mlx5_core > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > Call Trace: > > > <TASK> > > > ? __warn+0x78/0x110 > > > ? track_pfn_remap+0x12b/0x130 > > > ? report_bug+0x16d/0x180 > > > ? handle_bug+0x3c/0x60 > > > ? exc_invalid_op+0x14/0x70 > > > ? asm_exc_invalid_op+0x16/0x20 > > > ? track_pfn_remap+0x12b/0x130 > > > remap_pfn_range+0x41/0xa0 > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > __do_fault+0x2f/0xb0 > > > __handle_mm_fault+0x13d3/0x2210 > > > handle_mm_fault+0xb0/0x260 > > > fixup_user_fault+0x77/0x170 > > > hva_to_pfn+0x2c5/0x4b0 > > > kvm_faultin_pfn+0xd7/0x510 > > > kvm_tdp_page_fault+0x111/0x190 > > > kvm_mmu_do_page_fault+0x105/0x230 > > > kvm_mmu_page_fault+0x7d/0x620 > > > ? vmx_deliver_interrupt+0x110/0x190 > > > ? __apic_accept_irq+0x16c/0x270 > > > ? vmx_vmexit+0x8d/0xc0 > > > vmx_handle_exit+0x110/0x640 > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > ? futex_wake+0x81/0x180 > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > ? __x64_sys_futex+0x73/0x1c0 > > > ? kvm_on_user_return+0x86/0x90 > > > do_syscall_64+0x4c/0x100 > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > RIP: 0033:0x7f679186a17b > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > 0000000000000010 > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > </TASK> > > > ---[ end trace 0000000000000000 ]--- > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > (doorbell > > > mapping to guest). > > > > > > The issue seems to have existed before, but was visible only with > > CONFIG_LOCKDEP > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > The warning is triggered for the following call chain: > > > vhost_vdpa_fault() > > > -> remap_pfn_range() > > > -> remap_pfn_range_notrack() > > > -> vm_flags_set() > > > -> vma_start_write() > > > -> __is_vma_write_locked() > > > -> mmap_assert_write_locked() > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > call > > > chain or not taken at all. But I couldn't make much sense of it... > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > similar. > > > > > Any ideas of what could have gone wrong here? > > > > Adding Peter for more thought here. > > > > vfio-side fix was just queued for rc4: > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ Great, thanks for the pointer. Dragos, do you want to propose a similar fix for vDPA? Thanks ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-18 2:39 ` Jason Wang @ 2024-06-19 9:14 ` Dragos Tatulea 2024-06-19 9:51 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Dragos Tatulea @ 2024-06-19 9:14 UTC (permalink / raw) To: kevin.tian@intel.com, jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org, mst@redhat.com, eperezma@redhat.com, peterx@redhat.com On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > From: Jason Wang <jasowang@redhat.com> > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > wrote: > > > > > > > > Hi, > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > mmap_lock") was submitted, we started getting a lot of the > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > ------------[ cut here ]------------ > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > track_pfn_remap+0x12b/0x130 > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > ip6table_nat > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > xt_MASQUERADE > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > ib_iser > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > mlx5_ib > > > > ib_uverbs ib_core fuse mlx5_core > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > Call Trace: > > > > <TASK> > > > > ? __warn+0x78/0x110 > > > > ? track_pfn_remap+0x12b/0x130 > > > > ? report_bug+0x16d/0x180 > > > > ? handle_bug+0x3c/0x60 > > > > ? exc_invalid_op+0x14/0x70 > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > ? track_pfn_remap+0x12b/0x130 > > > > remap_pfn_range+0x41/0xa0 > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > __do_fault+0x2f/0xb0 > > > > __handle_mm_fault+0x13d3/0x2210 > > > > handle_mm_fault+0xb0/0x260 > > > > fixup_user_fault+0x77/0x170 > > > > hva_to_pfn+0x2c5/0x4b0 > > > > kvm_faultin_pfn+0xd7/0x510 > > > > kvm_tdp_page_fault+0x111/0x190 > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > ? __apic_accept_irq+0x16c/0x270 > > > > ? vmx_vmexit+0x8d/0xc0 > > > > vmx_handle_exit+0x110/0x640 > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > ? futex_wake+0x81/0x180 > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > ? kvm_on_user_return+0x86/0x90 > > > > do_syscall_64+0x4c/0x100 > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > RIP: 0033:0x7f679186a17b > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > 0000000000000010 > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > </TASK> > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > (doorbell > > > > mapping to guest). > > > > > > > > The issue seems to have existed before, but was visible only with > > > CONFIG_LOCKDEP > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > The warning is triggered for the following call chain: > > > > vhost_vdpa_fault() > > > > -> remap_pfn_range() > > > > -> remap_pfn_range_notrack() > > > > -> vm_flags_set() > > > > -> vma_start_write() > > > > -> __is_vma_write_locked() > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > call > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > similar. > > > > > > > Any ideas of what could have gone wrong here? > > > > > > Adding Peter for more thought here. > > > > > > > vfio-side fix was just queued for rc4: > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > Great, thanks for the pointer. > Yes, thanks! > Dragos, do you want to propose a similar fix for vDPA? > Had a first look: the fixes look a bit daunting. I will to "port" them, not promising anything though. Thanks, Dragos ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-19 9:14 ` Dragos Tatulea @ 2024-06-19 9:51 ` Michael S. Tsirkin 2024-06-20 4:07 ` Jason Wang 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-19 9:51 UTC (permalink / raw) To: Dragos Tatulea Cc: kevin.tian@intel.com, jasowang@redhat.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > ------------[ cut here ]------------ > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > track_pfn_remap+0x12b/0x130 > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > ip6table_nat > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > xt_MASQUERADE > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > ib_iser > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > mlx5_ib > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > knlGS:0000000000000000 > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > Call Trace: > > > > > <TASK> > > > > > ? __warn+0x78/0x110 > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > ? report_bug+0x16d/0x180 > > > > > ? handle_bug+0x3c/0x60 > > > > > ? exc_invalid_op+0x14/0x70 > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > remap_pfn_range+0x41/0xa0 > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > __do_fault+0x2f/0xb0 > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > handle_mm_fault+0xb0/0x260 > > > > > fixup_user_fault+0x77/0x170 > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > vmx_handle_exit+0x110/0x640 > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > ? futex_wake+0x81/0x180 > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > do_syscall_64+0x4c/0x100 > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > RIP: 0033:0x7f679186a17b > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > 0000000000000010 > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > </TASK> > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > (doorbell > > > > > mapping to guest). > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > CONFIG_LOCKDEP > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > The warning is triggered for the following call chain: > > > > > vhost_vdpa_fault() > > > > > -> remap_pfn_range() > > > > > -> remap_pfn_range_notrack() > > > > > -> vm_flags_set() > > > > > -> vma_start_write() > > > > > -> __is_vma_write_locked() > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > call > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > similar. > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > Great, thanks for the pointer. > > > Yes, thanks! > > > Dragos, do you want to propose a similar fix for vDPA? > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > promising anything though. > > Thanks, > Dragos Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, seems a bit much to ask from a random reporter, this race likely can bite anyone. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-19 9:51 ` Michael S. Tsirkin @ 2024-06-20 4:07 ` Jason Wang 2024-06-20 5:44 ` Michael S. Tsirkin 0 siblings, 1 reply; 13+ messages in thread From: Jason Wang @ 2024-06-20 4:07 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com [-- Attachment #1: Type: text/plain, Size: 7149 bytes --] On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > ip6table_nat > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > xt_MASQUERADE > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > ib_iser > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > mlx5_ib > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > knlGS:0000000000000000 > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > ? __warn+0x78/0x110 > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > ? report_bug+0x16d/0x180 > > > > > > ? handle_bug+0x3c/0x60 > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > __do_fault+0x2f/0xb0 > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > fixup_user_fault+0x77/0x170 > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > ? futex_wake+0x81/0x180 > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > do_syscall_64+0x4c/0x100 > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > RIP: 0033:0x7f679186a17b > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > 0000000000000010 > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > </TASK> > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > (doorbell > > > > > > mapping to guest). > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > CONFIG_LOCKDEP > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > vhost_vdpa_fault() > > > > > > -> remap_pfn_range() > > > > > > -> remap_pfn_range_notrack() > > > > > > -> vm_flags_set() > > > > > > -> vma_start_write() > > > > > > -> __is_vma_write_locked() > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > call > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > similar. > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > Great, thanks for the pointer. > > > > > Yes, thanks! > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > promising anything though. > > > > Thanks, > > Dragos > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > seems a bit much to ask from a random reporter, Probably, just asking since Dragos has done some investigation. > this race > likely can bite anyone. > Dragos, I've drafted a patch, please try to see if it works (I had tested it with LOCKDEP via vp_vdpa in L2). Thanks [-- Attachment #2: 0001-vhost-vdpa-switch-to-use-vmf_insert_pfn-in-the-fault.patch --] [-- Type: application/octet-stream, Size: 1500 bytes --] From a94a70372b702246436cb33ecbaa07d5c6127ce7 Mon Sep 17 00:00:00 2001 From: Jason Wang <jasowang@redhat.com> Date: Wed, 19 Jun 2024 21:25:32 -0400 Subject: [PATCH] vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler remap_pfn_page() should not be called in the fault handler as it may change the vma->flags which may trigger lockdep warning since the vma write lock is not held. Actually there's no need to modify the vma->flags as it has been set in the mmap(). So this patch switches to use vmf_insert_pfn() instead. Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Fixes: ddd89d0a059d ("vhost_vdpa: support doorbell mapping via mmap") Cc: stable@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> --- drivers/vhost/vdpa.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 63a53680a85c..6b9c12acf438 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1483,13 +1483,7 @@ static vm_fault_t vhost_vdpa_fault(struct vm_fault *vmf) notify = ops->get_vq_notification(vdpa, index); - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); - if (remap_pfn_range(vma, vmf->address & PAGE_MASK, - PFN_DOWN(notify.addr), PAGE_SIZE, - vma->vm_page_prot)) - return VM_FAULT_SIGBUS; - - return VM_FAULT_NOPAGE; + return vmf_insert_pfn(vma, vmf->address & PAGE_MASK, PFN_DOWN(notify.addr)); } static const struct vm_operations_struct vhost_vdpa_vm_ops = { -- 2.31.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-20 4:07 ` Jason Wang @ 2024-06-20 5:44 ` Michael S. Tsirkin 2024-06-20 8:23 ` Jason Wang 0 siblings, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-20 5:44 UTC (permalink / raw) To: Jason Wang Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > ip6table_nat > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > xt_MASQUERADE > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > ib_iser > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > mlx5_ib > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > knlGS:0000000000000000 > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > Call Trace: > > > > > > > <TASK> > > > > > > > ? __warn+0x78/0x110 > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > 0000000000000010 > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > </TASK> > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > (doorbell > > > > > > > mapping to guest). > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > CONFIG_LOCKDEP > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > vhost_vdpa_fault() > > > > > > > -> remap_pfn_range() > > > > > > > -> remap_pfn_range_notrack() > > > > > > > -> vm_flags_set() > > > > > > > -> vma_start_write() > > > > > > > -> __is_vma_write_locked() > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > call > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > similar. > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > Great, thanks for the pointer. > > > > > > > Yes, thanks! > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > promising anything though. > > > > > > Thanks, > > > Dragos > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > seems a bit much to ask from a random reporter, > > Probably, just asking since Dragos has done some investigation. > > > this race > > likely can bite anyone. > > > > Dragos, I've drafted a patch, please try to see if it works (I had > tested it with LOCKDEP via vp_vdpa in L2). > > Thanks What is going on here that you decided to do an attachment as opposed to inlining normally? -- MST ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-20 5:44 ` Michael S. Tsirkin @ 2024-06-20 8:23 ` Jason Wang 2024-06-20 9:05 ` Michael S. Tsirkin 2024-07-03 16:23 ` Michael S. Tsirkin 0 siblings, 2 replies; 13+ messages in thread From: Jason Wang @ 2024-06-20 8:23 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, Jun 20, 2024 at 1:44 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > > ip6table_nat > > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > > xt_MASQUERADE > > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > > ib_iser > > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > > mlx5_ib > > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > > knlGS:0000000000000000 > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > > Call Trace: > > > > > > > > <TASK> > > > > > > > > ? __warn+0x78/0x110 > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > > 0000000000000010 > > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > > </TASK> > > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > > (doorbell > > > > > > > > mapping to guest). > > > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > > CONFIG_LOCKDEP > > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > > vhost_vdpa_fault() > > > > > > > > -> remap_pfn_range() > > > > > > > > -> remap_pfn_range_notrack() > > > > > > > > -> vm_flags_set() > > > > > > > > -> vma_start_write() > > > > > > > > -> __is_vma_write_locked() > > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > > call > > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > > similar. > > > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > > > Great, thanks for the pointer. > > > > > > > > > Yes, thanks! > > > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > > promising anything though. > > > > > > > > Thanks, > > > > Dragos > > > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > > seems a bit much to ask from a random reporter, > > > > Probably, just asking since Dragos has done some investigation. > > > > > this race > > > likely can bite anyone. > > > > > > > Dragos, I've drafted a patch, please try to see if it works (I had > > tested it with LOCKDEP via vp_vdpa in L2). > > > > Thanks > > What is going on here that you decided to do an attachment as > opposed to inlining normally? Actually, I plan to send a formal patch separately but stop at the last seconds since it is just tested by L2 + vp_vdpa in L1. If inline really matters, I will do that next time. Thanks > > -- > MST > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-20 8:23 ` Jason Wang @ 2024-06-20 9:05 ` Michael S. Tsirkin 2024-06-26 10:54 ` Dragos Tatulea 2024-07-03 16:23 ` Michael S. Tsirkin 1 sibling, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-06-20 9:05 UTC (permalink / raw) To: Jason Wang Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, Jun 20, 2024 at 04:23:30PM +0800, Jason Wang wrote: > On Thu, Jun 20, 2024 at 1:44 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > > > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > > > ip6table_nat > > > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > > > xt_MASQUERADE > > > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > > > ib_iser > > > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > > > mlx5_ib > > > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > > > knlGS:0000000000000000 > > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > > > Call Trace: > > > > > > > > > <TASK> > > > > > > > > > ? __warn+0x78/0x110 > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > > > 0000000000000010 > > > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > > > </TASK> > > > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > > > (doorbell > > > > > > > > > mapping to guest). > > > > > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > > > CONFIG_LOCKDEP > > > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > > > vhost_vdpa_fault() > > > > > > > > > -> remap_pfn_range() > > > > > > > > > -> remap_pfn_range_notrack() > > > > > > > > > -> vm_flags_set() > > > > > > > > > -> vma_start_write() > > > > > > > > > -> __is_vma_write_locked() > > > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > > > call > > > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > > > similar. > > > > > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > > > > > Great, thanks for the pointer. > > > > > > > > > > > Yes, thanks! > > > > > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > > > promising anything though. > > > > > > > > > > Thanks, > > > > > Dragos > > > > > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > > > seems a bit much to ask from a random reporter, > > > > > > Probably, just asking since Dragos has done some investigation. > > > > > > > this race > > > > likely can bite anyone. > > > > > > > > > > Dragos, I've drafted a patch, please try to see if it works (I had > > > tested it with LOCKDEP via vp_vdpa in L2). > > > > > > Thanks > > > > What is going on here that you decided to do an attachment as > > opposed to inlining normally? > > Actually, I plan to send a formal patch separately but stop at the > last seconds since it is just tested by L2 + vp_vdpa in L1. tag it as RFC, explain the testing status in the mail. > If inline really matters, I will do that next time. yes, this way people can comment. > Thanks > > > > > -- > > MST > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-20 9:05 ` Michael S. Tsirkin @ 2024-06-26 10:54 ` Dragos Tatulea 0 siblings, 0 replies; 13+ messages in thread From: Dragos Tatulea @ 2024-06-26 10:54 UTC (permalink / raw) To: mst@redhat.com, jasowang@redhat.com Cc: kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, 2024-06-20 at 05:05 -0400, Michael S. Tsirkin wrote: > On Thu, Jun 20, 2024 at 04:23:30PM +0800, Jason Wang wrote: > > On Thu, Jun 20, 2024 at 1:44 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > > > > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > > > > ip6table_nat > > > > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > > > > xt_MASQUERADE > > > > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > > > > ib_iser > > > > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > > > > mlx5_ib > > > > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > > > > knlGS:0000000000000000 > > > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > > > > Call Trace: > > > > > > > > > > <TASK> > > > > > > > > > > ? __warn+0x78/0x110 > > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > > > > 0000000000000010 > > > > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > > > > </TASK> > > > > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > > > > (doorbell > > > > > > > > > > mapping to guest). > > > > > > > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > > > > CONFIG_LOCKDEP > > > > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > > > > vhost_vdpa_fault() > > > > > > > > > > -> remap_pfn_range() > > > > > > > > > > -> remap_pfn_range_notrack() > > > > > > > > > > -> vm_flags_set() > > > > > > > > > > -> vma_start_write() > > > > > > > > > > -> __is_vma_write_locked() > > > > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > > > > call > > > > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > > > > similar. > > > > > > > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > > > > > > > Great, thanks for the pointer. > > > > > > > > > > > > > Yes, thanks! > > > > > > > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > > > > promising anything though. > > > > > > > > > > > > Thanks, > > > > > > Dragos > > > > > > > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > > > > seems a bit much to ask from a random reporter, > > > > > > > > Probably, just asking since Dragos has done some investigation. > > > > > > > > > this race > > > > > likely can bite anyone. > > > > > > > > > > > > > Dragos, I've drafted a patch, please try to see if it works (I had > > > > tested it with LOCKDEP via vp_vdpa in L2). > > > > > > > > Thanks > > > > > > What is going on here that you decided to do an attachment as > > > opposed to inlining normally? > > > > Actually, I plan to send a formal patch separately but stop at the > > last seconds since it is just tested by L2 + vp_vdpa in L1. > > tag it as RFC, explain the testing status in the mail. > > > If inline really matters, I will do that next time. > > > yes, this way people can comment. > The fix works. Thanks Jason! FWIW: Tested-by: Dragos Tatulea <dtatulea@nvidia.com> Thanks, Dragos > > Thanks > > > > > > > > -- > > > MST > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-06-20 8:23 ` Jason Wang 2024-06-20 9:05 ` Michael S. Tsirkin @ 2024-07-03 16:23 ` Michael S. Tsirkin 2024-07-04 0:10 ` Jason Wang 1 sibling, 1 reply; 13+ messages in thread From: Michael S. Tsirkin @ 2024-07-03 16:23 UTC (permalink / raw) To: Jason Wang Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, Jun 20, 2024 at 04:23:30PM +0800, Jason Wang wrote: > On Thu, Jun 20, 2024 at 1:44 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > > > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > > > ip6table_nat > > > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > > > xt_MASQUERADE > > > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > > > ib_iser > > > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > > > mlx5_ib > > > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > > > knlGS:0000000000000000 > > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > > > Call Trace: > > > > > > > > > <TASK> > > > > > > > > > ? __warn+0x78/0x110 > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > > > 0000000000000010 > > > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > > > </TASK> > > > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > > > (doorbell > > > > > > > > > mapping to guest). > > > > > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > > > CONFIG_LOCKDEP > > > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > > > vhost_vdpa_fault() > > > > > > > > > -> remap_pfn_range() > > > > > > > > > -> remap_pfn_range_notrack() > > > > > > > > > -> vm_flags_set() > > > > > > > > > -> vma_start_write() > > > > > > > > > -> __is_vma_write_locked() > > > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > > > call > > > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > > > similar. > > > > > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > > > > > Great, thanks for the pointer. > > > > > > > > > > > Yes, thanks! > > > > > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > > > promising anything though. > > > > > > > > > > Thanks, > > > > > Dragos > > > > > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > > > seems a bit much to ask from a random reporter, > > > > > > Probably, just asking since Dragos has done some investigation. > > > > > > > this race > > > > likely can bite anyone. > > > > > > > > > > Dragos, I've drafted a patch, please try to see if it works (I had > > > tested it with LOCKDEP via vp_vdpa in L2). > > > > > > Thanks > > > > What is going on here that you decided to do an attachment as > > opposed to inlining normally? > > Actually, I plan to send a formal patch separately but stop at the > last seconds since it is just tested by L2 + vp_vdpa in L1. > > If inline really matters, I will do that next time. > > Thanks Jason are you going to submit a patch, now it's been tested? > > > > -- > > MST > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault 2024-07-03 16:23 ` Michael S. Tsirkin @ 2024-07-04 0:10 ` Jason Wang 0 siblings, 0 replies; 13+ messages in thread From: Jason Wang @ 2024-07-04 0:10 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Dragos Tatulea, kevin.tian@intel.com, virtualization@lists.linux-foundation.org, eperezma@redhat.com, peterx@redhat.com On Thu, Jul 4, 2024 at 12:23 AM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jun 20, 2024 at 04:23:30PM +0800, Jason Wang wrote: > > On Thu, Jun 20, 2024 at 1:44 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Thu, Jun 20, 2024 at 12:07:14PM +0800, Jason Wang wrote: > > > > On Wed, Jun 19, 2024 at 5:52 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote: > > > > > > On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote: > > > > > > > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote: > > > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > > Sent: Tuesday, June 18, 2024 9:18 AM > > > > > > > > > > > > > > > > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for > > > > > > > > > > mmap_lock") was submitted, we started getting a lot of the > > > > > > > > > > following warnings about a missing mmap write lock during VM boot: > > > > > > > > > > > > > > > > > > > > ------------[ cut here ]------------ > > > > > > > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85 > > > > > > > > > > track_pfn_remap+0x12b/0x130 > > > > > > > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall > > > > > > > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa > > > > > > > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle > > > > > > > > > ip6table_nat > > > > > > > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack > > > > > > > > > xt_MASQUERADE > > > > > > > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter > > > > > > > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm > > > > > > > > > ib_iser > > > > > > > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm > > > > > > > > > mlx5_ib > > > > > > > > > > ib_uverbs ib_core fuse mlx5_core > > > > > > > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G W > > > > > > > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1 > > > > > > > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS > > > > > > > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > > > > > > > RIP: 0010:track_pfn_remap+0x12b/0x130 > > > > > > > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48 > > > > > > > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44 > > > > > > > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89 > > > > > > > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246 > > > > > > > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000 > > > > > > > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000 > > > > > > > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000 > > > > > > > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918 > > > > > > > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000 > > > > > > > > > > FS: 00007f678d800700(0000) GS:ffff88852c880000(0000) > > > > > > > > > knlGS:0000000000000000 > > > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > > > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0 > > > > > > > > > > Call Trace: > > > > > > > > > > <TASK> > > > > > > > > > > ? __warn+0x78/0x110 > > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > > ? report_bug+0x16d/0x180 > > > > > > > > > > ? handle_bug+0x3c/0x60 > > > > > > > > > > ? exc_invalid_op+0x14/0x70 > > > > > > > > > > ? asm_exc_invalid_op+0x16/0x20 > > > > > > > > > > ? track_pfn_remap+0x12b/0x130 > > > > > > > > > > remap_pfn_range+0x41/0xa0 > > > > > > > > > > vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa] > > > > > > > > > > __do_fault+0x2f/0xb0 > > > > > > > > > > __handle_mm_fault+0x13d3/0x2210 > > > > > > > > > > handle_mm_fault+0xb0/0x260 > > > > > > > > > > fixup_user_fault+0x77/0x170 > > > > > > > > > > hva_to_pfn+0x2c5/0x4b0 > > > > > > > > > > kvm_faultin_pfn+0xd7/0x510 > > > > > > > > > > kvm_tdp_page_fault+0x111/0x190 > > > > > > > > > > kvm_mmu_do_page_fault+0x105/0x230 > > > > > > > > > > kvm_mmu_page_fault+0x7d/0x620 > > > > > > > > > > ? vmx_deliver_interrupt+0x110/0x190 > > > > > > > > > > ? __apic_accept_irq+0x16c/0x270 > > > > > > > > > > ? vmx_vmexit+0x8d/0xc0 > > > > > > > > > > vmx_handle_exit+0x110/0x640 > > > > > > > > > > kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20 > > > > > > > > > > kvm_vcpu_ioctl+0x263/0x6a0 > > > > > > > > > > ? futex_wake+0x81/0x180 > > > > > > > > > > __x64_sys_ioctl+0x4a7/0x9d0 > > > > > > > > > > ? __x64_sys_futex+0x73/0x1c0 > > > > > > > > > > ? kvm_on_user_return+0x86/0x90 > > > > > > > > > > do_syscall_64+0x4c/0x100 > > > > > > > > > > entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > > > > > > > > > RIP: 0033:0x7f679186a17b > > > > > > > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff > > > > > > > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 > > > > > > > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48 > > > > > > > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX: > > > > > > > > > 0000000000000010 > > > > > > > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b > > > > > > > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059 > > > > > > > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000 > > > > > > > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 > > > > > > > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000 > > > > > > > > > > </TASK> > > > > > > > > > > ---[ end trace 0000000000000000 ]--- > > > > > > > > > > > > > > > > > > > > The warnings show up only when the vdpa page-per-vq option is used > > > > > > > > > (doorbell > > > > > > > > > > mapping to guest). > > > > > > > > > > > > > > > > > > > > The issue seems to have existed before, but was visible only with > > > > > > > > > CONFIG_LOCKDEP > > > > > > > > > > enabled. I tried finding if this was introduced in more recent kernels, but > > > > > > > > > > stopped after going as far back as 6.5: the issue was still visible there. > > > > > > > > > > > > > > > > > > > > The warning is triggered for the following call chain: > > > > > > > > > > vhost_vdpa_fault() > > > > > > > > > > -> remap_pfn_range() > > > > > > > > > > -> remap_pfn_range_notrack() > > > > > > > > > > -> vm_flags_set() > > > > > > > > > > -> vma_start_write() > > > > > > > > > > -> __is_vma_write_locked() > > > > > > > > > > -> mmap_assert_write_locked() > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've been trying to follow how the mm write lock is dropped in the above > > > > > > > > > call > > > > > > > > > > chain or not taken at all. But I couldn't make much sense of it... > > > > > > > > > > > > > > > > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something > > > > > > > > > similar. > > > > > > > > > > > > > > > > > > > Any ideas of what could have gone wrong here? > > > > > > > > > > > > > > > > > > Adding Peter for more thought here. > > > > > > > > > > > > > > > > > > > > > > > > > vfio-side fix was just queued for rc4: > > > > > > > > > > > > > > > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/ > > > > > > > > > > > > > > Great, thanks for the pointer. > > > > > > > > > > > > > Yes, thanks! > > > > > > > > > > > > > Dragos, do you want to propose a similar fix for vDPA? > > > > > > > > > > > > > Had a first look: the fixes look a bit daunting. I will to "port" them, not > > > > > > promising anything though. > > > > > > > > > > > > Thanks, > > > > > > Dragos > > > > > > > > > > Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9, > > > > > seems a bit much to ask from a random reporter, > > > > > > > > Probably, just asking since Dragos has done some investigation. > > > > > > > > > this race > > > > > likely can bite anyone. > > > > > > > > > > > > > Dragos, I've drafted a patch, please try to see if it works (I had > > > > tested it with LOCKDEP via vp_vdpa in L2). > > > > > > > > Thanks > > > > > > What is going on here that you decided to do an attachment as > > > opposed to inlining normally? > > > > Actually, I plan to send a formal patch separately but stop at the > > last seconds since it is just tested by L2 + vp_vdpa in L1. > > > > If inline really matters, I will do that next time. > > > > Thanks > > Jason are you going to submit a patch, now it's been tested? I've posted it yesterday: https://patchwork.kernel.org/project/netdevbpf/patch/20240701033159.18133-1-jasowang@redhat.com/ Thanks > > > > > > > -- > > > MST > > > > ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-07-04 0:11 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-17 15:50 mmap_assert_write_locked warnings during for vhost_vdpa_fault Dragos Tatulea 2024-06-18 1:17 ` Jason Wang 2024-06-18 2:03 ` Tian, Kevin 2024-06-18 2:39 ` Jason Wang 2024-06-19 9:14 ` Dragos Tatulea 2024-06-19 9:51 ` Michael S. Tsirkin 2024-06-20 4:07 ` Jason Wang 2024-06-20 5:44 ` Michael S. Tsirkin 2024-06-20 8:23 ` Jason Wang 2024-06-20 9:05 ` Michael S. Tsirkin 2024-06-26 10:54 ` Dragos Tatulea 2024-07-03 16:23 ` Michael S. Tsirkin 2024-07-04 0:10 ` Jason Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).