All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Dragos Tatulea <dtatulea@nvidia.com>
Cc: "kevin.tian@intel.com" <kevin.tian@intel.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	"eperezma@redhat.com" <eperezma@redhat.com>,
	"peterx@redhat.com" <peterx@redhat.com>
Subject: Re: mmap_assert_write_locked warnings during for vhost_vdpa_fault
Date: Wed, 19 Jun 2024 05:51:52 -0400	[thread overview]
Message-ID: <20240619055112-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <8e540d6f7936852543957970797012ddb351d64d.camel@nvidia.com>

On Wed, Jun 19, 2024 at 09:14:41AM +0000, Dragos Tatulea wrote:
> On Tue, 2024-06-18 at 10:39 +0800, Jason Wang wrote:
> > On Tue, Jun 18, 2024 at 10:03 AM Tian, Kevin <kevin.tian@intel.com> wrote:
> > > 
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Tuesday, June 18, 2024 9:18 AM
> > > > 
> > > > On Mon, Jun 17, 2024 at 11:51 PM Dragos Tatulea <dtatulea@nvidia.com>
> > > > wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > After commit ba168b52bf8e "mm: use rwsem assertion macros for
> > > > > mmap_lock") was submitted, we started getting a lot of the
> > > > > following warnings about a missing mmap write lock during VM boot:
> > > > > 
> > > > > ------------[ cut here ]------------
> > > > > WARNING: CPU: 1 PID: 58633 at include/linux/rwsem.h:85
> > > > > track_pfn_remap+0x12b/0x130
> > > > > Modules linked in: act_mirred act_skbedit vhost_vdpa cls_matchall
> > > > > nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vdpa
> > > > > openvswitch nsh vhost_net vhost vhost_iotlb tap ip6table_mangle
> > > > ip6table_nat
> > > > > iptable_mangle nf_tables ip6table_filter ip6_tables xt_conntrack
> > > > xt_MASQUERADE
> > > > > nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter
> > > > > rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm
> > > > ib_iser
> > > > > libiscsi ib_umad scsi_transport_iscsi ib_ipoib rdma_cm iw_cm ib_cm
> > > > mlx5_ib
> > > > > ib_uverbs ib_core fuse mlx5_core
> > > > > CPU: 1 PID: 58633 Comm: CPU 0/KVM Tainted: G        W
> > > > > 6.10.0-rc1_for_upstream_min_debug_2024_05_29_17_06 #1
> > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> > > > > rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> > > > > RIP: 0010:track_pfn_remap+0x12b/0x130
> > > > > Code: 48 83 c4 08 b8 ea ff ff ff 5b 5d 41 5c 41 5d c3 48 83 c4 08 48 89 ef 48
> > > > > 89 f2 5b 31 c9 4c 89 c6 5d 41 5c 41 5d e9 f5 fb ff ff <0f> 0b eb 9b 90 0f 1f 44
> > > > > 00 00 80 3d ac 59 96 01 00 74 01 c3 48 89
> > > > > RSP: 0018:ffff888350f8b8e0 EFLAGS: 00010246
> > > > > RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000000000
> > > > > RDX: ffff8881080ca300 RSI: 0000000000001000 RDI: 0000000544003000
> > > > > RBP: 0000000544003000 R08: ffff888106730a60 R09: 0000000000000000
> > > > > R10: ffff888116eeff60 R11: 0000000000000000 R12: ffff888350f8b918
> > > > > R13: ffff888149f99da8 R14: 0000000000001000 R15: 0000000000001000
> > > > > FS:  00007f678d800700(0000) GS:ffff88852c880000(0000)
> > > > knlGS:0000000000000000
> > > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > CR2: 00000000004e54f8 CR3: 0000000112290004 CR4: 0000000000372eb0
> > > > > Call Trace:
> > > > >  <TASK>
> > > > >  ? __warn+0x78/0x110
> > > > >  ? track_pfn_remap+0x12b/0x130
> > > > >  ? report_bug+0x16d/0x180
> > > > >  ? handle_bug+0x3c/0x60
> > > > >  ? exc_invalid_op+0x14/0x70
> > > > >  ? asm_exc_invalid_op+0x16/0x20
> > > > >  ? track_pfn_remap+0x12b/0x130
> > > > >  remap_pfn_range+0x41/0xa0
> > > > >  vhost_vdpa_fault+0x6c/0xa0 [vhost_vdpa]
> > > > >  __do_fault+0x2f/0xb0
> > > > >  __handle_mm_fault+0x13d3/0x2210
> > > > >  handle_mm_fault+0xb0/0x260
> > > > >  fixup_user_fault+0x77/0x170
> > > > >  hva_to_pfn+0x2c5/0x4b0
> > > > >  kvm_faultin_pfn+0xd7/0x510
> > > > >  kvm_tdp_page_fault+0x111/0x190
> > > > >  kvm_mmu_do_page_fault+0x105/0x230
> > > > >  kvm_mmu_page_fault+0x7d/0x620
> > > > >  ? vmx_deliver_interrupt+0x110/0x190
> > > > >  ? __apic_accept_irq+0x16c/0x270
> > > > >  ? vmx_vmexit+0x8d/0xc0
> > > > >  vmx_handle_exit+0x110/0x640
> > > > >  kvm_arch_vcpu_ioctl_run+0xdb0/0x1c20
> > > > >  kvm_vcpu_ioctl+0x263/0x6a0
> > > > >  ? futex_wake+0x81/0x180
> > > > >  __x64_sys_ioctl+0x4a7/0x9d0
> > > > >  ? __x64_sys_futex+0x73/0x1c0
> > > > >  ? kvm_on_user_return+0x86/0x90
> > > > >  do_syscall_64+0x4c/0x100
> > > > >  entry_SYSCALL_64_after_hwframe+0x4b/0x53
> > > > > RIP: 0033:0x7f679186a17b
> > > > > Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff
> > > > > c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01
> > > > > c3 48 8b 0d ed ac 0c 00 f7 d8 64 89 01 48
> > > > > RSP: 002b:00007f678d7ff788 EFLAGS: 00000246 ORIG_RAX:
> > > > 0000000000000010
> > > > > RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f679186a17b
> > > > > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000059
> > > > > RBP: 000055da5ee22050 R08: 000055da44b28160 R09: 0000000000000000
> > > > > R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> > > > > R13: 000055da452b05e0 R14: 0000000000000001 R15: 0000000000000000
> > > > >  </TASK>
> > > > > ---[ end trace 0000000000000000 ]---
> > > > > 
> > > > > The warnings show up only when the vdpa page-per-vq option is used
> > > > (doorbell
> > > > > mapping to guest).
> > > > > 
> > > > > The issue seems to have existed before, but was visible only with
> > > > CONFIG_LOCKDEP
> > > > > enabled. I tried finding if this was introduced in more recent kernels, but
> > > > > stopped after going as far back as 6.5: the issue was still visible there.
> > > > > 
> > > > > The warning is triggered for the following call chain:
> > > > > vhost_vdpa_fault()
> > > > >  -> remap_pfn_range()
> > > > >   -> remap_pfn_range_notrack()
> > > > >    -> vm_flags_set()
> > > > >     -> vma_start_write()
> > > > >      -> __is_vma_write_locked()
> > > > >       -> mmap_assert_write_locked()
> > > > > 
> > > > > 
> > > > > I've been trying to follow how the mm write lock is dropped in the above
> > > > call
> > > > > chain or not taken at all. But I couldn't make much sense of it...
> > > > 
> > > > I've also had a glance at vfio_pci_mmap_fault, it seems to do something
> > > > similar.
> > > > 
> > > > > Any ideas of what could have gone wrong here?
> > > > 
> > > > Adding Peter for more thought here.
> > > > 
> > > 
> > > vfio-side fix was just queued for rc4:
> > > 
> > > https://lore.kernel.org/all/20240614155603.34567eb7.alex.williamson@redhat.com/T/
> > 
> > Great, thanks for the pointer.
> > 
> Yes, thanks!
> 
> > Dragos, do you want to propose a similar fix for vDPA?
> > 
> Had a first look: the fixes look a bit daunting. I will to "port" them, not
> promising anything though.
> 
> Thanks,
> Dragos

Yea Jason, you coded this in ddd89d0a059d8e9740c75a97e0efe9bf07ee51f9,
seems a bit much to ask from a random reporter, this race
likely can bite anyone.


  reply	other threads:[~2024-06-19  9:52 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17 15:50 mmap_assert_write_locked warnings during for vhost_vdpa_fault Dragos Tatulea
2024-06-18  1:17 ` Jason Wang
2024-06-18  2:03   ` Tian, Kevin
2024-06-18  2:39     ` Jason Wang
2024-06-19  9:14       ` Dragos Tatulea
2024-06-19  9:51         ` Michael S. Tsirkin [this message]
2024-06-20  4:07           ` Jason Wang
2024-06-20  5:44             ` Michael S. Tsirkin
2024-06-20  8:23               ` Jason Wang
2024-06-20  9:05                 ` Michael S. Tsirkin
2024-06-26 10:54                   ` Dragos Tatulea
2024-07-03 16:23                 ` Michael S. Tsirkin
2024-07-04  0:10                   ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240619055112-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=dtatulea@nvidia.com \
    --cc=eperezma@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=peterx@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.