From: Matthew Rosato <mjrosato@linux.ibm.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
Alex Williamson <alex.williamson@redhat.com>,
Cornelia Huck <cohuck@redhat.com>,
iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
kvm@vger.kernel.org, Will Deacon <will@kernel.org>
Cc: Qian Cai <cai@lca.pw>, Joerg Roedel <jroedel@suse.de>,
Marek Szyprowski <m.szyprowski@samsung.com>,
Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 3/4] vfio: Follow a strict lifetime for struct iommu_group *
Date: Tue, 20 Sep 2022 15:32:06 -0400 [thread overview]
Message-ID: <d15a08f6-fd0d-d28d-2b35-34c445d9b11e@linux.ibm.com> (raw)
In-Reply-To: <3-v1-ef00ffecea52+2cb-iommu_group_lifetime_jgg@nvidia.com>
On 9/8/22 2:45 PM, Jason Gunthorpe wrote:
> The iommu_group comes from the struct device that a driver has been bound
> to and then created a struct vfio_device against. To keep the iommu layer
> sane we want to have a simple rule that only an attached driver should be
> using the iommu API. Particularly only an attached driver should hold
> ownership.
>
> In VFIO's case since it uses the group APIs and it shares between
> different drivers it is a bit more complicated, but the principle still
> holds.
>
> Solve this by waiting for all users of the vfio_group to stop before
> allowing vfio_unregister_group_dev() to complete. This is done with a new
> completion to know when the users go away and an additional refcount to
> keep track of how many device drivers are sharing the vfio group. The last
> driver to be unregistered will clean up the group.
>
> This solves crashes in the S390 iommu driver that come because VFIO ends
> up racing releasing ownership (which attaches the default iommu_domain to
> the device) with the removal of that same device from the iommu
> driver. This is a side case that iommu drivers should not have to cope
> with.
>
> iommu driver failed to attach the default/blocking domain
> WARNING: CPU: 0 PID: 5082 at drivers/iommu/iommu.c:1961 iommu_detach_group+0x6c/0x80
> Modules linked in: macvtap macvlan tap vfio_pci vfio_pci_core irqbypass vfio_virqfd kvm nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink mlx5_ib sunrpc ib_uverbs ism smc uvdevice ib_core s390_trng eadm_sch tape_3590 tape tape_class vfio_ccw mdev vfio_iommu_type1 vfio zcrypt_cex4 sch_fq_codel configfs ghash_s390 prng chacha_s390 libchacha aes_s390 mlx5_core des_s390 libdes sha3_512_s390 nvme sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common nvme_core zfcp scsi_transport_fc pkey zcrypt rng_core autofs4
> CPU: 0 PID: 5082 Comm: qemu-system-s39 Tainted: G W 6.0.0-rc3 #5
> Hardware name: IBM 3931 A01 782 (LPAR)
> Krnl PSW : 0704c00180000000 000000095bb10d28 (iommu_detach_group+0x70/0x80)
> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000001 0000000900000027 0000000000000039 000000095c97ffe0
> 00000000fffeffff 00000009fc290000 00000000af1fda50 00000000af590b58
> 00000000af1fdaf0 0000000135c7a320 0000000135e52258 0000000135e52200
> 00000000a29e8000 00000000af590b40 000000095bb10d24 0000038004b13c98
> Krnl Code: 000000095bb10d18: c020003d56fc larl %r2,000000095c2bbb10
> 000000095bb10d1e: c0e50019d901 brasl %r14,000000095be4bf20
> #000000095bb10d24: af000000 mc 0,0
> >000000095bb10d28: b904002a lgr %r2,%r10
> 000000095bb10d2c: ebaff0a00004 lmg %r10,%r15,160(%r15)
> 000000095bb10d32: c0f4001aa867 brcl 15,000000095be65e00
> 000000095bb10d38: c004002168e0 brcl 0,000000095bf3def8
> 000000095bb10d3e: eb6ff0480024 stmg %r6,%r15,72(%r15)
> Call Trace:
> [<000000095bb10d28>] iommu_detach_group+0x70/0x80
> ([<000000095bb10d24>] iommu_detach_group+0x6c/0x80)
> [<000003ff80243b0e>] vfio_iommu_type1_detach_group+0x136/0x6c8 [vfio_iommu_type1]
> [<000003ff80137780>] __vfio_group_unset_container+0x58/0x158 [vfio]
> [<000003ff80138a16>] vfio_group_fops_unl_ioctl+0x1b6/0x210 [vfio]
> pci 0004:00:00.0: Removing from iommu group 4
> [<000000095b5b62e8>] __s390x_sys_ioctl+0xc0/0x100
> [<000000095be5d3b4>] __do_syscall+0x1d4/0x200
> [<000000095be6c072>] system_call+0x82/0xb0
> Last Breaking-Event-Address:
> [<000000095be4bf80>] __warn_printk+0x60/0x68
>
> It reflects that domain->ops->attach_dev() failed because the driver has
> already passed the point of destructing the device.
>
> Fixes: 9ac8545199a1 ("iommu: Fix use-after-free in iommu_release_device")
> Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
I've been running with only the first 3 patches in this series (the vfio changes) and can confirm that they resolve the reported issue for me.
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> # s390
...
> +static void vfio_group_remove(struct vfio_group *group)
> +{
> + /* Pairs with vfio_create_group() */
Nit: vfio_create_group() no longer exists as of patch 1
next prev parent reply other threads:[~2022-09-20 19:32 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-08 18:44 [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Jason Gunthorpe
2022-09-08 18:44 ` [PATCH 1/4] vfio: Simplify vfio_create_group() Jason Gunthorpe
2022-09-20 19:45 ` Matthew Rosato
2022-09-08 18:44 ` [PATCH 2/4] vfio: Move the sanity check of the group to vfio_create_group() Jason Gunthorpe
2022-09-22 19:10 ` Alex Williamson
2022-09-22 19:36 ` Jason Gunthorpe
2022-09-22 21:23 ` Alex Williamson
2022-09-22 23:12 ` Jason Gunthorpe
2022-09-08 18:45 ` [PATCH 3/4] vfio: Follow a strict lifetime for struct iommu_group * Jason Gunthorpe
2022-09-20 19:32 ` Matthew Rosato [this message]
2022-09-08 18:45 ` [PATCH 4/4] iommu: Fix ordering of iommu_release_device() Jason Gunthorpe
2022-09-08 21:05 ` Robin Murphy
2022-09-08 21:27 ` Robin Murphy
2022-09-08 21:43 ` Jason Gunthorpe
2022-09-09 9:05 ` Robin Murphy
2022-09-09 13:25 ` Jason Gunthorpe
2022-09-09 17:57 ` Robin Murphy
2022-09-09 18:30 ` Jason Gunthorpe
2022-09-09 19:55 ` Robin Murphy
2022-09-09 23:45 ` Jason Gunthorpe
2022-09-12 11:13 ` Robin Murphy
2022-09-22 16:56 ` Jason Gunthorpe
2022-09-09 12:49 ` [PATCH 0/4] Fix splats releated to using the iommu_group after destroying devices Matthew Rosato
2022-09-09 16:24 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d15a08f6-fd0d-d28d-2b35-34c445d9b11e@linux.ibm.com \
--to=mjrosato@linux.ibm.com \
--cc=alex.williamson@redhat.com \
--cc=cai@lca.pw \
--cc=cohuck@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=jroedel@suse.de \
--cc=kvm@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox