From: Yi Liu <yi.l.liu@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>, <bpf@vger.kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
David Woodhouse <dwmw2@infradead.org>, <iommu@lists.linux.dev>,
Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
<linux-doc@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
<llvm@lists.linux.dev>, Nathan Chancellor <nathan@kernel.org>,
Nick Desaulniers <ndesaulniers@google.com>,
Miguel Ojeda <ojeda@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Shuah Khan <shuah@kernel.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
Tom Rix <trix@redhat.com>, Will Deacon <will@kernel.org>
Cc: Alex Williamson <alex.williamson@redhat.com>,
Lu Baolu <baolu.lu@linux.intel.com>,
Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Cornelia Huck <cohuck@redhat.com>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
David Gibson <david@gibson.dropbear.id.au>,
Eric Auger <eric.auger@redhat.com>,
Eric Farman <farman@linux.ibm.com>,
Jason Wang <jasowang@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Joao Martins <joao.m.martins@oracle.com>, <kvm@vger.kernel.org>,
Matthew Rosato <mjrosato@linux.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Nicolin Chen <nicolinc@nvidia.com>,
"Niklas Schnelle" <schnelle@linux.ibm.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v4 12/17] iommufd: Add kAPI toward external drivers for physical devices
Date: Tue, 8 Nov 2022 22:34:05 +0800 [thread overview]
Message-ID: <2cbd00ff-a51f-bd0f-1bd9-67db5f5d22f4@intel.com> (raw)
In-Reply-To: <12-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com>
On 2022/11/8 08:49, Jason Gunthorpe wrote:
> Add the four functions external drivers need to connect physical DMA to
> the IOMMUFD:
>
> iommufd_device_bind() / iommufd_device_unbind()
> Register the device with iommufd and establish security isolation.
>
> iommufd_device_attach() / iommufd_device_detach()
> Connect a bound device to a page table
>
> Binding a device creates a device object ID in the uAPI, however the
> generic API provides no IOCTLs to manipulate them.
>
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/iommufd/Makefile | 1 +
> drivers/iommu/iommufd/device.c | 402 ++++++++++++++++++++++++
> drivers/iommu/iommufd/iommufd_private.h | 5 +
> drivers/iommu/iommufd/main.c | 3 +
> include/linux/iommufd.h | 13 +
> 5 files changed, 424 insertions(+)
> create mode 100644 drivers/iommu/iommufd/device.c
>
> diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
> index e13e971aa28c60..ca28a135b9675f 100644
> --- a/drivers/iommu/iommufd/Makefile
> +++ b/drivers/iommu/iommufd/Makefile
> @@ -1,5 +1,6 @@
> # SPDX-License-Identifier: GPL-2.0-only
> iommufd-y := \
> + device.o \
> hw_pagetable.o \
> io_pagetable.o \
> ioas.o \
> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> new file mode 100644
> index 00000000000000..a3bf3c07d3f800
> --- /dev/null
> +++ b/drivers/iommu/iommufd/device.c
> @@ -0,0 +1,402 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
> + */
> +#include <linux/iommufd.h>
> +#include <linux/slab.h>
> +#include <linux/iommu.h>
> +#include <linux/irqdomain.h>
> +
> +#include "iommufd_private.h"
> +
> +/*
> + * A iommufd_device object represents the binding relationship between a
> + * consuming driver and the iommufd. These objects are created/destroyed by
> + * external drivers, not by userspace.
> + */
> +struct iommufd_device {
> + struct iommufd_object obj;
> + struct iommufd_ctx *ictx;
> + struct iommufd_hw_pagetable *hwpt;
> + /* Head at iommufd_hw_pagetable::devices */
> + struct list_head devices_item;
> + /* always the physical device */
> + struct device *dev;
> + struct iommu_group *group;
> + bool enforce_cache_coherency;
> +};
> +
> +void iommufd_device_destroy(struct iommufd_object *obj)
> +{
> + struct iommufd_device *idev =
> + container_of(obj, struct iommufd_device, obj);
> +
> + iommu_device_release_dma_owner(idev->dev);
> + iommu_group_put(idev->group);
> + iommufd_ctx_put(idev->ictx);
> +}
> +
> +/**
> + * iommufd_device_bind - Bind a physical device to an iommu fd
> + * @ictx: iommufd file descriptor
> + * @dev: Pointer to a physical PCI device struct
> + * @id: Output ID number to return to userspace for this device
> + *
> + * A successful bind establishes an ownership over the device and returns
> + * struct iommufd_device pointer, otherwise returns error pointer.
> + *
> + * A driver using this API must set driver_managed_dma and must not touch
> + * the device until this routine succeeds and establishes ownership.
> + *
> + * Binding a PCI device places the entire RID under iommufd control.
> + *
> + * The caller must undo this with iommufd_unbind_device()
it should be iommufd_device_unbind() now.
> + */
> +struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
> + struct device *dev, u32 *id)
> +{
> + struct iommufd_device *idev;
> + struct iommu_group *group;
> + int rc;
> +
> + /*
> + * iommufd always sets IOMMU_CACHE because we offer no way for userspace
> + * to restore cache coherency.
> + */
> + if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
> + return ERR_PTR(-EINVAL);
> +
> + group = iommu_group_get(dev);
> + if (!group)
> + return ERR_PTR(-ENODEV);
> +
> + rc = iommu_device_claim_dma_owner(dev, ictx);
> + if (rc)
> + goto out_group_put;
> +
> + idev = iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE);
> + if (IS_ERR(idev)) {
> + rc = PTR_ERR(idev);
> + goto out_release_owner;
> + }
> + idev->ictx = ictx;
> + iommufd_ctx_get(ictx);
> + idev->dev = dev;
> + idev->enforce_cache_coherency =
> + device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY);
> + /* The calling driver is a user until iommufd_device_unbind() */
> + refcount_inc(&idev->obj.users);
> + /* group refcount moves into iommufd_device */
> + idev->group = group;
> +
> + /*
> + * If the caller fails after this success it must call
> + * iommufd_unbind_device() which is safe since we hold this refcount.
> + * This also means the device is a leaf in the graph and no other object
> + * can take a reference on it.
> + */
> + iommufd_object_finalize(ictx, &idev->obj);
> + *id = idev->obj.id;
> + return idev;
> +
> +out_release_owner:
> + iommu_device_release_dma_owner(dev);
> +out_group_put:
> + iommu_group_put(group);
> + return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD);
> +
> +void iommufd_device_unbind(struct iommufd_device *idev)
> +{
> + bool was_destroyed;
> +
> + was_destroyed = iommufd_object_destroy_user(idev->ictx, &idev->obj);
> + WARN_ON(!was_destroyed);
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
> +
> +static int iommufd_device_setup_msi(struct iommufd_device *idev,
> + struct iommufd_hw_pagetable *hwpt,
> + phys_addr_t sw_msi_start,
> + unsigned int flags)
> +{
> + int rc;
> +
> + /*
> + * IOMMU_CAP_INTR_REMAP means that the platform is isolating MSI, and it
> + * creates the MSI window by default in the iommu domain. Nothing
> + * further to do.
> + */
> + if (device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP))
> + return 0;
> +
> + /*
> + * On ARM systems that set the global IRQ_DOMAIN_FLAG_MSI_REMAP every
> + * allocated iommu_domain will block interrupts by default and this
> + * special flow is needed to turn them back on. iommu_dma_prepare_msi()
> + * will install pages into our domain after request_irq() to make this
> + * work.
> + *
> + * FIXME: This is conceptually broken for iommufd since we want to allow
> + * userspace to change the domains, eg switch from an identity IOAS to a
> + * DMA IOAS. There is currently no way to create a MSI window that
> + * matches what the IRQ layer actually expects in a newly created
> + * domain.
> + */
> + if (irq_domain_check_msi_remap()) {
> + if (WARN_ON(!sw_msi_start))
> + return -EPERM;
> + /*
> + * iommu_get_msi_cookie() can only be called once per domain,
> + * it returns -EBUSY on later calls.
> + */
> + if (hwpt->msi_cookie)
> + return 0;
> + rc = iommu_get_msi_cookie(hwpt->domain, sw_msi_start);
> + if (rc)
> + return rc;
> + hwpt->msi_cookie = true;
> + return 0;
> + }
> +
> + /*
> + * Otherwise the platform has a MSI window that is not isolated. For
> + * historical compat with VFIO allow a module parameter to ignore the
> + * insecurity.
> + */
> + if (!(flags & IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT))
> + return -EPERM;
> + else
> + dev_warn(
> + idev->dev,
> + "Device interrupts cannot be isolated by the IOMMU, this platform in insecure. Use an \"allow_unsafe_interrupts\" module parameter to override\n");
> +
> + return 0;
> +}
> +
> +static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
> + struct iommu_group *group)
> +{
> + struct iommufd_device *cur_dev;
> +
> + list_for_each_entry(cur_dev, &hwpt->devices, devices_item)
> + if (cur_dev->group == group)
> + return true;
> + return false;
> +}
> +
> +static int iommufd_device_do_attach(struct iommufd_device *idev,
> + struct iommufd_hw_pagetable *hwpt,
> + unsigned int flags)
> +{
> + phys_addr_t sw_msi_start = 0;
> + int rc;
> +
> + mutex_lock(&hwpt->devices_lock);
> +
> + /*
> + * Try to upgrade the domain we have, it is an iommu driver bug to
> + * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
> + * enforce_cache_coherency when there are no devices attached to the
> + * domain.
> + */
> + if (idev->enforce_cache_coherency && !hwpt->enforce_cache_coherency) {
> + if (hwpt->domain->ops->enforce_cache_coherency)
> + hwpt->enforce_cache_coherency =
> + hwpt->domain->ops->enforce_cache_coherency(
> + hwpt->domain);
> + if (!hwpt->enforce_cache_coherency) {
> + WARN_ON(list_empty(&hwpt->devices));
> + rc = -EINVAL;
> + goto out_unlock;
> + }
> + }
> +
> + rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
> + idev->group, &sw_msi_start);
> + if (rc)
> + goto out_unlock;
> +
> + rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start, flags);
> + if (rc)
> + goto out_iova;
aren't the above two operations only once for a group? I remember you did
the two after iommu_attach_group().
> + /*
> + * FIXME: Hack around missing a device-centric iommu api, only attach to
> + * the group once for the first device that is in the group.
> + */
> + if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
> + rc = iommu_attach_group(hwpt->domain, idev->group);
> + if (rc)
> + goto out_iova;
> +
> + if (list_empty(&hwpt->devices)) {
> + rc = iopt_table_add_domain(&hwpt->ioas->iopt,
> + hwpt->domain);
> + if (rc)
> + goto out_detach;
> + }
> + }
> +
> + idev->hwpt = hwpt;
> + refcount_inc(&hwpt->obj.users);
> + list_add(&idev->devices_item, &hwpt->devices);
> + mutex_unlock(&hwpt->devices_lock);
> + return 0;
> +
> +out_detach:
> + iommu_detach_group(hwpt->domain, idev->group);
> +out_iova:
> + iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
> +out_unlock:
> + mutex_unlock(&hwpt->devices_lock);
> + return rc;
> +}
> +
--
Regards,
Yi Liu
next prev parent reply other threads:[~2022-11-08 14:33 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 0:48 [PATCH v4 00/17] IOMMUFD Generic interface Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 01/17] iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCY Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 02/17] iommu: Add device-centric DMA ownership interfaces Jason Gunthorpe
2022-11-11 5:37 ` Tian, Kevin
2022-11-14 16:44 ` Jason Gunthorpe
2022-11-14 13:33 ` Eric Auger
2022-11-14 16:58 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 03/17] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-11-15 14:14 ` Eric Auger
2022-11-15 16:44 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 04/17] iommufd: Document overview of iommufd Jason Gunthorpe
2022-11-08 3:45 ` Bagas Sanjaya
2022-11-08 17:10 ` [PATCH v4 4/17] " Jason Gunthorpe
2022-11-11 5:59 ` Tian, Kevin
2022-11-14 15:14 ` Jason Gunthorpe
2022-11-10 9:30 ` [PATCH v4 04/17] " Bagas Sanjaya
2022-11-10 14:49 ` Jonathan Corbet
2022-11-10 14:54 ` Jason Gunthorpe
2022-11-10 15:10 ` Jonathan Corbet
2022-11-10 15:23 ` Jason Gunthorpe
2022-11-10 15:28 ` Jonathan Corbet
2022-11-10 15:29 ` Jason Gunthorpe
2022-11-10 15:52 ` Jonathan Corbet
2022-11-10 16:54 ` Jason Gunthorpe
2022-11-11 1:46 ` Bagas Sanjaya
2022-11-14 20:50 ` Eric Auger
2022-11-15 0:52 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 05/17] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-11-11 6:07 ` Tian, Kevin
2022-11-08 0:48 ` [PATCH v4 06/17] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 07/17] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-11-11 9:56 ` Tian, Kevin
2022-11-14 17:20 ` Jason Gunthorpe
2022-11-11 11:09 ` Tian, Kevin
2022-11-14 17:24 ` Jason Gunthorpe
2022-11-15 2:59 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 08/17] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-11-14 5:50 ` Tian, Kevin
2022-11-14 18:02 ` Jason Gunthorpe
2022-11-15 3:06 ` Tian, Kevin
2022-11-15 14:49 ` Jason Gunthorpe
2022-11-14 19:19 ` [PATCH v4 8/17] " Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 09/17] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-11-14 7:28 ` Tian, Kevin
2022-11-14 18:43 ` Jason Gunthorpe
2022-11-15 3:13 ` Tian, Kevin
2022-11-15 15:05 ` Jason Gunthorpe
2022-11-16 0:09 ` Tian, Kevin
2022-11-16 0:32 ` Jason Gunthorpe
2022-11-16 2:30 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 10/17] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-11-08 13:27 ` Bagas Sanjaya
2022-11-08 17:01 ` Jason Gunthorpe
2022-11-14 7:46 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 11/17] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 12/17] iommufd: Add kAPI toward external drivers for physical devices Jason Gunthorpe
2022-11-08 14:34 ` Yi Liu [this message]
2022-11-08 17:57 ` Jason Gunthorpe
2022-11-14 7:59 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 13/17] iommufd: Add kAPI toward external drivers for kernel access Jason Gunthorpe
2022-11-14 8:25 ` Tian, Kevin
2022-11-14 19:05 ` Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 14/17] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 16/17] iommufd: Add some fault injection points Jason Gunthorpe
2022-11-08 7:25 ` Nicolin Chen
2022-11-08 12:37 ` Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 17/17] iommufd: Add additional invariant assertions Jason Gunthorpe
[not found] ` <15-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com>
2022-11-08 1:01 ` [PATCH v4 15/17] iommufd: Add a selftest Jason Gunthorpe
2022-11-08 5:48 ` Nicolin Chen
2022-11-08 13:27 ` Jason Gunthorpe
2022-11-09 23:51 ` Matthew Rosato
2022-11-08 1:09 ` S390 testing for IOMMUFD Jason Gunthorpe
2022-11-08 10:12 ` Christian Borntraeger
2022-11-08 14:04 ` Anthony Krowiak
2022-11-09 14:49 ` Anthony Krowiak
2022-11-09 16:12 ` Jason Gunthorpe
2022-11-09 19:13 ` Anthony Krowiak
2022-11-09 20:43 ` Jason Gunthorpe
2022-11-09 19:09 ` Anthony Krowiak
2022-11-08 13:50 ` Matthew Rosato
2022-11-08 13:54 ` Jason Gunthorpe
2022-11-08 14:19 ` Eric Farman
2022-11-08 14:37 ` Jason Gunthorpe
2022-11-08 15:29 ` Eric Farman
2022-11-08 19:18 ` Matthew Rosato
2022-11-08 20:04 ` Jason Gunthorpe
2022-11-08 20:17 ` Eric Farman
2022-11-08 19:34 ` Jason Gunthorpe
2022-11-08 20:07 ` Eric Farman
2022-11-08 20:10 ` Jason Gunthorpe
2022-11-11 15:51 ` [PATCH v4 00/17] IOMMUFD Generic interface Shameerali Kolothum Thodi
2022-11-12 12:44 ` Yi Liu
2023-01-10 11:35 ` Shameerali Kolothum Thodi
2023-01-10 13:49 ` Jason Gunthorpe
2023-01-10 15:16 ` Joao Martins
2023-01-10 15:18 ` Jason Gunthorpe
2023-01-10 15:30 ` Shameerali Kolothum Thodi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2cbd00ff-a51f-bd0f-1bd9-67db5f5d22f4@intel.com \
--to=yi.l.liu@intel.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=bpf@vger.kernel.org \
--cc=chaitanyak@nvidia.com \
--cc=cohuck@redhat.com \
--cc=corbet@lwn.net \
--cc=daniel.m.jordan@oracle.com \
--cc=david@gibson.dropbear.id.au \
--cc=dwmw2@infradead.org \
--cc=eric.auger@redhat.com \
--cc=farman@linux.ibm.com \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jean-philippe@linaro.org \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=mjrosato@linux.ibm.com \
--cc=mst@redhat.com \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=nicolinc@nvidia.com \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=schnelle@linux.ibm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=shuah@kernel.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=trix@redhat.com \
--cc=will@kernel.org \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox