public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Yi Liu <yi.l.liu@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>, <bpf@vger.kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	David Woodhouse <dwmw2@infradead.org>, <iommu@lists.linux.dev>,
	Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
	<linux-doc@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
	<llvm@lists.linux.dev>, Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Miguel Ojeda <ojeda@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Shuah Khan <shuah@kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Tom Rix <trix@redhat.com>, Will Deacon <will@kernel.org>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Lu Baolu <baolu.lu@linux.intel.com>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Cornelia Huck <cohuck@redhat.com>,
	Daniel Jordan <daniel.m.jordan@oracle.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	Eric Auger <eric.auger@redhat.com>,
	Eric Farman <farman@linux.ibm.com>,
	Jason Wang <jasowang@redhat.com>,
	Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Joao Martins <joao.m.martins@oracle.com>, <kvm@vger.kernel.org>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Niklas Schnelle" <schnelle@linux.ibm.com>,
	Shameerali Kolothum Thodi  <shameerali.kolothum.thodi@huawei.com>,
	Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v4 12/17] iommufd: Add kAPI toward external drivers for physical devices
Date: Tue, 8 Nov 2022 22:34:05 +0800	[thread overview]
Message-ID: <2cbd00ff-a51f-bd0f-1bd9-67db5f5d22f4@intel.com> (raw)
In-Reply-To: <12-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com>

On 2022/11/8 08:49, Jason Gunthorpe wrote:
> Add the four functions external drivers need to connect physical DMA to
> the IOMMUFD:
> 
> iommufd_device_bind() / iommufd_device_unbind()
>    Register the device with iommufd and establish security isolation.
> 
> iommufd_device_attach() / iommufd_device_detach()
>    Connect a bound device to a page table
> 
> Binding a device creates a device object ID in the uAPI, however the
> generic API provides no IOCTLs to manipulate them.
> 
> Tested-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/iommu/iommufd/Makefile          |   1 +
>   drivers/iommu/iommufd/device.c          | 402 ++++++++++++++++++++++++
>   drivers/iommu/iommufd/iommufd_private.h |   5 +
>   drivers/iommu/iommufd/main.c            |   3 +
>   include/linux/iommufd.h                 |  13 +
>   5 files changed, 424 insertions(+)
>   create mode 100644 drivers/iommu/iommufd/device.c
> 
> diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile
> index e13e971aa28c60..ca28a135b9675f 100644
> --- a/drivers/iommu/iommufd/Makefile
> +++ b/drivers/iommu/iommufd/Makefile
> @@ -1,5 +1,6 @@
>   # SPDX-License-Identifier: GPL-2.0-only
>   iommufd-y := \
> +	device.o \
>   	hw_pagetable.o \
>   	io_pagetable.o \
>   	ioas.o \
> diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c
> new file mode 100644
> index 00000000000000..a3bf3c07d3f800
> --- /dev/null
> +++ b/drivers/iommu/iommufd/device.c
> @@ -0,0 +1,402 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES
> + */
> +#include <linux/iommufd.h>
> +#include <linux/slab.h>
> +#include <linux/iommu.h>
> +#include <linux/irqdomain.h>
> +
> +#include "iommufd_private.h"
> +
> +/*
> + * A iommufd_device object represents the binding relationship between a
> + * consuming driver and the iommufd. These objects are created/destroyed by
> + * external drivers, not by userspace.
> + */
> +struct iommufd_device {
> +	struct iommufd_object obj;
> +	struct iommufd_ctx *ictx;
> +	struct iommufd_hw_pagetable *hwpt;
> +	/* Head at iommufd_hw_pagetable::devices */
> +	struct list_head devices_item;
> +	/* always the physical device */
> +	struct device *dev;
> +	struct iommu_group *group;
> +	bool enforce_cache_coherency;
> +};
> +
> +void iommufd_device_destroy(struct iommufd_object *obj)
> +{
> +	struct iommufd_device *idev =
> +		container_of(obj, struct iommufd_device, obj);
> +
> +	iommu_device_release_dma_owner(idev->dev);
> +	iommu_group_put(idev->group);
> +	iommufd_ctx_put(idev->ictx);
> +}
> +
> +/**
> + * iommufd_device_bind - Bind a physical device to an iommu fd
> + * @ictx: iommufd file descriptor
> + * @dev: Pointer to a physical PCI device struct
> + * @id: Output ID number to return to userspace for this device
> + *
> + * A successful bind establishes an ownership over the device and returns
> + * struct iommufd_device pointer, otherwise returns error pointer.
> + *
> + * A driver using this API must set driver_managed_dma and must not touch
> + * the device until this routine succeeds and establishes ownership.
> + *
> + * Binding a PCI device places the entire RID under iommufd control.
> + *
> + * The caller must undo this with iommufd_unbind_device()

it should be iommufd_device_unbind() now.

> + */
> +struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx,
> +					   struct device *dev, u32 *id)
> +{
> +	struct iommufd_device *idev;
> +	struct iommu_group *group;
> +	int rc;
> +
> +	/*
> +	 * iommufd always sets IOMMU_CACHE because we offer no way for userspace
> +	 * to restore cache coherency.
> +	 */
> +	if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
> +		return ERR_PTR(-EINVAL);
> +
> +	group = iommu_group_get(dev);
> +	if (!group)
> +		return ERR_PTR(-ENODEV);
> +
> +	rc = iommu_device_claim_dma_owner(dev, ictx);
> +	if (rc)
> +		goto out_group_put;
> +
> +	idev = iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE);
> +	if (IS_ERR(idev)) {
> +		rc = PTR_ERR(idev);
> +		goto out_release_owner;
> +	}
> +	idev->ictx = ictx;
> +	iommufd_ctx_get(ictx);
> +	idev->dev = dev;
> +	idev->enforce_cache_coherency =
> +		device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY);
> +	/* The calling driver is a user until iommufd_device_unbind() */
> +	refcount_inc(&idev->obj.users);
> +	/* group refcount moves into iommufd_device */
> +	idev->group = group;
> +
> +	/*
> +	 * If the caller fails after this success it must call
> +	 * iommufd_unbind_device() which is safe since we hold this refcount.
> +	 * This also means the device is a leaf in the graph and no other object
> +	 * can take a reference on it.
> +	 */
> +	iommufd_object_finalize(ictx, &idev->obj);
> +	*id = idev->obj.id;
> +	return idev;
> +
> +out_release_owner:
> +	iommu_device_release_dma_owner(dev);
> +out_group_put:
> +	iommu_group_put(group);
> +	return ERR_PTR(rc);
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD);
> +
> +void iommufd_device_unbind(struct iommufd_device *idev)
> +{
> +	bool was_destroyed;
> +
> +	was_destroyed = iommufd_object_destroy_user(idev->ictx, &idev->obj);
> +	WARN_ON(!was_destroyed);
> +}
> +EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
> +
> +static int iommufd_device_setup_msi(struct iommufd_device *idev,
> +				    struct iommufd_hw_pagetable *hwpt,
> +				    phys_addr_t sw_msi_start,
> +				    unsigned int flags)
> +{
> +	int rc;
> +
> +	/*
> +	 * IOMMU_CAP_INTR_REMAP means that the platform is isolating MSI, and it
> +	 * creates the MSI window by default in the iommu domain. Nothing
> +	 * further to do.
> +	 */
> +	if (device_iommu_capable(idev->dev, IOMMU_CAP_INTR_REMAP))
> +		return 0;
> +
> +	/*
> +	 * On ARM systems that set the global IRQ_DOMAIN_FLAG_MSI_REMAP every
> +	 * allocated iommu_domain will block interrupts by default and this
> +	 * special flow is needed to turn them back on. iommu_dma_prepare_msi()
> +	 * will install pages into our domain after request_irq() to make this
> +	 * work.
> +	 *
> +	 * FIXME: This is conceptually broken for iommufd since we want to allow
> +	 * userspace to change the domains, eg switch from an identity IOAS to a
> +	 * DMA IOAS. There is currently no way to create a MSI window that
> +	 * matches what the IRQ layer actually expects in a newly created
> +	 * domain.
> +	 */
> +	if (irq_domain_check_msi_remap()) {
> +		if (WARN_ON(!sw_msi_start))
> +			return -EPERM;
> +		/*
> +		 * iommu_get_msi_cookie() can only be called once per domain,
> +		 * it returns -EBUSY on later calls.
> +		 */
> +		if (hwpt->msi_cookie)
> +			return 0;
> +		rc = iommu_get_msi_cookie(hwpt->domain, sw_msi_start);
> +		if (rc)
> +			return rc;
> +		hwpt->msi_cookie = true;
> +		return 0;
> +	}
> +
> +	/*
> +	 * Otherwise the platform has a MSI window that is not isolated. For
> +	 * historical compat with VFIO allow a module parameter to ignore the
> +	 * insecurity.
> +	 */
> +	if (!(flags & IOMMUFD_ATTACH_FLAGS_ALLOW_UNSAFE_INTERRUPT))
> +		return -EPERM;
> +	else
> +		dev_warn(
> +			idev->dev,
> +			"Device interrupts cannot be isolated by the IOMMU, this platform in insecure. Use an \"allow_unsafe_interrupts\" module parameter to override\n");
> +
> +	return 0;
> +}
> +
> +static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt,
> +					   struct iommu_group *group)
> +{
> +	struct iommufd_device *cur_dev;
> +
> +	list_for_each_entry(cur_dev, &hwpt->devices, devices_item)
> +		if (cur_dev->group == group)
> +			return true;
> +	return false;
> +}
> +
> +static int iommufd_device_do_attach(struct iommufd_device *idev,
> +				    struct iommufd_hw_pagetable *hwpt,
> +				    unsigned int flags)
> +{
> +	phys_addr_t sw_msi_start = 0;
> +	int rc;
> +
> +	mutex_lock(&hwpt->devices_lock);
> +
> +	/*
> +	 * Try to upgrade the domain we have, it is an iommu driver bug to
> +	 * report IOMMU_CAP_ENFORCE_CACHE_COHERENCY but fail
> +	 * enforce_cache_coherency when there are no devices attached to the
> +	 * domain.
> +	 */
> +	if (idev->enforce_cache_coherency && !hwpt->enforce_cache_coherency) {
> +		if (hwpt->domain->ops->enforce_cache_coherency)
> +			hwpt->enforce_cache_coherency =
> +				hwpt->domain->ops->enforce_cache_coherency(
> +					hwpt->domain);
> +		if (!hwpt->enforce_cache_coherency) {
> +			WARN_ON(list_empty(&hwpt->devices));
> +			rc = -EINVAL;
> +			goto out_unlock;
> +		}
> +	}
> +
> +	rc = iopt_table_enforce_group_resv_regions(&hwpt->ioas->iopt, idev->dev,
> +						   idev->group, &sw_msi_start);
> +	if (rc)
> +		goto out_unlock;
> +
> +	rc = iommufd_device_setup_msi(idev, hwpt, sw_msi_start, flags);
> +	if (rc)
> +		goto out_iova;

aren't the above two operations only once for a group? I remember you did
the two after iommu_attach_group().

> +	/*
> +	 * FIXME: Hack around missing a device-centric iommu api, only attach to
> +	 * the group once for the first device that is in the group.
> +	 */
> +	if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
> +		rc = iommu_attach_group(hwpt->domain, idev->group);
> +		if (rc)
> +			goto out_iova;
> +
> +		if (list_empty(&hwpt->devices)) {
> +			rc = iopt_table_add_domain(&hwpt->ioas->iopt,
> +						   hwpt->domain);
> +			if (rc)
> +				goto out_detach;
> +		}
> +	}
> +
> +	idev->hwpt = hwpt;
> +	refcount_inc(&hwpt->obj.users);
> +	list_add(&idev->devices_item, &hwpt->devices);
> +	mutex_unlock(&hwpt->devices_lock);
> +	return 0;
> +
> +out_detach:
> +	iommu_detach_group(hwpt->domain, idev->group);
> +out_iova:
> +	iopt_remove_reserved_iova(&hwpt->ioas->iopt, idev->dev);
> +out_unlock:
> +	mutex_unlock(&hwpt->devices_lock);
> +	return rc;
> +}
> +

-- 
Regards,
Yi Liu

  reply	other threads:[~2022-11-08 14:33 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-08  0:48 [PATCH v4 00/17] IOMMUFD Generic interface Jason Gunthorpe
2022-11-08  0:48 ` [PATCH v4 01/17] iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCY Jason Gunthorpe
2022-11-08  0:48 ` [PATCH v4 02/17] iommu: Add device-centric DMA ownership interfaces Jason Gunthorpe
2022-11-11  5:37   ` Tian, Kevin
2022-11-14 16:44     ` Jason Gunthorpe
2022-11-14 13:33   ` Eric Auger
2022-11-14 16:58     ` Jason Gunthorpe
2022-11-08  0:48 ` [PATCH v4 03/17] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-11-15 14:14   ` Eric Auger
2022-11-15 16:44     ` Jason Gunthorpe
2022-11-08  0:48 ` [PATCH v4 04/17] iommufd: Document overview of iommufd Jason Gunthorpe
2022-11-08  3:45   ` Bagas Sanjaya
2022-11-08 17:10   ` [PATCH v4 4/17] " Jason Gunthorpe
2022-11-11  5:59     ` Tian, Kevin
2022-11-14 15:14       ` Jason Gunthorpe
2022-11-10  9:30   ` [PATCH v4 04/17] " Bagas Sanjaya
2022-11-10 14:49     ` Jonathan Corbet
2022-11-10 14:54       ` Jason Gunthorpe
2022-11-10 15:10         ` Jonathan Corbet
2022-11-10 15:23           ` Jason Gunthorpe
2022-11-10 15:28             ` Jonathan Corbet
2022-11-10 15:29               ` Jason Gunthorpe
2022-11-10 15:52                 ` Jonathan Corbet
2022-11-10 16:54                   ` Jason Gunthorpe
2022-11-11  1:46       ` Bagas Sanjaya
2022-11-14 20:50   ` Eric Auger
2022-11-15  0:52     ` Jason Gunthorpe
2022-11-08  0:48 ` [PATCH v4 05/17] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-11-11  6:07   ` Tian, Kevin
2022-11-08  0:48 ` [PATCH v4 06/17] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 07/17] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-11-11  9:56   ` Tian, Kevin
2022-11-14 17:20     ` Jason Gunthorpe
2022-11-11 11:09   ` Tian, Kevin
2022-11-14 17:24     ` Jason Gunthorpe
2022-11-15  2:59       ` Tian, Kevin
2022-11-08  0:49 ` [PATCH v4 08/17] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-11-14  5:50   ` Tian, Kevin
2022-11-14 18:02     ` Jason Gunthorpe
2022-11-15  3:06       ` Tian, Kevin
2022-11-15 14:49         ` Jason Gunthorpe
2022-11-14 19:19   ` [PATCH v4 8/17] " Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 09/17] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-11-14  7:28   ` Tian, Kevin
2022-11-14 18:43     ` Jason Gunthorpe
2022-11-15  3:13       ` Tian, Kevin
2022-11-15 15:05         ` Jason Gunthorpe
2022-11-16  0:09           ` Tian, Kevin
2022-11-16  0:32             ` Jason Gunthorpe
2022-11-16  2:30               ` Tian, Kevin
2022-11-08  0:49 ` [PATCH v4 10/17] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-11-08 13:27   ` Bagas Sanjaya
2022-11-08 17:01     ` Jason Gunthorpe
2022-11-14  7:46   ` Tian, Kevin
2022-11-08  0:49 ` [PATCH v4 11/17] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 12/17] iommufd: Add kAPI toward external drivers for physical devices Jason Gunthorpe
2022-11-08 14:34   ` Yi Liu [this message]
2022-11-08 17:57     ` Jason Gunthorpe
2022-11-14  7:59   ` Tian, Kevin
2022-11-08  0:49 ` [PATCH v4 13/17] iommufd: Add kAPI toward external drivers for kernel access Jason Gunthorpe
2022-11-14  8:25   ` Tian, Kevin
2022-11-14 19:05     ` Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 14/17] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 16/17] iommufd: Add some fault injection points Jason Gunthorpe
2022-11-08  7:25   ` Nicolin Chen
2022-11-08 12:37     ` Jason Gunthorpe
2022-11-08  0:49 ` [PATCH v4 17/17] iommufd: Add additional invariant assertions Jason Gunthorpe
     [not found] ` <15-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com>
2022-11-08  1:01   ` [PATCH v4 15/17] iommufd: Add a selftest Jason Gunthorpe
2022-11-08  5:48   ` Nicolin Chen
2022-11-08 13:27     ` Jason Gunthorpe
2022-11-09 23:51   ` Matthew Rosato
2022-11-08  1:09 ` S390 testing for IOMMUFD Jason Gunthorpe
2022-11-08 10:12   ` Christian Borntraeger
2022-11-08 14:04     ` Anthony Krowiak
2022-11-09 14:49       ` Anthony Krowiak
2022-11-09 16:12         ` Jason Gunthorpe
2022-11-09 19:13           ` Anthony Krowiak
2022-11-09 20:43             ` Jason Gunthorpe
2022-11-09 19:09         ` Anthony Krowiak
2022-11-08 13:50   ` Matthew Rosato
2022-11-08 13:54     ` Jason Gunthorpe
2022-11-08 14:19       ` Eric Farman
2022-11-08 14:37         ` Jason Gunthorpe
2022-11-08 15:29           ` Eric Farman
2022-11-08 19:18             ` Matthew Rosato
2022-11-08 20:04               ` Jason Gunthorpe
2022-11-08 20:17                 ` Eric Farman
2022-11-08 19:34             ` Jason Gunthorpe
2022-11-08 20:07               ` Eric Farman
2022-11-08 20:10                 ` Jason Gunthorpe
2022-11-11 15:51 ` [PATCH v4 00/17] IOMMUFD Generic interface Shameerali Kolothum Thodi
2022-11-12 12:44   ` Yi Liu
2023-01-10 11:35     ` Shameerali Kolothum Thodi
2023-01-10 13:49       ` Jason Gunthorpe
2023-01-10 15:16         ` Joao Martins
2023-01-10 15:18           ` Jason Gunthorpe
2023-01-10 15:30           ` Shameerali Kolothum Thodi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cbd00ff-a51f-bd0f-1bd9-67db5f5d22f4@intel.com \
    --to=yi.l.liu@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=chaitanyak@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=corbet@lwn.net \
    --cc=daniel.m.jordan@oracle.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=dwmw2@infradead.org \
    --cc=eric.auger@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=iommu@lists.linux.dev \
    --cc=jasowang@redhat.com \
    --cc=jean-philippe@linaro.org \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mjrosato@linux.ibm.com \
    --cc=mst@redhat.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=nicolinc@nvidia.com \
    --cc=ojeda@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=shuah@kernel.org \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=trix@redhat.com \
    --cc=will@kernel.org \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox