From: Alex Williamson <alex@shazbot.org>
To: Jacob Pan <jacob.pan@linux.microsoft.com>
Cc: linux-kernel@vger.kernel.org,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
Jason Gunthorpe <jgg@nvidia.com>, Joerg Roedel <joro@8bytes.org>,
Mostafa Saleh <smostafa@google.com>,
David Matlack <dmatlack@google.com>,
Robin Murphy <robin.murphy@arm.com>,
Nicolin Chen <nicolinc@nvidia.com>,
"Tian, Kevin" <kevin.tian@intel.com>, Yi Liu <yi.l.liu@intel.com>,
skhawaja@google.com, pasha.tatashin@soleen.com,
Will Deacon <will@kernel.org>,
Baolu Lu <baolu.lu@linux.intel.com>,
alex@shazbot.org
Subject: Re: [PATCH V4 04/10] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA
Date: Thu, 16 Apr 2026 13:32:57 -0600 [thread overview]
Message-ID: <20260416133257.4d2e8818@shazbot.org> (raw)
In-Reply-To: <20260414211412.2729-5-jacob.pan@linux.microsoft.com>
On Tue, 14 Apr 2026 14:14:06 -0700
Jacob Pan <jacob.pan@linux.microsoft.com> wrote:
> To support no-IOMMU mode where userspace drivers perform unsafe DMA
> using physical addresses, introduce a new API to retrieve the
> physical address of a user-allocated DMA buffer that has been mapped to
> an IOVA via IOAS. The mapping is backed by mock I/O page tables maintained
> by generic IOMMUPT framework.
>
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> v4:
> - Fix unaligned IOVA length (Sashiko)
> v2:
> - Scan the contiguous physical-address span beyond the first page and return its length.
> ---
> drivers/iommu/iommufd/io_pagetable.c | 60 +++++++++++++++++++++++++
> drivers/iommu/iommufd/ioas.c | 25 +++++++++++
> drivers/iommu/iommufd/iommufd_private.h | 3 ++
> drivers/iommu/iommufd/main.c | 3 ++
> include/uapi/linux/iommufd.h | 25 +++++++++++
> 5 files changed, 116 insertions(+)
>
> diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/io_pagetable.c
> index ee003bb2f647..04336a8e12f5 100644
> --- a/drivers/iommu/iommufd/io_pagetable.c
> +++ b/drivers/iommu/iommufd/io_pagetable.c
> @@ -849,6 +849,66 @@ int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova,
> return iopt_unmap_iova_range(iopt, iova, iova_last, unmapped);
> }
>
> +int iopt_get_phys(struct io_pagetable *iopt, unsigned long iova, u64 *paddr,
> + u64 *length)
> +{
> + struct iopt_area *area;
> + u64 tmp_length = 0;
> + u64 tmp_paddr = 0;
> + int rc = 0;
> +
> + if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU))
> + return -EOPNOTSUPP;
> +
> + down_read(&iopt->iova_rwsem);
> + area = iopt_area_iter_first(iopt, iova, iova);
> + if (!area || !area->pages) {
> + rc = -ENOENT;
> + goto unlock_exit;
> + }
> +
> + if (!area->storage_domain ||
> + area->storage_domain->owner != &iommufd_noiommu_ops) {
> + rc = -EOPNOTSUPP;
> + goto unlock_exit;
> + }
> +
> + *paddr = iommu_iova_to_phys(area->storage_domain, iova);
> + if (!*paddr) {
> + rc = -EINVAL;
> + goto unlock_exit;
> + }
> +
> + tmp_length = PAGE_SIZE - offset_in_page(iova);
> + tmp_paddr = *paddr;
> + /*
> + * Scan the domain for the contiguous physical address length so that
> + * userspace search can be optimized for fewer ioctls.
> + */
> + while (iova < iopt_area_last_iova(area)) {
> + unsigned long next_iova;
> + u64 next_paddr;
> +
> + if (check_add_overflow(iova, PAGE_SIZE, &next_iova))
> + break;
> +
> + next_paddr = iommu_iova_to_phys(area->storage_domain, next_iova);
> +
> + if (!next_paddr || next_paddr != tmp_paddr + PAGE_SIZE)
> + break;
> +
> + iova = next_iova;
> + tmp_paddr += PAGE_SIZE;
> + tmp_length += PAGE_SIZE;
> + }
> + *length = tmp_length;
If next_iova exceeds iopt_area_last_iova(area) AND exists in
storage_domain AND happens to be physically contiguous, tmp_length is
advanced and the return length exceeds the mapping. Otherwise this
also always does an iommu_iova_to_phys() one iteration beyond the area
last iova. Thanks,
Alex
> +
> +unlock_exit:
> + up_read(&iopt->iova_rwsem);
> +
> + return rc;
> +}
> +
> int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped)
> {
> /* If the IOVAs are empty then unmap all succeeds */
> diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c
> index fed06c2b728e..93cebb4c23bd 100644
> --- a/drivers/iommu/iommufd/ioas.c
> +++ b/drivers/iommu/iommufd/ioas.c
> @@ -375,6 +375,31 @@ int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd)
> return rc;
> }
>
> +int iommufd_ioas_get_pa(struct iommufd_ucmd *ucmd)
> +{
> + struct iommu_ioas_get_pa *cmd = ucmd->cmd;
> + struct iommufd_ioas *ioas;
> + int rc;
> +
> + if (cmd->flags || cmd->__reserved)
> + return -EOPNOTSUPP;
> +
> + ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
> + if (IS_ERR(ioas))
> + return PTR_ERR(ioas);
> +
> + rc = iopt_get_phys(&ioas->iopt, cmd->iova, &cmd->out_phys,
> + &cmd->out_length);
> + if (rc)
> + goto out_put;
> +
> + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> +out_put:
> + iommufd_put_object(ucmd->ictx, &ioas->obj);
> +
> + return rc;
> +}
> +
> static void iommufd_release_all_iova_rwsem(struct iommufd_ctx *ictx,
> struct xarray *ioas_list)
> {
> diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h
> index 2682b5baa6e9..0e772882aee9 100644
> --- a/drivers/iommu/iommufd/iommufd_private.h
> +++ b/drivers/iommu/iommufd/iommufd_private.h
> @@ -118,6 +118,8 @@ int iopt_map_pages(struct io_pagetable *iopt, struct list_head *pages_list,
> int iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova,
> unsigned long length, unsigned long *unmapped);
> int iopt_unmap_all(struct io_pagetable *iopt, unsigned long *unmapped);
> +int iopt_get_phys(struct io_pagetable *iopt, unsigned long iova, u64 *paddr,
> + u64 *length);
>
> int iopt_read_and_clear_dirty_data(struct io_pagetable *iopt,
> struct iommu_domain *domain,
> @@ -346,6 +348,7 @@ int iommufd_ioas_map_file(struct iommufd_ucmd *ucmd);
> int iommufd_ioas_change_process(struct iommufd_ucmd *ucmd);
> int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
> int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
> +int iommufd_ioas_get_pa(struct iommufd_ucmd *ucmd);
> int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
> int iommufd_option_rlimit_mode(struct iommu_option *cmd,
> struct iommufd_ctx *ictx);
> diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c
> index 8c6d43601afb..ebae01ed947d 100644
> --- a/drivers/iommu/iommufd/main.c
> +++ b/drivers/iommu/iommufd/main.c
> @@ -432,6 +432,7 @@ union ucmd_buffer {
> struct iommu_veventq_alloc veventq;
> struct iommu_vfio_ioas vfio_ioas;
> struct iommu_viommu_alloc viommu;
> + struct iommu_ioas_get_pa get_pa;
> #ifdef CONFIG_IOMMUFD_TEST
> struct iommu_test_cmd test;
> #endif
> @@ -484,6 +485,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = {
> struct iommu_ioas_map_file, iova),
> IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct iommu_ioas_unmap,
> length),
> + IOCTL_OP(IOMMU_IOAS_GET_PA, iommufd_ioas_get_pa, struct iommu_ioas_get_pa,
> + out_phys),
> IOCTL_OP(IOMMU_OPTION, iommufd_option, struct iommu_option, val64),
> IOCTL_OP(IOMMU_VDEVICE_ALLOC, iommufd_vdevice_alloc_ioctl,
> struct iommu_vdevice_alloc, virt_id),
> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> index 1dafbc552d37..9afe0a1b11a0 100644
> --- a/include/uapi/linux/iommufd.h
> +++ b/include/uapi/linux/iommufd.h
> @@ -57,6 +57,7 @@ enum {
> IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
> IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
> IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94,
> + IOMMUFD_CMD_IOAS_GET_PA = 0x95,
> };
>
> /**
> @@ -219,6 +220,30 @@ struct iommu_ioas_map {
> };
> #define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
>
> +/**
> + * struct iommu_ioas_get_pa - ioctl(IOMMU_IOAS_GET_PA)
> + * @size: sizeof(struct iommu_ioas_get_pa)
> + * @flags: Reserved, must be 0 for now
> + * @ioas_id: IOAS ID to query IOVA to PA mapping from
> + * @__reserved: Must be 0
> + * @iova: IOVA to query
> + * @out_length: Number of bytes contiguous physical address starting from phys
> + * @out_phys: Output physical address the IOVA maps to
> + *
> + * Query the physical address backing an IOVA range. The entire range must be
> + * mapped already. For noiommu devices doing unsafe DMA only.
> + */
> +struct iommu_ioas_get_pa {
> + __u32 size;
> + __u32 flags;
> + __u32 ioas_id;
> + __u32 __reserved;
> + __aligned_u64 iova;
> + __aligned_u64 out_length;
> + __aligned_u64 out_phys;
> +};
> +#define IOMMU_IOAS_GET_PA _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_GET_PA)
> +
> /**
> * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
> * @size: sizeof(struct iommu_ioas_map_file)
next prev parent reply other threads:[~2026-04-16 19:33 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-14 21:14 [PATCH V4 00/10] iommufd: Enable noiommu mode for cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 01/10] iommufd: Support a HWPT without an iommu driver for noiommu Jacob Pan
2026-04-16 7:25 ` Tian, Kevin
2026-04-17 21:59 ` Jacob Pan
2026-04-14 21:14 ` [PATCH V4 02/10] iommufd: Move igroup allocation to a function Jacob Pan
2026-04-16 7:48 ` Tian, Kevin
2026-04-14 21:14 ` [PATCH V4 03/10] iommufd: Allow binding to a noiommu device Jacob Pan
2026-04-16 7:56 ` Tian, Kevin
2026-04-14 21:14 ` [PATCH V4 04/10] iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA Jacob Pan
2026-04-16 8:02 ` Tian, Kevin
2026-04-16 19:32 ` Alex Williamson [this message]
2026-04-14 21:14 ` [PATCH V4 05/10] vfio: Allow null group for noiommu without containers Jacob Pan
2026-04-16 8:13 ` Tian, Kevin
2026-04-16 21:33 ` Jacob Pan
2026-04-16 20:06 ` Alex Williamson
2026-04-17 17:06 ` Jacob Pan
2026-04-17 23:04 ` Alex Williamson
2026-04-14 21:14 ` [PATCH V4 06/10] vfio: Introduce and set noiommu flag on vfio_device Jacob Pan
2026-04-14 21:14 ` [PATCH V4 07/10] vfio: Enable cdev noiommu mode under iommufd Jacob Pan
2026-04-16 20:49 ` Alex Williamson
2026-04-14 21:14 ` [PATCH V4 08/10] vfio:selftest: Handle VFIO noiommu cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 09/10] selftests/vfio: Add iommufd noiommu mode selftest for cdev Jacob Pan
2026-04-14 21:14 ` [PATCH V4 10/10] Documentation: Update VFIO NOIOMMU mode Jacob Pan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260416133257.4d2e8818@shazbot.org \
--to=alex@shazbot.org \
--cc=baolu.lu@linux.intel.com \
--cc=dmatlack@google.com \
--cc=iommu@lists.linux.dev \
--cc=jacob.pan@linux.microsoft.com \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=nicolinc@nvidia.com \
--cc=pasha.tatashin@soleen.com \
--cc=robin.murphy@arm.com \
--cc=skhawaja@google.com \
--cc=smostafa@google.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox