All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: Yi Liu <yi.l.liu@intel.com>
Cc: <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex@shazbot.org>,
	Joerg Roedel <joro@8bytes.org>,
	Mostafa Saleh <smostafa@google.com>,
	David Matlack <dmatlack@google.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Tian, Kevin" <kevin.tian@intel.com>,
	Baolu Lu <baolu.lu@linux.intel.com>,
	Saurabh Sengar <ssengar@linux.microsoft.com>,
	<skhawaja@google.com>, <pasha.tatashin@soleen.com>,
	Will Deacon <will@kernel.org>,
	jacob.pan@linux.microsoft.com
Subject: Re: [PATCH v6 4/7] iommufd: Add an ioctl to query PA from IOVA for noiommu mode
Date: Sat, 23 May 2026 15:09:12 -0700	[thread overview]
Message-ID: <20260523150912.00002e02@linux.microsoft.com> (raw)
In-Reply-To: <dec4651f-ad7b-4a9c-8c25-2b68f8a85088@intel.com>

Hi Yi,

On Fri, 22 May 2026 17:22:51 +0800
Yi Liu <yi.l.liu@intel.com> wrote:

> On 5/22/26 06:11, Jacob Pan wrote:
> > To support no-IOMMU mode where userspace drivers perform unsafe DMA
> > using physical addresses, introduce a new API to retrieve the
> > physical address of a user-allocated DMA buffer that has been
> > mapped to an IOVA via IOAS. The mapping is backed by SW-only I/O
> > page tables  
> 
> nit: /via IOAS/via IOMMU_IOAS_MAP/
> 
> > maintained by the generic IOMMUPT framework.
> > 
> > Reviewed-by: Lu Baolu<baolu.lu@linux.intel.com>
> > Suggested-by: Jason Gunthorpe<jgg@nvidia.com>
> > Co-developed-by: Jason Gunthorpe<jgg@nvidia.com>
> > Signed-off-by: Jason Gunthorpe<jgg@nvidia.com>
> > Signed-off-by: Jacob Pan<jacob.pan@linux.microsoft.com>
> > ---
> > v6:
> >     - Limit search length (Baolu, Jason)
> > v5:
> >     - Fix next_iova exceeds iopt_area_last_iova (Alex)
> >     - Rename IOCTL more specific to NOIOMMU, i.e.
> > 	 IOMMUFD_CMD_IOAS_NOIOMMU_GET_PA (Kevin)
> >     - Add header stubs for iopt_get_phys()
> > v4:
> >     - Fix ioctl return type (Yi Liu)
> > ---
> >   drivers/iommu/iommufd/io_pagetable.c    | 72
> > +++++++++++++++++++++++++ drivers/iommu/iommufd/ioas.c            |
> > 30 +++++++++++ drivers/iommu/iommufd/iommufd_private.h | 18 +++++++
> >   drivers/iommu/iommufd/main.c            |  3 ++
> >   include/uapi/linux/iommufd.h            | 27 ++++++++++
> >   5 files changed, 150 insertions(+)
> > 
> > diff --git a/drivers/iommu/iommufd/io_pagetable.c
> > b/drivers/iommu/iommufd/io_pagetable.c index
> > 24d4917105d9..4369447e2125 100644 ---
> > a/drivers/iommu/iommufd/io_pagetable.c +++
> > b/drivers/iommu/iommufd/io_pagetable.c @@ -859,6 +859,78 @@ int
> > iopt_unmap_iova(struct io_pagetable *iopt, unsigned long iova,
> > return iopt_unmap_iova_range(iopt, iova, iova_last, unmapped); }
> >   
> > +#ifdef CONFIG_IOMMUFD_NOIOMMU
> > +int iopt_get_phys(struct io_pagetable *iopt, unsigned long iova,
> > u64 *paddr,
> > +		  u64 *length)
> > +{
> > +	struct iopt_area *area;
> > +	u64 max_length = *length;
> > +	u64 tmp_length = 0;
> > +	u64 tmp_paddr = 0;
> > +	int rc = 0;
> > +
> > +	down_read(&iopt->iova_rwsem);
> > +	area = iopt_area_iter_first(iopt, iova, iova);
> > +	if (!area || !area->pages) {
> > +		rc = -ENOENT;
> > +		goto unlock_exit;
> > +	}
> > +
> > +	if (!area->storage_domain ||
> > +	    area->storage_domain->owner != &iommufd_noiommu_ops) {
> > +		rc = -EOPNOTSUPP;
> > +		goto unlock_exit;
> > +	}
> > +
> > +	*paddr = iommu_iova_to_phys(area->storage_domain, iova);
> > +	if (!*paddr) {
> > +		rc = -EINVAL;
> > +		goto unlock_exit;
> > +	}
> > +
> > +	tmp_length = PAGE_SIZE - offset_in_page(iova);
> > +	tmp_paddr = *paddr;
> > +	/*
> > +	 * Scan the domain for the contiguous physical address
> > length so that
> > +	 * userspace search can be optimized for fewer ioctls. A
> > max_length of
> > +	 * 0 means no limit.
> > +	 */
> > +	while (iova < iopt_area_last_iova(area)) {
> > +		unsigned long next_iova;
> > +		u64 next_paddr;
> > +
> > +		if (max_length && tmp_length >= max_length) {
> > +			tmp_length = max_length;  
> 
> nit: is this value setting duplicated with the one outside this loop?
yes, you are right. will delete.

         if (max_length && tmp_length >= max_length)
                 break;
Thanks!

Jacob

> > +			break;
> > +		}
> > +
> > +		if (check_add_overflow(iova, PAGE_SIZE,
> > &next_iova))
> > +			break;
> > +
> > +		if (next_iova > iopt_area_last_iova(area))
> > +			break;
> > +
> > +		next_paddr =
> > iommu_iova_to_phys(area->storage_domain, next_iova); +
> > +		if (!next_paddr || next_paddr != tmp_paddr +
> > PAGE_SIZE)
> > +			break;
> > +
> > +		iova = next_iova;
> > +		tmp_paddr += PAGE_SIZE;
> > +		tmp_length += PAGE_SIZE;
> > +	}
> > +
> > +	if (max_length && tmp_length > max_length)
> > +		tmp_length = max_length;
> > +	*length = tmp_length;
> > +
> > +unlock_exit:
> > +	up_read(&iopt->iova_rwsem);
> > +
> > +	return rc;
> > +}
> > +#endif
> > +
> >   int iopt_unmap_all(struct io_pagetable *iopt, unsigned long
> > *unmapped) {
> >   	/* If the IOVAs are empty then unmap all succeeds */
> > diff --git a/drivers/iommu/iommufd/ioas.c
> > b/drivers/iommu/iommufd/ioas.c index fed06c2b728e..82bbc0c2357e
> > 100644 --- a/drivers/iommu/iommufd/ioas.c
> > +++ b/drivers/iommu/iommufd/ioas.c
> > @@ -375,6 +375,36 @@ int iommufd_ioas_unmap(struct iommufd_ucmd
> > *ucmd) return rc;
> >   }
> >   
> > +#ifdef CONFIG_IOMMUFD_NOIOMMU
> > +int iommufd_ioas_noiommu_get_pa(struct iommufd_ucmd *ucmd)
> > +{
> > +	struct iommu_ioas_noiommu_get_pa *cmd = ucmd->cmd;
> > +	struct iommufd_ioas *ioas;
> > +	int rc;
> > +
> > +	if (!capable(CAP_SYS_RAWIO))
> > +		return -EPERM;
> > +
> > +	if (cmd->flags || cmd->__reserved)
> > +		return -EOPNOTSUPP;
> > +
> > +	ioas = iommufd_get_ioas(ucmd->ictx, cmd->ioas_id);
> > +	if (IS_ERR(ioas))
> > +		return PTR_ERR(ioas);
> > +
> > +	rc = iopt_get_phys(&ioas->iopt, cmd->iova, &cmd->out_phys,
> > +			   &cmd->length);
> > +	if (rc)
> > +		goto out_put;
> > +
> > +	rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
> > +out_put:
> > +	iommufd_put_object(ucmd->ictx, &ioas->obj);
> > +
> > +	return rc;
> > +}
> > +#endif
> > +
> >   static void iommufd_release_all_iova_rwsem(struct iommufd_ctx
> > *ictx, struct xarray *ioas_list)
> >   {
> > diff --git a/drivers/iommu/iommufd/iommufd_private.h
> > b/drivers/iommu/iommufd/iommufd_private.h index
> > 2682b5baa6e9..13f1506d8066 100644 ---
> > a/drivers/iommu/iommufd/iommufd_private.h +++
> > b/drivers/iommu/iommufd/iommufd_private.h @@ -118,6 +118,16 @@ int
> > iopt_map_pages(struct io_pagetable *iopt, struct list_head
> > *pages_list, int iopt_unmap_iova(struct io_pagetable *iopt,
> > unsigned long iova, unsigned long length, unsigned long *unmapped);
> > int iopt_unmap_all(struct io_pagetable *iopt, unsigned long
> > *unmapped); +#ifdef CONFIG_IOMMUFD_NOIOMMU +int
> > iopt_get_phys(struct io_pagetable *iopt, unsigned long iova, u64
> > *paddr,
> > +		  u64 *length);
> > +#else
> > +static inline int iopt_get_phys(struct io_pagetable *iopt,
> > unsigned long iova,
> > +				u64 *paddr, u64 *length)
> > +{
> > +	return -EOPNOTSUPP;
> > +}
> > +#endif
> >   
> >   int iopt_read_and_clear_dirty_data(struct io_pagetable *iopt,
> >   				   struct iommu_domain *domain,
> > @@ -346,6 +356,14 @@ int iommufd_ioas_map_file(struct iommufd_ucmd
> > *ucmd); int iommufd_ioas_change_process(struct iommufd_ucmd *ucmd);
> >   int iommufd_ioas_copy(struct iommufd_ucmd *ucmd);
> >   int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd);
> > +#ifdef CONFIG_IOMMUFD_NOIOMMU
> > +int iommufd_ioas_noiommu_get_pa(struct iommufd_ucmd *ucmd);
> > +#else
> > +static inline int iommufd_ioas_noiommu_get_pa(struct iommufd_ucmd
> > *ucmd) +{
> > +	return -EOPNOTSUPP;
> > +}
> > +#endif
> >   int iommufd_ioas_option(struct iommufd_ucmd *ucmd);
> >   int iommufd_option_rlimit_mode(struct iommu_option *cmd,
> >   			       struct iommufd_ctx *ictx);
> > diff --git a/drivers/iommu/iommufd/main.c
> > b/drivers/iommu/iommufd/main.c index 8c6d43601afb..3b4192d70570
> > 100644 --- a/drivers/iommu/iommufd/main.c
> > +++ b/drivers/iommu/iommufd/main.c
> > @@ -424,6 +424,7 @@ union ucmd_buffer {
> >   	struct iommu_ioas_alloc alloc;
> >   	struct iommu_ioas_allow_iovas allow_iovas;
> >   	struct iommu_ioas_copy ioas_copy;
> > +	struct iommu_ioas_noiommu_get_pa noiommu_get_pa;
> >   	struct iommu_ioas_iova_ranges iova_ranges;
> >   	struct iommu_ioas_map map;
> >   	struct iommu_ioas_unmap unmap;
> > @@ -482,6 +483,8 @@ static const struct iommufd_ioctl_op
> > iommufd_ioctl_ops[] = { IOCTL_OP(IOMMU_IOAS_MAP, iommufd_ioas_map,
> > struct iommu_ioas_map, iova), IOCTL_OP(IOMMU_IOAS_MAP_FILE,
> > iommufd_ioas_map_file, struct iommu_ioas_map_file, iova),
> > +	IOCTL_OP(IOMMU_IOAS_NOIOMMU_GET_PA,
> > iommufd_ioas_noiommu_get_pa, struct iommu_ioas_noiommu_get_pa,
> > +		 out_phys),
> >   	IOCTL_OP(IOMMU_IOAS_UNMAP, iommufd_ioas_unmap, struct
> > iommu_ioas_unmap, length),
> >   	IOCTL_OP(IOMMU_OPTION, iommufd_option, struct
> > iommu_option, val64), diff --git a/include/uapi/linux/iommufd.h
> > b/include/uapi/linux/iommufd.h index e998dfbd6960..26b4998439e8
> > 100644 --- a/include/uapi/linux/iommufd.h
> > +++ b/include/uapi/linux/iommufd.h
> > @@ -57,6 +57,7 @@ enum {
> >   	IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
> >   	IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
> >   	IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94,
> > +	IOMMUFD_CMD_IOAS_NOIOMMU_GET_PA = 0x95,
> >   };
> >   
> >   /**
> > @@ -219,6 +220,32 @@ struct iommu_ioas_map {
> >   };
> >   #define IOMMU_IOAS_MAP _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_MAP)
> >   
> > +/**
> > + * struct iommu_ioas_noiommu_get_pa -
> > ioctl(IOMMU_IOAS_NOIOMMU_GET_PA)
> > + * @size: sizeof(struct iommu_ioas_noiommu_get_pa)
> > + * @flags: Reserved, must be 0 for now
> > + * @ioas_id: IOAS ID to query IOVA to PA mapping from
> > + * @__reserved: Must be 0
> > + * @iova: IOVA to query
> > + * @length: On input, maximum number of bytes to scan for
> > contiguity (0 means
> > + *          no limit). On output, actual number of contiguous
> > bytes starting
> > + *          from out_phys.
> > + * @out_phys: Output physical address the IOVA maps to
> > + *
> > + * Query the physical address backing an IOVA range. The entire
> > range must be
> > + * mapped already. For noiommu devices doing unsafe DMA only.
> > + */
> > +struct iommu_ioas_noiommu_get_pa {
> > +	__u32 size;
> > +	__u32 flags;
> > +	__u32 ioas_id;
> > +	__u32 __reserved;
> > +	__aligned_u64 iova;
> > +	__aligned_u64 length;
> > +	__aligned_u64 out_phys;
> > +};
> > +#define IOMMU_IOAS_NOIOMMU_GET_PA _IO(IOMMUFD_TYPE,
> > IOMMUFD_CMD_IOAS_NOIOMMU_GET_PA) +
> >   /**
> >    * struct iommu_ioas_map_file - ioctl(IOMMU_IOAS_MAP_FILE)
> >    * @size: sizeof(struct iommu_ioas_map_file)  


  reply	other threads:[~2026-05-23 22:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21 22:11 [PATCH v6 0/7] iommufd: Enable noiommu mode for cdev Jacob Pan
2026-05-21 22:11 ` [PATCH v6 1/7] iommufd: Support a HWPT without an iommu driver for noiommu Jacob Pan
2026-05-21 22:11 ` [PATCH v6 2/7] iommufd: Move igroup allocation to a function Jacob Pan
2026-05-22  6:00   ` Baolu Lu
2026-05-21 22:11 ` [PATCH v6 3/7] iommufd: Allow binding to a noiommu device Jacob Pan
2026-05-22  6:01   ` Baolu Lu
2026-05-21 22:11 ` [PATCH v6 4/7] iommufd: Add an ioctl to query PA from IOVA for noiommu mode Jacob Pan
2026-05-22  9:22   ` Yi Liu
2026-05-23 22:09     ` Jacob Pan [this message]
2026-05-21 22:11 ` [PATCH v6 5/7] vfio: Enable cdev noiommu mode under iommufd Jacob Pan
2026-05-22  9:19   ` Yi Liu
2026-05-23 22:01     ` Jacob Pan
2026-05-25  6:29       ` Yi Liu
2026-05-28 18:52         ` Jacob Pan
2026-05-29  7:27           ` Yi Liu
2026-05-21 22:11 ` [PATCH v6 6/7] selftests/vfio: Add iommufd noiommu mode selftest for cdev Jacob Pan
2026-05-21 22:39   ` David Matlack
2026-06-03  0:13     ` Jacob Pan
2026-05-21 22:11 ` [PATCH v6 7/7] Documentation: Update VFIO NOIOMMU mode Jacob Pan
2026-05-22  9:42   ` Yi Liu
2026-05-23  3:42     ` Jacob Pan
2026-05-25  6:29       ` Yi Liu
2026-05-25  8:30 ` [PATCH v6 0/7] iommufd: Enable noiommu mode for cdev Tian, Kevin
2026-05-26 15:32   ` Jacob Pan
2026-05-26 17:57     ` Alex Williamson
2026-05-27 22:34       ` Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260523150912.00002e02@linux.microsoft.com \
    --to=jacob.pan@linux.microsoft.com \
    --cc=alex@shazbot.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=smostafa@google.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.