Linux IOMMU Development
 help / color / mirror / Atom feed
From: Jacob Pan <jacob.pan@linux.microsoft.com>
To: linux-kernel@vger.kernel.org,
	"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
	Jason Gunthorpe <jgg@nvidia.com>,
	Alex Williamson <alex@shazbot.org>,
	Joerg Roedel <joro@8bytes.org>,
	Mostafa Saleh <smostafa@google.com>,
	David Matlack <dmatlack@google.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Nicolin Chen <nicolinc@nvidia.com>,
	"Tian, Kevin" <kevin.tian@intel.com>, Yi Liu <yi.l.liu@intel.com>,
	Baolu Lu <baolu.lu@linux.intel.com>
Cc: Saurabh Sengar <ssengar@linux.microsoft.com>,
	skhawaja@google.com, pasha.tatashin@soleen.com,
	Will Deacon <will@kernel.org>,
	Jacob Pan <jacob.pan@linux.microsoft.com>
Subject: [PATCH v9 0/6] iommufd: Enable noiommu mode for cdev
Date: Thu, 11 Jun 2026 10:26:52 -0700	[thread overview]
Message-ID: <20260611172658.3421138-1-jacob.pan@linux.microsoft.com> (raw)

VFIO's unsafe_noiommu_mode has long provided a way for userspace drivers
to operate on platforms lacking a hardware IOMMU. Today, IOMMUFD also
supports No-IOMMU mode for group-based devices under vfio_compat mode.
However, IOMMUFD's native character device (cdev) does not yet support
No-IOMMU mode, which is the purpose of this patch.

In summary, we have:

|-------------------------+------+---------------|
| Device access mode      | VFIO | IOMMUFD       |
|-------------------------+------+---------------|
| group /dev/vfio/$GROUP  | Yes  | Yes           |
|-------------------------+------+---------------|
| cdev /dev/vfio/devices/ | No   | This patch    |
|-------------------------+------+---------------|

Beyond enabling cdev for IOMMUFD, this patch also addresses the following
deficiencies in the current No-IOMMU mode suggested by Jason[1]:
- Devices operating under No-IOMMU mode are limited to device-level UAPI
  access, without container or IOAS-level capabilities. Consequently,
  user-space drivers lack structured mechanisms for page pinning and often
  resort to mlock(), which is less robust than pin_user_pages() used for
  devices backed by a physical IOMMU. For example, mlock() does not prevent
  page migration.
- There is no architectural mechanism for obtaining physical addresses for
  DMA. As a workaround, user-space drivers frequently rely on /proc/pagemap
  tricks or hardcoded values.

By allowing noiommu device access to IOMMUFD IOAS and HWPT objects, this
patch brings No-IOMMU mode closer to full citizenship within the IOMMU
subsystem. In addition to addressing the two deficiencies mentioned above,
the expectation is that it will also enable No-IOMMU devices to seamlessly
participate in live update sessions via KHO [2].

Furthermore, these devices will use the IOMMUFD-based ownership checking model for
VFIO_DEVICE_PCI_HOT_RESET, eliminating the need for an iommufd_access object
as required in a previous attempt [3].

ChangeLog:
v9:
  - Leave device->device.devt unset for no-IOMMU dev so cdev_device_add()
    registers only the struct device and does not expose an unsupported
    cdev. (Alex, Sashiko)
  - Clarify VFIO cdev no-IOMMU Kconfig limits in documentation
  - Hold registration while checking cdev no-IOMMU access (Sashiko)
  - Make no-IOMMU GET_PA length a real upper bound and reject zero length,
    avoiding an unbounded scan while holding IOAS locks. This matches the
    bounded-range semantics expected by the incoming
    iommu_iova_to_phys_length() helper.
  - Guard replace path for noiommu device (Sashiko)
v8:
  - Guard noiommu for vdevice viommu alloc (Kevin)
v7: 
  - Handle Sashiko reviews.
  - Dropped selftest for now, will submit separately for v7.2 to use
    new lib helpers
v6: Undo CDEV-GROUP NOIOMMU split, use Kconfig to restrict unwanted
    combo.
V5:
  - Split CONFIG_VFIO_NOIOMMU into CONFIG_VFIO_GROUP_NOIOMMU and
    CONFIG_VFIO_CDEV_NOIOMMU so cdev noiommu is independent of
    VFIO_GROUP (Alex)
  - Add CAP_SYS_RAWIO check for cdev open and bind under noiommu,
    security parity with group noiommu (Alex)
  - Add IS_ENABLED(CONFIG_IOMMUFD_NOIOMMU) guard in
    iommufd_device_is_noiommu() to prevent noiommu bind when feature
    is disabled
  - Add prep patch to tolerate NULL group for cdev noiommu devices
    when CONFIG_VFIO_GROUP_NOIOMMU is not set [7/9]
  - Rename IOCTL to IOMMUFD_CMD_IOAS_NOIOMMU_GET_PA to be more
    specific (Kevin)
  - Simplify iommufd_device_is_noiommu, use iommufd_bind_noiommu
    helper (Kevin, Yi)
  - Move IOMMU cap check under iommufd_bind_iommu() (Yi)
  - Fix next_iova exceeding iopt_area_last_iova in GET_PA (Alex)
  - Fix const hwpt, copyright date, typo in moved comment (Kevin)
  - Add Reviewed-by tags
  - Squash noiommu cdev selftest fix into selftest patch
  - Drop DSA selftest patch
  - Details in each patch changelog.

V4:
  - Fix various corner cases pointed out by (Sashiko)
    Details in each patch changelog.

V3:
  - Improve error handling [3/10] (Mostafa)
  - Simplify vfio_device_is_noiommu logic and merged in [6/10] (Mostafa)
  - Add comment to explain the design difference over the legacy noiommu
    VFIO code.[1/10]

V2:
  - Fix build dependency by adding IOMMU_SUPPORT in [8/11]
  - Add an optimization to scan beyond the first page for a contiguous
    physical address range and return its length instead of a single
    page.[4/11]

Since RFC[4]:
  - Abandoned dummy iommu driver approach as patch 1-3 absorbed the
    changes into iommufd.

[1] https://lore.kernel.org/linux-iommu/20250603175403.GA407344@nvidia.com/
[2] https://lore.kernel.org/linux-pci/20251027134430.00007e46@linux.microsoft.com/
[3] https://lore.kernel.org/kvm/20230522115751.326947-1-yi.l.liu@intel.com/
[4] https://lore.kernel.org/linux-iommu/20251201173012.18371-1-jacob.pan@linux.microsoft.com/

Future cleanup: consolidate all CONFIG_IOMMUFD_NOIOMMU code
(iopt_get_phys, iommufd_ioas_noiommu_get_pa, iommufd_noiommu_ops) into
hwpt_noiommu.c to eliminate #ifdef guards from ioas.c and io_pagetable.c.

Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>


---
      3	Jacob Pan
      3	Jason Gunthorpe

  iommufd: Support a HWPT without an iommu driver for noiommu
  iommufd: Move igroup allocation to a function
  iommufd: Allow binding to a noiommu device
  iommufd: Add an ioctl to query PA from IOVA for noiommu mode
  vfio: Enable cdev noiommu mode under iommufd
  Documentation: Update VFIO NOIOMMU mode

 Documentation/driver-api/vfio.rst       |  89 +++++++++++++-
 drivers/iommu/iommufd/Kconfig           |  12 ++
 drivers/iommu/iommufd/Makefile          |   1 +
 drivers/iommu/iommufd/device.c          | 201 +++++++++++++++++++++++---------
 drivers/iommu/iommufd/hw_pagetable.c    |  19 ++-
 drivers/iommu/iommufd/hwpt_noiommu.c    | 105 +++++++++++++++++
 drivers/iommu/iommufd/io_pagetable.c    |  78 +++++++++++++
 drivers/iommu/iommufd/ioas.c            |  36 ++++++
 drivers/iommu/iommufd/iommufd_private.h |  30 +++++
 drivers/iommu/iommufd/main.c            |   4 +
 drivers/iommu/iommufd/viommu.c          |  14 ++-
 drivers/vfio/Kconfig                    |   7 +-
 drivers/vfio/device_cdev.c              |   9 ++
 drivers/vfio/iommufd.c                  |  12 +-
 drivers/vfio/vfio.h                     |  23 ++--
 drivers/vfio/vfio_main.c                |  26 ++++-
 include/linux/vfio.h                    |   1 +
 include/uapi/linux/iommufd.h            |  28 +++++
 18 files changed, 609 insertions(+), 86 deletions(-)

-- 
2.43.0

             reply	other threads:[~2026-06-11 17:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-11 17:26 Jacob Pan [this message]
2026-06-11 17:26 ` [PATCH v9 1/6] iommufd: Support a HWPT without an iommu driver for noiommu Jacob Pan
2026-06-16  6:00   ` Yi Liu
2026-06-16 20:18   ` Pranjal Shrivastava
2026-06-17  0:09     ` Jason Gunthorpe
2026-06-17 10:59       ` Pranjal Shrivastava
2026-06-11 17:26 ` [PATCH v9 2/6] iommufd: Move igroup allocation to a function Jacob Pan
2026-06-16 20:23   ` Pranjal Shrivastava
2026-06-11 17:26 ` [PATCH v9 3/6] iommufd: Allow binding to a noiommu device Jacob Pan
2026-06-16 20:38   ` Pranjal Shrivastava
2026-06-11 17:26 ` [PATCH v9 4/6] iommufd: Add an ioctl to query PA from IOVA for noiommu mode Jacob Pan
2026-06-16  6:00   ` Yi Liu
2026-06-16 21:40   ` Pranjal Shrivastava
2026-06-11 17:26 ` [PATCH v9 5/6] vfio: Enable cdev noiommu mode under iommufd Jacob Pan
2026-06-11 23:14   ` Alex Williamson
2026-06-16  6:00   ` Yi Liu
2026-06-16 22:03   ` Pranjal Shrivastava
2026-06-11 17:26 ` [PATCH v9 6/6] Documentation: Update VFIO NOIOMMU mode Jacob Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260611172658.3421138-1-jacob.pan@linux.microsoft.com \
    --to=jacob.pan@linux.microsoft.com \
    --cc=alex@shazbot.org \
    --cc=baolu.lu@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=iommu@lists.linux.dev \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nicolinc@nvidia.com \
    --cc=pasha.tatashin@soleen.com \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=smostafa@google.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox