qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Zhenzhong Duan <zhenzhong.duan@intel.com>
To: qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com,
	nicolinc@nvidia.com, joao.m.martins@oracle.com,
	eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com,
	kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com,
	chao.p.peng@intel.com
Subject: [PATCH v7 27/27] docs/devel: Add VFIO iommufd backend documentation
Date: Tue, 21 Nov 2023 16:44:26 +0800	[thread overview]
Message-ID: <20231121084426.1286987-28-zhenzhong.duan@intel.com> (raw)
In-Reply-To: <20231121084426.1286987-1-zhenzhong.duan@intel.com>

Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 MAINTAINERS                    |   1 +
 docs/devel/index-internals.rst |   1 +
 docs/devel/vfio-iommufd.rst    | 166 +++++++++++++++++++++++++++++++++
 3 files changed, 168 insertions(+)
 create mode 100644 docs/devel/vfio-iommufd.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index ca70bb4e64..0ddb20a35f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,7 @@ F: backends/iommufd.c
 F: include/sysemu/iommufd.h
 F: include/qemu/chardev_open.h
 F: util/chardev_open.c
+F: docs/devel/vfio-iommufd.rst
 
 vhost
 M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 6f81df92bc..3def4a138b 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add features to them.
    s390-dasd-ipl
    tracing
    vfio-migration
+   vfio-iommufd
    writing-monitor-commands
    virtio-backends
diff --git a/docs/devel/vfio-iommufd.rst b/docs/devel/vfio-iommufd.rst
new file mode 100644
index 0000000000..3d1c11f175
--- /dev/null
+++ b/docs/devel/vfio-iommufd.rst
@@ -0,0 +1,166 @@
+===============================
+IOMMUFD BACKEND usage with VFIO
+===============================
+
+(Same meaning for backend/container/BE)
+
+With the introduction of iommufd, the Linux kernel provides a generic
+interface for user space drivers to propagate their DMA mappings to kernel
+for assigned devices. While the legacy kernel interface is group-centric,
+the new iommufd interface is device-centric, relying on device fd and iommufd.
+
+To support both interfaces in the QEMU VFIO device, introduce a base container
+to abstract the common part of VFIO legacy and iommufd container. So that the
+generic VFIO code can use either container.
+
+The base container implements generic functions such as memory_listener and
+address space management whereas the derived container implements callbacks
+specific to either legacy or iommufd. Each container has its own way to setup
+secure context and dma management interface. The below diagram shows how it
+looks like with both containers.
+
+::
+
+                      VFIO                           AddressSpace/Memory
+      +-------+  +----------+  +-----+  +-----+
+      |  pci  |  | platform |  |  ap |  | ccw |
+      +---+---+  +----+-----+  +--+--+  +--+--+     +----------------------+
+          |           |           |        |        |   AddressSpace       |
+          |           |           |        |        +------------+---------+
+      +---V-----------V-----------V--------V----+               /
+      |           VFIOAddressSpace              | <------------+
+      |                  |                      |  MemoryListener
+      |        VFIOContainerBase list           |
+      +-------+----------------------------+----+
+              |                            |
+              |                            |
+      +-------V------+            +--------V----------+
+      |   iommufd    |            |    vfio legacy    |
+      |  container   |            |     container     |
+      +-------+------+            +--------+----------+
+              |                            |
+              | /dev/iommu                 | /dev/vfio/vfio
+              | /dev/vfio/devices/vfioX    | /dev/vfio/$group_id
+  Userspace   |                            |
+  ============+============================+===========================
+  Kernel      |  device fd                 |
+              +---------------+            | group/container fd
+              | (BIND_IOMMUFD |            | (SET_CONTAINER/SET_IOMMU)
+              |  ATTACH_IOAS) |            | device fd
+              |               |            |
+              |       +-------V------------V-----------------+
+      iommufd |       |                vfio                  |
+  (map/unmap  |       +---------+--------------------+-------+
+  ioas_copy)  |                 |                    | map/unmap
+              |                 |                    |
+       +------V------+    +-----V------+      +------V--------+
+       | iommfd core |    |  device    |      |  vfio iommu   |
+       +-------------+    +------------+      +---------------+
+
+* Secure Context setup
+
+  - iommufd BE: uses device fd and iommufd to setup secure context
+    (bind_iommufd, attach_ioas)
+  - vfio legacy BE: uses group fd and container fd to setup secure context
+    (set_container, set_iommu)
+
+* Device access
+
+  - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
+  - vfio legacy BE: device fd is retrieved from group fd ioctl
+
+* DMA Mapping flow
+
+  1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
+  2. VFIO populates DMA map/unmap via the container BEs
+     * iommufd BE: uses iommufd
+     * vfio legacy BE: uses container fd
+
+Example configuration
+=====================
+
+Step 1: configure the host device
+---------------------------------
+
+It's exactly same as the VFIO device with legacy VFIO container.
+
+Step 2: configure QEMU
+----------------------
+
+Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
+object (compiled in with the ``CONFIG_IOMMUFD`` option).
+
+Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
+be linked with an iommufd object. It gets a new optional property
+named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
+device for example:
+
+.. code-block:: bash
+
+    -object iommufd,id=iommufd0
+    -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
+
+Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
+management layer. In such a case the fd is passed, the fd supports a
+string naming the fd or a number, for example:
+
+.. code-block:: bash
+
+    -object iommufd,id=iommufd0,fd=22
+    -device vfio-pci,iommufd=iommufd0,fd=23
+
+If the ``fd`` property is not passed, the fd is opened by QEMU.
+
+If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
+is not used and the user gets the behavior based on the legacy VFIO
+container:
+
+.. code-block:: bash
+
+    -device vfio-pci,host=0000:02:00.0
+
+Supported platform
+==================
+
+Supports x86, ARM and s390x currently.
+
+Caveats
+=======
+
+Dirty page sync
+---------------
+
+Dirty page sync with iommufd backend is unsupported yet, live migration is
+disabled by default. But it can be force enabled like below, low efficient
+though.
+
+.. code-block:: bash
+
+    -object iommufd,id=iommufd0
+    -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
+
+P2P DMA
+-------
+
+PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
+BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
+
+.. code-block:: none
+
+    qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
+    qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
+
+FD passing with mdev
+--------------------
+
+``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
+If FD passing is used, there is no way to know that and the mdev is treated
+like a real PCI device. There is an error as below if user wants to enable
+RAM discarding for mdev.
+
+.. code-block:: none
+
+    qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
+
+``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
+devices are always mdev and RAM discarding is force enabled.
-- 
2.34.1



  parent reply	other threads:[~2023-11-21  8:49 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-21  8:43 [PATCH v7 00/27] vfio: Adopt iommufd Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 01/27] backends/iommufd: Introduce the iommufd object Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 02/27] util/char_dev: Add open_cdev() Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 03/27] vfio/common: return early if space isn't empty Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 04/27] vfio/iommufd: Implement the iommufd backend Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 05/27] vfio/iommufd: Relax assert check for " Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 06/27] vfio/iommufd: Add support for iova_ranges and pgsizes Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 07/27] vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 08/27] vfio/pci: Introduce a vfio pci hot reset interface Zhenzhong Duan
2023-11-21 18:38   ` Philippe Mathieu-Daudé
2023-11-22  3:32     ` Duan, Zhenzhong
2023-11-21  8:44 ` [PATCH v7 09/27] vfio/iommufd: Enable pci hot reset through iommufd cdev interface Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 10/27] vfio/pci: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 11/27] vfio/pci: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 12/27] vfio/platform: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 13/27] vfio/platform: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 14/27] vfio/ap: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 15/27] vfio/ap: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 16/27] vfio/ccw: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 17/27] vfio/ccw: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 18/27] vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 19/27] hw/arm: Activate IOMMUFD for virt machines Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 20/27] kconfig: Activate IOMMUFD for s390x machines Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 21/27] hw/i386: Activate IOMMUFD for q35 machines Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 22/27] vfio/pci: Move VFIODevice initializations in vfio_instance_init Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 23/27] vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 24/27] vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 25/27] vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init Zhenzhong Duan
2023-11-21  8:44 ` [PATCH v7 26/27] vfio: Introduce a helper function to initialize VFIODevice Zhenzhong Duan
2023-11-21  8:44 ` Zhenzhong Duan [this message]
2023-11-21 17:22 ` [PATCH v7 00/27] vfio: Adopt iommufd Cédric Le Goater
2023-11-22  3:21   ` Duan, Zhenzhong
2023-11-22  8:06     ` Cédric Le Goater
2023-11-22 11:49       ` Duan, Zhenzhong
2023-11-21 22:56 ` Nicolin Chen
2023-11-22  3:32   ` Duan, Zhenzhong
2023-11-22 13:48 ` Joao Martins
2023-11-28 17:10 ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121084426.1286987-28-zhenzhong.duan@intel.com \
    --to=zhenzhong.duan@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=chao.p.peng@intel.com \
    --cc=clg@redhat.com \
    --cc=eric.auger@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).