From: Zhenzhong Duan <zhenzhong.duan@intel.com>
To: qemu-devel@nongnu.org
Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com,
nicolinc@nvidia.com, joao.m.martins@oracle.com,
eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com,
kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com,
chao.p.peng@intel.com
Subject: [PATCH v7 27/27] docs/devel: Add VFIO iommufd backend documentation
Date: Tue, 21 Nov 2023 16:44:26 +0800 [thread overview]
Message-ID: <20231121084426.1286987-28-zhenzhong.duan@intel.com> (raw)
In-Reply-To: <20231121084426.1286987-1-zhenzhong.duan@intel.com>
Suggested-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
MAINTAINERS | 1 +
docs/devel/index-internals.rst | 1 +
docs/devel/vfio-iommufd.rst | 166 +++++++++++++++++++++++++++++++++
3 files changed, 168 insertions(+)
create mode 100644 docs/devel/vfio-iommufd.rst
diff --git a/MAINTAINERS b/MAINTAINERS
index ca70bb4e64..0ddb20a35f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2176,6 +2176,7 @@ F: backends/iommufd.c
F: include/sysemu/iommufd.h
F: include/qemu/chardev_open.h
F: util/chardev_open.c
+F: docs/devel/vfio-iommufd.rst
vhost
M: Michael S. Tsirkin <mst@redhat.com>
diff --git a/docs/devel/index-internals.rst b/docs/devel/index-internals.rst
index 6f81df92bc..3def4a138b 100644
--- a/docs/devel/index-internals.rst
+++ b/docs/devel/index-internals.rst
@@ -18,5 +18,6 @@ Details about QEMU's various subsystems including how to add features to them.
s390-dasd-ipl
tracing
vfio-migration
+ vfio-iommufd
writing-monitor-commands
virtio-backends
diff --git a/docs/devel/vfio-iommufd.rst b/docs/devel/vfio-iommufd.rst
new file mode 100644
index 0000000000..3d1c11f175
--- /dev/null
+++ b/docs/devel/vfio-iommufd.rst
@@ -0,0 +1,166 @@
+===============================
+IOMMUFD BACKEND usage with VFIO
+===============================
+
+(Same meaning for backend/container/BE)
+
+With the introduction of iommufd, the Linux kernel provides a generic
+interface for user space drivers to propagate their DMA mappings to kernel
+for assigned devices. While the legacy kernel interface is group-centric,
+the new iommufd interface is device-centric, relying on device fd and iommufd.
+
+To support both interfaces in the QEMU VFIO device, introduce a base container
+to abstract the common part of VFIO legacy and iommufd container. So that the
+generic VFIO code can use either container.
+
+The base container implements generic functions such as memory_listener and
+address space management whereas the derived container implements callbacks
+specific to either legacy or iommufd. Each container has its own way to setup
+secure context and dma management interface. The below diagram shows how it
+looks like with both containers.
+
+::
+
+ VFIO AddressSpace/Memory
+ +-------+ +----------+ +-----+ +-----+
+ | pci | | platform | | ap | | ccw |
+ +---+---+ +----+-----+ +--+--+ +--+--+ +----------------------+
+ | | | | | AddressSpace |
+ | | | | +------------+---------+
+ +---V-----------V-----------V--------V----+ /
+ | VFIOAddressSpace | <------------+
+ | | | MemoryListener
+ | VFIOContainerBase list |
+ +-------+----------------------------+----+
+ | |
+ | |
+ +-------V------+ +--------V----------+
+ | iommufd | | vfio legacy |
+ | container | | container |
+ +-------+------+ +--------+----------+
+ | |
+ | /dev/iommu | /dev/vfio/vfio
+ | /dev/vfio/devices/vfioX | /dev/vfio/$group_id
+ Userspace | |
+ ============+============================+===========================
+ Kernel | device fd |
+ +---------------+ | group/container fd
+ | (BIND_IOMMUFD | | (SET_CONTAINER/SET_IOMMU)
+ | ATTACH_IOAS) | | device fd
+ | | |
+ | +-------V------------V-----------------+
+ iommufd | | vfio |
+ (map/unmap | +---------+--------------------+-------+
+ ioas_copy) | | | map/unmap
+ | | |
+ +------V------+ +-----V------+ +------V--------+
+ | iommfd core | | device | | vfio iommu |
+ +-------------+ +------------+ +---------------+
+
+* Secure Context setup
+
+ - iommufd BE: uses device fd and iommufd to setup secure context
+ (bind_iommufd, attach_ioas)
+ - vfio legacy BE: uses group fd and container fd to setup secure context
+ (set_container, set_iommu)
+
+* Device access
+
+ - iommufd BE: device fd is opened through ``/dev/vfio/devices/vfioX``
+ - vfio legacy BE: device fd is retrieved from group fd ioctl
+
+* DMA Mapping flow
+
+ 1. VFIOAddressSpace receives MemoryRegion add/del via MemoryListener
+ 2. VFIO populates DMA map/unmap via the container BEs
+ * iommufd BE: uses iommufd
+ * vfio legacy BE: uses container fd
+
+Example configuration
+=====================
+
+Step 1: configure the host device
+---------------------------------
+
+It's exactly same as the VFIO device with legacy VFIO container.
+
+Step 2: configure QEMU
+----------------------
+
+Interactions with the ``/dev/iommu`` are abstracted by a new iommufd
+object (compiled in with the ``CONFIG_IOMMUFD`` option).
+
+Any QEMU device (e.g. VFIO device) wishing to use ``/dev/iommu`` must
+be linked with an iommufd object. It gets a new optional property
+named iommufd which allows to pass an iommufd object. Take ``vfio-pci``
+device for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0
+
+Note the ``/dev/iommu`` and VFIO cdev can be externally opened by a
+management layer. In such a case the fd is passed, the fd supports a
+string naming the fd or a number, for example:
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0,fd=22
+ -device vfio-pci,iommufd=iommufd0,fd=23
+
+If the ``fd`` property is not passed, the fd is opened by QEMU.
+
+If no ``iommufd`` object is passed to the ``vfio-pci`` device, iommufd
+is not used and the user gets the behavior based on the legacy VFIO
+container:
+
+.. code-block:: bash
+
+ -device vfio-pci,host=0000:02:00.0
+
+Supported platform
+==================
+
+Supports x86, ARM and s390x currently.
+
+Caveats
+=======
+
+Dirty page sync
+---------------
+
+Dirty page sync with iommufd backend is unsupported yet, live migration is
+disabled by default. But it can be force enabled like below, low efficient
+though.
+
+.. code-block:: bash
+
+ -object iommufd,id=iommufd0
+ -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0,enable-migration=on
+
+P2P DMA
+-------
+
+PCI p2p DMA is unsupported as IOMMUFD doesn't support mapping hardware PCI
+BAR region yet. Below warning shows for assigned PCI device, it's not a bug.
+
+.. code-block:: none
+
+ qemu-system-x86_64: warning: IOMMU_IOAS_MAP failed: Bad address, PCI BAR?
+ qemu-system-x86_64: vfio_container_dma_map(0x560cb6cb1620, 0xe000000021000, 0x3000, 0x7f32ed55c000) = -14 (Bad address)
+
+FD passing with mdev
+--------------------
+
+``vfio-pci`` device checks sysfsdev property to decide if backend is a mdev.
+If FD passing is used, there is no way to know that and the mdev is treated
+like a real PCI device. There is an error as below if user wants to enable
+RAM discarding for mdev.
+
+.. code-block:: none
+
+ qemu-system-x86_64: -device vfio-pci,iommufd=iommufd0,x-balloon-allowed=on,fd=9: vfio VFIO_FD9: x-balloon-allowed only potentially compatible with mdev devices
+
+``vfio-ap`` and ``vfio-ccw`` devices don't have same issue as their backend
+devices are always mdev and RAM discarding is force enabled.
--
2.34.1
next prev parent reply other threads:[~2023-11-21 8:49 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-21 8:43 [PATCH v7 00/27] vfio: Adopt iommufd Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 01/27] backends/iommufd: Introduce the iommufd object Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 02/27] util/char_dev: Add open_cdev() Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 03/27] vfio/common: return early if space isn't empty Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 04/27] vfio/iommufd: Implement the iommufd backend Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 05/27] vfio/iommufd: Relax assert check for " Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 06/27] vfio/iommufd: Add support for iova_ranges and pgsizes Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 07/27] vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 08/27] vfio/pci: Introduce a vfio pci hot reset interface Zhenzhong Duan
2023-11-21 18:38 ` Philippe Mathieu-Daudé
2023-11-22 3:32 ` Duan, Zhenzhong
2023-11-21 8:44 ` [PATCH v7 09/27] vfio/iommufd: Enable pci hot reset through iommufd cdev interface Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 10/27] vfio/pci: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 11/27] vfio/pci: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 12/27] vfio/platform: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 13/27] vfio/platform: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 14/27] vfio/ap: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 15/27] vfio/ap: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 16/27] vfio/ccw: Allow the selection of a given iommu backend Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 17/27] vfio/ccw: Make vfio cdev pre-openable by passing a file handle Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 18/27] vfio: Make VFIOContainerBase poiner parameter const in VFIOIOMMUOps callbacks Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 19/27] hw/arm: Activate IOMMUFD for virt machines Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 20/27] kconfig: Activate IOMMUFD for s390x machines Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 21/27] hw/i386: Activate IOMMUFD for q35 machines Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 22/27] vfio/pci: Move VFIODevice initializations in vfio_instance_init Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 23/27] vfio/platform: Move VFIODevice initializations in vfio_platform_instance_init Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 24/27] vfio/ap: Move VFIODevice initializations in vfio_ap_instance_init Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 25/27] vfio/ccw: Move VFIODevice initializations in vfio_ccw_instance_init Zhenzhong Duan
2023-11-21 8:44 ` [PATCH v7 26/27] vfio: Introduce a helper function to initialize VFIODevice Zhenzhong Duan
2023-11-21 8:44 ` Zhenzhong Duan [this message]
2023-11-21 17:22 ` [PATCH v7 00/27] vfio: Adopt iommufd Cédric Le Goater
2023-11-22 3:21 ` Duan, Zhenzhong
2023-11-22 8:06 ` Cédric Le Goater
2023-11-22 11:49 ` Duan, Zhenzhong
2023-11-21 22:56 ` Nicolin Chen
2023-11-22 3:32 ` Duan, Zhenzhong
2023-11-22 13:48 ` Joao Martins
2023-11-28 17:10 ` Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121084426.1286987-28-zhenzhong.duan@intel.com \
--to=zhenzhong.duan@intel.com \
--cc=alex.williamson@redhat.com \
--cc=chao.p.peng@intel.com \
--cc=clg@redhat.com \
--cc=eric.auger@redhat.com \
--cc=jasowang@redhat.com \
--cc=jgg@nvidia.com \
--cc=joao.m.martins@oracle.com \
--cc=kevin.tian@intel.com \
--cc=nicolinc@nvidia.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=yi.l.liu@intel.com \
--cc=yi.y.sun@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).