From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Peter Xu" <peterx@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"David Hildenbrand" <david@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>
Subject: [PULL 09/56] intel-iommu: Document iova_tree
Date: Mon, 30 Jan 2023 15:19:11 -0500 [thread overview]
Message-ID: <20230130201810.11518-10-mst@redhat.com> (raw)
In-Reply-To: <20230130201810.11518-1-mst@redhat.com>
From: Peter Xu <peterx@redhat.com>
It seems not super clear on when iova_tree is used, and why. Add a rich
comment above iova_tree to track why we needed the iova_tree, and when we
need it.
Also comment for the map/unmap messages, on how they're used and
implications (e.g. unmap can be larger than the mapped ranges).
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20230109193727.1360190-1-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
include/exec/memory.h | 26 ++++++++++++++++++++++++
include/hw/i386/intel_iommu.h | 38 ++++++++++++++++++++++++++++++++++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c37ffdbcd1..2e602a2fad 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -129,6 +129,32 @@ struct IOMMUTLBEntry {
/*
* Bitmap for different IOMMUNotifier capabilities. Each notifier can
* register with one or multiple IOMMU Notifier capability bit(s).
+ *
+ * Normally there're two use cases for the notifiers:
+ *
+ * (1) When the device needs accurate synchronizations of the vIOMMU page
+ * tables, it needs to register with both MAP|UNMAP notifies (which
+ * is defined as IOMMU_NOTIFIER_IOTLB_EVENTS below).
+ *
+ * Regarding to accurate synchronization, it's when the notified
+ * device maintains a shadow page table and must be notified on each
+ * guest MAP (page table entry creation) and UNMAP (invalidation)
+ * events (e.g. VFIO). Both notifications must be accurate so that
+ * the shadow page table is fully in sync with the guest view.
+ *
+ * (2) When the device doesn't need accurate synchronizations of the
+ * vIOMMU page tables, it needs to register only with UNMAP or
+ * DEVIOTLB_UNMAP notifies.
+ *
+ * It's when the device maintains a cache of IOMMU translations
+ * (IOTLB) and is able to fill that cache by requesting translations
+ * from the vIOMMU through a protocol similar to ATS (Address
+ * Translation Service).
+ *
+ * Note that in this mode the vIOMMU will not maintain a shadowed
+ * page table for the address space, and the UNMAP messages can cover
+ * more than the pages that used to get mapped. The IOMMU notifiee
+ * should be able to take care of over-sized invalidations.
*/
typedef enum {
IOMMU_NOTIFIER_NONE = 0,
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 46d973e629..89dcbc5e1e 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -109,7 +109,43 @@ struct VTDAddressSpace {
QLIST_ENTRY(VTDAddressSpace) next;
/* Superset of notifier flags that this address space has */
IOMMUNotifierFlag notifier_flags;
- IOVATree *iova_tree; /* Traces mapped IOVA ranges */
+ /*
+ * @iova_tree traces mapped IOVA ranges.
+ *
+ * The tree is not needed if no MAP notifier is registered with current
+ * VTD address space, because all guest invalidate commands can be
+ * directly passed to the IOMMU UNMAP notifiers without any further
+ * reshuffling.
+ *
+ * The tree OTOH is required for MAP typed iommu notifiers for a few
+ * reasons.
+ *
+ * Firstly, there's no way to identify whether an PSI (Page Selective
+ * Invalidations) or DSI (Domain Selective Invalidations) event is an
+ * MAP or UNMAP event within the message itself. Without having prior
+ * knowledge of existing state vIOMMU doesn't know whether it should
+ * notify MAP or UNMAP for a PSI message it received when caching mode
+ * is enabled (for MAP notifiers).
+ *
+ * Secondly, PSI messages received from guest driver can be enlarged in
+ * range, covers but not limited to what the guest driver wanted to
+ * invalidate. When the range to invalidates gets bigger than the
+ * limit of a PSI message, it can even become a DSI which will
+ * invalidate the whole domain. If the vIOMMU directly notifies the
+ * registered device with the unmodified range, it may confuse the
+ * registered drivers (e.g. vfio-pci) on either:
+ *
+ * (1) Trying to map the same region more than once (for
+ * VFIO_IOMMU_MAP_DMA, -EEXIST will trigger), or,
+ *
+ * (2) Trying to UNMAP a range that is still partially mapped.
+ *
+ * That accuracy is not required for UNMAP-only notifiers, but it is a
+ * must-to-have for notifiers registered with MAP events, because the
+ * vIOMMU needs to make sure the shadow page table is always in sync
+ * with the guest IOMMU pgtables for a device.
+ */
+ IOVATree *iova_tree;
};
struct VTDIOTLBEntry {
--
MST
next prev parent reply other threads:[~2023-01-30 20:27 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-30 20:18 [PULL 00/56] virtio,pc,pci: features, cleanups, fixes Michael S. Tsirkin
2023-01-30 20:18 ` [PULL 01/56] shpc: disallow unplug when power indicator is blinking Michael S. Tsirkin
2023-01-30 20:18 ` [PULL 02/56] hw/i386/acpi-build: Remove unused attributes Michael S. Tsirkin
2023-01-30 20:18 ` [PULL 03/56] hw/isa/isa-bus: Turn isa_build_aml() into qbus_build_aml() Michael S. Tsirkin
2023-01-30 20:18 ` [PULL 04/56] hw/acpi/piix4: No need to #include "hw/southbridge/piix.h" Michael S. Tsirkin
2023-01-30 20:18 ` [PULL 05/56] hw/acpi/acpi_dev_interface: Remove unused parameter from AcpiDeviceIfClass::madt_cpu Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 06/56] vhost-user: Correct a reference of TARGET_AARCH64 Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 07/56] hw/pci-host: Use register definitions from PCI standard Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 08/56] virtio-rng-pci: fix migration compat for vectors Michael S. Tsirkin
2023-01-30 20:19 ` Michael S. Tsirkin [this message]
2023-01-30 20:19 ` [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data Michael S. Tsirkin
2023-01-30 20:19 ` Michael S. Tsirkin
2023-01-30 20:19 ` Michael S. Tsirkin
2023-01-31 19:39 ` Jason A. Donenfeld
2023-01-31 21:27 ` Michael S. Tsirkin
2023-01-31 20:54 ` H. Peter Anvin
2023-01-31 21:22 ` Jason A. Donenfeld
2023-02-01 5:40 ` H. Peter Anvin
2023-01-31 23:32 ` Jason A. Donenfeld
2023-01-30 20:19 ` [PULL 12/56] tests: acpi: cleanup arguments to make them more readable Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 13/56] tests: acpi: whitelist DSDT blobs for tests that use pci-bridges Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 14/56] tests: acpi: extend pcihp with nested bridges Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 15/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 16/56] tests: acpi: cleanup use_uefi argument usage Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 17/56] pci_bridge: remove whitespace Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 18/56] x86: acpi: pcihp: clean up duplicate bridge_in_acpi assignment Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 19/56] pci: acpi hotplug: rename x-native-hotplug to x-do-not-expose-native-hotplug-cap Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 20/56] pcihp: piix4: do not call acpi_pcihp_reset() when ACPI PCI hotplug is disabled Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 21/56] pci: acpihp: assign BSEL only to coldplugged bridges Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 22/56] x86: pcihp: fix invalid AML PCNT calls to hotplugged bridges Michael S. Tsirkin
2023-01-30 20:19 ` [PULL 23/56] tests: boot_sector_test: avoid crashing if status is not available yet Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 24/56] tests: acpi: extend bridge tests with hotplugged bridges Michael S. Tsirkin
2023-01-30 20:20 ` Michael S. Tsirkin
2023-01-30 20:20 ` Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 11/56] tests: qtest: print device_add error before failing test Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 26/56] tests: acpi: add reboot cycle to bridge test Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 27/56] tests: acpi: whitelist DSDT before refactoring acpi based PCI hotplug machinery Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 28/56] pcihp: drop pcihp_bridge_en dependency when composing PCNT method Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 29/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 30/56] tests: acpi: whitelist DSDT before refactoring acpi based PCI hotplug machinery Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 31/56] pcihp: compose PCNT callchain right before its user _GPE._E01 Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 32/56] pcihp: do not put empty PCNT in DSDT Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 25/56] tests: boot_sector_test(): make it multi-shot Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 33/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 34/56] whitelist DSDT before adding endpoint devices to bridge testcases Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 35/56] tests: acpi: add endpoint devices to bridges Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 36/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 37/56] x86: pcihp: acpi: prepare slot ignore rule to work with self describing bridges Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 38/56] pci: acpi: wire up AcpiDevAmlIf interface to generic bridge Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 39/56] pcihp: make bridge describe itself using AcpiDevAmlIfClass:build_dev_aml Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 40/56] pci: make sure pci_bus_is_express() won't error out with "discards ‘const’ qualifier" Michael S. Tsirkin
2023-01-30 20:20 ` [PULL 41/56] pcihp: isolate rule whether slot should be described in DSDT Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 42/56] tests: acpi: whitelist DSDT before decoupling PCI hotplug code from basic slots description Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 43/56] pcihp: acpi: decouple hotplug and generic " Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 44/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 45/56] tests: acpi: whitelist DSDT blobs before removing dynamic _DSM on coldplugged bridges Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 46/56] pcihp: acpi: ignore coldplugged bridges when composing hotpluggable slots Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 47/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 48/56] tests: acpi: whitelist DSDT before moving non-hotpluggble slots description from hotplug path Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 49/56] pcihp: generate populated non-hotpluggble slot descriptions on non-hotplug path Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 50/56] tests: acpi: update expected blobs Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 51/56] vhost-user: Skip unnecessary duplicated VHOST_USER_ADD/REM_MEM_REG requests Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 52/56] hw: Use TYPE_PCI_BUS definition where appropriate Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 53/56] tests/qtest/bios-tables-test: Make the test less verbose by default Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 54/56] Revert "vhost-user: Monitor slave channel in vhost_user_read()" Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 55/56] Revert "vhost-user: Introduce nested event loop " Michael S. Tsirkin
2023-01-30 20:21 ` [PULL 56/56] docs/pcie.txt: Replace ioh3420 with pcie-root-port Michael S. Tsirkin
2023-02-02 13:42 ` [PULL 00/56] virtio,pc,pci: features, cleanups, fixes Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230130201810.11518-10-mst@redhat.com \
--to=mst@redhat.com \
--cc=david@redhat.com \
--cc=jasowang@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.