From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Eric Auger <eric.auger@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
Zhenzhong Duan <zhenzhong.duan@intel.com>
Subject: [PULL 46/66] virtio-iommu: Fix 64kB host page size VFIO device assignment
Date: Mon, 10 Jul 2023 19:04:38 -0400 [thread overview]
Message-ID: <94df5b2180d61fb2ee2b04cc007981e58b6479a9.1689030052.git.mst@redhat.com> (raw)
In-Reply-To: <cover.1689030052.git.mst@redhat.com>
From: Eric Auger <eric.auger@redhat.com>
When running on a 64kB page size host and protecting a VFIO device
with the virtio-iommu, qemu crashes with this kind of message:
qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible
with mask 0x20010000
qemu: hardware error: vfio: DMA mapping failed, unable to continue
This is due to the fact the IOMMU MR corresponding to the VFIO device
is enabled very late on domain attach, after the machine init.
The device reports a minimal 64kB page size but it is too late to be
applied. virtio_iommu_set_page_size_mask() fails and this causes
vfio_listener_region_add() to end up with hw_error();
To work around this issue, we transiently enable the IOMMU MR on
machine init to collect the page size requirements and then restore
the bypass state.
Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20230705165118.28194-2-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
include/hw/virtio/virtio-iommu.h | 2 ++
hw/virtio/virtio-iommu.c | 31 +++++++++++++++++++++++++++++--
hw/virtio/trace-events | 1 +
3 files changed, 32 insertions(+), 2 deletions(-)
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h
index 2ad5ee320b..a93fc5383e 100644
--- a/include/hw/virtio/virtio-iommu.h
+++ b/include/hw/virtio/virtio-iommu.h
@@ -61,6 +61,8 @@ struct VirtIOIOMMU {
QemuRecMutex mutex;
GTree *endpoints;
bool boot_bypass;
+ Notifier machine_done;
+ bool granule_frozen;
};
#endif
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1bbad23f4a..8d0c5e3f32 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -25,6 +25,7 @@
#include "hw/virtio/virtio.h"
#include "sysemu/kvm.h"
#include "sysemu/reset.h"
+#include "sysemu/sysemu.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "trace.h"
@@ -1106,12 +1107,12 @@ static int virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
}
/*
- * After the machine is finalized, we can't change the mask anymore. If by
+ * Once the granule is frozen we can't change the mask anymore. If by
* chance the hotplugged device supports the same granule, we can still
* accept it. Having a different masks is possible but the guest will use
* sub-optimal block sizes, so warn about it.
*/
- if (phase_check(PHASE_MACHINE_READY)) {
+ if (s->granule_frozen) {
int new_granule = ctz64(new_mask);
int cur_granule = ctz64(cur_mask);
@@ -1146,6 +1147,28 @@ static void virtio_iommu_system_reset(void *opaque)
}
+static void virtio_iommu_freeze_granule(Notifier *notifier, void *data)
+{
+ VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done);
+ int granule;
+
+ if (likely(s->config.bypass)) {
+ /*
+ * Transient IOMMU MR enable to collect page_size_mask requirements
+ * through memory_region_iommu_set_page_size_mask() called by
+ * VFIO region_add() callback
+ */
+ s->config.bypass = false;
+ virtio_iommu_switch_address_space_all(s);
+ /* restore default */
+ s->config.bypass = true;
+ virtio_iommu_switch_address_space_all(s);
+ }
+ s->granule_frozen = true;
+ granule = ctz64(s->config.page_size_mask);
+ trace_virtio_iommu_freeze_granule(BIT(granule));
+}
+
static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
{
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -1189,6 +1212,9 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
error_setg(errp, "VIRTIO-IOMMU is not attached to any PCI bus!");
}
+ s->machine_done.notify = virtio_iommu_freeze_granule;
+ qemu_add_machine_init_done_notifier(&s->machine_done);
+
qemu_register_reset(virtio_iommu_system_reset, s);
}
@@ -1198,6 +1224,7 @@ static void virtio_iommu_device_unrealize(DeviceState *dev)
VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
qemu_unregister_reset(virtio_iommu_system_reset, s);
+ qemu_remove_machine_init_done_notifier(&s->machine_done);
g_hash_table_destroy(s->as_by_busptr);
if (s->domains) {
diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 4e39ed8a95..7109cf1a3b 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -133,6 +133,7 @@ virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "m
virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s"
virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s"
virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
+virtio_iommu_freeze_granule(uint64_t page_size_mask) "granule set to 0x%"PRIx64
# virtio-mem.c
virtio_mem_send_response(uint16_t type) "type=%" PRIu16
--
MST
next prev parent reply other threads:[~2023-07-10 23:09 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-10 23:02 [PULL 00/66] pc,pci,virtio: cleanups, fixes, features Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 01/66] vdpa: Remove status in reset tracing Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 02/66] vhost: register and change IOMMU flag depending on Device-TLB state Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 03/66] virtio-net: pass Device-TLB enable/disable events to vhost Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 04/66] virtio-gpu: refactor generate_edid function to virtio_gpu_base Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 05/66] docs: vhost-user-gpu: add protocol changes for EDID Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 06/66] contrib/vhost-user-gpu: implement get_edid feature Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 07/66] vhost-user-gpu: implement get_edid frontend feature Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 08/66] hw/virtio: Add boilerplate for vhost-user-scmi device Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 09/66] hw/virtio: Add vhost-user-scmi-pci boilerplate Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 10/66] tests/qtest: enable tests for virtio-scmi Michael S. Tsirkin
2023-07-18 12:42 ` Thomas Huth
2023-07-18 12:55 ` Milan Zamazal
2023-07-19 13:02 ` Thomas Huth
2023-07-19 19:56 ` Milan Zamazal
2023-07-20 8:40 ` Thomas Huth
2023-07-20 9:25 ` Milan Zamazal
2023-07-19 20:51 ` Fabiano Rosas
2023-07-20 8:34 ` Milan Zamazal
2023-07-10 23:02 ` [PULL 11/66] machine: Add helpers to get cores/threads per socket Michael S. Tsirkin
2023-07-10 23:02 ` [PULL 12/66] hw/smbios: Fix smbios_smp_sockets caculation Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 13/66] hw/smbios: Fix thread count in type4 Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 14/66] hw/smbios: Fix core " Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 15/66] vhost-user: Change one_time to per_device request Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 16/66] vhost-user: Make RESET_DEVICE a per device message Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 17/66] hw/i386/pc_q35: Resolve redundant q35_host variable Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 18/66] hw/pci-host/q35: Fix double, contradicting .endianness assignment Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 19/66] hw/pci-host/q35: Initialize PCMachineState::bus in board code Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 20/66] hw/pci/pci_host: Introduce PCI_HOST_BYPASS_IOMMU macro Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 21/66] hw/pci-host/q35: Initialize PCI_HOST_BYPASS_IOMMU property from board code Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 22/66] hw/pci-host/q35: Make some property name macros reusable by i440fx Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 23/66] hw/i386/pc_piix: Turn some local variables into initializers Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 24/66] hw/pci-host/i440fx: Add "i440fx" child property in board code Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 25/66] hw/pci-host/i440fx: Replace magic values by existing constants Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 26/66] hw/pci-host/i440fx: Have common names for some local variables Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 27/66] hw/pci-host/i440fx: Move i440fx_realize() into PCII440FXState section Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 28/66] hw/pci-host/i440fx: Make MemoryRegion pointers accessible as properties Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 29/66] hw/pci-host/i440fx: Add PCI_HOST_PROP_IO_MEM property Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 30/66] hw/pci-host/i440fx: Add PCI_HOST_{ABOVE, BELOW}_4G_MEM_SIZE properties Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 31/66] hw/pci-host/i440fx: Add I440FX_HOST_PROP_PCI_TYPE property Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 32/66] hw/pci-host/i440fx: Resolve i440fx_init() Michael S. Tsirkin
2023-07-10 23:03 ` [PULL 33/66] hw/i386/pc_piix: Move i440fx' realize near its qdev_new() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 34/66] hw/pci/pci: Remove multifunction parameter from pci_create_simple_multifunction() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 35/66] hw/pci/pci: Remove multifunction parameter from pci_new_multifunction() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 36/66] pcie: Release references of virtual functions Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 37/66] vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 38/66] vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 39/66] vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_offloads() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 40/66] vhost-vdpa: mute unaligned memory error report Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 41/66] tests/acpi: allow changes in DSDT.noacpihp table blob Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 42/66] tests/acpi/bios-tables-test: use the correct slot on the pcie-root-port Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 43/66] tests/acpi/bios-tables-test: update acpi blob q35/DSDT.noacpihp Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 44/66] tests/qtest/hd-geo-test: fix incorrect pcie-root-port usage and simplify test Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 45/66] hw/pci: warn when PCIe device is plugged into non-zero slot of downstream port Michael S. Tsirkin
2023-07-10 23:04 ` Michael S. Tsirkin [this message]
2023-07-17 10:50 ` [PULL 46/66] virtio-iommu: Fix 64kB host page size VFIO device assignment Peter Maydell
2023-07-17 11:51 ` Michael S. Tsirkin
2023-07-17 16:57 ` Eric Auger
2023-07-17 16:56 ` Eric Auger
2023-07-17 17:07 ` Peter Maydell
2023-07-17 17:20 ` Eric Auger
2023-07-10 23:04 ` [PULL 47/66] virtio-iommu: Rework the traces in virtio_iommu_set_page_size_mask() Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 48/66] pcie: Add hotplug detect state register to cmask Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 49/66] vdpa: Fix possible use-after-free for VirtQueueElement Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 50/66] include: attempt to document device_class_set_props Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 51/66] include/hw: document the device_class_set_parent_* fns Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 52/66] hw/virtio: fix typo in VIRTIO_CONFIG_IRQ_IDX comments Michael S. Tsirkin
2023-07-10 23:04 ` [PULL 53/66] include/hw/virtio: document virtio_notify_config Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 54/66] include/hw/virtio: add kerneldoc for virtio_init Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 55/66] include/hw/virtio: document some more usage of notifiers Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 56/66] pcie: Use common ARI next function number Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 57/66] pcie: Specify 0 for ARI next function numbers Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 58/66] vdpa: Use iovec for vhost_vdpa_net_load_cmd() Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 59/66] vdpa: Restore MAC address filtering state Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 60/66] vdpa: Restore packet receive filtering state relative with _F_CTRL_RX feature Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 61/66] vhost: Fix false positive out-of-bounds Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 62/66] vdpa: Accessing CVQ header through its structure Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 63/66] vdpa: Avoid forwarding large CVQ command failures Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 64/66] vdpa: Allow VIRTIO_NET_F_CTRL_RX in SVQ Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 65/66] vdpa: Restore packet receive filtering state relative with _F_CTRL_RX_EXTRA feature Michael S. Tsirkin
2023-07-10 23:05 ` [PULL 66/66] vdpa: Allow VIRTIO_NET_F_CTRL_RX_EXTRA in SVQ Michael S. Tsirkin
2023-07-11 10:07 ` [PULL 00/66] pc,pci,virtio: cleanups, fixes, features Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=94df5b2180d61fb2ee2b04cc007981e58b6479a9.1689030052.git.mst@redhat.com \
--to=mst@redhat.com \
--cc=eric.auger@redhat.com \
--cc=jean-philippe@linaro.org \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).