From: "Michael S. Tsirkin" <mst@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Peter Xu <peterx@redhat.com>,
QEMU Stable <qemu-stable@nongnu.org>, Cong Li <coli@redhat.com>,
Eric Auger <eric.auger@redhat.com>,
Jason Wang <jasowang@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>,
Eduardo Habkost <ehabkost@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Subject: [Qemu-devel] [PULL 07/33] intel_iommu: better handling of dmar state switch
Date: Mon, 5 Nov 2018 13:15:36 -0500 [thread overview]
Message-ID: <20181105181353.39804-8-mst@redhat.com> (raw)
In-Reply-To: <20181105181353.39804-1-mst@redhat.com>
From: Peter Xu <peterx@redhat.com>
QEMU is not handling the global DMAR switch well, especially when from
"on" to "off".
Let's first take the example of system reset.
Assuming that a guest has IOMMU enabled. When it reboots, we will drop
all the existing DMAR mappings to handle the system reset, however we'll
still keep the existing memory layouts which has the IOMMU memory region
enabled. So after the reboot and before the kernel reloads again, there
will be no mapping at all for the host device. That's problematic since
any software (for example, SeaBIOS) that runs earlier than the kernel
after the reboot will assume the IOMMU is disabled, so any DMA from the
software will fail.
For example, a guest that boots on an assigned NVMe device might fail to
find the boot device after a system reboot/reset and we'll be able to
observe SeaBIOS errors if we capture the debugging log:
WARNING - Timeout at nvme_wait:144!
Meanwhile, we should see DMAR errors on the host of that NVMe device.
It's the DMA fault that caused a NVMe driver timeout.
The correct fix should be that we do proper switching of device DMA
address spaces when system resets, which will setup correct memory
regions and notify the backend of the devices. This might not affect
much on non-assigned devices since QEMU VT-d emulation will assume a
default passthrough mapping if DMAR is not enabled in the GCMD
register (please refer to vtd_iommu_translate). However that's required
for an assigned devices, since that'll rebuild the correct GPA to HPA
mapping that is needed for any DMA operation during guest bootstrap.
Besides the system reset, we have some other places that might change
the global DMAR status and we'd better do the same thing there. For
example, when we change the state of GCMD register, or the DMAR root
pointer. Do the same refresh for all these places. For these two
places we'll also need to explicitly invalidate the context entry cache
and iotlb cache.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1625173
CC: QEMU Stable <qemu-stable@nongnu.org>
Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
--
v2:
- do the same for GCMD write, or root pointer update [Alex]
- test is carried out by me this time, by observing the
vtd_switch_address_space tracepoint after system reboot
v3:
- rewrite commit message as suggested by Alex
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
hw/i386/intel_iommu.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 1137861a9d..306708eb3b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -37,6 +37,8 @@
#include "kvm_i386.h"
#include "trace.h"
+static void vtd_address_space_refresh_all(IntelIOMMUState *s);
+
static void vtd_define_quad(IntelIOMMUState *s, hwaddr addr, uint64_t val,
uint64_t wmask, uint64_t w1cmask)
{
@@ -1436,7 +1438,7 @@ static void vtd_context_global_invalidate(IntelIOMMUState *s)
vtd_reset_context_cache_locked(s);
}
vtd_iommu_unlock(s);
- vtd_switch_address_space_all(s);
+ vtd_address_space_refresh_all(s);
/*
* From VT-d spec 6.5.2.1, a global context entry invalidation
* should be followed by a IOTLB global invalidation, so we should
@@ -1727,6 +1729,8 @@ static void vtd_handle_gcmd_srtp(IntelIOMMUState *s)
vtd_root_table_setup(s);
/* Ok - report back to driver */
vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_RTPS);
+ vtd_reset_caches(s);
+ vtd_address_space_refresh_all(s);
}
/* Set Interrupt Remap Table Pointer */
@@ -1759,7 +1763,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
}
- vtd_switch_address_space_all(s);
+ vtd_reset_caches(s);
+ vtd_address_space_refresh_all(s);
}
/* Handle Interrupt Remap Enable/Disable */
@@ -3059,6 +3064,12 @@ static void vtd_address_space_unmap_all(IntelIOMMUState *s)
}
}
+static void vtd_address_space_refresh_all(IntelIOMMUState *s)
+{
+ vtd_address_space_unmap_all(s);
+ vtd_switch_address_space_all(s);
+}
+
static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private)
{
memory_region_notify_one((IOMMUNotifier *)private, entry);
@@ -3231,11 +3242,7 @@ static void vtd_reset(DeviceState *dev)
IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
vtd_init(s);
-
- /*
- * When device reset, throw away all mappings and external caches
- */
- vtd_address_space_unmap_all(s);
+ vtd_address_space_refresh_all(s);
}
static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int devfn)
--
MST
next prev parent reply other threads:[~2018-11-05 18:15 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-05 18:14 [Qemu-devel] [PULL 00/33] pci, pc, virtio: fixes, features Michael S. Tsirkin
2018-11-05 18:14 ` [Qemu-devel] [PULL 01/33] tests: Move tests/acpi-test-data/ to tests/data/acpi/ Michael S. Tsirkin
2018-11-06 13:27 ` Philippe Mathieu-Daudé
2018-11-05 18:14 ` [Qemu-devel] [PULL 02/33] tests: Move tests/hex-loader-check-data/ to tests/data/hex-loader/ Michael S. Tsirkin
2018-11-06 13:27 ` Philippe Mathieu-Daudé
2018-11-06 14:13 ` Michael S. Tsirkin
2018-11-06 15:15 ` Philippe Mathieu-Daudé
2018-11-06 15:31 ` Peter Maydell
2018-11-06 16:02 ` Michael S. Tsirkin
2018-11-06 16:08 ` Michael S. Tsirkin
2018-11-06 16:16 ` Laurent Vivier
2018-11-08 10:24 ` Stefan Hajnoczi
2018-11-08 14:30 ` Laurent Vivier
2018-11-08 15:15 ` Philippe Mathieu-Daudé
2018-11-05 18:14 ` [Qemu-devel] [PULL 03/33] configure: Rename FILES variable to LINKS Michael S. Tsirkin
2018-11-06 13:27 ` Philippe Mathieu-Daudé
2018-11-05 18:14 ` [Qemu-devel] [PULL 04/33] configure: Use LINKS loop for all build tree symlinks Michael S. Tsirkin
2018-11-05 18:15 ` [Qemu-devel] [PULL 05/33] virtio-blk: fix comment for virtio_blk_rw_complete Michael S. Tsirkin
2018-11-06 3:17 ` Dongli Zhang
2018-11-06 3:44 ` Michael S. Tsirkin
2018-11-05 18:15 ` [Qemu-devel] [PULL 06/33] intel_iommu: introduce vtd_reset_caches() Michael S. Tsirkin
2018-11-05 18:15 ` Michael S. Tsirkin [this message]
2018-11-05 18:15 ` [Qemu-devel] [PULL 08/33] intel_iommu: move ce fetching out when sync shadow Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 09/33] intel_iommu: handle invalid ce for shadow sync Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 10/33] vhost-user-blk: start vhost when guest kicks Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 11/33] x86_iommu: move the kernel-irqchip check in common code Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 12/33] x86_iommu: move vtd_generate_msi_message in common file Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 13/33] x86_iommu/amd: remove V=1 check from amdvi_validate_dte() Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 14/33] x86_iommu/amd: make the address space naming consistent with intel-iommu Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 15/33] x86_iommu/amd: Prepare for interrupt remap support Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 16/33] x86_iommu/amd: Add interrupt remap support when VAPIC is not enabled Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 17/33] i386: acpi: add IVHD device entry for IOAPIC Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 18/33] x86_iommu/amd: Add interrupt remap support when VAPIC is enabled Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 19/33] x86_iommu/amd: Enable Guest virtual APIC support Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 20/33] MAINTAINERS: list "tests/acpi-test-data" files in ACPI/SMBIOS section Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 21/33] pci-testdev: add optional memory bar Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 22/33] hw/pci-host/x86: extract get_pci_hole64_start_value() helpers Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 23/33] hw/pci-host/x86: extend the 64-bit PCI hole relative to the fw-assigned base Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 24/33] bios-tables-test: prepare expected files for mmio64 Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 25/33] tests/bios-tables-test: add 64-bit PCI MMIO aperture round-up test on Q35 Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 26/33] hw/pci-bridge/xio3130: Remove unused functions Michael S. Tsirkin
2018-11-05 18:16 ` [Qemu-devel] [PULL 27/33] hw/pci-bridge/ioh3420: Remove unuseful header Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 28/33] hw/pci: Add missing include Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 29/33] pci_bridge: fix typo in comment Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 30/33] i440fx: use ARRAY_SIZE for pam_regions Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 31/33] piix: use TYPE_FOO constants than string constats Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 32/33] piix_pci: fix i440fx data sheet link Michael S. Tsirkin
2018-11-05 18:17 ` [Qemu-devel] [PULL 33/33] vhost-scsi: prevent using uninitialized vqs Michael S. Tsirkin
2018-11-06 8:18 ` [Qemu-devel] [PULL 00/33] pci, pc, virtio: fixes, features Thomas Huth
2018-11-06 11:07 ` Michael S. Tsirkin
2018-11-06 11:20 ` Peter Maydell
2018-11-06 12:39 ` Peter Maydell
2018-11-06 15:52 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181105181353.39804-8-mst@redhat.com \
--to=mst@redhat.com \
--cc=coli@redhat.com \
--cc=ehabkost@redhat.com \
--cc=eric.auger@redhat.com \
--cc=jasowang@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).