From: David Hildenbrand <david@redhat.com>
To: qemu-devel@nongnu.org
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
Wei Yang <richard.weiyang@linux.alibaba.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
David Hildenbrand <david@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Peter Xu <peterx@redhat.com>, Auger Eric <eric.auger@redhat.com>,
Alex Williamson <alex.williamson@redhat.com>,
teawater <teawaterz@linux.alibaba.com>,
Igor Mammedov <imammedo@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Marek Kedzierski <mkedzier@redhat.com>
Subject: [PATCH v6 05/12] vfio: Support for RamDiscardMgr in the !vIOMMU case
Date: Mon, 22 Feb 2021 12:57:01 +0100 [thread overview]
Message-ID: <20210222115708.7623-6-david@redhat.com> (raw)
In-Reply-To: <20210222115708.7623-1-david@redhat.com>
Implement support for RamDiscardMgr, to prepare for virtio-mem
support. Instead of mapping the whole memory section, we only map
"populated" parts and update the mapping when notified about
discarding/population of memory via the RamDiscardListener. Similarly, when
syncing the dirty bitmaps, sync only the actually mapped (populated) parts
by replaying via the notifier.
Using virtio-mem with vfio is still blocked via
ram_block_discard_disable()/ram_block_discard_require() after this patch.
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Auger Eric <eric.auger@redhat.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: teawater <teawaterz@linux.alibaba.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
hw/vfio/common.c | 203 ++++++++++++++++++++++++++++++++++
include/hw/vfio/vfio-common.h | 12 ++
2 files changed, 215 insertions(+)
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 6ff1daa763..f68370de6c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -654,6 +654,139 @@ out:
rcu_read_unlock();
}
+static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl,
+ const MemoryRegion *mr,
+ ram_addr_t offset, ram_addr_t size)
+{
+ VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener,
+ listener);
+ const hwaddr mr_start = MAX(offset, vrdl->offset_within_region);
+ const hwaddr mr_end = MIN(offset + size,
+ vrdl->offset_within_region + vrdl->size);
+ const hwaddr iova = mr_start - vrdl->offset_within_region +
+ vrdl->offset_within_address_space;
+ int ret;
+
+ if (mr_start >= mr_end) {
+ return;
+ }
+
+ /* Unmap with a single call. */
+ ret = vfio_dma_unmap(vrdl->container, iova, mr_end - mr_start, NULL);
+ if (ret) {
+ error_report("%s: vfio_dma_unmap() failed: %s", __func__,
+ strerror(-ret));
+ }
+}
+
+static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl,
+ const MemoryRegion *mr,
+ ram_addr_t offset, ram_addr_t size)
+{
+ VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener,
+ listener);
+ const hwaddr mr_end = MIN(offset + size,
+ vrdl->offset_within_region + vrdl->size);
+ hwaddr mr_start = MAX(offset, vrdl->offset_within_region);
+ hwaddr mr_next, iova;
+ void *vaddr;
+ int ret;
+
+ /*
+ * Map in (aligned within memory region) minimum granularity, so we can
+ * unmap in minimum granularity later.
+ */
+ for (; mr_start < mr_end; mr_start = mr_next) {
+ mr_next = ROUND_UP(mr_start + 1, vrdl->granularity);
+ mr_next = MIN(mr_next, mr_end);
+
+ iova = mr_start - vrdl->offset_within_region +
+ vrdl->offset_within_address_space;
+ vaddr = memory_region_get_ram_ptr(vrdl->mr) + mr_start;
+
+ ret = vfio_dma_map(vrdl->container, iova, mr_next - mr_start,
+ vaddr, mr->readonly);
+ if (ret) {
+ /* Rollback */
+ vfio_ram_discard_notify_discard(rdl, mr, offset, size);
+ return ret;
+ }
+ }
+ return 0;
+}
+
+static void vfio_ram_discard_notify_discard_all(RamDiscardListener *rdl,
+ const MemoryRegion *mr)
+{
+ VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener,
+ listener);
+ int ret;
+
+ /* Unmap with a single call. */
+ ret = vfio_dma_unmap(vrdl->container, vrdl->offset_within_address_space,
+ vrdl->size, NULL);
+ if (ret) {
+ error_report("%s: vfio_dma_unmap() failed: %s", __func__,
+ strerror(-ret));
+ }
+}
+
+static void vfio_register_ram_discard_notifier(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ RamDiscardMgr *rdm = memory_region_get_ram_discard_mgr(section->mr);
+ RamDiscardMgrClass *rdmc = RAM_DISCARD_MGR_GET_CLASS(rdm);
+ VFIORamDiscardListener *vrdl;
+
+ vrdl = g_new0(VFIORamDiscardListener, 1);
+ vrdl->container = container;
+ vrdl->mr = section->mr;
+ vrdl->offset_within_region = section->offset_within_region;
+ vrdl->offset_within_address_space = section->offset_within_address_space;
+ vrdl->size = int128_get64(section->size);
+ vrdl->granularity = rdmc->get_min_granularity(rdm, section->mr);
+
+ g_assert(vrdl->granularity && is_power_of_2(vrdl->granularity));
+ g_assert(vrdl->granularity >= 1 << ctz64(container->pgsizes));
+
+ /* Ignore some corner cases not relevant in practice. */
+ g_assert(QEMU_IS_ALIGNED(vrdl->offset_within_region, TARGET_PAGE_SIZE));
+ g_assert(QEMU_IS_ALIGNED(vrdl->offset_within_address_space,
+ TARGET_PAGE_SIZE));
+ g_assert(QEMU_IS_ALIGNED(vrdl->size, TARGET_PAGE_SIZE));
+
+ ram_discard_listener_init(&vrdl->listener,
+ vfio_ram_discard_notify_populate,
+ vfio_ram_discard_notify_discard,
+ vfio_ram_discard_notify_discard_all);
+ rdmc->register_listener(rdm, section->mr, &vrdl->listener);
+ QLIST_INSERT_HEAD(&container->vrdl_list, vrdl, next);
+}
+
+static void vfio_unregister_ram_discard_listener(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ RamDiscardMgr *rdm = memory_region_get_ram_discard_mgr(section->mr);
+ RamDiscardMgrClass *rdmc = RAM_DISCARD_MGR_GET_CLASS(rdm);
+ VFIORamDiscardListener *vrdl = NULL;
+
+ QLIST_FOREACH(vrdl, &container->vrdl_list, next) {
+ if (vrdl->mr == section->mr &&
+ vrdl->offset_within_region == section->offset_within_region) {
+ break;
+ }
+ }
+
+ if (!vrdl) {
+ hw_error("vfio: Trying to unregister missing RAM discard listener");
+ }
+
+ rdmc->unregister_listener(rdm, section->mr, &vrdl->listener);
+ QLIST_REMOVE(vrdl, next);
+
+ g_free(vrdl);
+}
+
static void vfio_listener_region_add(MemoryListener *listener,
MemoryRegionSection *section)
{
@@ -814,6 +947,16 @@ static void vfio_listener_region_add(MemoryListener *listener,
/* Here we assume that memory_region_is_ram(section->mr)==true */
+ /*
+ * For RAM memory regions with a RamDiscardMgr, we only want to map the
+ * actually populated parts - and update the mapping whenever we're notified
+ * about changes.
+ */
+ if (memory_region_has_ram_discard_mgr(section->mr)) {
+ vfio_register_ram_discard_notifier(container, section);
+ return;
+ }
+
vaddr = memory_region_get_ram_ptr(section->mr) +
section->offset_within_region +
(iova - section->offset_within_address_space);
@@ -950,6 +1093,10 @@ static void vfio_listener_region_del(MemoryListener *listener,
pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1;
try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask));
+ } else if (memory_region_has_ram_discard_mgr(section->mr)) {
+ vfio_unregister_ram_discard_listener(container, section);
+ /* Unregistering will trigger an unmap. */
+ try_unmap = false;
}
if (try_unmap) {
@@ -1077,6 +1224,59 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
rcu_read_unlock();
}
+static int vfio_ram_discard_notify_dirty_bitmap(RamDiscardListener *rdl,
+ const MemoryRegion *mr,
+ ram_addr_t offset,
+ ram_addr_t size)
+{
+ VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener,
+ listener);
+ const hwaddr mr_start = MAX(offset, vrdl->offset_within_region);
+ const hwaddr mr_end = MIN(offset + size,
+ vrdl->offset_within_region + vrdl->size);
+ const hwaddr iova = mr_start - vrdl->offset_within_region +
+ vrdl->offset_within_address_space;
+ ram_addr_t ram_addr;
+ int ret;
+
+ if (mr_start >= mr_end) {
+ return 0;
+ }
+
+ /*
+ * Sync the whole mapped region (spanning multiple individual mappings)
+ * in one go.
+ */
+ ram_addr = memory_region_get_ram_addr(vrdl->mr) + mr_start;
+ ret = vfio_get_dirty_bitmap(vrdl->container, iova, mr_end - mr_start,
+ ram_addr);
+ return ret;
+}
+
+static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container,
+ MemoryRegionSection *section)
+{
+ RamDiscardMgr *rdm = memory_region_get_ram_discard_mgr(section->mr);
+ RamDiscardMgrClass *rdmc = RAM_DISCARD_MGR_GET_CLASS(rdm);
+ VFIORamDiscardListener tmp_vrdl, *vrdl = NULL;
+
+ QLIST_FOREACH(vrdl, &container->vrdl_list, next) {
+ if (vrdl->mr == section->mr &&
+ vrdl->offset_within_region == section->offset_within_region) {
+ break;
+ }
+ }
+
+ if (!vrdl) {
+ hw_error("vfio: Trying to sync missing RAM discard listener");
+ }
+
+ tmp_vrdl = *vrdl;
+ ram_discard_listener_init(&tmp_vrdl.listener,
+ vfio_ram_discard_notify_dirty_bitmap, NULL, NULL);
+ return rdmc->replay_populated(rdm, section->mr, &tmp_vrdl.listener);
+}
+
static int vfio_sync_dirty_bitmap(VFIOContainer *container,
MemoryRegionSection *section)
{
@@ -1108,6 +1308,8 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container,
}
}
return 0;
+ } else if (memory_region_has_ram_discard_mgr(section->mr)) {
+ return vfio_sync_ram_discard_listener_dirty_bitmap(container, section);
}
ram_addr = memory_region_get_ram_addr(section->mr) +
@@ -1737,6 +1939,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
container->dirty_pages_supported = false;
QLIST_INIT(&container->giommu_list);
QLIST_INIT(&container->hostwin_list);
+ QLIST_INIT(&container->vrdl_list);
ret = vfio_init_container(container, group->fd, errp);
if (ret) {
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 6141162d7a..af6f8d1b22 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -91,6 +91,7 @@ typedef struct VFIOContainer {
QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
QLIST_HEAD(, VFIOGroup) group_list;
+ QLIST_HEAD(, VFIORamDiscardListener) vrdl_list;
QLIST_ENTRY(VFIOContainer) next;
} VFIOContainer;
@@ -102,6 +103,17 @@ typedef struct VFIOGuestIOMMU {
QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
} VFIOGuestIOMMU;
+typedef struct VFIORamDiscardListener {
+ VFIOContainer *container;
+ MemoryRegion *mr;
+ hwaddr offset_within_region;
+ hwaddr offset_within_address_space;
+ hwaddr size;
+ uint64_t granularity;
+ RamDiscardListener listener;
+ QLIST_ENTRY(VFIORamDiscardListener) next;
+} VFIORamDiscardListener;
+
typedef struct VFIOHostDMAWindow {
hwaddr min_iova;
hwaddr max_iova;
--
2.29.2
next prev parent reply other threads:[~2021-02-22 12:04 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-22 11:56 [PATCH v6 00/12] virtio-mem: vfio support David Hildenbrand
2021-02-22 11:56 ` [PATCH v6 01/12] memory: Introduce RamDiscardMgr for RAM memory regions David Hildenbrand
2021-02-22 13:27 ` Paolo Bonzini
2021-02-22 14:03 ` David Hildenbrand
2021-02-22 14:18 ` Paolo Bonzini
2021-02-22 14:53 ` David Hildenbrand
2021-02-22 17:37 ` Paolo Bonzini
2021-02-22 17:48 ` David Hildenbrand
2021-02-22 19:43 ` David Hildenbrand
2021-02-23 10:50 ` David Hildenbrand
2021-02-23 15:03 ` Paolo Bonzini
2021-02-23 15:09 ` David Hildenbrand
2021-02-22 11:56 ` [PATCH v6 02/12] virtio-mem: Factor out traversing unplugged ranges David Hildenbrand
2021-02-22 11:56 ` [PATCH v6 03/12] virtio-mem: Don't report errors when ram_block_discard_range() fails David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 04/12] virtio-mem: Implement RamDiscardMgr interface David Hildenbrand
2021-02-22 11:57 ` David Hildenbrand [this message]
2021-02-22 13:20 ` [PATCH v6 05/12] vfio: Support for RamDiscardMgr in the !vIOMMU case Paolo Bonzini
2021-02-22 14:43 ` David Hildenbrand
2021-02-22 17:29 ` Paolo Bonzini
2021-02-22 17:34 ` David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 06/12] vfio: Query and store the maximum number of possible DMA mappings David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 07/12] vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 08/12] vfio: Support for RamDiscardMgr in the vIOMMU case David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 09/12] softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require) David Hildenbrand
2021-02-22 13:14 ` Paolo Bonzini
2021-02-22 13:33 ` David Hildenbrand
2021-02-22 14:02 ` Paolo Bonzini
2021-02-22 15:38 ` David Hildenbrand
2021-02-22 17:32 ` Paolo Bonzini
2021-02-23 9:02 ` David Hildenbrand
2021-02-23 15:02 ` Paolo Bonzini
2021-02-22 11:57 ` [PATCH v6 10/12] softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 11/12] virtio-mem: Require only coordinated discards David Hildenbrand
2021-02-22 11:57 ` [PATCH v6 12/12] vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210222115708.7623-6-david@redhat.com \
--to=david@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eric.auger@redhat.com \
--cc=imammedo@redhat.com \
--cc=mkedzier@redhat.com \
--cc=mst@redhat.com \
--cc=pankaj.gupta.linux@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richard.weiyang@linux.alibaba.com \
--cc=teawaterz@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.