qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [QEMU][PATCHv2 0/8] Xen: support grant mappings.
@ 2023-10-25 21:24 Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices Vikram Garhwal
                   ` (8 more replies)
  0 siblings, 9 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: sstabellini, vikram.garhwal, jgross

Hi,
This patch series add support for grant mappings as a pseudo RAM region for Xen.

Enabling grant mappings patches(first 6) are written by Juergen in 2021.

QEMU Virtio device provides an emulated backends for Virtio frontned devices
in Xen.
Please set "iommu_platform=on" option when invoking QEMU. As this will set
VIRTIO_F_ACCESS_PLATFORM feature which will be used by virtio frontend in Xen
to know whether backend supports grants or not.

Changelog:
v1->v2:
    Split patch 2/7 to keep phymem.c changes in a separate.
    In patch "xen: add map and unmap callbacks for grant" add check for total
        allowed grant < XEN_MAX_VIRTIO_GRANTS.
    Fix formatting issues and re-based with master latest.

Regards,
Vikram
Juergen Gross (6):
  xen: when unplugging emulated devices skip virtio devices
  xen: add pseudo RAM region for grant mappings
  softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length()
  xen: let xen_ram_addr_from_mapcache() return -1 in case of not found
    entry
  memory: add MemoryRegion map and unmap callbacks
  xen: add map and unmap callbacks for grant region

Vikram Garhwal (2):
  softmmu: physmem: Split ram_block_add()
  hw: arm: Add grant mapping.

 docs/system/i386/xen.rst        |   3 -
 hw/arm/xen_arm.c                |   3 +
 hw/i386/xen/xen-hvm.c           |   3 +
 hw/i386/xen/xen_platform.c      |  10 +-
 hw/xen/xen-hvm-common.c         |   4 +-
 hw/xen/xen-mapcache.c           | 214 ++++++++++++++++++++++++++++++--
 include/exec/memory.h           |  21 ++++
 include/exec/ram_addr.h         |   1 +
 include/hw/xen/xen-hvm-common.h |   2 +
 include/hw/xen/xen_pvdev.h      |   3 +
 include/sysemu/xen-mapcache.h   |   3 +
 system/physmem.c                | 181 ++++++++++++++++-----------
 12 files changed, 358 insertions(+), 90 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-25 23:22   ` David Woodhouse
  2023-10-25 21:24 ` [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add() Vikram Garhwal
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

From: Juergen Gross <jgross@suse.com>

Virtio devices should never be unplugged at boot time, as they are
similar to pci passthrough devices.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 docs/system/i386/xen.rst   |  3 ---
 hw/i386/xen/xen_platform.c | 10 ++++++++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/docs/system/i386/xen.rst b/docs/system/i386/xen.rst
index f06765e88c..b86d57af6e 100644
--- a/docs/system/i386/xen.rst
+++ b/docs/system/i386/xen.rst
@@ -52,9 +52,6 @@ It is necessary to use the pc machine type, as the q35 machine uses AHCI instead
 of legacy IDE, and AHCI disks are not unplugged through the Xen PV unplug
 mechanism.
 
-VirtIO devices can also be used; Linux guests may need to be dissuaded from
-umplugging them by adding 'xen_emul_unplug=never' on their command line.
-
 Properties
 ----------
 
diff --git a/hw/i386/xen/xen_platform.c b/hw/i386/xen/xen_platform.c
index 17457ff3de..0187b73eeb 100644
--- a/hw/i386/xen/xen_platform.c
+++ b/hw/i386/xen/xen_platform.c
@@ -28,6 +28,7 @@
 #include "hw/ide/pci.h"
 #include "hw/pci/pci.h"
 #include "migration/vmstate.h"
+#include "hw/virtio/virtio-bus.h"
 #include "net/net.h"
 #include "trace.h"
 #include "sysemu/xen.h"
@@ -129,10 +130,11 @@ static bool pci_device_is_passthrough(PCIDevice *d)
 
 static void unplug_nic(PCIBus *b, PCIDevice *d, void *o)
 {
-    /* We have to ignore passthrough devices */
+    /* We have to ignore passthrough devices  and virtio devices. */
     if (pci_get_word(d->config + PCI_CLASS_DEVICE) ==
             PCI_CLASS_NETWORK_ETHERNET
-            && !pci_device_is_passthrough(d)) {
+            && !pci_device_is_passthrough(d)
+            && !qdev_get_child_bus(&d->qdev, TYPE_VIRTIO_BUS)) {
         object_unparent(OBJECT(d));
     }
 }
@@ -208,6 +210,10 @@ static void unplug_disks(PCIBus *b, PCIDevice *d, void *opaque)
     /* We have to ignore passthrough devices */
     if (pci_device_is_passthrough(d))
         return;
+    /* Ignore virtio devices */
+    if (qdev_get_child_bus(&d->qdev, TYPE_VIRTIO_BUS)) {
+        return;
+    }
 
     switch (pci_get_word(d->config + PCI_CLASS_DEVICE)) {
     case PCI_CLASS_STORAGE_IDE:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add()
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-26  1:25   ` Stefano Stabellini
  2023-10-25 21:24 ` [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings Vikram Garhwal
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Paolo Bonzini, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé

Extract ram block list update to a new function ram_block_add_list(). This is
done to support grant mappings which adds a memory region for granted memory and
updates the ram_block list.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 include/exec/ram_addr.h |  1 +
 system/physmem.c        | 62 ++++++++++++++++++++++++++---------------
 2 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 90676093f5..c0b5f9a7d0 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -139,6 +139,7 @@ void qemu_ram_free(RAMBlock *block);
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
 
 void qemu_ram_msync(RAMBlock *block, ram_addr_t start, ram_addr_t length);
+void ram_block_add_list(RAMBlock *new_block);
 
 /* Clear whole block of mem */
 static inline void qemu_ram_block_writeback(RAMBlock *block)
diff --git a/system/physmem.c b/system/physmem.c
index fc2b0fee01..7a7f95b8b9 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -1803,12 +1803,47 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
     }
 }
 
+static void ram_block_add_list_locked(RAMBlock *new_block)
+ {
+     RAMBlock *block;
+     RAMBlock *last_block = NULL;
+
+    /*
+     * Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
+     * QLIST (which has an RCU-friendly variant) does not have insertion at
+     * tail, so save the last element in last_block.
+     */
+    RAMBLOCK_FOREACH(block) {
+        last_block = block;
+        if (block->max_length < new_block->max_length) {
+            break;
+        }
+    }
+    if (block) {
+        QLIST_INSERT_BEFORE_RCU(block, new_block, next);
+    } else if (last_block) {
+        QLIST_INSERT_AFTER_RCU(last_block, new_block, next);
+    } else { /* list is empty */
+        QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
+    }
+    ram_list.mru_block = NULL;
+
+    /* Write list before version */
+    smp_wmb();
+    ram_list.version++;
+}
+
+void ram_block_add_list(RAMBlock *new_block)
+{
+    qemu_mutex_lock_ramlist();
+    ram_block_add_list_locked(new_block);
+    qemu_mutex_unlock_ramlist();
+}
+
 static void ram_block_add(RAMBlock *new_block, Error **errp)
 {
     const bool noreserve = qemu_ram_is_noreserve(new_block);
     const bool shared = qemu_ram_is_shared(new_block);
-    RAMBlock *block;
-    RAMBlock *last_block = NULL;
     ram_addr_t old_ram_size, new_ram_size;
     Error *err = NULL;
 
@@ -1846,28 +1881,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
     if (new_ram_size > old_ram_size) {
         dirty_memory_extend(old_ram_size, new_ram_size);
     }
-    /* Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
-     * QLIST (which has an RCU-friendly variant) does not have insertion at
-     * tail, so save the last element in last_block.
-     */
-    RAMBLOCK_FOREACH(block) {
-        last_block = block;
-        if (block->max_length < new_block->max_length) {
-            break;
-        }
-    }
-    if (block) {
-        QLIST_INSERT_BEFORE_RCU(block, new_block, next);
-    } else if (last_block) {
-        QLIST_INSERT_AFTER_RCU(last_block, new_block, next);
-    } else { /* list is empty */
-        QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
-    }
-    ram_list.mru_block = NULL;
 
-    /* Write list before version */
-    smp_wmb();
-    ram_list.version++;
+    ram_block_add_list_locked(new_block);
+
     qemu_mutex_unlock_ramlist();
 
     cpu_physical_memory_set_dirty_range(new_block->offset,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add() Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-26  1:27   ` Stefano Stabellini
  2023-10-25 21:24 ` [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length() Vikram Garhwal
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

From: Juergen Gross <jgross@suse.com>

Add a memory region which can be used to automatically map granted
memory. It is starting at 0x8000000000000000ULL in order to be able to
distinguish it from normal RAM.

For this reason the xen.ram memory region is expanded, which has no
further impact as it is used just as a container of the real RAM
regions and now the grant region.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 hw/i386/xen/xen-hvm.c           |  3 +++
 hw/xen/xen-hvm-common.c         |  4 ++--
 hw/xen/xen-mapcache.c           | 27 +++++++++++++++++++++++++++
 include/hw/xen/xen-hvm-common.h |  2 ++
 include/hw/xen/xen_pvdev.h      |  3 +++
 include/sysemu/xen-mapcache.h   |  3 +++
 6 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index f42621e674..67a55558a6 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -172,6 +172,9 @@ static void xen_ram_init(PCMachineState *pcms,
                                  x86ms->above_4g_mem_size);
         memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi);
     }
+
+    /* Add grant mappings as a pseudo RAM region. */
+    ram_grants = *xen_init_grant_ram();
 }
 
 static XenPhysmap *get_physmapping(hwaddr start_addr, ram_addr_t size)
diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 565dc39c8f..b7255977a5 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -9,7 +9,7 @@
 #include "hw/boards.h"
 #include "hw/xen/arch_hvm.h"
 
-MemoryRegion ram_memory;
+MemoryRegion ram_memory, ram_grants;
 
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
                    Error **errp)
@@ -26,7 +26,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
         return;
     }
 
-    if (mr == &ram_memory) {
+    if (mr == &ram_memory || mr == &ram_grants) {
         return;
     }
 
diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index f7d974677d..8115c44c00 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -14,7 +14,9 @@
 
 #include <sys/resource.h>
 
+#include "hw/xen/xen-hvm-common.h"
 #include "hw/xen/xen_native.h"
+#include "hw/xen/xen_pvdev.h"
 #include "qemu/bitmap.h"
 
 #include "sysemu/runstate.h"
@@ -597,3 +599,28 @@ uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
     mapcache_unlock();
     return p;
 }
+
+MemoryRegion *xen_init_grant_ram(void)
+{
+    RAMBlock *block;
+
+    memory_region_init(&ram_grants, NULL, "xen.grants",
+                       XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE);
+    block = g_malloc0(sizeof(*block));
+    block->mr = &ram_grants;
+    block->used_length = XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE;
+    block->max_length = XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE;
+    block->fd = -1;
+    block->page_size = XC_PAGE_SIZE;
+    block->host = (void *)XEN_GRANT_ADDR_OFF;
+    block->offset = XEN_GRANT_ADDR_OFF;
+    block->flags = RAM_PREALLOC;
+    ram_grants.ram_block = block;
+    ram_grants.ram = true;
+    ram_grants.terminates = true;
+    ram_block_add_list(block);
+    memory_region_add_subregion(get_system_memory(), XEN_GRANT_ADDR_OFF,
+                                &ram_grants);
+
+    return &ram_grants;
+}
diff --git a/include/hw/xen/xen-hvm-common.h b/include/hw/xen/xen-hvm-common.h
index 4e9904f1a6..0d300ba898 100644
--- a/include/hw/xen/xen-hvm-common.h
+++ b/include/hw/xen/xen-hvm-common.h
@@ -17,6 +17,8 @@
 #include <xen/hvm/ioreq.h>
 
 extern MemoryRegion ram_memory;
+
+extern MemoryRegion ram_grants;
 extern MemoryListener xen_io_listener;
 extern DeviceListener xen_device_listener;
 
diff --git a/include/hw/xen/xen_pvdev.h b/include/hw/xen/xen_pvdev.h
index ddad4b9f36..0f1b5edfa9 100644
--- a/include/hw/xen/xen_pvdev.h
+++ b/include/hw/xen/xen_pvdev.h
@@ -80,4 +80,7 @@ int xen_pv_send_notify(struct XenLegacyDevice *xendev);
 void xen_pv_printf(struct XenLegacyDevice *xendev, int msg_level,
                    const char *fmt, ...)  G_GNUC_PRINTF(3, 4);
 
+#define XEN_GRANT_ADDR_OFF    0x8000000000000000ULL
+#define XEN_MAX_VIRTIO_GRANTS 65536
+
 #endif /* QEMU_HW_XEN_PVDEV_H */
diff --git a/include/sysemu/xen-mapcache.h b/include/sysemu/xen-mapcache.h
index c8e7c2f6cf..f4bedb1c11 100644
--- a/include/sysemu/xen-mapcache.h
+++ b/include/sysemu/xen-mapcache.h
@@ -10,6 +10,7 @@
 #define XEN_MAPCACHE_H
 
 #include "exec/cpu-common.h"
+#include "exec/ram_addr.h"
 
 typedef hwaddr (*phys_offset_to_gaddr_t)(hwaddr phys_offset,
                                          ram_addr_t size);
@@ -25,6 +26,8 @@ void xen_invalidate_map_cache(void);
 uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
                                  hwaddr new_phys_addr,
                                  hwaddr size);
+MemoryRegion *xen_init_grant_ram(void);
+
 #else
 
 static inline void xen_map_cache_init(phys_offset_to_gaddr_t f,
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length()
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (2 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-26  1:28   ` Stefano Stabellini
  2023-10-25 21:24 ` [QEMU][PATCHv2 5/8] xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry Vikram Garhwal
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Paolo Bonzini, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé

From: Juergen Gross <jgross@suse.com>

qemu_map_ram_ptr() and qemu_ram_ptr_length() share quite some code, so
modify qemu_ram_ptr_length() a little bit and use it for
qemu_map_ram_ptr(), too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 system/physmem.c | 58 +++++++++++++++++++-----------------------------
 1 file changed, 23 insertions(+), 35 deletions(-)

diff --git a/system/physmem.c b/system/physmem.c
index 7a7f95b8b9..667a695078 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2163,38 +2163,8 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
 }
 #endif /* !_WIN32 */
 
-/* Return a host pointer to ram allocated with qemu_ram_alloc.
- * This should not be used for general purpose DMA.  Use address_space_map
- * or address_space_rw instead. For local memory (e.g. video ram) that the
- * device owns, use memory_region_get_ram_ptr.
- *
- * Called within RCU critical section.
- */
-void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
-{
-    RAMBlock *block = ram_block;
-
-    if (block == NULL) {
-        block = qemu_get_ram_block(addr);
-        addr -= block->offset;
-    }
-
-    if (xen_enabled() && block->host == NULL) {
-        /* We need to check if the requested address is in the RAM
-         * because we don't want to map the entire memory in QEMU.
-         * In that case just map until the end of the page.
-         */
-        if (block->offset == 0) {
-            return xen_map_cache(addr, 0, 0, false);
-        }
-
-        block->host = xen_map_cache(block->offset, block->max_length, 1, false);
-    }
-    return ramblock_ptr(block, addr);
-}
-
-/* Return a host pointer to guest's ram. Similar to qemu_map_ram_ptr
- * but takes a size argument.
+/*
+ * Return a host pointer to guest's ram.
  *
  * Called within RCU critical section.
  */
@@ -2202,7 +2172,9 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
                                  hwaddr *size, bool lock)
 {
     RAMBlock *block = ram_block;
-    if (*size == 0) {
+    hwaddr len = 0;
+
+    if (size && *size == 0) {
         return NULL;
     }
 
@@ -2210,7 +2182,10 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
         block = qemu_get_ram_block(addr);
         addr -= block->offset;
     }
-    *size = MIN(*size, block->max_length - addr);
+    if (size) {
+        *size = MIN(*size, block->max_length - addr);
+        len = *size;
+    }
 
     if (xen_enabled() && block->host == NULL) {
         /* We need to check if the requested address is in the RAM
@@ -2218,7 +2193,7 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
          * In that case just map the requested area.
          */
         if (block->offset == 0) {
-            return xen_map_cache(addr, *size, lock, lock);
+            return xen_map_cache(addr, len, lock, lock);
         }
 
         block->host = xen_map_cache(block->offset, block->max_length, 1, lock);
@@ -2227,6 +2202,19 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
     return ramblock_ptr(block, addr);
 }
 
+/*
+ * Return a host pointer to ram allocated with qemu_ram_alloc.
+ * This should not be used for general purpose DMA.  Use address_space_map
+ * or address_space_rw instead. For local memory (e.g. video ram) that the
+ * device owns, use memory_region_get_ram_ptr.
+ *
+ * Called within RCU critical section.
+ */
+void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
+{
+    return qemu_ram_ptr_length(ram_block, addr, NULL, false);
+}
+
 /* Return the offset of a hostpointer within a ramblock */
 ram_addr_t qemu_ram_block_host_offset(RAMBlock *rb, void *host)
 {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 5/8] xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (3 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length() Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 6/8] memory: add MemoryRegion map and unmap callbacks Vikram Garhwal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Anthony Perard, Paul Durrant,
	open list:X86 Xen CPUs

From: Juergen Gross <jgross@suse.com>

Today xen_ram_addr_from_mapcache() will either abort() or return 0 in
case it can't find a matching entry for a pointer value. Both cases
are bad, so change that to return an invalid address instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 hw/xen/xen-mapcache.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index 8115c44c00..8a61c7dde6 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -404,13 +404,8 @@ ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
         }
     }
     if (!found) {
-        fprintf(stderr, "%s, could not find %p\n", __func__, ptr);
-        QTAILQ_FOREACH(reventry, &mapcache->locked_entries, next) {
-            DPRINTF("   "HWADDR_FMT_plx" -> %p is present\n", reventry->paddr_index,
-                    reventry->vaddr_req);
-        }
-        abort();
-        return 0;
+        mapcache_unlock();
+        return RAM_ADDR_INVALID;
     }
 
     entry = &mapcache->entry[paddr_index % mapcache->nr_buckets];
@@ -418,8 +413,7 @@ ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
         entry = entry->next;
     }
     if (!entry) {
-        DPRINTF("Trying to find address %p that is not in the mapcache!\n", ptr);
-        raddr = 0;
+        raddr = RAM_ADDR_INVALID;
     } else {
         raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) +
              ((unsigned long) ptr - (unsigned long) entry->vaddr_base);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 6/8] memory: add MemoryRegion map and unmap callbacks
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (4 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 5/8] xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-25 21:24 ` [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region Vikram Garhwal
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Paolo Bonzini, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé

From: Juergen Gross <jgross@suse.com>

In order to support mapping and unmapping guest memory dynamically to
and from qemu during address_space_[un]map() operations add the map()
and unmap() callbacks to MemoryRegionOps.

Those will be used e.g. for Xen grant mappings when performing guest
I/Os.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 include/exec/memory.h | 21 ++++++++++++++++++
 system/physmem.c      | 50 +++++++++++++++++++++++++++++++++----------
 2 files changed, 60 insertions(+), 11 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 9087d02769..7c5444d46f 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -282,6 +282,27 @@ struct MemoryRegionOps {
                                     unsigned size,
                                     MemTxAttrs attrs);
 
+    /*
+     * Dynamically create mapping. @addr is the guest address to map; @plen
+     * is the pointer to the usable length of the buffer.
+     * @mr contents can be changed in case a new memory region is created for
+     * the mapping.
+     * Returns the buffer address for accessing the data.
+     */
+    void *(*map)(MemoryRegion **mr,
+                 hwaddr addr,
+                 hwaddr *plen,
+                 bool is_write,
+                 MemTxAttrs attrs);
+
+    /* Unmap an area obtained via map() before. */
+    void (*unmap)(MemoryRegion *mr,
+                  void *buffer,
+                  ram_addr_t addr,
+                  hwaddr len,
+                  bool is_write,
+                  hwaddr access_len);
+
     enum device_endian endianness;
     /* Guest-visible constraints: */
     struct {
diff --git a/system/physmem.c b/system/physmem.c
index 667a695078..5db1b32823 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -3120,6 +3120,7 @@ void *address_space_map(AddressSpace *as,
     hwaddr len = *plen;
     hwaddr l, xlat;
     MemoryRegion *mr;
+    void *ptr = NULL;
     FlatView *fv;
 
     if (len == 0) {
@@ -3153,12 +3154,20 @@ void *address_space_map(AddressSpace *as,
         return bounce.buffer;
     }
 
-
     memory_region_ref(mr);
+
+    if (mr->ops && mr->ops->map) {
+        ptr = mr->ops->map(&mr, addr, plen, is_write, attrs);
+    }
+
     *plen = flatview_extend_translation(fv, addr, len, mr, xlat,
                                         l, is_write, attrs);
     fuzz_dma_read_cb(addr, *plen, mr);
-    return qemu_ram_ptr_length(mr->ram_block, xlat, plen, true);
+    if (ptr == NULL) {
+        ptr = qemu_ram_ptr_length(mr->ram_block, xlat, plen, true);
+    }
+
+    return ptr;
 }
 
 /* Unmaps a memory region previously mapped by address_space_map().
@@ -3174,11 +3183,16 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len,
 
         mr = memory_region_from_host(buffer, &addr1);
         assert(mr != NULL);
-        if (is_write) {
-            invalidate_and_set_dirty(mr, addr1, access_len);
-        }
-        if (xen_enabled()) {
-            xen_invalidate_map_cache_entry(buffer);
+
+        if (mr->ops && mr->ops->unmap) {
+            mr->ops->unmap(mr, buffer, addr1, len, is_write, access_len);
+        } else {
+            if (is_write) {
+                invalidate_and_set_dirty(mr, addr1, access_len);
+            }
+            if (xen_enabled()) {
+                xen_invalidate_map_cache_entry(buffer);
+            }
         }
         memory_region_unref(mr);
         return;
@@ -3251,10 +3265,18 @@ int64_t address_space_cache_init(MemoryRegionCache *cache,
          * doing this if we found actual RAM, which behaves the same
          * regardless of attributes; so UNSPECIFIED is fine.
          */
+        if (mr->ops && mr->ops->map) {
+            cache->ptr = mr->ops->map(&mr, addr, &l, is_write,
+                                      MEMTXATTRS_UNSPECIFIED);
+        }
+
         l = flatview_extend_translation(cache->fv, addr, len, mr,
                                         cache->xlat, l, is_write,
                                         MEMTXATTRS_UNSPECIFIED);
-        cache->ptr = qemu_ram_ptr_length(mr->ram_block, cache->xlat, &l, true);
+        if (!cache->ptr) {
+            cache->ptr = qemu_ram_ptr_length(mr->ram_block, cache->xlat, &l,
+                                             true);
+        }
     } else {
         cache->ptr = NULL;
     }
@@ -3276,14 +3298,20 @@ void address_space_cache_invalidate(MemoryRegionCache *cache,
 
 void address_space_cache_destroy(MemoryRegionCache *cache)
 {
-    if (!cache->mrs.mr) {
+    MemoryRegion *mr = cache->mrs.mr;
+
+    if (!mr) {
         return;
     }
 
-    if (xen_enabled()) {
+    if (mr->ops && mr->ops->unmap) {
+            mr->ops->unmap(mr, cache->ptr, cache->xlat, cache->len,
+                           cache->is_write, cache->len);
+    } else if (xen_enabled()) {
         xen_invalidate_map_cache_entry(cache->ptr);
     }
-    memory_region_unref(cache->mrs.mr);
+
+    memory_region_unref(mr);
     flatview_unref(cache->fv);
     cache->mrs.mr = NULL;
     cache->fv = NULL;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (5 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 6/8] memory: add MemoryRegion map and unmap callbacks Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-26  1:32   ` Stefano Stabellini
  2023-10-25 21:24 ` [QEMU][PATCHv2 8/8] hw: arm: Add grant mapping Vikram Garhwal
  2023-10-26 16:12 ` [QEMU][PATCHv2 0/8] Xen: support grant mappings David Woodhouse
  8 siblings, 1 reply; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Peter Xu, David Hildenbrand,
	Philippe Mathieu-Daudé, open list:X86 Xen CPUs

From: Juergen Gross <jgross@suse.com>

Add the callbacks for mapping/unmapping guest memory via grants to the
special grant memory region.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
---
 hw/xen/xen-mapcache.c | 175 +++++++++++++++++++++++++++++++++++++++++-
 system/physmem.c      |  11 ++-
 2 files changed, 181 insertions(+), 5 deletions(-)

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index 8a61c7dde6..feb4a3b886 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -9,6 +9,8 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/queue.h"
+#include "qemu/thread.h"
 #include "qemu/units.h"
 #include "qemu/error-report.h"
 
@@ -23,6 +25,8 @@
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
 
+#include <xenevtchn.h>
+#include <xengnttab.h>
 
 //#define MAPCACHE_DEBUG
 
@@ -385,7 +389,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
     return p;
 }
 
-ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
+static ram_addr_t xen_ram_addr_from_mapcache_try(void *ptr)
 {
     MapCacheEntry *entry = NULL;
     MapCacheRev *reventry;
@@ -594,10 +598,178 @@ uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
     return p;
 }
 
+struct XENMappedGrantRegion {
+    void *addr;
+    unsigned int pages;
+    unsigned int refs;
+    unsigned int prot;
+    uint32_t idx;
+    QLIST_ENTRY(XENMappedGrantRegion) list;
+};
+
+static xengnttab_handle *xen_region_gnttabdev;
+static QLIST_HEAD(GrantRegionList, XENMappedGrantRegion) xen_grant_mappings =
+    QLIST_HEAD_INITIALIZER(xen_grant_mappings);
+static QemuMutex xen_map_mutex;
+
+static void *xen_map_grant_dyn(MemoryRegion **mr, hwaddr addr, hwaddr *plen,
+                               bool is_write, MemTxAttrs attrs)
+{
+    unsigned int page_off = addr & (XC_PAGE_SIZE - 1);
+    unsigned int i;
+    unsigned int total_grants = 0;
+    unsigned int nrefs = (page_off + *plen + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
+    uint32_t ref = (addr - XEN_GRANT_ADDR_OFF) >> XC_PAGE_SHIFT;
+    uint32_t *refs = NULL;
+    unsigned int prot = PROT_READ;
+    struct XENMappedGrantRegion *mgr = NULL;
+
+    if (is_write) {
+        prot |= PROT_WRITE;
+    }
+
+    qemu_mutex_lock(&xen_map_mutex);
+
+    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
+        if (mgr->idx == ref &&
+            mgr->pages == nrefs &&
+            (mgr->prot & prot) == prot) {
+            break;
+        }
+
+        total_grants += mgr->pages;
+    }
+
+    if (!mgr) {
+        if (nrefs + total_grants >= XEN_MAX_VIRTIO_GRANTS) {
+            return NULL;
+        }
+
+        mgr = g_new(struct XENMappedGrantRegion, 1);
+
+        if (nrefs == 1) {
+            refs = &ref;
+        } else {
+            refs = g_new(uint32_t, nrefs);
+            for (i = 0; i < nrefs; i++) {
+                refs[i] = ref + i;
+            }
+        }
+        mgr->addr = xengnttab_map_domain_grant_refs(xen_region_gnttabdev, nrefs,
+                                                    xen_domid, refs, prot);
+        if (mgr->addr) {
+            mgr->pages = nrefs;
+            mgr->refs = 1;
+            mgr->prot = prot;
+            mgr->idx = ref;
+
+            QLIST_INSERT_HEAD(&xen_grant_mappings, mgr, list);
+        } else {
+            g_free(mgr);
+            mgr = NULL;
+        }
+    } else {
+        mgr->refs++;
+    }
+
+    qemu_mutex_unlock(&xen_map_mutex);
+
+    if (nrefs > 1) {
+        g_free(refs);
+    }
+
+    return mgr ? mgr->addr + page_off : NULL;
+}
+
+static void xen_unmap_grant_dyn(MemoryRegion *mr, void *buffer, ram_addr_t addr,
+                                hwaddr len, bool is_write, hwaddr access_len)
+{
+    unsigned int page_off = (unsigned long)buffer & (XC_PAGE_SIZE - 1);
+    unsigned int nrefs = (page_off + len + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
+    unsigned int prot = PROT_READ;
+    struct XENMappedGrantRegion *mgr = NULL;
+
+    if (is_write) {
+        prot |= PROT_WRITE;
+    }
+
+    qemu_mutex_lock(&xen_map_mutex);
+
+    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
+        if (mgr->addr == buffer - page_off &&
+            mgr->pages == nrefs &&
+            (mgr->prot & prot) == prot) {
+            break;
+        }
+    }
+    if (mgr) {
+        mgr->refs--;
+        if (!mgr->refs) {
+            xengnttab_unmap(xen_region_gnttabdev, mgr->addr, nrefs);
+
+            QLIST_REMOVE(mgr, list);
+            g_free(mgr);
+        }
+    } else {
+        error_report("xen_unmap_grant_dyn() trying to unmap unknown buffer");
+    }
+
+    qemu_mutex_unlock(&xen_map_mutex);
+}
+
+static ram_addr_t xen_ram_addr_from_grant_cache(void *ptr)
+{
+    unsigned int page_off = (unsigned long)ptr & (XC_PAGE_SIZE - 1);
+    struct XENMappedGrantRegion *mgr = NULL;
+    ram_addr_t raddr = RAM_ADDR_INVALID;
+
+    qemu_mutex_lock(&xen_map_mutex);
+
+    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
+        if (mgr->addr == ptr - page_off) {
+            break;
+        }
+    }
+
+    if (mgr) {
+        raddr = (mgr->idx << XC_PAGE_SHIFT) + page_off + XEN_GRANT_ADDR_OFF;
+    }
+
+    qemu_mutex_unlock(&xen_map_mutex);
+
+    return raddr;
+}
+
+ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
+{
+    ram_addr_t raddr;
+
+    raddr = xen_ram_addr_from_mapcache_try(ptr);
+    if (raddr == RAM_ADDR_INVALID) {
+        raddr = xen_ram_addr_from_grant_cache(ptr);
+    }
+
+    return raddr;
+}
+
+static const struct MemoryRegionOps xen_grant_mr_ops = {
+    .map = xen_map_grant_dyn,
+    .unmap = xen_unmap_grant_dyn,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+};
+
 MemoryRegion *xen_init_grant_ram(void)
 {
     RAMBlock *block;
 
+    qemu_mutex_init(&xen_map_mutex);
+
+    xen_region_gnttabdev = xengnttab_open(NULL, 0);
+    if (xen_region_gnttabdev == NULL) {
+        fprintf(stderr, "can't open gnttab device\n");
+        return NULL;
+    }
+
     memory_region_init(&ram_grants, NULL, "xen.grants",
                        XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE);
     block = g_malloc0(sizeof(*block));
@@ -612,6 +784,7 @@ MemoryRegion *xen_init_grant_ram(void)
     ram_grants.ram_block = block;
     ram_grants.ram = true;
     ram_grants.terminates = true;
+    ram_grants.ops = &xen_grant_mr_ops;
     ram_block_add_list(block);
     memory_region_add_subregion(get_system_memory(), XEN_GRANT_ADDR_OFF,
                                 &ram_grants);
diff --git a/system/physmem.c b/system/physmem.c
index 5db1b32823..155a8c05fb 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2233,13 +2233,16 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
 
     if (xen_enabled()) {
         ram_addr_t ram_addr;
+
         RCU_READ_LOCK_GUARD();
         ram_addr = xen_ram_addr_from_mapcache(ptr);
-        block = qemu_get_ram_block(ram_addr);
-        if (block) {
-            *offset = ram_addr - block->offset;
+        if (ram_addr != RAM_ADDR_INVALID) {
+            block = qemu_get_ram_block(ram_addr);
+            if (block) {
+                *offset = ram_addr - block->offset;
+            }
+            return block;
         }
-        return block;
     }
 
     RCU_READ_LOCK_GUARD();
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [QEMU][PATCHv2 8/8] hw: arm: Add grant mapping.
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (6 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region Vikram Garhwal
@ 2023-10-25 21:24 ` Vikram Garhwal
  2023-10-26 16:12 ` [QEMU][PATCHv2 0/8] Xen: support grant mappings David Woodhouse
  8 siblings, 0 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-25 21:24 UTC (permalink / raw)
  To: qemu-devel
  Cc: sstabellini, vikram.garhwal, jgross, Peter Maydell,
	open list:ARM TCG CPUs

Enable grant ram mapping support for Xenpvh machine on ARM.

Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 hw/arm/xen_arm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/arm/xen_arm.c b/hw/arm/xen_arm.c
index f83b983ec5..553c289720 100644
--- a/hw/arm/xen_arm.c
+++ b/hw/arm/xen_arm.c
@@ -125,6 +125,9 @@ static void xen_init_ram(MachineState *machine)
         DPRINTF("Initialized region xen.ram.hi: base 0x%llx size 0x%lx\n",
                 GUEST_RAM1_BASE, ram_size[1]);
     }
+
+    DPRINTF("init grant ram mapping for XEN\n");
+    ram_grants = *xen_init_grant_ram();
 }
 
 void arch_handle_ioreq(XenIOState *state, ioreq_t *req)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices
  2023-10-25 21:24 ` [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices Vikram Garhwal
@ 2023-10-25 23:22   ` David Woodhouse
  2023-10-26  1:23     ` Stefano Stabellini
  0 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-10-25 23:22 UTC (permalink / raw)
  To: Vikram Garhwal, qemu-devel
  Cc: sstabellini, jgross, Anthony Perard, Paul Durrant, Paolo Bonzini,
	Richard Henderson, Eduardo Habkost, Michael S. Tsirkin,
	Marcel Apfelbaum, open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 633 bytes --]

On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Virtio devices should never be unplugged at boot time, as they are
> similar to pci passthrough devices.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>

Hm, do your virtio NICs still actually *work* after that? Or are they
all disconnected from their netdev peers? 

I suspect you're going to want a variant of
https://lore.kernel.org/qemu-devel/20231025145042.627381-19-dwmw2@infradead.org/T/#u
which also leave the peers of your virtio devices intact?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices
  2023-10-25 23:22   ` David Woodhouse
@ 2023-10-26  1:23     ` Stefano Stabellini
  2023-10-26 15:45       ` David Woodhouse
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26  1:23 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Vikram Garhwal, qemu-devel, sstabellini, jgross, Anthony Perard,
	Paul Durrant, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

On Thu, 26 Oct 2023, David Woodhouse wrote:
> On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > From: Juergen Gross <jgross@suse.com>
> > 
> > Virtio devices should never be unplugged at boot time, as they are
> > similar to pci passthrough devices.
> > 
> > Signed-off-by: Juergen Gross <jgross@suse.com>
> > Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
> 
> Hm, do your virtio NICs still actually *work* after that? Or are they
> all disconnected from their netdev peers? 
> 
> I suspect you're going to want a variant of
> https://lore.kernel.org/qemu-devel/20231025145042.627381-19-dwmw2@infradead.org/T/#u
> which also leave the peers of your virtio devices intact?

Hi David, device unplug is an x86-only thing (see the definition of
xen_emul_unplug in Linux under arch/x86/xen/platform-pci-unplug.c) I
suspect Vikram who is working on ARM hasn't tested it.

Vikram, a simple option is to drop this patch if you don't need it.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add()
  2023-10-25 21:24 ` [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add() Vikram Garhwal
@ 2023-10-26  1:25   ` Stefano Stabellini
  0 siblings, 0 replies; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26  1:25 UTC (permalink / raw)
  To: Vikram Garhwal
  Cc: qemu-devel, sstabellini, jgross, Paolo Bonzini, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé

On Wed, 25 Oct 2023, Vikram Garhwal wrote:
> Extract ram block list update to a new function ram_block_add_list(). This is
> done to support grant mappings which adds a memory region for granted memory and
> updates the ram_block list.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  include/exec/ram_addr.h |  1 +
>  system/physmem.c        | 62 ++++++++++++++++++++++++++---------------
>  2 files changed, 40 insertions(+), 23 deletions(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 90676093f5..c0b5f9a7d0 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -139,6 +139,7 @@ void qemu_ram_free(RAMBlock *block);
>  int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp);
>  
>  void qemu_ram_msync(RAMBlock *block, ram_addr_t start, ram_addr_t length);
> +void ram_block_add_list(RAMBlock *new_block);
>  
>  /* Clear whole block of mem */
>  static inline void qemu_ram_block_writeback(RAMBlock *block)
> diff --git a/system/physmem.c b/system/physmem.c
> index fc2b0fee01..7a7f95b8b9 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -1803,12 +1803,47 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
>      }
>  }
>  
> +static void ram_block_add_list_locked(RAMBlock *new_block)
> + {
> +     RAMBlock *block;
> +     RAMBlock *last_block = NULL;
> +
> +    /*
> +     * Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
> +     * QLIST (which has an RCU-friendly variant) does not have insertion at
> +     * tail, so save the last element in last_block.
> +     */
> +    RAMBLOCK_FOREACH(block) {
> +        last_block = block;
> +        if (block->max_length < new_block->max_length) {
> +            break;
> +        }
> +    }
> +    if (block) {
> +        QLIST_INSERT_BEFORE_RCU(block, new_block, next);
> +    } else if (last_block) {
> +        QLIST_INSERT_AFTER_RCU(last_block, new_block, next);
> +    } else { /* list is empty */
> +        QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
> +    }
> +    ram_list.mru_block = NULL;
> +
> +    /* Write list before version */
> +    smp_wmb();
> +    ram_list.version++;
> +}
> +
> +void ram_block_add_list(RAMBlock *new_block)
> +{
> +    qemu_mutex_lock_ramlist();
> +    ram_block_add_list_locked(new_block);
> +    qemu_mutex_unlock_ramlist();
> +}
> +
>  static void ram_block_add(RAMBlock *new_block, Error **errp)
>  {
>      const bool noreserve = qemu_ram_is_noreserve(new_block);
>      const bool shared = qemu_ram_is_shared(new_block);
> -    RAMBlock *block;
> -    RAMBlock *last_block = NULL;
>      ram_addr_t old_ram_size, new_ram_size;
>      Error *err = NULL;
>  
> @@ -1846,28 +1881,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>      if (new_ram_size > old_ram_size) {
>          dirty_memory_extend(old_ram_size, new_ram_size);
>      }
> -    /* Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
> -     * QLIST (which has an RCU-friendly variant) does not have insertion at
> -     * tail, so save the last element in last_block.
> -     */
> -    RAMBLOCK_FOREACH(block) {
> -        last_block = block;
> -        if (block->max_length < new_block->max_length) {
> -            break;
> -        }
> -    }
> -    if (block) {
> -        QLIST_INSERT_BEFORE_RCU(block, new_block, next);
> -    } else if (last_block) {
> -        QLIST_INSERT_AFTER_RCU(last_block, new_block, next);
> -    } else { /* list is empty */
> -        QLIST_INSERT_HEAD_RCU(&ram_list.blocks, new_block, next);
> -    }
> -    ram_list.mru_block = NULL;
>  
> -    /* Write list before version */
> -    smp_wmb();
> -    ram_list.version++;
> +    ram_block_add_list_locked(new_block);
> +
>      qemu_mutex_unlock_ramlist();
>  
>      cpu_physical_memory_set_dirty_range(new_block->offset,
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings
  2023-10-25 21:24 ` [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings Vikram Garhwal
@ 2023-10-26  1:27   ` Stefano Stabellini
  0 siblings, 0 replies; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26  1:27 UTC (permalink / raw)
  To: Vikram Garhwal
  Cc: qemu-devel, sstabellini, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

On Wed, 25 Oct 2023, Vikram Garhwal wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Add a memory region which can be used to automatically map granted
> memory. It is starting at 0x8000000000000000ULL in order to be able to
> distinguish it from normal RAM.
> 
> For this reason the xen.ram memory region is expanded, which has no
> further impact as it is used just as a container of the real RAM
> regions and now the grant region.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  hw/i386/xen/xen-hvm.c           |  3 +++
>  hw/xen/xen-hvm-common.c         |  4 ++--
>  hw/xen/xen-mapcache.c           | 27 +++++++++++++++++++++++++++
>  include/hw/xen/xen-hvm-common.h |  2 ++
>  include/hw/xen/xen_pvdev.h      |  3 +++
>  include/sysemu/xen-mapcache.h   |  3 +++
>  6 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index f42621e674..67a55558a6 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -172,6 +172,9 @@ static void xen_ram_init(PCMachineState *pcms,
>                                   x86ms->above_4g_mem_size);
>          memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi);
>      }
> +
> +    /* Add grant mappings as a pseudo RAM region. */
> +    ram_grants = *xen_init_grant_ram();
>  }
>  
>  static XenPhysmap *get_physmapping(hwaddr start_addr, ram_addr_t size)
> diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
> index 565dc39c8f..b7255977a5 100644
> --- a/hw/xen/xen-hvm-common.c
> +++ b/hw/xen/xen-hvm-common.c
> @@ -9,7 +9,7 @@
>  #include "hw/boards.h"
>  #include "hw/xen/arch_hvm.h"
>  
> -MemoryRegion ram_memory;
> +MemoryRegion ram_memory, ram_grants;
>  
>  void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
>                     Error **errp)
> @@ -26,7 +26,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
>          return;
>      }
>  
> -    if (mr == &ram_memory) {
> +    if (mr == &ram_memory || mr == &ram_grants) {
>          return;
>      }
>  
> diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> index f7d974677d..8115c44c00 100644
> --- a/hw/xen/xen-mapcache.c
> +++ b/hw/xen/xen-mapcache.c
> @@ -14,7 +14,9 @@
>  
>  #include <sys/resource.h>
>  
> +#include "hw/xen/xen-hvm-common.h"
>  #include "hw/xen/xen_native.h"
> +#include "hw/xen/xen_pvdev.h"
>  #include "qemu/bitmap.h"
>  
>  #include "sysemu/runstate.h"
> @@ -597,3 +599,28 @@ uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
>      mapcache_unlock();
>      return p;
>  }
> +
> +MemoryRegion *xen_init_grant_ram(void)
> +{
> +    RAMBlock *block;
> +
> +    memory_region_init(&ram_grants, NULL, "xen.grants",
> +                       XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE);
> +    block = g_malloc0(sizeof(*block));
> +    block->mr = &ram_grants;
> +    block->used_length = XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE;
> +    block->max_length = XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE;
> +    block->fd = -1;
> +    block->page_size = XC_PAGE_SIZE;
> +    block->host = (void *)XEN_GRANT_ADDR_OFF;
> +    block->offset = XEN_GRANT_ADDR_OFF;
> +    block->flags = RAM_PREALLOC;
> +    ram_grants.ram_block = block;
> +    ram_grants.ram = true;
> +    ram_grants.terminates = true;
> +    ram_block_add_list(block);
> +    memory_region_add_subregion(get_system_memory(), XEN_GRANT_ADDR_OFF,
> +                                &ram_grants);
> +
> +    return &ram_grants;
> +}
> diff --git a/include/hw/xen/xen-hvm-common.h b/include/hw/xen/xen-hvm-common.h
> index 4e9904f1a6..0d300ba898 100644
> --- a/include/hw/xen/xen-hvm-common.h
> +++ b/include/hw/xen/xen-hvm-common.h
> @@ -17,6 +17,8 @@
>  #include <xen/hvm/ioreq.h>
>  
>  extern MemoryRegion ram_memory;
> +
> +extern MemoryRegion ram_grants;
>  extern MemoryListener xen_io_listener;
>  extern DeviceListener xen_device_listener;
>  
> diff --git a/include/hw/xen/xen_pvdev.h b/include/hw/xen/xen_pvdev.h
> index ddad4b9f36..0f1b5edfa9 100644
> --- a/include/hw/xen/xen_pvdev.h
> +++ b/include/hw/xen/xen_pvdev.h
> @@ -80,4 +80,7 @@ int xen_pv_send_notify(struct XenLegacyDevice *xendev);
>  void xen_pv_printf(struct XenLegacyDevice *xendev, int msg_level,
>                     const char *fmt, ...)  G_GNUC_PRINTF(3, 4);
>  
> +#define XEN_GRANT_ADDR_OFF    0x8000000000000000ULL
> +#define XEN_MAX_VIRTIO_GRANTS 65536
> +
>  #endif /* QEMU_HW_XEN_PVDEV_H */
> diff --git a/include/sysemu/xen-mapcache.h b/include/sysemu/xen-mapcache.h
> index c8e7c2f6cf..f4bedb1c11 100644
> --- a/include/sysemu/xen-mapcache.h
> +++ b/include/sysemu/xen-mapcache.h
> @@ -10,6 +10,7 @@
>  #define XEN_MAPCACHE_H
>  
>  #include "exec/cpu-common.h"
> +#include "exec/ram_addr.h"
>  
>  typedef hwaddr (*phys_offset_to_gaddr_t)(hwaddr phys_offset,
>                                           ram_addr_t size);
> @@ -25,6 +26,8 @@ void xen_invalidate_map_cache(void);
>  uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
>                                   hwaddr new_phys_addr,
>                                   hwaddr size);
> +MemoryRegion *xen_init_grant_ram(void);
> +
>  #else
>  
>  static inline void xen_map_cache_init(phys_offset_to_gaddr_t f,
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length()
  2023-10-25 21:24 ` [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length() Vikram Garhwal
@ 2023-10-26  1:28   ` Stefano Stabellini
  0 siblings, 0 replies; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26  1:28 UTC (permalink / raw)
  To: Vikram Garhwal
  Cc: qemu-devel, sstabellini, jgross, Paolo Bonzini, Peter Xu,
	David Hildenbrand, Philippe Mathieu-Daudé

On Wed, 25 Oct 2023, Vikram Garhwal wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> qemu_map_ram_ptr() and qemu_ram_ptr_length() share quite some code, so
> modify qemu_ram_ptr_length() a little bit and use it for
> qemu_map_ram_ptr(), too.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  system/physmem.c | 58 +++++++++++++++++++-----------------------------
>  1 file changed, 23 insertions(+), 35 deletions(-)
> 
> diff --git a/system/physmem.c b/system/physmem.c
> index 7a7f95b8b9..667a695078 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2163,38 +2163,8 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
>  }
>  #endif /* !_WIN32 */
>  
> -/* Return a host pointer to ram allocated with qemu_ram_alloc.
> - * This should not be used for general purpose DMA.  Use address_space_map
> - * or address_space_rw instead. For local memory (e.g. video ram) that the
> - * device owns, use memory_region_get_ram_ptr.
> - *
> - * Called within RCU critical section.
> - */
> -void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
> -{
> -    RAMBlock *block = ram_block;
> -
> -    if (block == NULL) {
> -        block = qemu_get_ram_block(addr);
> -        addr -= block->offset;
> -    }
> -
> -    if (xen_enabled() && block->host == NULL) {
> -        /* We need to check if the requested address is in the RAM
> -         * because we don't want to map the entire memory in QEMU.
> -         * In that case just map until the end of the page.
> -         */
> -        if (block->offset == 0) {
> -            return xen_map_cache(addr, 0, 0, false);
> -        }
> -
> -        block->host = xen_map_cache(block->offset, block->max_length, 1, false);
> -    }
> -    return ramblock_ptr(block, addr);
> -}
> -
> -/* Return a host pointer to guest's ram. Similar to qemu_map_ram_ptr
> - * but takes a size argument.
> +/*
> + * Return a host pointer to guest's ram.
>   *
>   * Called within RCU critical section.
>   */
> @@ -2202,7 +2172,9 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
>                                   hwaddr *size, bool lock)
>  {
>      RAMBlock *block = ram_block;
> -    if (*size == 0) {
> +    hwaddr len = 0;
> +
> +    if (size && *size == 0) {
>          return NULL;
>      }
>  
> @@ -2210,7 +2182,10 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
>          block = qemu_get_ram_block(addr);
>          addr -= block->offset;
>      }
> -    *size = MIN(*size, block->max_length - addr);
> +    if (size) {
> +        *size = MIN(*size, block->max_length - addr);
> +        len = *size;
> +    }
>  
>      if (xen_enabled() && block->host == NULL) {
>          /* We need to check if the requested address is in the RAM
> @@ -2218,7 +2193,7 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
>           * In that case just map the requested area.
>           */
>          if (block->offset == 0) {
> -            return xen_map_cache(addr, *size, lock, lock);
> +            return xen_map_cache(addr, len, lock, lock);
>          }
>  
>          block->host = xen_map_cache(block->offset, block->max_length, 1, lock);
> @@ -2227,6 +2202,19 @@ static void *qemu_ram_ptr_length(RAMBlock *ram_block, ram_addr_t addr,
>      return ramblock_ptr(block, addr);
>  }
>  
> +/*
> + * Return a host pointer to ram allocated with qemu_ram_alloc.
> + * This should not be used for general purpose DMA.  Use address_space_map
> + * or address_space_rw instead. For local memory (e.g. video ram) that the
> + * device owns, use memory_region_get_ram_ptr.
> + *
> + * Called within RCU critical section.
> + */
> +void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr)
> +{
> +    return qemu_ram_ptr_length(ram_block, addr, NULL, false);
> +}
> +
>  /* Return the offset of a hostpointer within a ramblock */
>  ram_addr_t qemu_ram_block_host_offset(RAMBlock *rb, void *host)
>  {
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region
  2023-10-25 21:24 ` [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region Vikram Garhwal
@ 2023-10-26  1:32   ` Stefano Stabellini
  2023-10-26  4:35     ` Vikram Garhwal
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26  1:32 UTC (permalink / raw)
  To: Vikram Garhwal
  Cc: qemu-devel, sstabellini, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Peter Xu, David Hildenbrand,
	Philippe Mathieu-Daudé, open list:X86 Xen CPUs

On Wed, 25 Oct 2023, Vikram Garhwal wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Add the callbacks for mapping/unmapping guest memory via grants to the
> special grant memory region.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
> ---
>  hw/xen/xen-mapcache.c | 175 +++++++++++++++++++++++++++++++++++++++++-
>  system/physmem.c      |  11 ++-
>  2 files changed, 181 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> index 8a61c7dde6..feb4a3b886 100644
> --- a/hw/xen/xen-mapcache.c
> +++ b/hw/xen/xen-mapcache.c
> @@ -9,6 +9,8 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/queue.h"
> +#include "qemu/thread.h"
>  #include "qemu/units.h"
>  #include "qemu/error-report.h"
>  
> @@ -23,6 +25,8 @@
>  #include "sysemu/xen-mapcache.h"
>  #include "trace.h"
>  
> +#include <xenevtchn.h>
> +#include <xengnttab.h>
>  
>  //#define MAPCACHE_DEBUG
>  
> @@ -385,7 +389,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
>      return p;
>  }
>  
> -ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
> +static ram_addr_t xen_ram_addr_from_mapcache_try(void *ptr)
>  {
>      MapCacheEntry *entry = NULL;
>      MapCacheRev *reventry;
> @@ -594,10 +598,178 @@ uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
>      return p;
>  }
>  
> +struct XENMappedGrantRegion {
> +    void *addr;
> +    unsigned int pages;
> +    unsigned int refs;
> +    unsigned int prot;
> +    uint32_t idx;
> +    QLIST_ENTRY(XENMappedGrantRegion) list;
> +};
> +
> +static xengnttab_handle *xen_region_gnttabdev;
> +static QLIST_HEAD(GrantRegionList, XENMappedGrantRegion) xen_grant_mappings =
> +    QLIST_HEAD_INITIALIZER(xen_grant_mappings);
> +static QemuMutex xen_map_mutex;
> +
> +static void *xen_map_grant_dyn(MemoryRegion **mr, hwaddr addr, hwaddr *plen,
> +                               bool is_write, MemTxAttrs attrs)
> +{
> +    unsigned int page_off = addr & (XC_PAGE_SIZE - 1);
> +    unsigned int i;
> +    unsigned int total_grants = 0;
> +    unsigned int nrefs = (page_off + *plen + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
> +    uint32_t ref = (addr - XEN_GRANT_ADDR_OFF) >> XC_PAGE_SHIFT;
> +    uint32_t *refs = NULL;
> +    unsigned int prot = PROT_READ;
> +    struct XENMappedGrantRegion *mgr = NULL;
> +
> +    if (is_write) {
> +        prot |= PROT_WRITE;
> +    }
> +
> +    qemu_mutex_lock(&xen_map_mutex);
> +
> +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> +        if (mgr->idx == ref &&
> +            mgr->pages == nrefs &&
> +            (mgr->prot & prot) == prot) {
> +            break;
> +        }
> +
> +        total_grants += mgr->pages;
> +    }
> +
> +    if (!mgr) {
> +        if (nrefs + total_grants >= XEN_MAX_VIRTIO_GRANTS) {
> +            return NULL;

missing qemu_mutex_unlock


> +        }
> +
> +        mgr = g_new(struct XENMappedGrantRegion, 1);
> +
> +        if (nrefs == 1) {
> +            refs = &ref;
> +        } else {
> +            refs = g_new(uint32_t, nrefs);
> +            for (i = 0; i < nrefs; i++) {
> +                refs[i] = ref + i;
> +            }
> +        }
> +        mgr->addr = xengnttab_map_domain_grant_refs(xen_region_gnttabdev, nrefs,
> +                                                    xen_domid, refs, prot);
> +        if (mgr->addr) {
> +            mgr->pages = nrefs;
> +            mgr->refs = 1;
> +            mgr->prot = prot;
> +            mgr->idx = ref;
> +
> +            QLIST_INSERT_HEAD(&xen_grant_mappings, mgr, list);
> +        } else {
> +            g_free(mgr);
> +            mgr = NULL;
> +        }
> +    } else {
> +        mgr->refs++;
> +    }
> +
> +    qemu_mutex_unlock(&xen_map_mutex);
> +
> +    if (nrefs > 1) {
> +        g_free(refs);
> +    }
> +
> +    return mgr ? mgr->addr + page_off : NULL;
> +}
> +
> +static void xen_unmap_grant_dyn(MemoryRegion *mr, void *buffer, ram_addr_t addr,
> +                                hwaddr len, bool is_write, hwaddr access_len)
> +{
> +    unsigned int page_off = (unsigned long)buffer & (XC_PAGE_SIZE - 1);
> +    unsigned int nrefs = (page_off + len + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
> +    unsigned int prot = PROT_READ;
> +    struct XENMappedGrantRegion *mgr = NULL;
> +
> +    if (is_write) {
> +        prot |= PROT_WRITE;
> +    }
> +
> +    qemu_mutex_lock(&xen_map_mutex);
> +
> +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> +        if (mgr->addr == buffer - page_off &&
> +            mgr->pages == nrefs &&
> +            (mgr->prot & prot) == prot) {
> +            break;
> +        }
> +    }
> +    if (mgr) {
> +        mgr->refs--;
> +        if (!mgr->refs) {
> +            xengnttab_unmap(xen_region_gnttabdev, mgr->addr, nrefs);
> +
> +            QLIST_REMOVE(mgr, list);
> +            g_free(mgr);
> +        }
> +    } else {
> +        error_report("xen_unmap_grant_dyn() trying to unmap unknown buffer");
> +    }
> +
> +    qemu_mutex_unlock(&xen_map_mutex);
> +}
> +
> +static ram_addr_t xen_ram_addr_from_grant_cache(void *ptr)
> +{
> +    unsigned int page_off = (unsigned long)ptr & (XC_PAGE_SIZE - 1);
> +    struct XENMappedGrantRegion *mgr = NULL;
> +    ram_addr_t raddr = RAM_ADDR_INVALID;
> +
> +    qemu_mutex_lock(&xen_map_mutex);
> +
> +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> +        if (mgr->addr == ptr - page_off) {
> +            break;
> +        }
> +    }
> +
> +    if (mgr) {
> +        raddr = (mgr->idx << XC_PAGE_SHIFT) + page_off + XEN_GRANT_ADDR_OFF;
> +    }
> +
> +    qemu_mutex_unlock(&xen_map_mutex);
> +
> +    return raddr;
> +}
> +
> +ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
> +{
> +    ram_addr_t raddr;
> +
> +    raddr = xen_ram_addr_from_mapcache_try(ptr);
> +    if (raddr == RAM_ADDR_INVALID) {
> +        raddr = xen_ram_addr_from_grant_cache(ptr);
> +    }
> +
> +    return raddr;
> +}
> +
> +static const struct MemoryRegionOps xen_grant_mr_ops = {
> +    .map = xen_map_grant_dyn,
> +    .unmap = xen_unmap_grant_dyn,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +};
> +
>  MemoryRegion *xen_init_grant_ram(void)
>  {
>      RAMBlock *block;
>  
> +    qemu_mutex_init(&xen_map_mutex);
> +
> +    xen_region_gnttabdev = xengnttab_open(NULL, 0);
> +    if (xen_region_gnttabdev == NULL) {
> +        fprintf(stderr, "can't open gnttab device\n");
> +        return NULL;
> +    }
> +
>      memory_region_init(&ram_grants, NULL, "xen.grants",
>                         XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE);
>      block = g_malloc0(sizeof(*block));
> @@ -612,6 +784,7 @@ MemoryRegion *xen_init_grant_ram(void)
>      ram_grants.ram_block = block;
>      ram_grants.ram = true;
>      ram_grants.terminates = true;
> +    ram_grants.ops = &xen_grant_mr_ops;
>      ram_block_add_list(block);
>      memory_region_add_subregion(get_system_memory(), XEN_GRANT_ADDR_OFF,
>                                  &ram_grants);
> diff --git a/system/physmem.c b/system/physmem.c
> index 5db1b32823..155a8c05fb 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2233,13 +2233,16 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
>  
>      if (xen_enabled()) {
>          ram_addr_t ram_addr;
> +
>          RCU_READ_LOCK_GUARD();
>          ram_addr = xen_ram_addr_from_mapcache(ptr);
> -        block = qemu_get_ram_block(ram_addr);
> -        if (block) {
> -            *offset = ram_addr - block->offset;
> +        if (ram_addr != RAM_ADDR_INVALID) {
> +            block = qemu_get_ram_block(ram_addr);
> +            if (block) {
> +                *offset = ram_addr - block->offset;
> +            }
> +            return block;
>          }
> -        return block;
>      }
>  
>      RCU_READ_LOCK_GUARD();
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region
  2023-10-26  1:32   ` Stefano Stabellini
@ 2023-10-26  4:35     ` Vikram Garhwal
  0 siblings, 0 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-26  4:35 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: qemu-devel, jgross, Anthony Perard, Paul Durrant, Paolo Bonzini,
	Peter Xu, David Hildenbrand, Philippe Mathieu-Daudé,
	open list:X86 Xen CPUs

On Wed, Oct 25, 2023 at 06:32:26PM -0700, Stefano Stabellini wrote:
> On Wed, 25 Oct 2023, Vikram Garhwal wrote:
> > From: Juergen Gross <jgross@suse.com>
> > 
> > Add the callbacks for mapping/unmapping guest memory via grants to the
> > special grant memory region.
> > 
> > Signed-off-by: Juergen Gross <jgross@suse.com>
> > Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
> > ---
> >  hw/xen/xen-mapcache.c | 175 +++++++++++++++++++++++++++++++++++++++++-
> >  system/physmem.c      |  11 ++-
> >  2 files changed, 181 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> > index 8a61c7dde6..feb4a3b886 100644
> > --- a/hw/xen/xen-mapcache.c
> > +++ b/hw/xen/xen-mapcache.c
> > @@ -9,6 +9,8 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > +#include "qemu/queue.h"
> > +#include "qemu/thread.h"
> >  #include "qemu/units.h"
> >  #include "qemu/error-report.h"
> >  
> > @@ -23,6 +25,8 @@
> >  #include "sysemu/xen-mapcache.h"
> >  #include "trace.h"
> >  
> > +#include <xenevtchn.h>
> > +#include <xengnttab.h>
> >  
> >  //#define MAPCACHE_DEBUG
> >  
> > @@ -385,7 +389,7 @@ uint8_t *xen_map_cache(hwaddr phys_addr, hwaddr size,
> >      return p;
> >  }
> >  
> > -ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
> > +static ram_addr_t xen_ram_addr_from_mapcache_try(void *ptr)
> >  {
> >      MapCacheEntry *entry = NULL;
> >      MapCacheRev *reventry;
> > @@ -594,10 +598,178 @@ uint8_t *xen_replace_cache_entry(hwaddr old_phys_addr,
> >      return p;
> >  }
> >  
> > +struct XENMappedGrantRegion {
> > +    void *addr;
> > +    unsigned int pages;
> > +    unsigned int refs;
> > +    unsigned int prot;
> > +    uint32_t idx;
> > +    QLIST_ENTRY(XENMappedGrantRegion) list;
> > +};
> > +
> > +static xengnttab_handle *xen_region_gnttabdev;
> > +static QLIST_HEAD(GrantRegionList, XENMappedGrantRegion) xen_grant_mappings =
> > +    QLIST_HEAD_INITIALIZER(xen_grant_mappings);
> > +static QemuMutex xen_map_mutex;
> > +
> > +static void *xen_map_grant_dyn(MemoryRegion **mr, hwaddr addr, hwaddr *plen,
> > +                               bool is_write, MemTxAttrs attrs)
> > +{
> > +    unsigned int page_off = addr & (XC_PAGE_SIZE - 1);
> > +    unsigned int i;
> > +    unsigned int total_grants = 0;
> > +    unsigned int nrefs = (page_off + *plen + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
> > +    uint32_t ref = (addr - XEN_GRANT_ADDR_OFF) >> XC_PAGE_SHIFT;
> > +    uint32_t *refs = NULL;
> > +    unsigned int prot = PROT_READ;
> > +    struct XENMappedGrantRegion *mgr = NULL;
> > +
> > +    if (is_write) {
> > +        prot |= PROT_WRITE;
> > +    }
> > +
> > +    qemu_mutex_lock(&xen_map_mutex);
> > +
> > +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> > +        if (mgr->idx == ref &&
> > +            mgr->pages == nrefs &&
> > +            (mgr->prot & prot) == prot) {
> > +            break;
> > +        }
> > +
> > +        total_grants += mgr->pages;
> > +    }
> > +
> > +    if (!mgr) {
> > +        if (nrefs + total_grants >= XEN_MAX_VIRTIO_GRANTS) {
> > +            return NULL;
> 
> missing qemu_mutex_unlock
Oops, thanks for catching this! Will fix it in v3.
> 
> 
> > +        }
> > +
> > +        mgr = g_new(struct XENMappedGrantRegion, 1);
> > +
> > +        if (nrefs == 1) {
> > +            refs = &ref;
> > +        } else {
> > +            refs = g_new(uint32_t, nrefs);
> > +            for (i = 0; i < nrefs; i++) {
> > +                refs[i] = ref + i;
> > +            }
> > +        }
> > +        mgr->addr = xengnttab_map_domain_grant_refs(xen_region_gnttabdev, nrefs,
> > +                                                    xen_domid, refs, prot);
> > +        if (mgr->addr) {
> > +            mgr->pages = nrefs;
> > +            mgr->refs = 1;
> > +            mgr->prot = prot;
> > +            mgr->idx = ref;
> > +
> > +            QLIST_INSERT_HEAD(&xen_grant_mappings, mgr, list);
> > +        } else {
> > +            g_free(mgr);
> > +            mgr = NULL;
> > +        }
> > +    } else {
> > +        mgr->refs++;
> > +    }
> > +
> > +    qemu_mutex_unlock(&xen_map_mutex);
> > +
> > +    if (nrefs > 1) {
> > +        g_free(refs);
> > +    }
> > +
> > +    return mgr ? mgr->addr + page_off : NULL;
> > +}
> > +
> > +static void xen_unmap_grant_dyn(MemoryRegion *mr, void *buffer, ram_addr_t addr,
> > +                                hwaddr len, bool is_write, hwaddr access_len)
> > +{
> > +    unsigned int page_off = (unsigned long)buffer & (XC_PAGE_SIZE - 1);
> > +    unsigned int nrefs = (page_off + len + XC_PAGE_SIZE - 1) >> XC_PAGE_SHIFT;
> > +    unsigned int prot = PROT_READ;
> > +    struct XENMappedGrantRegion *mgr = NULL;
> > +
> > +    if (is_write) {
> > +        prot |= PROT_WRITE;
> > +    }
> > +
> > +    qemu_mutex_lock(&xen_map_mutex);
> > +
> > +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> > +        if (mgr->addr == buffer - page_off &&
> > +            mgr->pages == nrefs &&
> > +            (mgr->prot & prot) == prot) {
> > +            break;
> > +        }
> > +    }
> > +    if (mgr) {
> > +        mgr->refs--;
> > +        if (!mgr->refs) {
> > +            xengnttab_unmap(xen_region_gnttabdev, mgr->addr, nrefs);
> > +
> > +            QLIST_REMOVE(mgr, list);
> > +            g_free(mgr);
> > +        }
> > +    } else {
> > +        error_report("xen_unmap_grant_dyn() trying to unmap unknown buffer");
> > +    }
> > +
> > +    qemu_mutex_unlock(&xen_map_mutex);
> > +}
> > +
> > +static ram_addr_t xen_ram_addr_from_grant_cache(void *ptr)
> > +{
> > +    unsigned int page_off = (unsigned long)ptr & (XC_PAGE_SIZE - 1);
> > +    struct XENMappedGrantRegion *mgr = NULL;
> > +    ram_addr_t raddr = RAM_ADDR_INVALID;
> > +
> > +    qemu_mutex_lock(&xen_map_mutex);
> > +
> > +    QLIST_FOREACH(mgr, &xen_grant_mappings, list) {
> > +        if (mgr->addr == ptr - page_off) {
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (mgr) {
> > +        raddr = (mgr->idx << XC_PAGE_SHIFT) + page_off + XEN_GRANT_ADDR_OFF;
> > +    }
> > +
> > +    qemu_mutex_unlock(&xen_map_mutex);
> > +
> > +    return raddr;
> > +}
> > +
> > +ram_addr_t xen_ram_addr_from_mapcache(void *ptr)
> > +{
> > +    ram_addr_t raddr;
> > +
> > +    raddr = xen_ram_addr_from_mapcache_try(ptr);
> > +    if (raddr == RAM_ADDR_INVALID) {
> > +        raddr = xen_ram_addr_from_grant_cache(ptr);
> > +    }
> > +
> > +    return raddr;
> > +}
> > +
> > +static const struct MemoryRegionOps xen_grant_mr_ops = {
> > +    .map = xen_map_grant_dyn,
> > +    .unmap = xen_unmap_grant_dyn,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +};
> > +
> >  MemoryRegion *xen_init_grant_ram(void)
> >  {
> >      RAMBlock *block;
> >  
> > +    qemu_mutex_init(&xen_map_mutex);
> > +
> > +    xen_region_gnttabdev = xengnttab_open(NULL, 0);
> > +    if (xen_region_gnttabdev == NULL) {
> > +        fprintf(stderr, "can't open gnttab device\n");
> > +        return NULL;
> > +    }
> > +
> >      memory_region_init(&ram_grants, NULL, "xen.grants",
> >                         XEN_MAX_VIRTIO_GRANTS * XC_PAGE_SIZE);
> >      block = g_malloc0(sizeof(*block));
> > @@ -612,6 +784,7 @@ MemoryRegion *xen_init_grant_ram(void)
> >      ram_grants.ram_block = block;
> >      ram_grants.ram = true;
> >      ram_grants.terminates = true;
> > +    ram_grants.ops = &xen_grant_mr_ops;
> >      ram_block_add_list(block);
> >      memory_region_add_subregion(get_system_memory(), XEN_GRANT_ADDR_OFF,
> >                                  &ram_grants);
> > diff --git a/system/physmem.c b/system/physmem.c
> > index 5db1b32823..155a8c05fb 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -2233,13 +2233,16 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset,
> >  
> >      if (xen_enabled()) {
> >          ram_addr_t ram_addr;
> > +
> >          RCU_READ_LOCK_GUARD();
> >          ram_addr = xen_ram_addr_from_mapcache(ptr);
> > -        block = qemu_get_ram_block(ram_addr);
> > -        if (block) {
> > -            *offset = ram_addr - block->offset;
> > +        if (ram_addr != RAM_ADDR_INVALID) {
> > +            block = qemu_get_ram_block(ram_addr);
> > +            if (block) {
> > +                *offset = ram_addr - block->offset;
> > +            }
> > +            return block;
> >          }
> > -        return block;
> >      }
> >  
> >      RCU_READ_LOCK_GUARD();
> > -- 
> > 2.17.1
> > 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices
  2023-10-26  1:23     ` Stefano Stabellini
@ 2023-10-26 15:45       ` David Woodhouse
  2023-10-26 17:13         ` Vikram Garhwal
  0 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-10-26 15:45 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vikram Garhwal, qemu-devel, jgross, Anthony Perard, Paul Durrant,
	Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]

On Wed, 2023-10-25 at 18:23 -0700, Stefano Stabellini wrote:
> On Thu, 26 Oct 2023, David Woodhouse wrote:
> > On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > > From: Juergen Gross <jgross@suse.com>
> > > 
> > > Virtio devices should never be unplugged at boot time, as they are
> > > similar to pci passthrough devices.
> > > 
> > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
> > 
> > Hm, do your virtio NICs still actually *work* after that? Or are they
> > all disconnected from their netdev peers? 
> > 
> > I suspect you're going to want a variant of
> > https://lore.kernel.org/qemu-devel/20231025145042.627381-19-dwmw2@infradead.org/T/#u
> > which also leave the peers of your virtio devices intact?
> 
> Hi David, device unplug is an x86-only thing (see the definition of
> xen_emul_unplug in Linux under arch/x86/xen/platform-pci-unplug.c) I
> suspect Vikram who is working on ARM hasn't tested it.

Ah, I had assumed there was something else coming along later which
would make it actually get used. 

> Vikram, a simple option is to drop this patch if you don't need it.

That works. Although I may revive it in that case. 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
                   ` (7 preceding siblings ...)
  2023-10-25 21:24 ` [QEMU][PATCHv2 8/8] hw: arm: Add grant mapping Vikram Garhwal
@ 2023-10-26 16:12 ` David Woodhouse
  2023-10-26 18:07   ` Stefano Stabellini
  8 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-10-26 16:12 UTC (permalink / raw)
  To: Vikram Garhwal, qemu-devel; +Cc: sstabellini, jgross

[-- Attachment #1: Type: text/plain, Size: 1325 bytes --]

On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> Hi,
> This patch series add support for grant mappings as a pseudo RAM region for Xen.
> 
> Enabling grant mappings patches(first 6) are written by Juergen in 2021.
> 
> QEMU Virtio device provides an emulated backends for Virtio frontned devices
> in Xen.
> Please set "iommu_platform=on" option when invoking QEMU. As this will set
> VIRTIO_F_ACCESS_PLATFORM feature which will be used by virtio frontend in Xen
> to know whether backend supports grants or not.

I don't really understand what's going on here. The subject of the
cover letter certainly doesn't help me, because we *already* support
grant mappings under Xen, don't we?

I found
https://static.linaro.org/connect/lvc21/presentations/lvc21-314.pdf but
I think it's a bit out of date; the decision about how to handle grant
mappings for virtio devices is still 'TBD'.

Can you talk me through the process of what happens when a guest wants
to a virtio device to initiate 'DMA' to one of its pages? I assume it
starts by creating a grant mapping, and then taking the gntref and...
then what?

I don't see any changes to the virtio devices themselves in this
series; are we doing something that will make it work by magic? If so,
it might be useful to explain that magic...

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices
  2023-10-26 15:45       ` David Woodhouse
@ 2023-10-26 17:13         ` Vikram Garhwal
  0 siblings, 0 replies; 28+ messages in thread
From: Vikram Garhwal @ 2023-10-26 17:13 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Stefano Stabellini, qemu-devel, jgross, Anthony Perard,
	Paul Durrant, Paolo Bonzini, Richard Henderson, Eduardo Habkost,
	Michael S. Tsirkin, Marcel Apfelbaum, open list:X86 Xen CPUs

On Thu, Oct 26, 2023 at 04:45:21PM +0100, David Woodhouse wrote:
> On Wed, 2023-10-25 at 18:23 -0700, Stefano Stabellini wrote:
> > On Thu, 26 Oct 2023, David Woodhouse wrote:
> > > On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > > > From: Juergen Gross <jgross@suse.com>
> > > > 
> > > > Virtio devices should never be unplugged at boot time, as they are
> > > > similar to pci passthrough devices.
> > > > 
> > > > Signed-off-by: Juergen Gross <jgross@suse.com>
> > > > Signed-off-by: Vikram Garhwal <vikram.garhwal@amd.com>
> > > 
> > > Hm, do your virtio NICs still actually *work* after that? Or are they
> > > all disconnected from their netdev peers? 
> > > 
> > > I suspect you're going to want a variant of
> > > https://lore.kernel.org/qemu-devel/20231025145042.627381-19-dwmw2@infradead.org/T/#u
> > > which also leave the peers of your virtio devices intact?
> > 
> > Hi David, device unplug is an x86-only thing (see the definition of
> > xen_emul_unplug in Linux under arch/x86/xen/platform-pci-unplug.c) I
> > suspect Vikram who is working on ARM hasn't tested it.
> 
> Ah, I had assumed there was something else coming along later which
> would make it actually get used. 
> 
> > Vikram, a simple option is to drop this patch if you don't need it.
> 
> That works. Although I may revive it in that case. 
> 
Hopefully, Juergen is also okay with dropping the patch. Then, i will remove it
from v3.

Thanks David & Stefano!




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 16:12 ` [QEMU][PATCHv2 0/8] Xen: support grant mappings David Woodhouse
@ 2023-10-26 18:07   ` Stefano Stabellini
  2023-10-26 20:15     ` David Woodhouse
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26 18:07 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Vikram Garhwal, qemu-devel, sstabellini, jgross

On Thu, 26 Oct 2023, David Woodhouse wrote:
> On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > Hi,
> > This patch series add support for grant mappings as a pseudo RAM region for Xen.
> > 
> > Enabling grant mappings patches(first 6) are written by Juergen in 2021.
> > 
> > QEMU Virtio device provides an emulated backends for Virtio frontned devices
> > in Xen.
> > Please set "iommu_platform=on" option when invoking QEMU. As this will set
> > VIRTIO_F_ACCESS_PLATFORM feature which will be used by virtio frontend in Xen
> > to know whether backend supports grants or not.
> 
> I don't really understand what's going on here. The subject of the
> cover letter certainly doesn't help me, because we *already* support
> grant mappings under Xen, don't we?
> 
> I found
> https://static.linaro.org/connect/lvc21/presentations/lvc21-314.pdf but
> I think it's a bit out of date; the decision about how to handle grant
> mappings for virtio devices is still 'TBD'.

See this presentation:
https://www.youtube.com/watch?v=boRQ8UHc760

The patch series is for the guest (e.g. Linux) to use grants to share
memory with virtio devices. The plumbing was already done in Linux a
couple of years ago, but QEMU support for it is still missing.


> Can you talk me through the process of what happens when a guest wants
> to a virtio device to initiate 'DMA' to one of its pages? I assume it
> starts by creating a grant mapping, and then taking the gntref and...
> then what?

First the guest gets a grant reference for the page, then it uses it as
"dma address" to pass to the virtio device. The top bit (1ULL << 63) is
used to signal that the address in question is a grant address.

See in Linux:
drivers/xen/grant-dma-ops.c grant_to_dma, dma_to_grant,
xen_grant_dma_alloc, etc.


> I don't see any changes to the virtio devices themselves in this
> series; are we doing something that will make it work by magic? If so,
> it might be useful to explain that magic...

Yes something like that :-)
https://marc.info/?l=xen-devel&m=165419780304962&w=2

Vikram, I think it would be a good idea to submit a separate patch to
xen.git to add a document under docs/ to capture this.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 18:07   ` Stefano Stabellini
@ 2023-10-26 20:15     ` David Woodhouse
  2023-10-26 20:36       ` Stefano Stabellini
  0 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-10-26 20:15 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Vikram Garhwal, qemu-devel, jgross

[-- Attachment #1: Type: text/plain, Size: 2847 bytes --]

On Thu, 2023-10-26 at 11:07 -0700, Stefano Stabellini wrote:
> On Thu, 26 Oct 2023, David Woodhouse wrote:
> > On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > > Hi,
> > > This patch series add support for grant mappings as a pseudo RAM region for Xen.
> > > 
> > > Enabling grant mappings patches(first 6) are written by Juergen in 2021.
> > > 
> > > QEMU Virtio device provides an emulated backends for Virtio frontned devices
> > > in Xen.
> > > Please set "iommu_platform=on" option when invoking QEMU. As this will set
> > > VIRTIO_F_ACCESS_PLATFORM feature which will be used by virtio frontend in Xen
> > > to know whether backend supports grants or not.
> > 
> > I don't really understand what's going on here. The subject of the
> > cover letter certainly doesn't help me, because we *already* support
> > grant mappings under Xen, don't we?
> > 
> > I found
> > https://static.linaro.org/connect/lvc21/presentations/lvc21-314.pdf but
> > I think it's a bit out of date; the decision about how to handle grant
> > mappings for virtio devices is still 'TBD'.
> 
> See this presentation:
> https://www.youtube.com/watch?v=boRQ8UHc760
> 
> The patch series is for the guest (e.g. Linux) to use grants to share
> memory with virtio devices. The plumbing was already done in Linux a
> couple of years ago, but QEMU support for it is still missing.

Thanks.

> > Can you talk me through the process of what happens when a guest wants
> > to a virtio device to initiate 'DMA' to one of its pages? I assume it
> > starts by creating a grant mapping, and then taking the gntref and...
> > then what?
> 
> First the guest gets a grant reference for the page, then it uses it as
> "dma address" to pass to the virtio device. The top bit (1ULL << 63) is
> used to signal that the address in question is a grant address.

OK, so the guest sets the top bit in the DMA address and that indicates
that this is no longer actually a guest physical address, but instead,
bits 62-12 are a (starting) grant ref. (And the address *within* the
page is still bits 0-11, assuming 4KiB pages).

An alternative way of implementing this on the QEMU side would just be
to teach the virtio mapping to recognise the "special" format with the
top bit set, and call qemu_xen_gnttab_map_refs() directly to get the
mapping?

This seems like a lot of code to replace that simpler option... is
there a massive performance win from doing it this way? Would we want
to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
make sense to introduce the simple version and *then* the optimisation,
with some clear benchmarking to show the win?

If we're just going to switch to grant mappings, why *aren't* we using
Xen PV device models on Arm anyway?

Or am I missing the point completely?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 20:15     ` David Woodhouse
@ 2023-10-26 20:36       ` Stefano Stabellini
  2023-10-26 20:44         ` David Woodhouse
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26 20:36 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Stefano Stabellini, Vikram Garhwal, qemu-devel, jgross

[-- Attachment #1: Type: text/plain, Size: 3726 bytes --]

On Thu, 26 Oct 2023, David Woodhouse wrote:
> On Thu, 2023-10-26 at 11:07 -0700, Stefano Stabellini wrote:
> > On Thu, 26 Oct 2023, David Woodhouse wrote:
> > > On Wed, 2023-10-25 at 14:24 -0700, Vikram Garhwal wrote:
> > > > Hi,
> > > > This patch series add support for grant mappings as a pseudo RAM region for Xen.
> > > > 
> > > > Enabling grant mappings patches(first 6) are written by Juergen in 2021.
> > > > 
> > > > QEMU Virtio device provides an emulated backends for Virtio frontned devices
> > > > in Xen.
> > > > Please set "iommu_platform=on" option when invoking QEMU. As this will set
> > > > VIRTIO_F_ACCESS_PLATFORM feature which will be used by virtio frontend in Xen
> > > > to know whether backend supports grants or not.
> > > 
> > > I don't really understand what's going on here. The subject of the
> > > cover letter certainly doesn't help me, because we *already* support
> > > grant mappings under Xen, don't we?
> > > 
> > > I found
> > > https://static.linaro.org/connect/lvc21/presentations/lvc21-314.pdf but
> > > I think it's a bit out of date; the decision about how to handle grant
> > > mappings for virtio devices is still 'TBD'.
> > 
> > See this presentation:
> > https://www.youtube.com/watch?v=boRQ8UHc760
> > 
> > The patch series is for the guest (e.g. Linux) to use grants to share
> > memory with virtio devices. The plumbing was already done in Linux a
> > couple of years ago, but QEMU support for it is still missing.
> 
> Thanks.
> 
> > > Can you talk me through the process of what happens when a guest wants
> > > to a virtio device to initiate 'DMA' to one of its pages? I assume it
> > > starts by creating a grant mapping, and then taking the gntref and...
> > > then what?
> > 
> > First the guest gets a grant reference for the page, then it uses it as
> > "dma address" to pass to the virtio device. The top bit (1ULL << 63) is
> > used to signal that the address in question is a grant address.
> 
> OK, so the guest sets the top bit in the DMA address and that indicates
> that this is no longer actually a guest physical address, but instead,
> bits 62-12 are a (starting) grant ref. (And the address *within* the
> page is still bits 0-11, assuming 4KiB pages).
> 
> An alternative way of implementing this on the QEMU side would just be
> to teach the virtio mapping to recognise the "special" format with the
> top bit set, and call qemu_xen_gnttab_map_refs() directly to get the
> mapping?
> 
> This seems like a lot of code to replace that simpler option... is
> there a massive performance win from doing it this way? Would we want
> to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
> make sense to introduce the simple version and *then* the optimisation,
> with some clear benchmarking to show the win?

This is not done for performance but for safety (as in safety
certifications, ISO 26262, etc.). This is to enable unprivileged virtio
backends running in a DomU. By unprivileged I mean a virtio backend that
is unable to map arbitrary memory (the xenforeignmemory interface is
prohibited).

The goal is to run Xen on safety-critical systems such as cars,
industrial robots and more. In this configuration there is no
traditional Dom0 in the system at all. If you  would like to know more:
https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8


> If we're just going to switch to grant mappings, why *aren't* we using
> Xen PV device models on Arm anyway?

I think you mean Xen PV drivers. Yes we have been using them for a
while. Now we would also like to add the option of running virtio
devices but we can only do that if the virtio backend is unprivileged as
we have no Dom0 in the system.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 20:36       ` Stefano Stabellini
@ 2023-10-26 20:44         ` David Woodhouse
  2023-10-26 20:56           ` Stefano Stabellini
  0 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-10-26 20:44 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Vikram Garhwal, qemu-devel, jgross

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
> 
> > This seems like a lot of code to replace that simpler option... is
> > there a massive performance win from doing it this way? Would we want
> > to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
> > make sense to introduce the simple version and *then* the optimisation,
> > with some clear benchmarking to show the win?
> 
> This is not done for performance but for safety (as in safety
> certifications, ISO 26262, etc.). This is to enable unprivileged virtio
> backends running in a DomU. By unprivileged I mean a virtio backend that
> is unable to map arbitrary memory (the xenforeignmemory interface is
> prohibited).
> 
> The goal is to run Xen on safety-critical systems such as cars,
> industrial robots and more. In this configuration there is no
> traditional Dom0 in the system at all. If you  would like to know more:
> https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8

Yeah, I understand why we're using grant mappings instead of just
directly having access via foreignmem mappings. That wasn't what I was
confused about.

What I haven't worked out is why we're implementing this through an
automatically-populated MemoryRegion in QEMU, rather than just using
grant mapping ops like we always have.

It seems like a lot of complexity just to avoid calling
qemu_xen_gnttab_map_refs() from the virtio backend.

And if we're going to use this magic region, are we going to start
using it for the Xen PV backends in QEMU too, when they want to map
grant refs? 


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 20:44         ` David Woodhouse
@ 2023-10-26 20:56           ` Stefano Stabellini
  2023-10-27  5:27             ` Juergen Gross
  0 siblings, 1 reply; 28+ messages in thread
From: Stefano Stabellini @ 2023-10-26 20:56 UTC (permalink / raw)
  To: David Woodhouse; +Cc: Stefano Stabellini, Vikram Garhwal, qemu-devel, jgross

[-- Attachment #1: Type: text/plain, Size: 2172 bytes --]

On Thu, 26 Oct 2023, David Woodhouse wrote:
> On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
> > 
> > > This seems like a lot of code to replace that simpler option... is
> > > there a massive performance win from doing it this way? Would we want
> > > to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
> > > make sense to introduce the simple version and *then* the optimisation,
> > > with some clear benchmarking to show the win?
> > 
> > This is not done for performance but for safety (as in safety
> > certifications, ISO 26262, etc.). This is to enable unprivileged virtio
> > backends running in a DomU. By unprivileged I mean a virtio backend that
> > is unable to map arbitrary memory (the xenforeignmemory interface is
> > prohibited).
> > 
> > The goal is to run Xen on safety-critical systems such as cars,
> > industrial robots and more. In this configuration there is no
> > traditional Dom0 in the system at all. If you  would like to know more:
> > https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8
> 
> Yeah, I understand why we're using grant mappings instead of just
> directly having access via foreignmem mappings. That wasn't what I was
> confused about.
> 
> What I haven't worked out is why we're implementing this through an
> automatically-populated MemoryRegion in QEMU, rather than just using
> grant mapping ops like we always have.
> 
> It seems like a lot of complexity just to avoid calling
> qemu_xen_gnttab_map_refs() from the virtio backend.

I think there are two questions here. One question is "Why do we need
all the new grant mapping code added to xen-mapcache.c in patch #7?
Can't we use qemu_xen_gnttab_map_refs() instead?"

Good question, I'll let Juergen and Vikram comment as original authors.

And the second question is where to call the grant mapping from (whether
the new code or the old qemu_xen_gnttab_map_refs code it doesn't
matter). It could be even simpler than calling it from the virtio
backends: we could just call it from the existing qemu_ram_ptr_length()
hook. See this discussion:
https://marc.info/?l=qemu-devel&m=169828434927778

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-26 20:56           ` Stefano Stabellini
@ 2023-10-27  5:27             ` Juergen Gross
  2023-11-13 20:24               ` David Woodhouse
  0 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2023-10-27  5:27 UTC (permalink / raw)
  To: Stefano Stabellini, David Woodhouse; +Cc: Vikram Garhwal, qemu-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 2996 bytes --]

On 26.10.23 22:56, Stefano Stabellini wrote:
> On Thu, 26 Oct 2023, David Woodhouse wrote:
>> On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
>>>
>>>> This seems like a lot of code to replace that simpler option... is
>>>> there a massive performance win from doing it this way? Would we want
>>>> to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
>>>> make sense to introduce the simple version and *then* the optimisation,
>>>> with some clear benchmarking to show the win?
>>>
>>> This is not done for performance but for safety (as in safety
>>> certifications, ISO 26262, etc.). This is to enable unprivileged virtio
>>> backends running in a DomU. By unprivileged I mean a virtio backend that
>>> is unable to map arbitrary memory (the xenforeignmemory interface is
>>> prohibited).
>>>
>>> The goal is to run Xen on safety-critical systems such as cars,
>>> industrial robots and more. In this configuration there is no
>>> traditional Dom0 in the system at all. If you  would like to know more:
>>> https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8
>>
>> Yeah, I understand why we're using grant mappings instead of just
>> directly having access via foreignmem mappings. That wasn't what I was
>> confused about.
>>
>> What I haven't worked out is why we're implementing this through an
>> automatically-populated MemoryRegion in QEMU, rather than just using
>> grant mapping ops like we always have.
>>
>> It seems like a lot of complexity just to avoid calling
>> qemu_xen_gnttab_map_refs() from the virtio backend.
> 
> I think there are two questions here. One question is "Why do we need
> all the new grant mapping code added to xen-mapcache.c in patch #7?
> Can't we use qemu_xen_gnttab_map_refs() instead?"

The main motivation was to _avoid_ having to change all the backends.

My implementation enables _all_ qemu based virtio backends to use grant
mappings. And if a new backend is added to qemu, there will be no change
required to make it work with grants.

> And the second question is where to call the grant mapping from (whether
> the new code or the old qemu_xen_gnttab_map_refs code it doesn't
> matter). It could be even simpler than calling it from the virtio
> backends: we could just call it from the existing qemu_ram_ptr_length()
> hook. See this discussion:
> https://marc.info/?l=qemu-devel&m=169828434927778

I wanted to be explicit where and when the mapping and unmapping are
happening. Adding the mapping to qemu_ram_ptr_length() would be probably
possible, but it would be quite hard to verify that no double mappings
are happening. And I had problems with that approach when qemu was setting
up the ring page access.

Adding a map() and an unmap() hook seemed to be the cleanest solution, even
if it needs more code churn. The qemu_ram_ptr_length() approach seemed to be
more like a hack than a solution.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-10-27  5:27             ` Juergen Gross
@ 2023-11-13 20:24               ` David Woodhouse
  2023-11-14  6:19                 ` Juergen Gross
  0 siblings, 1 reply; 28+ messages in thread
From: David Woodhouse @ 2023-11-13 20:24 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini; +Cc: Vikram Garhwal, qemu-devel, paul

[-- Attachment #1: Type: text/plain, Size: 2609 bytes --]

On Fri, 2023-10-27 at 07:27 +0200, Juergen Gross wrote:
> On 26.10.23 22:56, Stefano Stabellini wrote:
> > On Thu, 26 Oct 2023, David Woodhouse wrote:
> > > On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
> > > > 
> > > > > This seems like a lot of code to replace that simpler option... is
> > > > > there a massive performance win from doing it this way? Would we want
> > > > > to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
> > > > > make sense to introduce the simple version and *then* the optimisation,
> > > > > with some clear benchmarking to show the win?
> > > > 
> > > > This is not done for performance but for safety (as in safety
> > > > certifications, ISO 26262, etc.). This is to enable unprivileged virtio
> > > > backends running in a DomU. By unprivileged I mean a virtio backend that
> > > > is unable to map arbitrary memory (the xenforeignmemory interface is
> > > > prohibited).
> > > > 
> > > > The goal is to run Xen on safety-critical systems such as cars,
> > > > industrial robots and more. In this configuration there is no
> > > > traditional Dom0 in the system at all. If you  would like to know more:
> > > > https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8
> > > 
> > > Yeah, I understand why we're using grant mappings instead of just
> > > directly having access via foreignmem mappings. That wasn't what I was
> > > confused about.
> > > 
> > > What I haven't worked out is why we're implementing this through an
> > > automatically-populated MemoryRegion in QEMU, rather than just using
> > > grant mapping ops like we always have.
> > > 
> > > It seems like a lot of complexity just to avoid calling
> > > qemu_xen_gnttab_map_refs() from the virtio backend.
> > 
> > I think there are two questions here. One question is "Why do we need
> > all the new grant mapping code added to xen-mapcache.c in patch #7?
> > Can't we use qemu_xen_gnttab_map_refs() instead?"
> 
> The main motivation was to _avoid_ having to change all the backends.
> 
> My implementation enables _all_ qemu based virtio backends to use grant
> mappings. And if a new backend is added to qemu, there will be no change
> required to make it work with grants.

I'm not really convinced I buy that. This is a lot of complexity, and
don't backends need to call an appropriate mapping function to map via
an IOMMU if it's present anyway? Make then call a helper where you can
do this in one place directly instead of through a fake MemoryRegion,
and you're done, surely? 

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5965 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-11-13 20:24               ` David Woodhouse
@ 2023-11-14  6:19                 ` Juergen Gross
  2023-11-14 20:58                   ` Stefano Stabellini
  0 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2023-11-14  6:19 UTC (permalink / raw)
  To: David Woodhouse, Stefano Stabellini; +Cc: Vikram Garhwal, qemu-devel, paul


[-- Attachment #1.1.1: Type: text/plain, Size: 2871 bytes --]

On 13.11.23 21:24, David Woodhouse wrote:
> On Fri, 2023-10-27 at 07:27 +0200, Juergen Gross wrote:
>> On 26.10.23 22:56, Stefano Stabellini wrote:
>>> On Thu, 26 Oct 2023, David Woodhouse wrote:
>>>> On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
>>>>>
>>>>>> This seems like a lot of code to replace that simpler option... is
>>>>>> there a massive performance win from doing it this way? Would we want
>>>>>> to use this trick for the Xen PV backends (qdisk, qnic) *too*? Might it
>>>>>> make sense to introduce the simple version and *then* the optimisation,
>>>>>> with some clear benchmarking to show the win?
>>>>>
>>>>> This is not done for performance but for safety (as in safety
>>>>> certifications, ISO 26262, etc.). This is to enable unprivileged virtio
>>>>> backends running in a DomU. By unprivileged I mean a virtio backend that
>>>>> is unable to map arbitrary memory (the xenforeignmemory interface is
>>>>> prohibited).
>>>>>
>>>>> The goal is to run Xen on safety-critical systems such as cars,
>>>>> industrial robots and more. In this configuration there is no
>>>>> traditional Dom0 in the system at all. If you  would like to know more:
>>>>> https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8
>>>>
>>>> Yeah, I understand why we're using grant mappings instead of just
>>>> directly having access via foreignmem mappings. That wasn't what I was
>>>> confused about.
>>>>
>>>> What I haven't worked out is why we're implementing this through an
>>>> automatically-populated MemoryRegion in QEMU, rather than just using
>>>> grant mapping ops like we always have.
>>>>
>>>> It seems like a lot of complexity just to avoid calling
>>>> qemu_xen_gnttab_map_refs() from the virtio backend.
>>>
>>> I think there are two questions here. One question is "Why do we need
>>> all the new grant mapping code added to xen-mapcache.c in patch #7?
>>> Can't we use qemu_xen_gnttab_map_refs() instead?"
>>
>> The main motivation was to _avoid_ having to change all the backends.
>>
>> My implementation enables _all_ qemu based virtio backends to use grant
>> mappings. And if a new backend is added to qemu, there will be no change
>> required to make it work with grants.
> 
> I'm not really convinced I buy that. This is a lot of complexity, and
> don't backends need to call an appropriate mapping function to map via
> an IOMMU if it's present anyway? Make then call a helper where you can
> do this in one place directly instead of through a fake MemoryRegion,
> and you're done, surely?

That was tested with unmodified block and net backends in qemu.

Maybe I missed something, but I think the IOMMU accesses are _not_ covering
accesses to the virtio rings from qemu. And this is something you really
want for driver domains.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [QEMU][PATCHv2 0/8] Xen: support grant mappings.
  2023-11-14  6:19                 ` Juergen Gross
@ 2023-11-14 20:58                   ` Stefano Stabellini
  0 siblings, 0 replies; 28+ messages in thread
From: Stefano Stabellini @ 2023-11-14 20:58 UTC (permalink / raw)
  To: Juergen Gross
  Cc: David Woodhouse, Stefano Stabellini, Vikram Garhwal, qemu-devel,
	paul

[-- Attachment #1: Type: text/plain, Size: 3699 bytes --]

On Tue, 14 Nov 2023, Juergen Gross wrote:
> On 13.11.23 21:24, David Woodhouse wrote:
> > On Fri, 2023-10-27 at 07:27 +0200, Juergen Gross wrote:
> > > On 26.10.23 22:56, Stefano Stabellini wrote:
> > > > On Thu, 26 Oct 2023, David Woodhouse wrote:
> > > > > On Thu, 2023-10-26 at 13:36 -0700, Stefano Stabellini wrote:
> > > > > > 
> > > > > > > This seems like a lot of code to replace that simpler option... is
> > > > > > > there a massive performance win from doing it this way? Would we
> > > > > > > want
> > > > > > > to use this trick for the Xen PV backends (qdisk, qnic) *too*?
> > > > > > > Might it
> > > > > > > make sense to introduce the simple version and *then* the
> > > > > > > optimisation,
> > > > > > > with some clear benchmarking to show the win?
> > > > > > 
> > > > > > This is not done for performance but for safety (as in safety
> > > > > > certifications, ISO 26262, etc.). This is to enable unprivileged
> > > > > > virtio
> > > > > > backends running in a DomU. By unprivileged I mean a virtio backend
> > > > > > that
> > > > > > is unable to map arbitrary memory (the xenforeignmemory interface is
> > > > > > prohibited).
> > > > > > 
> > > > > > The goal is to run Xen on safety-critical systems such as cars,
> > > > > > industrial robots and more. In this configuration there is no
> > > > > > traditional Dom0 in the system at all. If you  would like to know
> > > > > > more:
> > > > > > https://www.youtube.com/watch?v=tisljY6Bqv0&list=PLYyw7IQjL-zHtpYtMpFR3KYdRn0rcp5Xn&index=8
> > > > > 
> > > > > Yeah, I understand why we're using grant mappings instead of just
> > > > > directly having access via foreignmem mappings. That wasn't what I was
> > > > > confused about.
> > > > > 
> > > > > What I haven't worked out is why we're implementing this through an
> > > > > automatically-populated MemoryRegion in QEMU, rather than just using
> > > > > grant mapping ops like we always have.
> > > > > 
> > > > > It seems like a lot of complexity just to avoid calling
> > > > > qemu_xen_gnttab_map_refs() from the virtio backend.
> > > > 
> > > > I think there are two questions here. One question is "Why do we need
> > > > all the new grant mapping code added to xen-mapcache.c in patch #7?
> > > > Can't we use qemu_xen_gnttab_map_refs() instead?"
> > > 
> > > The main motivation was to _avoid_ having to change all the backends.
> > > 
> > > My implementation enables _all_ qemu based virtio backends to use grant
> > > mappings. And if a new backend is added to qemu, there will be no change
> > > required to make it work with grants.
> > 
> > I'm not really convinced I buy that. This is a lot of complexity, and
> > don't backends need to call an appropriate mapping function to map via
> > an IOMMU if it's present anyway? Make then call a helper where you can
> > do this in one place directly instead of through a fake MemoryRegion,
> > and you're done, surely?
> 
> That was tested with unmodified block and net backends in qemu.
> 
> Maybe I missed something, but I think the IOMMU accesses are _not_ covering
> accesses to the virtio rings from qemu. And this is something you really
> want for driver domains.

Hi Juergen, David,

I don't think we should use the IOMMU accesses and I don't think we
should change the virtio backends.

My preference would be to either use the patch as is, or (better) to
reuse the original Xen hooks already in place (qemu_ram_ptr_length and
xen_invalidate_map_cache_entry, see
https://marc.info/?l=qemu-devel&m=169828434927778).

Juergen did a good job at adding new hooks in a clean way, but from my
point of view I would still prefer to only have the two existing hooks
instead of having 4 (2 old, 2 new).

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-11-14 20:59 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-25 21:24 [QEMU][PATCHv2 0/8] Xen: support grant mappings Vikram Garhwal
2023-10-25 21:24 ` [QEMU][PATCHv2 1/8] xen: when unplugging emulated devices skip virtio devices Vikram Garhwal
2023-10-25 23:22   ` David Woodhouse
2023-10-26  1:23     ` Stefano Stabellini
2023-10-26 15:45       ` David Woodhouse
2023-10-26 17:13         ` Vikram Garhwal
2023-10-25 21:24 ` [QEMU][PATCHv2 2/8] softmmu: physmem: Split ram_block_add() Vikram Garhwal
2023-10-26  1:25   ` Stefano Stabellini
2023-10-25 21:24 ` [QEMU][PATCHv2 3/8] xen: add pseudo RAM region for grant mappings Vikram Garhwal
2023-10-26  1:27   ` Stefano Stabellini
2023-10-25 21:24 ` [QEMU][PATCHv2 4/8] softmmu: let qemu_map_ram_ptr() use qemu_ram_ptr_length() Vikram Garhwal
2023-10-26  1:28   ` Stefano Stabellini
2023-10-25 21:24 ` [QEMU][PATCHv2 5/8] xen: let xen_ram_addr_from_mapcache() return -1 in case of not found entry Vikram Garhwal
2023-10-25 21:24 ` [QEMU][PATCHv2 6/8] memory: add MemoryRegion map and unmap callbacks Vikram Garhwal
2023-10-25 21:24 ` [QEMU][PATCHv2 7/8] xen: add map and unmap callbacks for grant region Vikram Garhwal
2023-10-26  1:32   ` Stefano Stabellini
2023-10-26  4:35     ` Vikram Garhwal
2023-10-25 21:24 ` [QEMU][PATCHv2 8/8] hw: arm: Add grant mapping Vikram Garhwal
2023-10-26 16:12 ` [QEMU][PATCHv2 0/8] Xen: support grant mappings David Woodhouse
2023-10-26 18:07   ` Stefano Stabellini
2023-10-26 20:15     ` David Woodhouse
2023-10-26 20:36       ` Stefano Stabellini
2023-10-26 20:44         ` David Woodhouse
2023-10-26 20:56           ` Stefano Stabellini
2023-10-27  5:27             ` Juergen Gross
2023-11-13 20:24               ` David Woodhouse
2023-11-14  6:19                 ` Juergen Gross
2023-11-14 20:58                   ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).