qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
@ 2025-04-10 12:26 Manos Pitsidianakis
  2025-04-10 12:26 ` [PATCH v2 1/3] hw/display: re-arrange memory region tracking Manos Pitsidianakis
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Manos Pitsidianakis @ 2025-04-10 12:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Philippe Mathieu-Daudé, Richard Henderson

A hang was observed when running a small kernel that exercised VIRTIO 
GPU under TCG. This is an edge-case and won't happen under typical 
conditions.

When unmapping a blob object, its MemoryRegion's freeing is deferred to 
the RCU thread. The hang's cause was determined to be a busy main loop 
not allowing for the RCU thread to run because the kernel did not setup 
any timers or had any interrupts on the way. While fixing the RCU thread 
to run even if the guest CPU spins is a solution, it's easier to fix the 
reason why the MemoryRegion isn't freed from the main loop instead.

While at it, also restructure the 3 stage cleanup to immediately respond 
to the guest if the MR happened to have had no other reference.

PS: The hang can be reproduced by running this unikernel with TCG 

https://git.codelinaro.org/manos.pitsidianakis/virtio-tests/-/tree/8c0ebe9395827e24aa5711186d499bf5de87cf63/virtio-test-suite

v1 to v2:
  - Add patch by Alex to prevent double-free when FlatView is destroyed 
    from RCU thread.

Alex Bennée (1):
  hw/display: re-arrange memory region tracking

Manos Pitsidianakis (2):
  virtio-gpu: fix hang under TCG when unmapping blob
  virtio-gpu: refactor async blob unmapping

 include/exec/memory.h         |  1 +
 hw/display/virtio-gpu-virgl.c | 60 ++++++++++++++++++++---------------
 2 files changed, 35 insertions(+), 26 deletions(-)


base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365
-- 
γαῖα πυρί μιχθήτω



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] hw/display: re-arrange memory region tracking
  2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
@ 2025-04-10 12:26 ` Manos Pitsidianakis
  2025-04-15 10:35   ` Philippe Mathieu-Daudé
  2025-04-10 12:26 ` [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob Manos Pitsidianakis
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Manos Pitsidianakis @ 2025-04-10 12:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Philippe Mathieu-Daudé, Richard Henderson,
	Michael S. Tsirkin, Paolo Bonzini, Peter Xu, David Hildenbrand

From: Alex Bennée <alex.bennee@linaro.org>

QOM objects can be embedded in other QOM objects and managed as part
of their lifetime but this isn't the case for
virtio_gpu_virgl_hostmem_region. However before we can split it out we
need some other way of associating the wider data structure with the
memory region.

Fortunately MemoryRegion has an opaque pointer. This is passed down to
MemoryRegionOps for device type regions but is unused in the
memory_region_init_ram_ptr() case. Use the opaque to carry the
reference and allow the final MemoryRegion object to be reaped when
its reference count is cleared.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
---
 include/exec/memory.h         |  1 +
 hw/display/virtio-gpu-virgl.c | 23 ++++++++---------------
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index d09af58c97..bb735a3c7e 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -784,6 +784,7 @@ struct MemoryRegion {
     DeviceState *dev;
 
     const MemoryRegionOps *ops;
+    /* opaque data, used by backends like @ops */
     void *opaque;
     MemoryRegion *container;
     int mapped_via_alias; /* Mapped via an alias, container might be NULL */
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 145a0b3879..71a7500de9 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -52,17 +52,11 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
 
 #if VIRGL_VERSION_MAJOR >= 1
 struct virtio_gpu_virgl_hostmem_region {
-    MemoryRegion mr;
+    MemoryRegion *mr;
     struct VirtIOGPU *g;
     bool finish_unmapping;
 };
 
-static struct virtio_gpu_virgl_hostmem_region *
-to_hostmem_region(MemoryRegion *mr)
-{
-    return container_of(mr, struct virtio_gpu_virgl_hostmem_region, mr);
-}
-
 static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
 {
     VirtIOGPU *g = opaque;
@@ -73,14 +67,12 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
 static void virtio_gpu_virgl_hostmem_region_free(void *obj)
 {
     MemoryRegion *mr = MEMORY_REGION(obj);
-    struct virtio_gpu_virgl_hostmem_region *vmr;
+    struct virtio_gpu_virgl_hostmem_region *vmr = mr->opaque;
     VirtIOGPUBase *b;
     VirtIOGPUGL *gl;
 
-    vmr = to_hostmem_region(mr);
-    vmr->finish_unmapping = true;
-
     b = VIRTIO_GPU_BASE(vmr->g);
+    vmr->finish_unmapping = true;
     b->renderer_blocked--;
 
     /*
@@ -118,8 +110,8 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
 
     vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
     vmr->g = g;
+    mr = g_new0(MemoryRegion, 1);
 
-    mr = &vmr->mr;
     memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
     memory_region_add_subregion(&b->hostmem, offset, mr);
     memory_region_set_enabled(mr, true);
@@ -131,7 +123,9 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
      * command processing until MR is fully unreferenced and freed.
      */
     OBJECT(mr)->free = virtio_gpu_virgl_hostmem_region_free;
+    mr->opaque = vmr;
 
+    vmr->mr = mr;
     res->mr = mr;
 
     return 0;
@@ -142,16 +136,15 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
                                      struct virtio_gpu_virgl_resource *res,
                                      bool *cmd_suspended)
 {
-    struct virtio_gpu_virgl_hostmem_region *vmr;
     VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
     MemoryRegion *mr = res->mr;
+    struct virtio_gpu_virgl_hostmem_region *vmr;
     int ret;
 
     if (!mr) {
         return 0;
     }
-
-    vmr = to_hostmem_region(res->mr);
+    vmr = mr->opaque;
 
     /*
      * Perform async unmapping in 3 steps:
-- 
γαῖα πυρί μιχθήτω



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob
  2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
  2025-04-10 12:26 ` [PATCH v2 1/3] hw/display: re-arrange memory region tracking Manos Pitsidianakis
@ 2025-04-10 12:26 ` Manos Pitsidianakis
  2025-04-10 14:59   ` Alex Bennée
  2025-04-10 12:26 ` [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping Manos Pitsidianakis
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Manos Pitsidianakis @ 2025-04-10 12:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Philippe Mathieu-Daudé, Richard Henderson,
	Michael S. Tsirkin

From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>

This commit fixes an indefinite hang when using VIRTIO GPU blob objects
under TCG in certain conditions.

The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
MemoryRegion and attaches it to an offset on a PCI BAR of the
VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
it.

Because virglrenderer commands are not thread-safe they are only
called on the main context and QEMU performs the cleanup in three steps
to prevent a use-after-free scenario where the guest can access the
region after it’s unmapped:

1. From the main context, the region’s field finish_unmapping is false
   by default, so it sets a variable cmd_suspended, increases the
   renderer_blocked variable, deletes the blob subregion, and unparents
   the blob subregion causing its reference count to decrement.

2. From an RCU context, the MemoryView gets freed, the FlatView gets
   recalculated, the free callback of the blob region
   virtio_gpu_virgl_hostmem_region_free is called which sets the
   region’s field finish_unmapping to true, allowing the main thread
   context to finish replying to the command

3. From the main context, the command is processed again, but this time
   finish_unmapping is true, so virgl_renderer_resource_unmap can be
   called and a response is sent to the guest.

It happens so that under TCG, if the guest has no timers configured (and
thus no interrupt will cause the CPU to exit), the RCU thread does not
have enough time to grab the locks and recalculate the FlatView.

That’s not a big problem in practice since most guests will assume a
response will happen later in time and go on to do different things,
potentially triggering interrupts and allowing the RCU context to run.
If the guest waits for the unmap command to complete though, it blocks
indefinitely. Attaching to the QEMU monitor and force quitting the guest
allows the cleanup to continue.

There's no reason why the FlatView recalculation can't occur right away
when we delete the blob subregion, however. It does not, because when we
create the subregion we set the object as its own parent:

    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);

The extra reference is what prevents freeing the memory region object in
the memory transaction of deleting the subregion.

This commit changes the owner object to the device, which removes the
extra owner reference in the memory region and causes the MR to be
freed right away in the main context.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
---
 hw/display/virtio-gpu-virgl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 71a7500de9..8fbe4e70cc 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
     vmr->g = g;
     mr = g_new0(MemoryRegion, 1);
 
-    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
+    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
     memory_region_add_subregion(&b->hostmem, offset, mr);
     memory_region_set_enabled(mr, true);
 
-- 
γαῖα πυρί μιχθήτω



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping
  2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
  2025-04-10 12:26 ` [PATCH v2 1/3] hw/display: re-arrange memory region tracking Manos Pitsidianakis
  2025-04-10 12:26 ` [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob Manos Pitsidianakis
@ 2025-04-10 12:26 ` Manos Pitsidianakis
  2025-04-15 10:40   ` Philippe Mathieu-Daudé
  2025-04-15 18:46 ` [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Alex Bennée
  2025-04-17  8:08 ` Alex Bennée
  4 siblings, 1 reply; 14+ messages in thread
From: Manos Pitsidianakis @ 2025-04-10 12:26 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Philippe Mathieu-Daudé, Richard Henderson,
	Michael S. Tsirkin

From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>

Change the 3 part async cleanup of a blob memory mapping to check if the
unmapping has finished already after deleting the subregion; this
condition allows us to skip suspending the command and responding to the
guest right away.

Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
---
 hw/display/virtio-gpu-virgl.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 8fbe4e70cc..32a32879f7 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -155,7 +155,32 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
      *    asynchronously by virtio_gpu_virgl_hostmem_region_free().
      * 3. Finish the unmapping with final virgl_renderer_resource_unmap().
      */
+
+    /* 1. Check if we should start unmapping now */
+    if (!vmr->finish_unmapping) {
+        /* begin async unmapping. render will be unblocked once MR is freed */
+        b->renderer_blocked++;
+
+        memory_region_set_enabled(mr, false);
+        memory_region_del_subregion(&b->hostmem, mr);
+        object_unparent(OBJECT(mr));
+        /*
+         * The unmapping might have already finished at this point if no one
+         * else held a reference to the MR; if yes, we can skip suspending the
+         * command and unmap the resource right away.
+         */
+        *cmd_suspended = !vmr->finish_unmapping;
+    }
+
+    /*
+     * 2. if virtio_gpu_virgl_hostmem_region_free hasn't been executed yet, we
+     * have marked the command to be re-processed later by setting
+     * cmd_suspended to true. The freeing callback will be called from RCU
+     * context later.
+     */
+
     if (vmr->finish_unmapping) {
+        /* 3. MemoryRegion has been freed, so finish unmapping */
         res->mr = NULL;
         g_free(vmr);
 
@@ -166,16 +191,6 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
                           __func__, strerror(-ret));
             return ret;
         }
-    } else {
-        *cmd_suspended = true;
-
-        /* render will be unblocked once MR is freed */
-        b->renderer_blocked++;
-
-        /* memory region owns self res->mr object and frees it by itself */
-        memory_region_set_enabled(mr, false);
-        memory_region_del_subregion(&b->hostmem, mr);
-        object_unparent(OBJECT(mr));
     }
 
     return 0;
-- 
γαῖα πυρί μιχθήτω



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob
  2025-04-10 12:26 ` [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob Manos Pitsidianakis
@ 2025-04-10 14:59   ` Alex Bennée
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2025-04-10 14:59 UTC (permalink / raw)
  To: Manos Pitsidianakis
  Cc: qemu-devel, Philippe Mathieu-Daudé, Richard Henderson,
	Michael S. Tsirkin

Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:

> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>
> This commit fixes an indefinite hang when using VIRTIO GPU blob objects
> under TCG in certain conditions.
>
> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
> MemoryRegion and attaches it to an offset on a PCI BAR of the
> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
> it.
>
> Because virglrenderer commands are not thread-safe they are only
> called on the main context and QEMU performs the cleanup in three steps
> to prevent a use-after-free scenario where the guest can access the
> region after it’s unmapped:
>
> 1. From the main context, the region’s field finish_unmapping is false
>    by default, so it sets a variable cmd_suspended, increases the
>    renderer_blocked variable, deletes the blob subregion, and unparents
>    the blob subregion causing its reference count to decrement.
>
> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>    recalculated, the free callback of the blob region
>    virtio_gpu_virgl_hostmem_region_free is called which sets the
>    region’s field finish_unmapping to true, allowing the main thread
>    context to finish replying to the command
>
> 3. From the main context, the command is processed again, but this time
>    finish_unmapping is true, so virgl_renderer_resource_unmap can be
>    called and a response is sent to the guest.
>
> It happens so that under TCG, if the guest has no timers configured (and
> thus no interrupt will cause the CPU to exit), the RCU thread does not
> have enough time to grab the locks and recalculate the FlatView.
>
> That’s not a big problem in practice since most guests will assume a
> response will happen later in time and go on to do different things,
> potentially triggering interrupts and allowing the RCU context to run.
> If the guest waits for the unmap command to complete though, it blocks
> indefinitely. Attaching to the QEMU monitor and force quitting the guest
> allows the cleanup to continue.
>
> There's no reason why the FlatView recalculation can't occur right away
> when we delete the blob subregion, however. It does not, because when we
> create the subregion we set the object as its own parent:
>
>     memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>
> The extra reference is what prevents freeing the memory region object in
> the memory transaction of deleting the subregion.
>
> This commit changes the owner object to the device, which removes the
> extra owner reference in the memory region and causes the MR to be
> freed right away in the main context.
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] hw/display: re-arrange memory region tracking
  2025-04-10 12:26 ` [PATCH v2 1/3] hw/display: re-arrange memory region tracking Manos Pitsidianakis
@ 2025-04-15 10:35   ` Philippe Mathieu-Daudé
  2025-04-15 11:21     ` Alex Bennée
  0 siblings, 1 reply; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-04-15 10:35 UTC (permalink / raw)
  To: Manos Pitsidianakis, qemu-devel
  Cc: Alex Bennée, Richard Henderson, Michael S. Tsirkin,
	Paolo Bonzini, Peter Xu, David Hildenbrand

On 10/4/25 14:26, Manos Pitsidianakis wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
> 
> QOM objects can be embedded in other QOM objects and managed as part
> of their lifetime but this isn't the case for
> virtio_gpu_virgl_hostmem_region. However before we can split it out we
> need some other way of associating the wider data structure with the
> memory region.
> 
> Fortunately MemoryRegion has an opaque pointer. This is passed down to
> MemoryRegionOps for device type regions but is unused in the
> memory_region_init_ram_ptr() case.

It is unclear to me what layer is supposed to set/consume it. So far
in memory_region_init_io/ram/rom it is kind of internal to the memory
layer, but MemoryRegionOps aren't. Yeah well, OK then.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> Use the opaque to carry the
> reference and allow the final MemoryRegion object to be reaped when
> its reference count is cleared.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> ---
>   include/exec/memory.h         |  1 +
>   hw/display/virtio-gpu-virgl.c | 23 ++++++++---------------
>   2 files changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index d09af58c97..bb735a3c7e 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -784,6 +784,7 @@ struct MemoryRegion {
>       DeviceState *dev;
>   
>       const MemoryRegionOps *ops;
> +    /* opaque data, used by backends like @ops */
>       void *opaque;
>       MemoryRegion *container;
>       int mapped_via_alias; /* Mapped via an alias, container might be NULL */
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 145a0b3879..71a7500de9 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -52,17 +52,11 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
>   
>   #if VIRGL_VERSION_MAJOR >= 1
>   struct virtio_gpu_virgl_hostmem_region {
> -    MemoryRegion mr;
> +    MemoryRegion *mr;
>       struct VirtIOGPU *g;
>       bool finish_unmapping;
>   };
>   
> -static struct virtio_gpu_virgl_hostmem_region *
> -to_hostmem_region(MemoryRegion *mr)
> -{
> -    return container_of(mr, struct virtio_gpu_virgl_hostmem_region, mr);
> -}
> -
>   static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
>   {
>       VirtIOGPU *g = opaque;
> @@ -73,14 +67,12 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
>   static void virtio_gpu_virgl_hostmem_region_free(void *obj)
>   {
>       MemoryRegion *mr = MEMORY_REGION(obj);
> -    struct virtio_gpu_virgl_hostmem_region *vmr;
> +    struct virtio_gpu_virgl_hostmem_region *vmr = mr->opaque;
>       VirtIOGPUBase *b;
>       VirtIOGPUGL *gl;
>   
> -    vmr = to_hostmem_region(mr);
> -    vmr->finish_unmapping = true;
> -
>       b = VIRTIO_GPU_BASE(vmr->g);
> +    vmr->finish_unmapping = true;
>       b->renderer_blocked--;
>   
>       /*
> @@ -118,8 +110,8 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>   
>       vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
>       vmr->g = g;
> +    mr = g_new0(MemoryRegion, 1);
>   
> -    mr = &vmr->mr;
>       memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>       memory_region_add_subregion(&b->hostmem, offset, mr);
>       memory_region_set_enabled(mr, true);
> @@ -131,7 +123,9 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>        * command processing until MR is fully unreferenced and freed.
>        */
>       OBJECT(mr)->free = virtio_gpu_virgl_hostmem_region_free;
> +    mr->opaque = vmr;
>   
> +    vmr->mr = mr;
>       res->mr = mr;
>   
>       return 0;
> @@ -142,16 +136,15 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
>                                        struct virtio_gpu_virgl_resource *res,
>                                        bool *cmd_suspended)
>   {
> -    struct virtio_gpu_virgl_hostmem_region *vmr;
>       VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
>       MemoryRegion *mr = res->mr;
> +    struct virtio_gpu_virgl_hostmem_region *vmr;

Same same but different? ;)

>       int ret;
>   
>       if (!mr) {
>           return 0;
>       }
> -
> -    vmr = to_hostmem_region(res->mr);
> +    vmr = mr->opaque;
>   
>       /*
>        * Perform async unmapping in 3 steps:



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping
  2025-04-10 12:26 ` [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping Manos Pitsidianakis
@ 2025-04-15 10:40   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 14+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-04-15 10:40 UTC (permalink / raw)
  To: Manos Pitsidianakis, qemu-devel
  Cc: Alex Bennée, Richard Henderson, Michael S. Tsirkin

On 10/4/25 14:26, Manos Pitsidianakis wrote:
> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> 
> Change the 3 part async cleanup of a blob memory mapping to check if the
> unmapping has finished already after deleting the subregion; this
> condition allows us to skip suspending the command and responding to the
> guest right away.
> 
> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> ---
>   hw/display/virtio-gpu-virgl.c | 35 +++++++++++++++++++++++++----------
>   1 file changed, 25 insertions(+), 10 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] hw/display: re-arrange memory region tracking
  2025-04-15 10:35   ` Philippe Mathieu-Daudé
@ 2025-04-15 11:21     ` Alex Bennée
  0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2025-04-15 11:21 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Manos Pitsidianakis, qemu-devel, Richard Henderson,
	Michael S. Tsirkin, Paolo Bonzini, Peter Xu, David Hildenbrand

Philippe Mathieu-Daudé <philmd@linaro.org> writes:

> On 10/4/25 14:26, Manos Pitsidianakis wrote:
>> From: Alex Bennée <alex.bennee@linaro.org>
>> QOM objects can be embedded in other QOM objects and managed as part
>> of their lifetime but this isn't the case for
>> virtio_gpu_virgl_hostmem_region. However before we can split it out we
>> need some other way of associating the wider data structure with the
>> memory region.
>> Fortunately MemoryRegion has an opaque pointer. This is passed down
>> to
>> MemoryRegionOps for device type regions but is unused in the
>> memory_region_init_ram_ptr() case.
>
> It is unclear to me what layer is supposed to set/consume it. So far
> in memory_region_init_io/ram/rom it is kind of internal to the memory
> layer, but MemoryRegionOps aren't. Yeah well, OK then.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>
>> Use the opaque to carry the
>> reference and allow the final MemoryRegion object to be reaped when
>> its reference count is cleared.
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>> ---
>>   include/exec/memory.h         |  1 +
>>   hw/display/virtio-gpu-virgl.c | 23 ++++++++---------------
>>   2 files changed, 9 insertions(+), 15 deletions(-)
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index d09af58c97..bb735a3c7e 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -784,6 +784,7 @@ struct MemoryRegion {
>>       DeviceState *dev;
>>         const MemoryRegionOps *ops;
>> +    /* opaque data, used by backends like @ops */
>>       void *opaque;
>>       MemoryRegion *container;
>>       int mapped_via_alias; /* Mapped via an alias, container might be NULL */
>> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
>> index 145a0b3879..71a7500de9 100644
>> --- a/hw/display/virtio-gpu-virgl.c
>> +++ b/hw/display/virtio-gpu-virgl.c
>> @@ -52,17 +52,11 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
>>     #if VIRGL_VERSION_MAJOR >= 1
>>   struct virtio_gpu_virgl_hostmem_region {
>> -    MemoryRegion mr;
>> +    MemoryRegion *mr;
>>       struct VirtIOGPU *g;
>>       bool finish_unmapping;
>>   };
>>   -static struct virtio_gpu_virgl_hostmem_region *
>> -to_hostmem_region(MemoryRegion *mr)
>> -{
>> -    return container_of(mr, struct virtio_gpu_virgl_hostmem_region, mr);
>> -}
>> -
>>   static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
>>   {
>>       VirtIOGPU *g = opaque;
>> @@ -73,14 +67,12 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
>>   static void virtio_gpu_virgl_hostmem_region_free(void *obj)
>>   {
>>       MemoryRegion *mr = MEMORY_REGION(obj);
>> -    struct virtio_gpu_virgl_hostmem_region *vmr;
>> +    struct virtio_gpu_virgl_hostmem_region *vmr = mr->opaque;
>>       VirtIOGPUBase *b;
>>       VirtIOGPUGL *gl;
>>   -    vmr = to_hostmem_region(mr);
>> -    vmr->finish_unmapping = true;
>> -
>>       b = VIRTIO_GPU_BASE(vmr->g);
>> +    vmr->finish_unmapping = true;
>>       b->renderer_blocked--;
>>         /*
>> @@ -118,8 +110,8 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>         vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
>>       vmr->g = g;
>> +    mr = g_new0(MemoryRegion, 1);
>>   -    mr = &vmr->mr;
>>       memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>>       memory_region_add_subregion(&b->hostmem, offset, mr);
>>       memory_region_set_enabled(mr, true);
>> @@ -131,7 +123,9 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>        * command processing until MR is fully unreferenced and freed.
>>        */
>>       OBJECT(mr)->free = virtio_gpu_virgl_hostmem_region_free;
>> +    mr->opaque = vmr;
>>   +    vmr->mr = mr;
>>       res->mr = mr;
>>         return 0;
>> @@ -142,16 +136,15 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
>>                                        struct virtio_gpu_virgl_resource *res,
>>                                        bool *cmd_suspended)
>>   {
>> -    struct virtio_gpu_virgl_hostmem_region *vmr;
>>       VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
>>       MemoryRegion *mr = res->mr;
>> +    struct virtio_gpu_virgl_hostmem_region *vmr;
>
> Same same but different? ;)

Hmm I think I just reflexively put unassigned variables at the bottom of
the list of declarations so they stand out more.

>
>>       int ret;
>>         if (!mr) {
>>           return 0;
>>       }
>> -
>> -    vmr = to_hostmem_region(res->mr);
>> +    vmr = mr->opaque;
>>         /*
>>        * Perform async unmapping in 3 steps:

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
                   ` (2 preceding siblings ...)
  2025-04-10 12:26 ` [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping Manos Pitsidianakis
@ 2025-04-15 18:46 ` Alex Bennée
  2025-04-16 18:57   ` Michael S. Tsirkin
  2025-04-16 19:26   ` Stefan Hajnoczi
  2025-04-17  8:08 ` Alex Bennée
  4 siblings, 2 replies; 14+ messages in thread
From: Alex Bennée @ 2025-04-15 18:46 UTC (permalink / raw)
  To: Manos Pitsidianakis
  Cc: qemu-devel, Philippe Mathieu-Daudé, Richard Henderson,
	Stefan Hajnoczi, Michael S. Tsirkin

Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:

> A hang was observed when running a small kernel that exercised VIRTIO 
> GPU under TCG. This is an edge-case and won't happen under typical 
> conditions.

Should I (or MST?) pull these into a tree for 10.0 or should they be
grabbed for when the tree opens with a Cc qemu-stable?

>
> When unmapping a blob object, its MemoryRegion's freeing is deferred to 
> the RCU thread. The hang's cause was determined to be a busy main loop 
> not allowing for the RCU thread to run because the kernel did not setup 
> any timers or had any interrupts on the way. While fixing the RCU thread 
> to run even if the guest CPU spins is a solution, it's easier to fix the 
> reason why the MemoryRegion isn't freed from the main loop instead.
>
> While at it, also restructure the 3 stage cleanup to immediately respond 
> to the guest if the MR happened to have had no other reference.
>
> PS: The hang can be reproduced by running this unikernel with TCG 
>
> https://git.codelinaro.org/manos.pitsidianakis/virtio-tests/-/tree/8c0ebe9395827e24aa5711186d499bf5de87cf63/virtio-test-suite
>
> v1 to v2:
>   - Add patch by Alex to prevent double-free when FlatView is destroyed 
>     from RCU thread.
>
> Alex Bennée (1):
>   hw/display: re-arrange memory region tracking
>
> Manos Pitsidianakis (2):
>   virtio-gpu: fix hang under TCG when unmapping blob
>   virtio-gpu: refactor async blob unmapping
>
>  include/exec/memory.h         |  1 +
>  hw/display/virtio-gpu-virgl.c | 60 ++++++++++++++++++++---------------
>  2 files changed, 35 insertions(+), 26 deletions(-)
>
>
> base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-15 18:46 ` [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Alex Bennée
@ 2025-04-16 18:57   ` Michael S. Tsirkin
  2025-04-17  8:00     ` Alex Bennée
  2025-04-16 19:26   ` Stefan Hajnoczi
  1 sibling, 1 reply; 14+ messages in thread
From: Michael S. Tsirkin @ 2025-04-16 18:57 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Manos Pitsidianakis, qemu-devel, Philippe Mathieu-Daudé,
	Richard Henderson, Stefan Hajnoczi

On Tue, Apr 15, 2025 at 07:46:14PM +0100, Alex Bennée wrote:
> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:
> 
> > A hang was observed when running a small kernel that exercised VIRTIO 
> > GPU under TCG. This is an edge-case and won't happen under typical 
> > conditions.
> 
> Should I (or MST?) pull these into a tree for 10.0 or should they be
> grabbed for when the tree opens with a Cc qemu-stable?


Let's just agree who takes them, then it's up to that maintainer.
You wanna merge them?

> >
> > When unmapping a blob object, its MemoryRegion's freeing is deferred to 
> > the RCU thread. The hang's cause was determined to be a busy main loop 
> > not allowing for the RCU thread to run because the kernel did not setup 
> > any timers or had any interrupts on the way. While fixing the RCU thread 
> > to run even if the guest CPU spins is a solution, it's easier to fix the 
> > reason why the MemoryRegion isn't freed from the main loop instead.
> >
> > While at it, also restructure the 3 stage cleanup to immediately respond 
> > to the guest if the MR happened to have had no other reference.
> >
> > PS: The hang can be reproduced by running this unikernel with TCG 
> >
> > https://git.codelinaro.org/manos.pitsidianakis/virtio-tests/-/tree/8c0ebe9395827e24aa5711186d499bf5de87cf63/virtio-test-suite
> >
> > v1 to v2:
> >   - Add patch by Alex to prevent double-free when FlatView is destroyed 
> >     from RCU thread.
> >
> > Alex Bennée (1):
> >   hw/display: re-arrange memory region tracking
> >
> > Manos Pitsidianakis (2):
> >   virtio-gpu: fix hang under TCG when unmapping blob
> >   virtio-gpu: refactor async blob unmapping
> >
> >  include/exec/memory.h         |  1 +
> >  hw/display/virtio-gpu-virgl.c | 60 ++++++++++++++++++++---------------
> >  2 files changed, 35 insertions(+), 26 deletions(-)
> >
> >
> > base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365
> 
> -- 
> Alex Bennée
> Virtualisation Tech Lead @ Linaro



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-15 18:46 ` [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Alex Bennée
  2025-04-16 18:57   ` Michael S. Tsirkin
@ 2025-04-16 19:26   ` Stefan Hajnoczi
  1 sibling, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2025-04-16 19:26 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Manos Pitsidianakis, qemu-devel, Philippe Mathieu-Daudé,
	Richard Henderson, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2145 bytes --]

On Tue, Apr 15, 2025 at 07:46:14PM +0100, Alex Bennée wrote:
> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:
> 
> > A hang was observed when running a small kernel that exercised VIRTIO 
> > GPU under TCG. This is an edge-case and won't happen under typical 
> > conditions.
> 
> Should I (or MST?) pull these into a tree for 10.0 or should they be
> grabbed for when the tree opens with a Cc qemu-stable?

QEMU 10.0.0-rc4 has already been tagged. No further patches will be
merged unless there is a show-stopper (build failure, security issue).

Please Cc qemu-stable so this can be merged for 10.0.1. Thanks!

> 
> >
> > When unmapping a blob object, its MemoryRegion's freeing is deferred to 
> > the RCU thread. The hang's cause was determined to be a busy main loop 
> > not allowing for the RCU thread to run because the kernel did not setup 
> > any timers or had any interrupts on the way. While fixing the RCU thread 
> > to run even if the guest CPU spins is a solution, it's easier to fix the 
> > reason why the MemoryRegion isn't freed from the main loop instead.
> >
> > While at it, also restructure the 3 stage cleanup to immediately respond 
> > to the guest if the MR happened to have had no other reference.
> >
> > PS: The hang can be reproduced by running this unikernel with TCG 
> >
> > https://git.codelinaro.org/manos.pitsidianakis/virtio-tests/-/tree/8c0ebe9395827e24aa5711186d499bf5de87cf63/virtio-test-suite
> >
> > v1 to v2:
> >   - Add patch by Alex to prevent double-free when FlatView is destroyed 
> >     from RCU thread.
> >
> > Alex Bennée (1):
> >   hw/display: re-arrange memory region tracking
> >
> > Manos Pitsidianakis (2):
> >   virtio-gpu: fix hang under TCG when unmapping blob
> >   virtio-gpu: refactor async blob unmapping
> >
> >  include/exec/memory.h         |  1 +
> >  hw/display/virtio-gpu-virgl.c | 60 ++++++++++++++++++++---------------
> >  2 files changed, 35 insertions(+), 26 deletions(-)
> >
> >
> > base-commit: 56c6e249b6988c1b6edc2dd34ebb0f1e570a1365
> 
> -- 
> Alex Bennée
> Virtualisation Tech Lead @ Linaro
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-16 18:57   ` Michael S. Tsirkin
@ 2025-04-17  8:00     ` Alex Bennée
  2025-04-17 11:37       ` Michael S. Tsirkin
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Bennée @ 2025-04-17  8:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Manos Pitsidianakis, qemu-devel, Philippe Mathieu-Daudé,
	Richard Henderson, Stefan Hajnoczi

"Michael S. Tsirkin" <mst@redhat.com> writes:

> On Tue, Apr 15, 2025 at 07:46:14PM +0100, Alex Bennée wrote:
>> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:
>> 
>> > A hang was observed when running a small kernel that exercised VIRTIO 
>> > GPU under TCG. This is an edge-case and won't happen under typical 
>> > conditions.
>> 
>> Should I (or MST?) pull these into a tree for 10.0 or should they be
>> grabbed for when the tree opens with a Cc qemu-stable?
>
>
> Let's just agree who takes them, then it's up to that maintainer.
> You wanna merge them?

I'm happy to take them, I'll cc stable when the tree opens up again.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
                   ` (3 preceding siblings ...)
  2025-04-15 18:46 ` [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Alex Bennée
@ 2025-04-17  8:08 ` Alex Bennée
  4 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2025-04-17  8:08 UTC (permalink / raw)
  To: Manos Pitsidianakis
  Cc: qemu-devel, Philippe Mathieu-Daudé, Richard Henderson

Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:

> A hang was observed when running a small kernel that exercised VIRTIO 
> GPU under TCG. This is an edge-case and won't happen under typical 
> conditions.

Queued to virtio-gpu/next, thanks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence
  2025-04-17  8:00     ` Alex Bennée
@ 2025-04-17 11:37       ` Michael S. Tsirkin
  0 siblings, 0 replies; 14+ messages in thread
From: Michael S. Tsirkin @ 2025-04-17 11:37 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Manos Pitsidianakis, qemu-devel, Philippe Mathieu-Daudé,
	Richard Henderson, Stefan Hajnoczi

On Thu, Apr 17, 2025 at 09:00:40AM +0100, Alex Bennée wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Tue, Apr 15, 2025 at 07:46:14PM +0100, Alex Bennée wrote:
> >> Manos Pitsidianakis <manos.pitsidianakis@linaro.org> writes:
> >> 
> >> > A hang was observed when running a small kernel that exercised VIRTIO 
> >> > GPU under TCG. This is an edge-case and won't happen under typical 
> >> > conditions.
> >> 
> >> Should I (or MST?) pull these into a tree for 10.0 or should they be
> >> grabbed for when the tree opens with a Cc qemu-stable?
> >
> >
> > Let's just agree who takes them, then it's up to that maintainer.
> > You wanna merge them?
> 
> I'm happy to take them, I'll cc stable when the tree opens up again.


Acked-by: Michael S. Tsirkin <mst@redhat.com>


> -- 
> Alex Bennée
> Virtualisation Tech Lead @ Linaro



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-04-17 11:38 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10 12:26 [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Manos Pitsidianakis
2025-04-10 12:26 ` [PATCH v2 1/3] hw/display: re-arrange memory region tracking Manos Pitsidianakis
2025-04-15 10:35   ` Philippe Mathieu-Daudé
2025-04-15 11:21     ` Alex Bennée
2025-04-10 12:26 ` [PATCH v2 2/3] virtio-gpu: fix hang under TCG when unmapping blob Manos Pitsidianakis
2025-04-10 14:59   ` Alex Bennée
2025-04-10 12:26 ` [PATCH v2 3/3] virtio-gpu: refactor async blob unmapping Manos Pitsidianakis
2025-04-15 10:40   ` Philippe Mathieu-Daudé
2025-04-15 18:46 ` [PATCH v2 0/3] virtio-gpu: fix blob unmapping sequence Alex Bennée
2025-04-16 18:57   ` Michael S. Tsirkin
2025-04-17  8:00     ` Alex Bennée
2025-04-17 11:37       ` Michael S. Tsirkin
2025-04-16 19:26   ` Stefan Hajnoczi
2025-04-17  8:08 ` Alex Bennée

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).