qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Gurchetan Singh <gurchetansingh@chromium.org>
To: qemu-devel@nongnu.org
Cc: pbonzini@redhat.com, philmd@linaro.org, david@redhat.com,
	stefanha@redhat.com, kraxel@redhat.com,
	marcandre.lureau@redhat.com, akihiko.odaki@gmail.com,
	dmitry.osipenko@collabora.com, ray.huang@amd.com,
	alex.bennee@linaro.org
Subject: [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest
Date: Thu, 20 Apr 2023 18:12:22 -0700	[thread overview]
Message-ID: <20230421011223.718-13-gurchetansingh@chromium.org> (raw)
In-Reply-To: <20230421011223.718-1-gurchetansingh@chromium.org>

I just copied the patches that have been floating around that do
this, but it doesn't seem to robustly work.  This current
implementation is probably good enough to run vkcube or simple
apps, but whenever a test starts to aggressively map/unmap memory,
things do explode on the QEMU side.

A simple way to reproduce is run:

./deqp-vk --deqp-case=deqp-vk --deqp-case=dEQP-VK.memory.mapping.suballocation.*

You should get stack traces that sometimes look like this:

0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=140737316304448, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3  0x00007ffff7042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
4  0x00007ffff70287f3 in __GI_abort () at ./stdlib/abort.c:79
5  0x00007ffff70896f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff71dbb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
6  0x00007ffff70a0d7c in malloc_printerr (str=str@entry=0x7ffff71de7b0 "double free or corruption (out)") at ./malloc/malloc.c:5664
7  0x00007ffff70a2ef0 in _int_free (av=0x7ffff7219c80 <main_arena>, p=0x555557793e00, have_lock=<optimized out>) at ./malloc/malloc.c:4588
8  0x00007ffff70a54d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
9  0x0000555555d65e7e in phys_section_destroy (mr=0x555557793e10) at ../softmmu/physmem.c:1003
10 0x0000555555d65ed0 in phys_sections_free (map=0x555556d4b410) at ../softmmu/physmem.c:1011
11 0x0000555555d69578 in address_space_dispatch_free (d=0x555556d4b400) at ../softmmu/physmem.c:2430
12 0x0000555555d58412 in flatview_destroy (view=0x5555572bb090) at ../softmmu/memory.c:292
13 0x000055555600fd23 in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
14 0x00005555560026d4 in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
15 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
16 0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

or this:

0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
1198        g_assert(obj->ref > 0);
(gdb) bt
0  0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
1  0x0000555555d5cca5 in memory_region_unref (mr=0x5555572b9e20) at ../softmmu/memory.c:1799
2  0x0000555555d65e47 in phys_section_destroy (mr=0x5555572b9e20) at ../softmmu/physmem.c:998
3  0x0000555555d65ec7 in phys_sections_free (map=0x5555588365c0) at ../softmmu/physmem.c:1011
4  0x0000555555d6956f in address_space_dispatch_free (d=0x5555588365b0) at ../softmmu/physmem.c:2430
5  0x0000555555d58409 in flatview_destroy (view=0x555558836570) at ../softmmu/memory.c:292
6  0x000055555600fd1a in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
7  0x00005555560026cb in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
8  0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
9  0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

The reason seems to be memory regions are handled on a different
thread than the virtio-gpu thread, and that inevitably leads to
raciness.  The memory region docs[a] generally seems to dissuade this:

"In order to do this, as a general rule do not create or destroy
 memory regions dynamically during a device’s lifetime, and only
 call object_unparent() in the memory region owner’s instance_finalize
 callback. The dynamically allocated data structure that contains
 the memory region then should obviously be freed in the
 instance_finalize callback as well."

Though instance_finalize is called before device destruction, so
storing the memory until then is unlikely to be an option.  The
tests do pass when virtio-gpu doesn't free the memory, but
progressively the guest becomes slower and then OOMs.

Though the api does make an exception:

"There is an exception to the above rule: it is okay to call
object_unparent at any time for an alias or a container region. It is
therefore also okay to create or destroy alias and container regions
dynamically during a device’s lifetime."

I believe we are trying to create a container subregion, but that's
still failing?  Are we doing it right?  Any memory region experts
here can help out?  The other revelant patch in this series
is "virtio-gpu: hostmem".

[a] https://qemu.readthedocs.io/en/latest/devel/memory.html

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-rutabaga.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
index 5fd1154198..196267aac2 100644
--- a/hw/display/virtio-gpu-rutabaga.c
+++ b/hw/display/virtio-gpu-rutabaga.c
@@ -159,6 +159,12 @@ static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
     GET_VIRTIO_GPU_GL(g);
     GET_RUTABAGA(virtio_gpu);
 
+    memory_region_transaction_begin();
+    memory_region_set_enabled(&res->region, false);
+    memory_region_del_subregion(&g->parent_obj.hostmem, &res->region);
+    object_unparent(OBJECT(&res->region));
+    memory_region_transaction_commit();
+
     res->mapped = NULL;
     return rutabaga_resource_unmap(rutabaga, res->resource_id);
 }
@@ -671,6 +677,14 @@ rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
     result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
     CHECK_RESULT(result, cmd);
 
+    memory_region_transaction_begin();
+    memory_region_init_ram_device_ptr(&res->region, OBJECT(g), NULL,
+                                      mapping.size, (void *)mapping.ptr);
+    memory_region_add_subregion(&g->parent_obj.hostmem, mblob.offset,
+                                &res->region);
+    memory_region_set_enabled(&res->region, true);
+    memory_region_transaction_commit();
+
     memset(&resp, 0, sizeof(resp));
     resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
     result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,
-- 
2.40.0.634.g4ca3ef3211-goog



  parent reply	other threads:[~2023-04-21  1:13 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 01/13] virtio: Add shared memory capability Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 02/13] virtio-gpu: hostmem Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 03/13] virtio-gpu blob prep: improve decoding and add memory region Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 04/13] virtio-gpu: CONTEXT_INIT feature Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl Gurchetan Singh
2023-04-21  9:40   ` Philippe Mathieu-Daudé
2023-04-21  1:12 ` [RFC PATCH 06/13] gfxstream + rutabaga prep: make GL device more library agnostic Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function Gurchetan Singh
2023-04-21  9:53   ` Philippe Mathieu-Daudé
2023-04-21  1:12 ` [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options Gurchetan Singh
2023-04-22 14:54   ` Akihiko Odaki
2023-04-21  1:12 ` [RFC PATCH 09/13] gfxstream + rutabaga: add required meson changes Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream Gurchetan Singh
2023-04-22 15:37   ` Akihiko Odaki
2023-04-21  1:12 ` [RFC PATCH 11/13] gfxstream + rutabaga: enable rutabaga Gurchetan Singh
2023-04-21  1:12 ` Gurchetan Singh [this message]
2023-04-22 15:46   ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Akihiko Odaki
2023-04-25  0:08     ` Gurchetan Singh
2023-04-22 19:22   ` Peter Maydell
2023-04-21  1:12 ` [RFC PATCH 13/13] HACK: schedule fence return on main AIO context Gurchetan Singh
2023-04-21 15:59   ` Stefan Hajnoczi
2023-04-21 23:20     ` Gurchetan Singh
2023-04-25 12:04       ` Stefan Hajnoczi
2023-04-21 16:02 ` [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Stefan Hajnoczi
2023-04-21 23:54   ` Gurchetan Singh
2023-04-22 16:41     ` Akihiko Odaki
2023-04-25  0:16       ` Gurchetan Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230421011223.718-13-gurchetansingh@chromium.org \
    --to=gurchetansingh@chromium.org \
    --cc=akihiko.odaki@gmail.com \
    --cc=alex.bennee@linaro.org \
    --cc=david@redhat.com \
    --cc=dmitry.osipenko@collabora.com \
    --cc=kraxel@redhat.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=ray.huang@amd.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).