[RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?
@ 2023-04-21  1:12 Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 01/13] virtio: Add shared memory capability Gurchetan Singh
                   ` (13 more replies)
  0 siblings, 14 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

From: Gurchetan Singh <gurchetansingh@google.com>

Rationale:

- gfxstream [a] is good for the Android Emulator/upstream QEMU
  alignment
- Wayland passhthrough [b] via the cross-domain context type is good
  for Linux on Linux display virtualization
- rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
  virglrenderer
- This series ports rutabaga_gfx to QEMU

Feedback requested:

- How is everyone feeling about gfxstream/rutabaga_gfx, especially UI
  maintainers?  I've been assuming it is a definite win, so if there's
  a divergence of opinion on that, we should resolve that quickly.

- Need help from memory region API experts on "HACK: use memory region
  API to inject memory to guest"

- Need help from QEMU multi-threaded experts on "HACK: schedule fence
  return on main AIO context"

----------
| Longer |
----------

Dear all,

The people have demanded it, and we have listened.  Just the other
day, some kids came up to me on the street -- hat in hand, teardrops
in their eyes -- and asked "please sir, can you perchance port
gfxstream and rutabaga_gfx to upstream QEMU?".  I honestly can't take
it anymore.

In a way, I can understand the fanaticism of the gfxstreamists -- the
benefits of gfxstream + rutabaga_gfx in upstream QEMU are massive for
all involved:

(i) Android Emulator aligned with QEMU

The biggest use case is no doubt the Android Emulator.  Although used
by millions of developers around the world [d][e], the Android Emulator
itself uses currently uses a forked QEMU 2.12.  The initial fork
happened in the early days of Android (circa 2006 [f]) and while the
situation has improved, a QEMU update inside the Android Emulator only
happens once every 3-5 years. Indeed, most Android Emulator developers
aren't even subscribed to qemu-devel@ given this situation.  Their
task is often to get the next foldable config working or fix that UI
bug, but long term technical debt is something that is rarely
prioritized.

This one of those years when QEMU will be upreved, though.  Soon, the
emulator will be based on QEMU7.2 and new controls will be instituted
to make QEMU modifications harder.  Things that can be upstreamed
will be upstreamed.

One of the biggest downstream pieces of the Android Emulator is the
gfxstream graphics stack, and it has some nontrivial features that
aren't easy to implement elsewhere [g].

The lore of gfxstream is detailed in patch 10, but suffice to say
getting gfxstream mainlined would help move the Android Emulator out
of it's downstream mud hut into the light, love and compassion of
upstream.

(ii) Wayland passthrough

For the Linux guest on Linux host use case, we've elected to port
rutabaga_gfx into QEMU rather than gfxstream.  rutabaga_gfx sits on
top of gfxstream, virglrenderer, and the cross-domain context type.
With the cross-domain context type, one can avoid a guest compositor
pass to display VM windows like host normal windows.  It's now
possible to run the examples found in the crosvm book [h] with this
patchset.  There are a few problems [i], but fixing them is O(days).

This use case is less strong than the Android Emulator one, since
anyone who would play a game in a Linux guest via QEMU would be able
to run it natively.  But it could be good for developers who need to
test code in a virtual machine.

------------------
| Issues         |
------------------

The two biggest unsolved issues are the last two "HACK:" patches.
Feedback from QEMU memory management and threading experts would be
greatly appreciated.

------------------
| UI integration |
------------------

This patchset intentionally uses the simplest KMS display integration
possible: framebuffer copies to Pixman.  The reason is Linux guests
are expected to use Wayland Passthrough, and the Android Emulator UI
integration is very complex.  gfxstream doesn't have a "context 0"
like virglrenderer that can force synchronization between QEMU's and
gfxstream's GL code.

Initially, we just want to run the Android Emulator in headless mode,
and we have a few subsequent followup ideas in mind for UI integration
(all of with the goal to be minimally invasive for QEMU).  Note: even
with Android in headless mode, QEMU upstream will be used in production
and not just be a developer toy.

--------------------------
| Packaging / Versioning |
--------------------------

We have to build QEMU from sources due to compliance reasons, so we
haven't created Debian packages for either gfxstream or rutabaga_gfx
yet.  QEMU is upstream of Debian/Portage anyways.  Let us know the
standard on packaging and we should be able to follow it.

Versioning would be keyed on initial merge into QEMU.

--------------------------
| Testing                |
--------------------------

A document on how to test the patchset is availble on QEMU Gitlab [j].

[a] https://android.googlesource.com/device/generic/vulkan-cereal/
[b] https://www.youtube.com/watch?v=OZJiHMtIQ2M
[c] https://github.com/google/crosvm/blob/main/rutabaga_gfx/ffi/src/include/rutabaga_gfx_ffi.h
[d] https://www.youtube.com/watch?v=LgRRmgfrFQM
[e] https://maltewolfcastle.medium.com/how-to-setup-an-automotive-android-emulator-f287a4061b19
[f] https://groups.google.com/g/android-emulator-dev/c/dltBnUW_HzU
[g] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
[h] https://crosvm.dev/book/devices/wayland.html
[i] https://github.com/talex5/wayland-proxy-virtwl/blob/master/virtio-spec.md#problem
[j] https://gitlab.com/qemu-project/qemu/-/issues/1611

Antonio Caggiano (2):
  virtio-gpu blob prep: improve decoding and add memory region
  virtio-gpu: CONTEXT_INIT feature

Dr. David Alan Gilbert (1):
  virtio: Add shared memory capability

Gerd Hoffmann (1):
  virtio-gpu: hostmem

Gurchetan Singh (9):
  gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl
  gfxstream + rutabaga prep: make GL device more library agnostic
  gfxstream + rutabaga prep: define callbacks in realize function
  gfxstream + rutabaga prep: added need defintions, fields, and options
  gfxstream + rutabaga: add required meson changes
  gfxstream + rutabaga: add initial support for gfxstream
  gfxstream + rutabaga: enable rutabaga
  HACK: use memory region API to inject memory to guest
  HACK: schedule fence return on main AIO context

 hw/display/meson.build                 |   40 +-
 hw/display/virtio-gpu-base.c           |    6 +-
 hw/display/virtio-gpu-gl.c             |  121 +--
 hw/display/virtio-gpu-pci.c            |   14 +
 hw/display/virtio-gpu-rutabaga-stubs.c |    8 +
 hw/display/virtio-gpu-rutabaga.c       | 1032 ++++++++++++++++++++++++
 hw/display/virtio-gpu-virgl-stubs.c    |    8 +
 hw/display/virtio-gpu-virgl.c          |  138 +++-
 hw/display/virtio-gpu.c                |   17 +-
 hw/display/virtio-vga.c                |   33 +-
 hw/virtio/virtio-pci.c                 |   18 +
 include/hw/virtio/virtio-gpu-bswap.h   |   18 +
 include/hw/virtio/virtio-gpu.h         |   38 +-
 include/hw/virtio/virtio-pci.h         |    4 +
 meson.build                            |    8 +
 meson_options.txt                      |    2 +
 scripts/meson-buildoptions.sh          |    3 +
 17 files changed, 1356 insertions(+), 152 deletions(-)
 create mode 100644 hw/display/virtio-gpu-rutabaga-stubs.c
 create mode 100644 hw/display/virtio-gpu-rutabaga.c
 create mode 100644 hw/display/virtio-gpu-virgl-stubs.c

-- 
2.40.0.634.g4ca3ef3211-goog

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC PATCH 01/13] virtio: Add shared memory capability
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 02/13] virtio-gpu: hostmem Gurchetan Singh
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Define a new capability type 'VIRTIO_PCI_CAP_SHARED_MEMORY_CFG' to allow
defining shared memory regions with sizes and offsets of 2^32 and more.
Multiple instances of the capability are allowed and distinguished
by a device-specific 'id'.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
---
 hw/virtio/virtio-pci.c         | 18 ++++++++++++++++++
 include/hw/virtio/virtio-pci.h |  4 ++++
 2 files changed, 22 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 247325c193..08ba76fca2 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1399,6 +1399,24 @@ static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
     return offset;
 }
 
+int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
+                           uint8_t bar, uint64_t offset, uint64_t length,
+                           uint8_t id)
+{
+    struct virtio_pci_cap64 cap = {
+        .cap.cap_len = sizeof cap,
+        .cap.cfg_type = VIRTIO_PCI_CAP_SHARED_MEMORY_CFG,
+    };
+
+    cap.cap.bar = bar;
+    cap.cap.length = cpu_to_le32(length);
+    cap.length_hi = cpu_to_le32(length >> 32);
+    cap.cap.offset = cpu_to_le32(offset);
+    cap.offset_hi = cpu_to_le32(offset >> 32);
+    cap.cap.id = id;
+    return virtio_pci_add_mem_cap(proxy, &cap.cap);
+}
+
 static uint64_t virtio_pci_common_read(void *opaque, hwaddr addr,
                                        unsigned size)
 {
diff --git a/include/hw/virtio/virtio-pci.h b/include/hw/virtio/virtio-pci.h
index ab2051b64b..5a3f182f99 100644
--- a/include/hw/virtio/virtio-pci.h
+++ b/include/hw/virtio/virtio-pci.h
@@ -264,4 +264,8 @@ unsigned virtio_pci_optimal_num_queues(unsigned fixed_queues);
 void virtio_pci_set_guest_notifier_fd_handler(VirtIODevice *vdev, VirtQueue *vq,
                                               int n, bool assign,
                                               bool with_irqfd);
+
+int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy, uint8_t bar, uint64_t offset,
+                           uint64_t length, uint8_t id);
+
 #endif
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 02/13] virtio-gpu: hostmem
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 01/13] virtio: Add shared memory capability Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 03/13] virtio-gpu blob prep: improve decoding and add memory region Gurchetan Singh
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

From: Gerd Hoffmann <kraxel@redhat.com>

Use VIRTIO_GPU_SHM_ID_HOST_VISIBLE as id for virtio-gpu.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/display/virtio-gpu-pci.c    | 14 ++++++++++++++
 hw/display/virtio-gpu.c        |  1 +
 hw/display/virtio-vga.c        | 33 ++++++++++++++++++++++++---------
 include/hw/virtio/virtio-gpu.h |  5 +++++
 4 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/hw/display/virtio-gpu-pci.c b/hw/display/virtio-gpu-pci.c
index 93f214ff58..da6a99f038 100644
--- a/hw/display/virtio-gpu-pci.c
+++ b/hw/display/virtio-gpu-pci.c
@@ -33,6 +33,20 @@ static void virtio_gpu_pci_base_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
     DeviceState *vdev = DEVICE(g);
     int i;
 
+    if (virtio_gpu_hostmem_enabled(g->conf)) {
+        vpci_dev->msix_bar_idx = 1;
+        vpci_dev->modern_mem_bar_idx = 2;
+        memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
+                           g->conf.hostmem);
+        pci_register_bar(&vpci_dev->pci_dev, 4,
+                         PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_PREFETCH |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                         &g->hostmem);
+        virtio_pci_add_shm_cap(vpci_dev, 4, 0, g->conf.hostmem,
+                               VIRTIO_GPU_SHM_ID_HOST_VISIBLE);
+    }
+
     virtio_pci_force_virtio_1(vpci_dev);
     if (!qdev_realize(vdev, BUS(&vpci_dev->bus), errp)) {
         return;
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 5e15c79b94..7b592f998d 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1425,6 +1425,7 @@ static Property virtio_gpu_properties[] = {
                      256 * MiB),
     DEFINE_PROP_BIT("blob", VirtIOGPU, parent_obj.conf.flags,
                     VIRTIO_GPU_FLAG_BLOB_ENABLED, false),
+    DEFINE_PROP_SIZE("hostmem", VirtIOGPU, parent_obj.conf.hostmem, 0),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/display/virtio-vga.c b/hw/display/virtio-vga.c
index e6fb0aa876..c8552ff760 100644
--- a/hw/display/virtio-vga.c
+++ b/hw/display/virtio-vga.c
@@ -115,17 +115,32 @@ static void virtio_vga_base_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
     pci_register_bar(&vpci_dev->pci_dev, 0,
                      PCI_BASE_ADDRESS_MEM_PREFETCH, &vga->vram);
 
-    /*
-     * Configure virtio bar and regions
-     *
-     * We use bar #2 for the mmio regions, to be compatible with stdvga.
-     * virtio regions are moved to the end of bar #2, to make room for
-     * the stdvga mmio registers at the start of bar #2.
-     */
-    vpci_dev->modern_mem_bar_idx = 2;
-    vpci_dev->msix_bar_idx = 4;
     vpci_dev->modern_io_bar_idx = 5;
 
+    if (!virtio_gpu_hostmem_enabled(g->conf)) {
+        /*
+         * Configure virtio bar and regions
+         *
+         * We use bar #2 for the mmio regions, to be compatible with stdvga.
+         * virtio regions are moved to the end of bar #2, to make room for
+         * the stdvga mmio registers at the start of bar #2.
+         */
+        vpci_dev->modern_mem_bar_idx = 2;
+        vpci_dev->msix_bar_idx = 4;
+    } else {
+        vpci_dev->msix_bar_idx = 1;
+        vpci_dev->modern_mem_bar_idx = 2;
+        memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
+                           g->conf.hostmem);
+        pci_register_bar(&vpci_dev->pci_dev, 4,
+                         PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_PREFETCH |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                         &g->hostmem);
+        virtio_pci_add_shm_cap(vpci_dev, 4, 0, g->conf.hostmem,
+                               VIRTIO_GPU_SHM_ID_HOST_VISIBLE);
+    }
+
     if (!(vpci_dev->flags & VIRTIO_PCI_FLAG_PAGE_PER_VQ)) {
         /*
          * with page-per-vq=off there is no padding space we can use
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 2e28507efe..eafce75b04 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -102,12 +102,15 @@ enum virtio_gpu_base_conf_flags {
     (_cfg.flags & (1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED))
 #define virtio_gpu_blob_enabled(_cfg) \
     (_cfg.flags & (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED))
+#define virtio_gpu_hostmem_enabled(_cfg) \
+    (_cfg.hostmem > 0)
 
 struct virtio_gpu_base_conf {
     uint32_t max_outputs;
     uint32_t flags;
     uint32_t xres;
     uint32_t yres;
+    uint64_t hostmem;
 };
 
 struct virtio_gpu_ctrl_command {
@@ -131,6 +134,8 @@ struct VirtIOGPUBase {
     int renderer_blocked;
     int enable;
 
+    MemoryRegion hostmem;
+
     struct virtio_gpu_scanout scanout[VIRTIO_GPU_MAX_SCANOUTS];
 
     int enabled_output_bitmask;
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 03/13] virtio-gpu blob prep: improve decoding and add memory region
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 01/13] virtio: Add shared memory capability Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 02/13] virtio-gpu: hostmem Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 04/13] virtio-gpu: CONTEXT_INIT feature Gurchetan Singh
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

From: Antonio Caggiano <antonio.caggiano@collabora.com>

This adds preparatory functions needed to:

     - decode blob cmds
     - track memory regions associated with mappable blobs
     - tracking iovecs

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu.c              | 11 +++--------
 include/hw/virtio/virtio-gpu-bswap.h | 18 ++++++++++++++++++
 include/hw/virtio/virtio-gpu.h       |  8 ++++++++
 3 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 7b592f998d..938eed9181 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -33,16 +33,11 @@
 
 #define VIRTIO_GPU_VM_VERSION 1
 
-static struct virtio_gpu_simple_resource*
-virtio_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id);
 static struct virtio_gpu_simple_resource *
 virtio_gpu_find_check_resource(VirtIOGPU *g, uint32_t resource_id,
                                bool require_backing,
                                const char *caller, uint32_t *error);
 
-static void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
-                                       struct virtio_gpu_simple_resource *res);
-
 void virtio_gpu_update_cursor_data(VirtIOGPU *g,
                                    struct virtio_gpu_scanout *s,
                                    uint32_t resource_id)
@@ -115,7 +110,7 @@ static void update_cursor(VirtIOGPU *g, struct virtio_gpu_update_cursor *cursor)
                   cursor->resource_id ? 1 : 0);
 }
 
-static struct virtio_gpu_simple_resource *
+struct virtio_gpu_simple_resource *
 virtio_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id)
 {
     struct virtio_gpu_simple_resource *res;
@@ -872,8 +867,8 @@ void virtio_gpu_cleanup_mapping_iov(VirtIOGPU *g,
     g_free(iov);
 }
 
-static void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
-                                       struct virtio_gpu_simple_resource *res)
+void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
+                                struct virtio_gpu_simple_resource *res)
 {
     virtio_gpu_cleanup_mapping_iov(g, res->iov, res->iov_cnt);
     res->iov = NULL;
diff --git a/include/hw/virtio/virtio-gpu-bswap.h b/include/hw/virtio/virtio-gpu-bswap.h
index 9124108485..dd1975e2d4 100644
--- a/include/hw/virtio/virtio-gpu-bswap.h
+++ b/include/hw/virtio/virtio-gpu-bswap.h
@@ -63,10 +63,28 @@ virtio_gpu_create_blob_bswap(struct virtio_gpu_resource_create_blob *cblob)
 {
     virtio_gpu_ctrl_hdr_bswap(&cblob->hdr);
     le32_to_cpus(&cblob->resource_id);
+    le32_to_cpus(&cblob->blob_mem);
     le32_to_cpus(&cblob->blob_flags);
+    le32_to_cpus(&cblob->nr_entries);
+    le64_to_cpus(&cblob->blob_id);
     le64_to_cpus(&cblob->size);
 }
 
+static inline void
+virtio_gpu_map_blob_bswap(struct virtio_gpu_resource_map_blob *mblob)
+{
+    virtio_gpu_ctrl_hdr_bswap(&mblob->hdr);
+    le32_to_cpus(&mblob->resource_id);
+    le64_to_cpus(&mblob->offset);
+}
+
+static inline void
+virtio_gpu_unmap_blob_bswap(struct virtio_gpu_resource_unmap_blob *ublob)
+{
+    virtio_gpu_ctrl_hdr_bswap(&ublob->hdr);
+    le32_to_cpus(&ublob->resource_id);
+}
+
 static inline void
 virtio_gpu_scanout_blob_bswap(struct virtio_gpu_set_scanout_blob *ssb)
 {
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index eafce75b04..326988e4d5 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -55,6 +55,9 @@ struct virtio_gpu_simple_resource {
     int dmabuf_fd;
     uint8_t *remapped;
 
+    MemoryRegion region;
+    void *mapped;
+
     QTAILQ_ENTRY(virtio_gpu_simple_resource) next;
 };
 
@@ -245,6 +248,9 @@ void virtio_gpu_base_fill_display_info(VirtIOGPUBase *g,
                         struct virtio_gpu_resp_display_info *dpy_info);
 
 /* virtio-gpu.c */
+struct virtio_gpu_simple_resource *
+virtio_gpu_find_resource(VirtIOGPU *g, uint32_t resource_id);
+
 void virtio_gpu_ctrl_response(VirtIOGPU *g,
                               struct virtio_gpu_ctrl_command *cmd,
                               struct virtio_gpu_ctrl_hdr *resp,
@@ -263,6 +269,8 @@ int virtio_gpu_create_mapping_iov(VirtIOGPU *g,
                                   uint32_t *niov);
 void virtio_gpu_cleanup_mapping_iov(VirtIOGPU *g,
                                     struct iovec *iov, uint32_t count);
+void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
+                                struct virtio_gpu_simple_resource *res);
 void virtio_gpu_process_cmdq(VirtIOGPU *g);
 void virtio_gpu_device_realize(DeviceState *qdev, Error **errp);
 void virtio_gpu_reset(VirtIODevice *vdev);
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 04/13] virtio-gpu: CONTEXT_INIT feature
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (2 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 03/13] virtio-gpu blob prep: improve decoding and add memory region Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl Gurchetan Singh
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

From: Antonio Caggiano <antonio.caggiano@collabora.com>

The feature can be enabled when a backend wants it.

Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-base.c   | 3 +++
 include/hw/virtio/virtio-gpu.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index a29f191aa8..6c5f1f327f 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -215,6 +215,9 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, uint64_t features,
     if (virtio_gpu_blob_enabled(g->conf)) {
         features |= (1 << VIRTIO_GPU_F_RESOURCE_BLOB);
     }
+    if (virtio_gpu_context_init_enabled(g->conf)) {
+        features |= (1 << VIRTIO_GPU_F_CONTEXT_INIT);
+    }
 
     return features;
 }
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 326988e4d5..adee17968d 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -93,6 +93,7 @@ enum virtio_gpu_base_conf_flags {
     VIRTIO_GPU_FLAG_EDID_ENABLED,
     VIRTIO_GPU_FLAG_DMABUF_ENABLED,
     VIRTIO_GPU_FLAG_BLOB_ENABLED,
+    VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED,
 };
 
 #define virtio_gpu_virgl_enabled(_cfg) \
@@ -107,6 +108,8 @@ enum virtio_gpu_base_conf_flags {
     (_cfg.flags & (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED))
 #define virtio_gpu_hostmem_enabled(_cfg) \
     (_cfg.hostmem > 0)
+#define virtio_gpu_context_init_enabled(_cfg) \
+    (_cfg.flags & (1 << VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED))
 
 struct virtio_gpu_base_conf {
     uint32_t max_outputs;
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (3 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 04/13] virtio-gpu: CONTEXT_INIT feature Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  9:40   ` Philippe Mathieu-Daudé
  2023-04-21  1:12 ` [RFC PATCH 06/13] gfxstream + rutabaga prep: make GL device more library agnostic Gurchetan Singh
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

The virtio-gpu GL device has a heavy dependence on virgl.
Acknowledge this by naming functions accurately.

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-gl.c     | 27 ++++++++++++++-------------
 hw/display/virtio-gpu-virgl.c  |  2 +-
 include/hw/virtio/virtio-gpu.h |  2 +-
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
index e06be60dfb..dc648aacb2 100644
--- a/hw/display/virtio-gpu-gl.c
+++ b/hw/display/virtio-gpu-gl.c
@@ -25,9 +25,10 @@
 
 #include <virglrenderer.h>
 
-static void virtio_gpu_gl_update_cursor_data(VirtIOGPU *g,
-                                             struct virtio_gpu_scanout *s,
-                                             uint32_t resource_id)
+static void
+virtio_gpu_virgl_update_cursor(VirtIOGPU *g,
+                               struct virtio_gpu_scanout *s,
+                               uint32_t resource_id)
 {
     uint32_t width, height;
     uint32_t pixels, *data;
@@ -48,14 +49,14 @@ static void virtio_gpu_gl_update_cursor_data(VirtIOGPU *g,
     free(data);
 }
 
-static void virtio_gpu_gl_flushed(VirtIOGPUBase *b)
+static void virtio_gpu_virgl_flushed(VirtIOGPUBase *b)
 {
     VirtIOGPU *g = VIRTIO_GPU(b);
 
     virtio_gpu_process_cmdq(g);
 }
 
-static void virtio_gpu_gl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+static void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOGPU *g = VIRTIO_GPU(vdev);
     VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
@@ -71,7 +72,7 @@ static void virtio_gpu_gl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
     }
     if (gl->renderer_reset) {
         gl->renderer_reset = false;
-        virtio_gpu_virgl_reset(g);
+        virtio_gpu_virglrenderer_reset(g);
     }
 
     cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
@@ -87,7 +88,7 @@ static void virtio_gpu_gl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
     virtio_gpu_virgl_fence_poll(g);
 }
 
-static void virtio_gpu_gl_reset(VirtIODevice *vdev)
+static void virtio_gpu_virgl_reset(VirtIODevice *vdev)
 {
     VirtIOGPU *g = VIRTIO_GPU(vdev);
     VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
@@ -104,7 +105,7 @@ static void virtio_gpu_gl_reset(VirtIODevice *vdev)
     }
 }
 
-static void virtio_gpu_gl_device_realize(DeviceState *qdev, Error **errp)
+static void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
 {
     VirtIOGPU *g = VIRTIO_GPU(qdev);
 
@@ -143,13 +144,13 @@ static void virtio_gpu_gl_class_init(ObjectClass *klass, void *data)
     VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_CLASS(klass);
     VirtIOGPUClass *vgc = VIRTIO_GPU_CLASS(klass);
 
-    vbc->gl_flushed = virtio_gpu_gl_flushed;
-    vgc->handle_ctrl = virtio_gpu_gl_handle_ctrl;
+    vbc->gl_flushed = virtio_gpu_virgl_flushed;
+    vgc->handle_ctrl = virtio_gpu_virgl_handle_ctrl;
     vgc->process_cmd = virtio_gpu_virgl_process_cmd;
-    vgc->update_cursor_data = virtio_gpu_gl_update_cursor_data;
+    vgc->update_cursor_data = virtio_gpu_virgl_update_cursor;
 
-    vdc->realize = virtio_gpu_gl_device_realize;
-    vdc->reset = virtio_gpu_gl_reset;
+    vdc->realize = virtio_gpu_virgl_device_realize;
+    vdc->reset = virtio_gpu_virgl_reset;
     device_class_set_props(dc, virtio_gpu_gl_properties);
 }
 
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 1c47603d40..f91d33ce13 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -599,7 +599,7 @@ void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g)
     }
 }
 
-void virtio_gpu_virgl_reset(VirtIOGPU *g)
+void virtio_gpu_virglrenderer_reset(VirtIOGPU *g)
 {
     virgl_renderer_reset();
 }
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index adee17968d..e256e44172 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -297,7 +297,7 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
                                   struct virtio_gpu_ctrl_command *cmd);
 void virtio_gpu_virgl_fence_poll(VirtIOGPU *g);
 void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g);
-void virtio_gpu_virgl_reset(VirtIOGPU *g);
+void virtio_gpu_virglrenderer_reset(VirtIOGPU *g);
 int virtio_gpu_virgl_init(VirtIOGPU *g);
 int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g);
 
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 06/13] gfxstream + rutabaga prep: make GL device more library agnostic
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (4 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function Gurchetan Singh
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

Rather than create a virtio-gpu-gfxstream device and it's
associated variants (vga, pci), let's just extend the GL device.

We need to:
    - Move all virgl functions to their own file
    - Only all needed class callbacks in the generic GL device

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-gl.c     | 110 ------------------------------
 hw/display/virtio-gpu-virgl.c  | 119 +++++++++++++++++++++++++++++++--
 include/hw/virtio/virtio-gpu.h |  11 +--
 3 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
index dc648aacb2..2d140e8792 100644
--- a/hw/display/virtio-gpu-gl.c
+++ b/hw/display/virtio-gpu-gl.c
@@ -15,122 +15,12 @@
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "qemu/error-report.h"
-#include "qapi/error.h"
-#include "sysemu/sysemu.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-gpu.h"
 #include "hw/virtio/virtio-gpu-bswap.h"
 #include "hw/virtio/virtio-gpu-pixman.h"
 #include "hw/qdev-properties.h"
 
-#include <virglrenderer.h>
-
-static void
-virtio_gpu_virgl_update_cursor(VirtIOGPU *g,
-                               struct virtio_gpu_scanout *s,
-                               uint32_t resource_id)
-{
-    uint32_t width, height;
-    uint32_t pixels, *data;
-
-    data = virgl_renderer_get_cursor_data(resource_id, &width, &height);
-    if (!data) {
-        return;
-    }
-
-    if (width != s->current_cursor->width ||
-        height != s->current_cursor->height) {
-        free(data);
-        return;
-    }
-
-    pixels = s->current_cursor->width * s->current_cursor->height;
-    memcpy(s->current_cursor->data, data, pixels * sizeof(uint32_t));
-    free(data);
-}
-
-static void virtio_gpu_virgl_flushed(VirtIOGPUBase *b)
-{
-    VirtIOGPU *g = VIRTIO_GPU(b);
-
-    virtio_gpu_process_cmdq(g);
-}
-
-static void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
-{
-    VirtIOGPU *g = VIRTIO_GPU(vdev);
-    VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
-    struct virtio_gpu_ctrl_command *cmd;
-
-    if (!virtio_queue_ready(vq)) {
-        return;
-    }
-
-    if (!gl->renderer_inited) {
-        virtio_gpu_virgl_init(g);
-        gl->renderer_inited = true;
-    }
-    if (gl->renderer_reset) {
-        gl->renderer_reset = false;
-        virtio_gpu_virglrenderer_reset(g);
-    }
-
-    cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
-    while (cmd) {
-        cmd->vq = vq;
-        cmd->error = 0;
-        cmd->finished = false;
-        QTAILQ_INSERT_TAIL(&g->cmdq, cmd, next);
-        cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
-    }
-
-    virtio_gpu_process_cmdq(g);
-    virtio_gpu_virgl_fence_poll(g);
-}
-
-static void virtio_gpu_virgl_reset(VirtIODevice *vdev)
-{
-    VirtIOGPU *g = VIRTIO_GPU(vdev);
-    VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
-
-    virtio_gpu_reset(vdev);
-
-    /*
-     * GL functions must be called with the associated GL context in main
-     * thread, and when the renderer is unblocked.
-     */
-    if (gl->renderer_inited && !gl->renderer_reset) {
-        virtio_gpu_virgl_reset_scanout(g);
-        gl->renderer_reset = true;
-    }
-}
-
-static void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
-{
-    VirtIOGPU *g = VIRTIO_GPU(qdev);
-
-#if HOST_BIG_ENDIAN
-    error_setg(errp, "virgl is not supported on bigendian platforms");
-    return;
-#endif
-
-    if (!object_resolve_path_type("", TYPE_VIRTIO_GPU_GL, NULL)) {
-        error_setg(errp, "at most one %s device is permitted", TYPE_VIRTIO_GPU_GL);
-        return;
-    }
-
-    if (!display_opengl) {
-        error_setg(errp, "opengl is not available");
-        return;
-    }
-
-    g->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);
-    VIRTIO_GPU_BASE(g)->virtio_config.num_capsets =
-        virtio_gpu_virgl_get_num_capsets(g);
-
-    virtio_gpu_device_realize(qdev, errp);
-}
-
 static Property virtio_gpu_gl_properties[] = {
     DEFINE_PROP_BIT("stats", VirtIOGPU, parent_obj.conf.flags,
                     VIRTIO_GPU_FLAG_STATS_ENABLED, false),
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index f91d33ce13..87eccb2b97 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -14,6 +14,8 @@
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
 #include "qemu/iov.h"
+#include "qapi/error.h"
+#include "sysemu/sysemu.h"
 #include "trace.h"
 #include "hw/virtio/virtio.h"
 #include "hw/virtio/virtio-gpu.h"
@@ -584,12 +586,12 @@ static void virtio_gpu_fence_poll(void *opaque)
     }
 }
 
-void virtio_gpu_virgl_fence_poll(VirtIOGPU *g)
+static void virtio_gpu_virgl_fence_poll(VirtIOGPU *g)
 {
     virtio_gpu_fence_poll(g);
 }
 
-void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g)
+static void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g)
 {
     int i;
 
@@ -599,12 +601,12 @@ void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g)
     }
 }
 
-void virtio_gpu_virglrenderer_reset(VirtIOGPU *g)
+static void virtio_gpu_virglrenderer_reset(VirtIOGPU *g)
 {
     virgl_renderer_reset();
 }
 
-int virtio_gpu_virgl_init(VirtIOGPU *g)
+static int virtio_gpu_virgl_init(VirtIOGPU *g)
 {
     int ret;
 
@@ -625,7 +627,7 @@ int virtio_gpu_virgl_init(VirtIOGPU *g)
     return 0;
 }
 
-int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g)
+static int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g)
 {
     uint32_t capset2_max_ver, capset2_max_size;
     virgl_renderer_get_cap_set(VIRTIO_GPU_CAPSET_VIRGL2,
@@ -634,3 +636,110 @@ int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g)
 
     return capset2_max_ver ? 2 : 1;
 }
+
+void
+virtio_gpu_virgl_update_cursor(VirtIOGPU *g,
+                               struct virtio_gpu_scanout *s,
+                               uint32_t resource_id)
+{
+    uint32_t width, height;
+    uint32_t pixels, *data;
+
+    data = virgl_renderer_get_cursor_data(resource_id, &width, &height);
+    if (!data) {
+        return;
+    }
+
+    if (width != s->current_cursor->width ||
+        height != s->current_cursor->height) {
+        free(data);
+        return;
+    }
+
+    pixels = s->current_cursor->width * s->current_cursor->height;
+    memcpy(s->current_cursor->data, data, pixels * sizeof(uint32_t));
+    free(data);
+}
+
+void virtio_gpu_virgl_flushed(VirtIOGPUBase *b)
+{
+    VirtIOGPU *g = VIRTIO_GPU(b);
+
+    virtio_gpu_process_cmdq(g);
+}
+
+void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOGPU *g = VIRTIO_GPU(vdev);
+    VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
+    struct virtio_gpu_ctrl_command *cmd;
+
+    if (!virtio_queue_ready(vq)) {
+        return;
+    }
+
+    if (!gl->renderer_inited) {
+        virtio_gpu_virgl_init(g);
+        gl->renderer_inited = true;
+    }
+    if (gl->renderer_reset) {
+        gl->renderer_reset = false;
+        virtio_gpu_virglrenderer_reset(g);
+    }
+
+    cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
+    while (cmd) {
+        cmd->vq = vq;
+        cmd->error = 0;
+        cmd->finished = false;
+        QTAILQ_INSERT_TAIL(&g->cmdq, cmd, next);
+        cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
+    }
+
+    virtio_gpu_process_cmdq(g);
+    virtio_gpu_virgl_fence_poll(g);
+}
+
+void virtio_gpu_virgl_reset(VirtIODevice *vdev)
+{
+    VirtIOGPU *g = VIRTIO_GPU(vdev);
+    VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
+
+    virtio_gpu_reset(vdev);
+
+    /*
+     * GL functions must be called with the associated GL context in main
+     * thread, and when the renderer is unblocked.
+     */
+    if (gl->renderer_inited && !gl->renderer_reset) {
+        virtio_gpu_virgl_reset_scanout(g);
+        gl->renderer_reset = true;
+    }
+}
+
+void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
+{
+    VirtIOGPU *g = VIRTIO_GPU(qdev);
+
+#if HOST_BIG_ENDIAN
+    error_setg(errp, "virgl is not supported on bigendian platforms");
+    return;
+#endif
+
+    if (!object_resolve_path_type("", TYPE_VIRTIO_GPU_GL, NULL)) {
+        error_setg(errp, "at most one %s device is permitted",
+                   TYPE_VIRTIO_GPU_GL);
+        return;
+    }
+
+    if (!display_opengl) {
+        error_setg(errp, "opengl is not available");
+        return;
+    }
+
+    g->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);
+    VIRTIO_GPU_BASE(g)->virtio_config.num_capsets =
+        virtio_gpu_virgl_get_num_capsets(g);
+
+    virtio_gpu_device_realize(qdev, errp);
+}
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index e256e44172..7317b60004 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -295,10 +295,11 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
 /* virtio-gpu-3d.c */
 void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
                                   struct virtio_gpu_ctrl_command *cmd);
-void virtio_gpu_virgl_fence_poll(VirtIOGPU *g);
-void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g);
-void virtio_gpu_virglrenderer_reset(VirtIOGPU *g);
-int virtio_gpu_virgl_init(VirtIOGPU *g);
-int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g);
+void virtio_gpu_virgl_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
+                                    uint32_t resource_id);
+void virtio_gpu_virgl_flushed(VirtIOGPUBase *b);
+void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq);
+void virtio_gpu_virgl_reset(VirtIODevice *vdev);
+void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp);
 
 #endif
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (5 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 06/13] gfxstream + rutabaga prep: make GL device more library agnostic Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  9:53   ` Philippe Mathieu-Daudé
  2023-04-21  1:12 ` [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options Gurchetan Singh
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

This reduces the amount of renderer backend specific needed to
be exposed to the GL device.  We only need one realize function
per renderer backend.

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-gl.c     | 17 +++++++++++------
 hw/display/virtio-gpu-virgl.c  | 35 ++++++++++++++++++++++++----------
 include/hw/virtio/virtio-gpu.h |  7 -------
 3 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
index 2d140e8792..547e697333 100644
--- a/hw/display/virtio-gpu-gl.c
+++ b/hw/display/virtio-gpu-gl.c
@@ -21,6 +21,11 @@
 #include "hw/virtio/virtio-gpu-pixman.h"
 #include "hw/qdev-properties.h"
 
+static void virtio_gpu_gl_device_realize(DeviceState *qdev, Error **errp)
+{
+    virtio_gpu_virgl_device_realize(qdev, errp);
+}
+
 static Property virtio_gpu_gl_properties[] = {
     DEFINE_PROP_BIT("stats", VirtIOGPU, parent_obj.conf.flags,
                     VIRTIO_GPU_FLAG_STATS_ENABLED, false),
@@ -34,13 +39,13 @@ static void virtio_gpu_gl_class_init(ObjectClass *klass, void *data)
     VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_CLASS(klass);
     VirtIOGPUClass *vgc = VIRTIO_GPU_CLASS(klass);
 
-    vbc->gl_flushed = virtio_gpu_virgl_flushed;
-    vgc->handle_ctrl = virtio_gpu_virgl_handle_ctrl;
-    vgc->process_cmd = virtio_gpu_virgl_process_cmd;
-    vgc->update_cursor_data = virtio_gpu_virgl_update_cursor;
+    vbc->gl_flushed = NULL;
+    vgc->handle_ctrl = NULL;
+    vgc->process_cmd = NULL;
+    vgc->update_cursor_data = NULL;
 
-    vdc->realize = virtio_gpu_virgl_device_realize;
-    vdc->reset = virtio_gpu_virgl_reset;
+    vdc->realize = virtio_gpu_gl_device_realize;
+    vdc->reset = NULL;
     device_class_set_props(dc, virtio_gpu_gl_properties);
 }
 
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 87eccb2b97..5be288562d 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -401,8 +401,9 @@ static void virgl_cmd_get_capset(VirtIOGPU *g,
     g_free(resp);
 }
 
-void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
-                                      struct virtio_gpu_ctrl_command *cmd)
+static void
+virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
+                             struct virtio_gpu_ctrl_command *cmd)
 {
     VIRTIO_GPU_FILL_CMD(cmd->cmd_hdr);
 
@@ -637,7 +638,7 @@ static int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g)
     return capset2_max_ver ? 2 : 1;
 }
 
-void
+static void
 virtio_gpu_virgl_update_cursor(VirtIOGPU *g,
                                struct virtio_gpu_scanout *s,
                                uint32_t resource_id)
@@ -661,14 +662,14 @@ virtio_gpu_virgl_update_cursor(VirtIOGPU *g,
     free(data);
 }
 
-void virtio_gpu_virgl_flushed(VirtIOGPUBase *b)
+static void virtio_gpu_virgl_flushed(VirtIOGPUBase *b)
 {
     VirtIOGPU *g = VIRTIO_GPU(b);
 
     virtio_gpu_process_cmdq(g);
 }
 
-void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+static void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOGPU *g = VIRTIO_GPU(vdev);
     VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
@@ -700,7 +701,7 @@ void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
     virtio_gpu_virgl_fence_poll(g);
 }
 
-void virtio_gpu_virgl_reset(VirtIODevice *vdev)
+static void virtio_gpu_virgl_reset(VirtIODevice *vdev)
 {
     VirtIOGPU *g = VIRTIO_GPU(vdev);
     VirtIOGPUGL *gl = VIRTIO_GPU_GL(vdev);
@@ -719,7 +720,21 @@ void virtio_gpu_virgl_reset(VirtIODevice *vdev)
 
 void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
 {
-    VirtIOGPU *g = VIRTIO_GPU(qdev);
+    VirtIODevice *vdev = VIRTIO_DEVICE(qdev);
+    VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
+
+    VirtIOGPUBase *bdev = VIRTIO_GPU_BASE(qdev);
+    VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_GET_CLASS(bdev);
+
+    VirtIOGPU *gpudev = VIRTIO_GPU(qdev);
+    VirtIOGPUClass *vgc = VIRTIO_GPU_GET_CLASS(gpudev);
+
+    vbc->gl_flushed = virtio_gpu_virgl_flushed;
+    vgc->handle_ctrl = virtio_gpu_virgl_handle_ctrl;
+    vgc->process_cmd = virtio_gpu_virgl_process_cmd;
+    vgc->update_cursor_data = virtio_gpu_virgl_update_cursor;
+
+    vdc->reset = virtio_gpu_virgl_reset;
 
 #if HOST_BIG_ENDIAN
     error_setg(errp, "virgl is not supported on bigendian platforms");
@@ -737,9 +752,9 @@ void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
         return;
     }
 
-    g->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);
-    VIRTIO_GPU_BASE(g)->virtio_config.num_capsets =
-        virtio_gpu_virgl_get_num_capsets(g);
+    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);
+    VIRTIO_GPU_BASE(gpudev)->virtio_config.num_capsets =
+        virtio_gpu_virgl_get_num_capsets(gpudev);
 
     virtio_gpu_device_realize(qdev, errp);
 }
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 7317b60004..421733d751 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -293,13 +293,6 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
                              struct virtio_gpu_rect *r);
 
 /* virtio-gpu-3d.c */
-void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
-                                  struct virtio_gpu_ctrl_command *cmd);
-void virtio_gpu_virgl_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
-                                    uint32_t resource_id);
-void virtio_gpu_virgl_flushed(VirtIOGPUBase *b);
-void virtio_gpu_virgl_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq);
-void virtio_gpu_virgl_reset(VirtIODevice *vdev);
 void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp);
 
 #endif
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (6 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-22 14:54   ` Akihiko Odaki
  2023-04-21  1:12 ` [RFC PATCH 09/13] gfxstream + rutabaga: add required meson changes Gurchetan Singh
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

This modifies the common virtio-gpu.h file have the fields and
defintions needed by gfxstream/rutabaga.  It also modifies VirtioGPUGL
to have the runtime options needed by rutabaga.  They are:

- a colon separated list of capset names, defined in the virtio spec
- a wayland socket path to enable guest Wayland passthrough

The command to run these would be:

-device virtio-vga-gl,capset_names=gfxstream:cross-domain, \
        wayland_socket_path=/run/user/1000/wayland-0,hostmem=8G  \

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-gl.c     | 2 ++
 include/hw/virtio/virtio-gpu.h | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
index 547e697333..15270b0c8a 100644
--- a/hw/display/virtio-gpu-gl.c
+++ b/hw/display/virtio-gpu-gl.c
@@ -29,6 +29,8 @@ static void virtio_gpu_gl_device_realize(DeviceState *qdev, Error **errp)
 static Property virtio_gpu_gl_properties[] = {
     DEFINE_PROP_BIT("stats", VirtIOGPU, parent_obj.conf.flags,
                     VIRTIO_GPU_FLAG_STATS_ENABLED, false),
+    DEFINE_PROP_STRING("capset_names", VirtIOGPUGL, capset_names),
+    DEFINE_PROP_STRING("wayland_socket_path", VirtIOGPUGL, wayland_socket_path),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 421733d751..a35ade3608 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -94,6 +94,7 @@ enum virtio_gpu_base_conf_flags {
     VIRTIO_GPU_FLAG_DMABUF_ENABLED,
     VIRTIO_GPU_FLAG_BLOB_ENABLED,
     VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED,
+    VIRTIO_GPU_FLAG_RUTABAGA_ENABLED,
 };
 
 #define virtio_gpu_virgl_enabled(_cfg) \
@@ -106,6 +107,8 @@ enum virtio_gpu_base_conf_flags {
     (_cfg.flags & (1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED))
 #define virtio_gpu_blob_enabled(_cfg) \
     (_cfg.flags & (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED))
+#define virtio_gpu_rutabaga_enabled(_cfg) \
+    (_cfg.flags & (1 << VIRTIO_GPU_FLAG_RUTABAGA_ENABLED))
 #define virtio_gpu_hostmem_enabled(_cfg) \
     (_cfg.hostmem > 0)
 #define virtio_gpu_context_init_enabled(_cfg) \
@@ -217,6 +220,11 @@ struct VirtIOGPUGL {
 
     bool renderer_inited;
     bool renderer_reset;
+
+    char *capset_names;
+    char *wayland_socket_path;
+    uint32_t num_capsets;
+    void *rutabaga;
 };
 
 struct VhostUserGPU {
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 09/13] gfxstream + rutabaga: add required meson changes
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (7 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream Gurchetan Singh
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

- Add meson detection of rutabaga_gfx
- Compile stubs when rutabaga_gfx_ffi is not installed
- Compile stubs when virglrenderer is not installed

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/meson.build                 | 40 ++++++++++++++++++++------
 hw/display/virtio-gpu-rutabaga-stubs.c |  8 ++++++
 hw/display/virtio-gpu-virgl-stubs.c    |  8 ++++++
 include/hw/virtio/virtio-gpu.h         |  5 +++-
 meson.build                            |  8 ++++++
 meson_options.txt                      |  2 ++
 scripts/meson-buildoptions.sh          |  3 ++
 7 files changed, 64 insertions(+), 10 deletions(-)
 create mode 100644 hw/display/virtio-gpu-rutabaga-stubs.c
 create mode 100644 hw/display/virtio-gpu-virgl-stubs.c

diff --git a/hw/display/meson.build b/hw/display/meson.build
index 4191694380..48785cfcb6 100644
--- a/hw/display/meson.build
+++ b/hw/display/meson.build
@@ -63,6 +63,8 @@ softmmu_ss.add(when: 'CONFIG_ARTIST', if_true: files('artist.c'))
 
 softmmu_ss.add(when: [pixman, 'CONFIG_ATI_VGA'], if_true: files('ati.c', 'ati_2d.c', 'ati_dbg.c'))
 
+virgl_found = virgl.found() and opengl.found()
+rutabaga_found = rutabaga.found()
 
 if config_all_devices.has_key('CONFIG_VIRTIO_GPU')
   virtio_gpu_ss = ss.source_set()
@@ -73,12 +75,27 @@ if config_all_devices.has_key('CONFIG_VIRTIO_GPU')
   virtio_gpu_ss.add(when: 'CONFIG_VHOST_USER_GPU', if_true: files('vhost-user-gpu.c'))
   hw_display_modules += {'virtio-gpu': virtio_gpu_ss}
 
-  if virgl.found() and opengl.found()
-    virtio_gpu_gl_ss = ss.source_set()
-    virtio_gpu_gl_ss.add(when: ['CONFIG_VIRTIO_GPU', virgl, opengl],
-                         if_true: [files('virtio-gpu-gl.c', 'virtio-gpu-virgl.c'), pixman, virgl])
-    hw_display_modules += {'virtio-gpu-gl': virtio_gpu_gl_ss}
+  virtio_gpu_gl_ss = ss.source_set()
+  if virgl_found or rutabaga_found
+    virtio_gpu_gl_ss.add(when: ['CONFIG_VIRTIO_GPU'],
+                         if_true: [files('virtio-gpu-gl.c'), pixman])
   endif
+
+  if virgl_found
+    virtio_gpu_gl_ss.add(when: ['CONFIG_VIRTIO_GPU'],
+                         if_true: [files('virtio-gpu-virgl.c'), virgl])
+  else
+    virtio_gpu_gl_ss.add([files('virtio-gpu-virgl-stubs.c')])
+  endif
+
+  if rutabaga_found
+    virtio_gpu_gl_ss.add(when: ['CONFIG_VIRTIO_GPU'],
+                         if_true: [files('virtio-gpu-rutabaga.c'), rutabaga])
+  else
+    virtio_gpu_gl_ss.add([files('virtio-gpu-rutabaga-stubs.c')])
+  endif
+
+  hw_display_modules += {'virtio-gpu-gl': virtio_gpu_gl_ss}
 endif
 
 if config_all_devices.has_key('CONFIG_VIRTIO_PCI')
@@ -89,9 +106,10 @@ if config_all_devices.has_key('CONFIG_VIRTIO_PCI')
                         if_true: files('vhost-user-gpu-pci.c'))
   hw_display_modules += {'virtio-gpu-pci': virtio_gpu_pci_ss}
 
-  if virgl.found() and opengl.found()
+
+  if virgl_found or rutabaga_found
     virtio_gpu_pci_gl_ss = ss.source_set()
-    virtio_gpu_pci_gl_ss.add(when: ['CONFIG_VIRTIO_GPU', 'CONFIG_VIRTIO_PCI', virgl, opengl],
+    virtio_gpu_pci_gl_ss.add(when: ['CONFIG_VIRTIO_GPU', 'CONFIG_VIRTIO_PCI'],
                              if_true: [files('virtio-gpu-pci-gl.c'), pixman])
     hw_display_modules += {'virtio-gpu-pci-gl': virtio_gpu_pci_gl_ss}
   endif
@@ -108,8 +126,12 @@ if config_all_devices.has_key('CONFIG_VIRTIO_VGA')
   hw_display_modules += {'virtio-vga': virtio_vga_ss}
 
   virtio_vga_gl_ss = ss.source_set()
-  virtio_vga_gl_ss.add(when: ['CONFIG_VIRTIO_VGA', virgl, opengl],
-                       if_true: [files('virtio-vga-gl.c'), pixman])
+
+  if virgl_found or rutabaga_found
+    virtio_vga_gl_ss.add(when: ['CONFIG_VIRTIO_VGA'],
+                         if_true: [files('virtio-vga-gl.c'), pixman])
+  endif
+
   virtio_vga_gl_ss.add(when: 'CONFIG_ACPI', if_true: files('acpi-vga.c'),
                                             if_false: files('acpi-vga-stub.c'))
   hw_display_modules += {'virtio-vga-gl': virtio_vga_gl_ss}
diff --git a/hw/display/virtio-gpu-rutabaga-stubs.c b/hw/display/virtio-gpu-rutabaga-stubs.c
new file mode 100644
index 0000000000..26c38d3892
--- /dev/null
+++ b/hw/display/virtio-gpu-rutabaga-stubs.c
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "qemu/osdep.h"
+#include "hw/virtio/virtio-gpu.h"
+
+void virtio_gpu_rutabaga_device_realize(DeviceState *qdev, Error **errp)
+{
+    /* nothing (stub) */
+}
diff --git a/hw/display/virtio-gpu-virgl-stubs.c b/hw/display/virtio-gpu-virgl-stubs.c
new file mode 100644
index 0000000000..b29e35f990
--- /dev/null
+++ b/hw/display/virtio-gpu-virgl-stubs.c
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "qemu/osdep.h"
+#include "hw/virtio/virtio-gpu.h"
+
+void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
+{
+    /* nothing (stub) */
+}
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index a35ade3608..034c71e8f5 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -300,7 +300,10 @@ int virtio_gpu_update_dmabuf(VirtIOGPU *g,
                              struct virtio_gpu_framebuffer *fb,
                              struct virtio_gpu_rect *r);
 
-/* virtio-gpu-3d.c */
+/* virtio-gpu-virgl.c or virtio-gpu-virgl-stubs.c */
 void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp);
+/* virtio-gpu-rutabaga.c or virtio-gpu-rutabaga-stubs.c */
+void virtio_gpu_rutabaga_device_realize(DeviceState *qdev, Error **errp);
+
 
 #endif
diff --git a/meson.build b/meson.build
index c44d05a13f..ba1c1905fb 100644
--- a/meson.build
+++ b/meson.build
@@ -777,6 +777,13 @@ if not get_option('virglrenderer').auto() or have_system or have_vhost_user_gpu
                      required: get_option('virglrenderer'),
                      kwargs: static_kwargs)
 endif
+rutabaga = not_found
+if not get_option('rutabaga_gfx').auto() or have_system or have_vhost_user_gpu
+  rutabaga = dependency('rutabaga_gfx_ffi',
+                         method: 'pkg-config',
+                         required: get_option('rutabaga_gfx'),
+                         kwargs: static_kwargs)
+endif
 blkio = not_found
 if not get_option('blkio').auto() or have_block
   blkio = dependency('blkio',
@@ -3963,6 +3970,7 @@ summary_info += {'PAM':               pam}
 summary_info += {'iconv support':     iconv}
 summary_info += {'curses support':    curses}
 summary_info += {'virgl support':     virgl}
+summary_info += {'rutabaga support':  rutabaga}
 summary_info += {'blkio support':     blkio}
 summary_info += {'curl support':      curl}
 summary_info += {'Multipath support': mpathpersist}
diff --git a/meson_options.txt b/meson_options.txt
index fc9447d267..3b4249c2fe 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -213,6 +213,8 @@ option('vmnet', type : 'feature', value : 'auto',
        description: 'vmnet.framework network backend support')
 option('virglrenderer', type : 'feature', value : 'auto',
        description: 'virgl rendering support')
+option('rutabaga_gfx', type : 'feature', value : 'auto',
+       description: 'rutabaga_gfx support')
 option('png', type : 'feature', value : 'auto',
        description: 'PNG support with libpng')
 option('vnc', type : 'feature', value : 'auto',
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 009fab1515..ce35c44926 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -144,6 +144,7 @@ meson_options_help() {
   printf "%s\n" '  rbd             Ceph block device driver'
   printf "%s\n" '  rdma            Enable RDMA-based migration'
   printf "%s\n" '  replication     replication support'
+  printf "%s\n" '  rutabaga-gfx    rutabaga_gfx support'
   printf "%s\n" '  sdl             SDL user interface'
   printf "%s\n" '  sdl-image       SDL Image support for icons'
   printf "%s\n" '  seccomp         seccomp support'
@@ -394,6 +395,8 @@ _meson_option_parse() {
     --disable-replication) printf "%s" -Dreplication=disabled ;;
     --enable-rng-none) printf "%s" -Drng_none=true ;;
     --disable-rng-none) printf "%s" -Drng_none=false ;;
+    --enable-rutabaga-gfx) printf "%s" -Drutabaga_gfx=enabled ;;
+    --disable-rutabaga-gfx) printf "%s" -Drutabaga_gfx=disabled ;;
     --enable-sdl) printf "%s" -Dsdl=enabled ;;
     --disable-sdl) printf "%s" -Dsdl=disabled ;;
     --enable-sdl-image) printf "%s" -Dsdl_image=enabled ;;
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (8 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 09/13] gfxstream + rutabaga: add required meson changes Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-22 15:37   ` Akihiko Odaki
  2023-04-21  1:12 ` [RFC PATCH 11/13] gfxstream + rutabaga: enable rutabaga Gurchetan Singh
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

This adds initial support for gfxstream and cross-domain.  Both
features rely on virtio-gpu blob resources and context types, which
are also implemented in this patch.

gfxstream has a long and illustrious history in Android graphics
paravirtualization.  It has been powering graphics in the Android
Studio Emulator for more than a decade, which is the main developer
platform.

Originally conceived by Jesse Hall, it was first known as "EmuGL" [a].
The key design characteristic was a 1:1 threading model and
auto-generation, which fit nicely with the OpenGLES spec.  It also
allowed easy layering with ANGLE on the host, which provides the GLES
implementations on Windows or MacOS enviroments.

gfxstream has traditionally been maintained by a single engineer, and
between 2015 to 2021, the iron throne passed to Frank Yang.  Just to
name a few accomplishments in a reign filled with many of them: newer
versions of GLES, address space graphics, snapshot support and CTS
compliant Vulkan [b].

One major drawback was the use of out-of-tree goldfish drivers.
Android engineers didn't know much about DRM/KMS and especially TTM so
a simple guest to host pipe was conceived.

Luckily, virtio-gpu 3D started to emerge in 2016 due to the work of
the Mesa/virglrenderer communities.  In 2018, the initial virtio-gpu
port of gfxstream was done by Cuttlefish enthusiast Alistair Delva.
It was a symbol compatible replacement of virglrenderer [c] and named
"AVDVirglrenderer".  This implementation forms the basis of the
current gfxstream host implementation still in use today.

cross-domain support follows a similar arc.  Originally conceived by
Wayland aficionado David Reveman and crosvm enjoyer Zach Reizner in
2018, it initially relied on the downstream "virtio-wl" device.

In 2020 and 2021, virtio-gpu was extended to include blob resources
and multiple timelines by yours truly, features gfxstream/cross-domain
both require to function correctly.

Right now, we stand at the precipice of a truly fantastic possibility:
the Android Emulator powered by upstream QEMU and upstream Linux
kernel.  gfxstream will then be packaged properfully, and app
developers can even fix gfxstream bugs on their own if they encounter
them.

It's been quite the ride, my friends.  Where will gfxstream head next,
nobody really knows.  I wouldn't be surprised if it's around for
another decade, maintained by a new generation of Android graphics
enthusiasts.  One thing is for sure, though -- it'll be filled with
friendship and magic!

Technical details:
  - Very simple initial display integration: just used Pixman
  - Largely, 1:1 mapping of virtio-gpu hypercalls to rutabaga function
    calls

[a] https://android-review.googlesource.com/c/platform/development/+/34470
[b] https://android-review.googlesource.com/q/topic:%22vulkan-hostconnection-start%22
[c] https://android-review.googlesource.com/c/device/generic/goldfish-opengl/+/761927

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-rutabaga.c | 995 +++++++++++++++++++++++++++++++
 1 file changed, 995 insertions(+)
 create mode 100644 hw/display/virtio-gpu-rutabaga.c

diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
new file mode 100644
index 0000000000..5fd1154198
--- /dev/null
+++ b/hw/display/virtio-gpu-rutabaga.c
@@ -0,0 +1,995 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/iov.h"
+#include "trace.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-gpu.h"
+#include "hw/virtio/virtio-gpu-pixman.h"
+#include "hw/virtio/virtio-iommu.h"
+
+#include <rutabaga_gfx/rutabaga_gfx_ffi.h>
+
+static int virtio_gpu_rutabaga_init(VirtIOGPU *g);
+
+#define GET_VIRTIO_GPU_GL(x)                                                  \
+        VirtIOGPUGL *virtio_gpu = VIRTIO_GPU_GL(x);                           \
+
+#define GET_RUTABAGA(x)                                                       \
+        struct rutabaga *rutabaga = (struct rutabaga *)(x->rutabaga);         \
+
+#define CHECK(condition, cmd)                                                 \
+    do {                                                                      \
+        if (!condition) {                                                     \
+            qemu_log_mask(LOG_GUEST_ERROR, "CHECK_RESULT failed in %s() %s:"  \
+                          "%d\n", __func__, __FILE__, __LINE__);              \
+            cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;                          \
+            return;                                                           \
+       }                                                                      \
+    } while (0)
+
+#define CHECK_RESULT(result, cmd) CHECK(result == 0, cmd)
+
+static void
+virtio_gpu_rutabaga_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
+                                  uint32_t resource_id)
+{
+    struct virtio_gpu_simple_resource *res;
+    struct rutabaga_transfer transfer = { 0 };
+    struct iovec transfer_iovec;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    res = virtio_gpu_find_resource(g, resource_id);
+    if (!res) {
+        return;
+    }
+
+    if (res->width != s->current_cursor->width ||
+        res->height != s->current_cursor->height) {
+        return;
+    }
+
+    transfer.x = 0;
+    transfer.y = 0;
+    transfer.z = 0;
+    transfer.w = res->width;
+    transfer.h = res->height;
+    transfer.d = 1;
+
+    transfer_iovec.iov_base = (void *)s->current_cursor->data;
+    transfer_iovec.iov_len = res->width * res->height * 4;
+
+    rutabaga_resource_transfer_read(rutabaga, 0,
+                                    resource_id, &transfer,
+                                    &transfer_iovec);
+}
+
+static void
+virtio_gpu_rutabaga_gl_flushed(VirtIOGPUBase *b)
+{
+    VirtIOGPU *g = VIRTIO_GPU(b);
+    virtio_gpu_process_cmdq(g);
+}
+
+static void
+rutabaga_cmd_create_resource_2d(VirtIOGPU *g,
+                                struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct rutabaga_create_3d rc_3d = { 0 };
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_create_2d c2d;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(c2d);
+    trace_virtio_gpu_cmd_res_create_2d(c2d.resource_id, c2d.format,
+                                       c2d.width, c2d.height);
+
+    rc_3d.target = 2;
+    rc_3d.format = c2d.format;
+    rc_3d.bind = (1 << 1);
+    rc_3d.width = c2d.width;
+    rc_3d.height = c2d.height;
+    rc_3d.depth = 1;
+    rc_3d.array_size = 1;
+    rc_3d.last_level = 0;
+    rc_3d.nr_samples = 0;
+    rc_3d.flags = VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP;
+
+    result = rutabaga_resource_create_3d(rutabaga, c2d.resource_id, &rc_3d);
+    CHECK_RESULT(result, cmd);
+
+    res = g_new0(struct virtio_gpu_simple_resource, 1);
+    res->width = c2d.width;
+    res->height = c2d.height;
+    res->format = c2d.format;
+    res->resource_id = c2d.resource_id;
+
+    QTAILQ_INSERT_HEAD(&g->reslist, res, next);
+}
+
+static void
+rutabaga_cmd_create_resource_3d(VirtIOGPU *g,
+                                struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct rutabaga_create_3d rc_3d = { 0 };
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_create_3d c3d;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(c3d);
+
+    trace_virtio_gpu_cmd_res_create_3d(c3d.resource_id, c3d.format,
+                                       c3d.width, c3d.height, c3d.depth);
+
+    rc_3d.target = c3d.target;
+    rc_3d.format = c3d.format;
+    rc_3d.bind = c3d.bind;
+    rc_3d.width = c3d.width;
+    rc_3d.height = c3d.height;
+    rc_3d.depth = c3d.depth;
+    rc_3d.array_size = c3d.array_size;
+    rc_3d.last_level = c3d.last_level;
+    rc_3d.nr_samples = c3d.nr_samples;
+    rc_3d.flags = c3d.flags;
+
+    result = rutabaga_resource_create_3d(rutabaga, c3d.resource_id, &rc_3d);
+    CHECK_RESULT(result, cmd);
+
+    res = g_new0(struct virtio_gpu_simple_resource, 1);
+    res->width = c3d.width;
+    res->height = c3d.height;
+    res->format = c3d.format;
+    res->resource_id = c3d.resource_id;
+
+    QTAILQ_INSERT_HEAD(&g->reslist, res, next);
+}
+
+static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
+                                     struct virtio_gpu_simple_resource *res)
+{
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    res->mapped = NULL;
+    return rutabaga_resource_unmap(rutabaga, res->resource_id);
+}
+
+static void
+rutabaga_cmd_resource_unref(VirtIOGPU *g,
+                            struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_unref unref;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(unref);
+
+    trace_virtio_gpu_cmd_res_unref(unref.resource_id);
+
+    res = virtio_gpu_find_resource(g, unref.resource_id);
+    CHECK(res, cmd);
+
+    result = rutabaga_resource_unref(rutabaga, unref.resource_id);
+    CHECK_RESULT(result, cmd);
+
+    if (res->image) {
+        pixman_image_unref(res->image);
+    }
+
+    if (res->mapped) {
+        result = rutabaga_handle_unmap(g, res);
+        CHECK(result, cmd);
+    }
+
+    QTAILQ_REMOVE(&g->reslist, res, next);
+    g_free(res);
+}
+
+static void
+rutabaga_cmd_context_create(VirtIOGPU *g,
+                            struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_ctx_create cc;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(cc);
+    trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
+                                    cc.debug_name);
+
+    result = rutabaga_context_create(rutabaga, cc.hdr.ctx_id, cc.context_init,
+                                     cc.debug_name, cc.nlen);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_context_destroy(VirtIOGPU *g,
+                             struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_ctx_destroy cd;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(cd);
+    trace_virtio_gpu_cmd_ctx_destroy(cd.hdr.ctx_id);
+
+    result = rutabaga_context_destroy(rutabaga, cd.hdr.ctx_id);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_resource_flush(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result, i;
+    struct virtio_gpu_scanout *scanout = NULL;
+    struct virtio_gpu_simple_resource *res;
+    struct rutabaga_transfer transfer = { 0 };
+    struct iovec transfer_iovec;
+    struct virtio_gpu_resource_flush rf;
+    bool found = false;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(rf);
+    trace_virtio_gpu_cmd_res_flush(rf.resource_id,
+                                   rf.r.width, rf.r.height, rf.r.x, rf.r.y);
+
+    res = virtio_gpu_find_resource(g, rf.resource_id);
+    CHECK(res, cmd);
+
+    for (i = 0; i < g->parent_obj.conf.max_outputs; i++) {
+        scanout = &g->parent_obj.scanout[i];
+        if (i == res->scanout_bitmask) {
+            found = true;
+            break;
+        }
+    }
+
+    if (!found) {
+        return;
+    }
+
+    transfer.x = 0;
+    transfer.y = 0;
+    transfer.z = 0;
+    transfer.w = res->width;
+    transfer.h = res->height;
+    transfer.d = 1;
+
+    transfer_iovec.iov_base = (void *)pixman_image_get_data(res->image);
+    transfer_iovec.iov_len = res->width * res->height * 4;
+
+    result = rutabaga_resource_transfer_read(rutabaga, 0,
+                                             rf.resource_id, &transfer,
+                                             &transfer_iovec);
+    CHECK_RESULT(result, cmd);
+    dpy_gfx_update_full(scanout->con);
+}
+
+static void
+rutabaga_cmd_set_scanout(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_scanout *scanout = NULL;
+    struct virtio_gpu_set_scanout ss;
+
+    VIRTIO_GPU_FILL_CMD(ss);
+    trace_virtio_gpu_cmd_set_scanout(ss.scanout_id, ss.resource_id,
+                                     ss.r.width, ss.r.height, ss.r.x, ss.r.y);
+
+    scanout = &g->parent_obj.scanout[ss.scanout_id];
+    g->parent_obj.enable = 1;
+
+    res = virtio_gpu_find_resource(g, ss.resource_id);
+    CHECK(res, cmd);
+
+    if (!res->image) {
+        pixman_format_code_t pformat;
+        pformat = virtio_gpu_get_pixman_format(res->format);
+        CHECK(pformat, cmd);
+
+        res->image = pixman_image_create_bits(pformat,
+                                              res->width,
+                                              res->height,
+                                              NULL, 0);
+        CHECK(res->image, cmd);
+        pixman_image_ref(res->image);
+    }
+
+    /* realloc the surface ptr */
+    scanout->ds = qemu_create_displaysurface_pixman(res->image);
+    dpy_gfx_replace_surface(scanout->con, NULL);
+    dpy_gfx_replace_surface(scanout->con, scanout->ds);
+    res->scanout_bitmask = ss.scanout_id;
+}
+
+static void
+rutabaga_cmd_submit_3d(VirtIOGPU *g,
+                       struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_cmd_submit cs;
+    void *buf;
+    size_t s;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(cs);
+    trace_virtio_gpu_cmd_ctx_submit(cs.hdr.ctx_id, cs.size);
+
+    buf = g_malloc(cs.size);
+    s = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num,
+                   sizeof(cs), buf, cs.size);
+    if (s != cs.size) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: size mismatch (%zd/%d)",
+                      __func__, s, cs.size);
+        cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
+        goto out;
+    }
+
+    result = rutabaga_submit_command(rutabaga, cs.hdr.ctx_id, buf, cs.size);
+    CHECK_RESULT(result, cmd);
+
+out:
+    g_free(buf);
+}
+
+static void
+rutabaga_cmd_transfer_to_host_2d(VirtIOGPU *g,
+                                 struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct rutabaga_transfer transfer = { 0 };
+    struct virtio_gpu_transfer_to_host_2d t2d;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(t2d);
+    trace_virtio_gpu_cmd_res_xfer_toh_2d(t2d.resource_id);
+
+    transfer.x = t2d.r.x;
+    transfer.y = t2d.r.y;
+    transfer.z = 0;
+    transfer.w = t2d.r.width;
+    transfer.h = t2d.r.height;
+    transfer.d = 1;
+
+    result = rutabaga_resource_transfer_write(rutabaga, 0, t2d.resource_id,
+                                              &transfer);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_transfer_to_host_3d(VirtIOGPU *g,
+                                 struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct rutabaga_transfer transfer = { 0 };
+    struct virtio_gpu_transfer_host_3d t3d;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(t3d);
+    trace_virtio_gpu_cmd_res_xfer_toh_3d(t3d.resource_id);
+
+    transfer.x = t3d.box.x;
+    transfer.y = t3d.box.y;
+    transfer.z = t3d.box.z;
+    transfer.w = t3d.box.w;
+    transfer.h = t3d.box.h;
+    transfer.d = t3d.box.d;
+    transfer.level = t3d.level;
+    transfer.stride = t3d.stride;
+    transfer.layer_stride = t3d.layer_stride;
+    transfer.offset = t3d.offset;
+
+    result = rutabaga_resource_transfer_write(rutabaga, t3d.hdr.ctx_id,
+                                              t3d.resource_id, &transfer);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_transfer_from_host_3d(VirtIOGPU *g,
+                                   struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct rutabaga_transfer transfer = { 0 };
+    struct virtio_gpu_transfer_host_3d t3d;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(t3d);
+    trace_virtio_gpu_cmd_res_xfer_fromh_3d(t3d.resource_id);
+
+    transfer.x = t3d.box.x;
+    transfer.y = t3d.box.y;
+    transfer.z = t3d.box.z;
+    transfer.w = t3d.box.w;
+    transfer.h = t3d.box.h;
+    transfer.d = t3d.box.d;
+    transfer.level = t3d.level;
+    transfer.stride = t3d.stride;
+    transfer.layer_stride = t3d.layer_stride;
+    transfer.offset = t3d.offset;
+
+    result = rutabaga_resource_transfer_read(rutabaga, t3d.hdr.ctx_id,
+                                             t3d.resource_id, &transfer, NULL);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_attach_backing(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    struct rutabaga_iovecs vecs = { 0 };
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_attach_backing att_rb;
+    struct iovec *res_iovs;
+    uint32_t res_niov;
+    int ret;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(att_rb);
+    trace_virtio_gpu_cmd_res_back_attach(att_rb.resource_id);
+
+    res = virtio_gpu_find_resource(g, att_rb.resource_id);
+    if (!res) {
+        qemu_log_mask(LOG_GUEST_ERROR, "%s: illegal resource specified %d\n",
+                      __func__, att_rb.resource_id);
+        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
+        return;
+    }
+
+    if (res->iov) {
+        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
+        return;
+    }
+
+    ret = virtio_gpu_create_mapping_iov(g, att_rb.nr_entries, sizeof(att_rb),
+                                        cmd, NULL, &res_iovs, &res_niov);
+    if (ret != 0) {
+        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
+        return;
+    }
+
+    vecs.iovecs = res_iovs;
+    vecs.num_iovecs = res_niov;
+
+    ret = rutabaga_resource_attach_backing(rutabaga, att_rb.resource_id, &vecs);
+    if (ret != 0) {
+        virtio_gpu_cleanup_mapping_iov(g, res_iovs, res_niov);
+    }
+}
+
+static void
+rutabaga_cmd_detach_backing(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_detach_backing detach_rb;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(detach_rb);
+    trace_virtio_gpu_cmd_res_back_detach(detach_rb.resource_id);
+
+    res = virtio_gpu_find_resource(g, detach_rb.resource_id);
+    CHECK(res, cmd);
+
+    rutabaga_resource_detach_backing(rutabaga,
+                                     detach_rb.resource_id);
+
+    virtio_gpu_cleanup_mapping(g, res);
+}
+
+static void
+rutabaga_cmd_ctx_attach_resource(VirtIOGPU *g,
+                                 struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_ctx_resource att_res;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(att_res);
+    trace_virtio_gpu_cmd_ctx_res_attach(att_res.hdr.ctx_id,
+                                        att_res.resource_id);
+
+    result = rutabaga_context_attach_resource(rutabaga, att_res.hdr.ctx_id,
+                                              att_res.resource_id);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_ctx_detach_resource(VirtIOGPU *g,
+                                 struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_ctx_resource det_res;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(det_res);
+    trace_virtio_gpu_cmd_ctx_res_detach(det_res.hdr.ctx_id,
+                                        det_res.resource_id);
+
+    result = rutabaga_context_detach_resource(rutabaga, det_res.hdr.ctx_id,
+                                              det_res.resource_id);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_get_capset_info(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_get_capset_info info;
+    struct virtio_gpu_resp_capset_info resp;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(info);
+
+    result = rutabaga_get_capset_info(rutabaga, info.capset_index,
+                                      &resp.capset_id, &resp.capset_max_version,
+                                      &resp.capset_max_size);
+    CHECK_RESULT(result, cmd);
+
+    resp.hdr.type = VIRTIO_GPU_RESP_OK_CAPSET_INFO;
+    virtio_gpu_ctrl_response(g, cmd, &resp.hdr, sizeof(resp));
+}
+
+static void
+rutabaga_cmd_get_capset(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_get_capset gc;
+    struct virtio_gpu_resp_capset *resp;
+    uint32_t capset_size;
+    uint32_t current_id;
+    bool found = false;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(gc);
+    for (uint32_t i = 0; i < virtio_gpu->num_capsets; i++) {
+        result = rutabaga_get_capset_info(rutabaga, i,
+                                          &current_id, &capset_size,
+                                          &capset_size);
+        CHECK_RESULT(result, cmd);
+
+        if (current_id == gc.capset_id) {
+            found = true;
+            break;
+        }
+    }
+
+    if (!found) {
+        qemu_log_mask(LOG_GUEST_ERROR, "capset not found!");
+        return;
+    }
+
+    resp = g_malloc0(sizeof(*resp) + capset_size);
+    resp->hdr.type = VIRTIO_GPU_RESP_OK_CAPSET;
+    rutabaga_get_capset(rutabaga, gc.capset_id, gc.capset_version,
+                       (uint8_t *)resp->capset_data, capset_size);
+
+    virtio_gpu_ctrl_response(g, cmd, &resp->hdr, sizeof(*resp) + capset_size);
+    g_free(resp);
+}
+
+static void
+rutabaga_cmd_resource_create_blob(VirtIOGPU *g,
+                                  struct virtio_gpu_ctrl_command *cmd)
+{
+    int result;
+    struct rutabaga_iovecs vecs = { 0 };
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_create_blob cblob;
+    struct rutabaga_create_blob rc_blob = { 0 };
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(cblob);
+    trace_virtio_gpu_cmd_res_create_blob(cblob.resource_id, cblob.size);
+
+    CHECK(cblob.resource_id != 0, cmd);
+
+    res = g_new0(struct virtio_gpu_simple_resource, 1);
+    QTAILQ_INSERT_HEAD(&g->reslist, res, next);
+
+    res->resource_id = cblob.resource_id;
+    res->blob_size = cblob.size;
+
+    if (cblob.blob_mem != VIRTIO_GPU_BLOB_MEM_HOST3D) {
+        result = virtio_gpu_create_mapping_iov(g, cblob.nr_entries,
+                                               sizeof(cblob), cmd, &res->addrs,
+                                               &res->iov, &res->iov_cnt);
+        CHECK_RESULT(result, cmd);
+    }
+
+    rc_blob.blob_id = cblob.blob_id;
+    rc_blob.blob_mem = cblob.blob_mem;
+    rc_blob.blob_flags = cblob.blob_flags;
+    rc_blob.size = cblob.size;
+
+    vecs.iovecs = res->iov;
+    vecs.num_iovecs = res->iov_cnt;
+
+    result = rutabaga_resource_create_blob(rutabaga, cblob.hdr.ctx_id,
+                                           cblob.resource_id, &rc_blob, &vecs,
+                                           NULL);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
+                               struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_simple_resource *res;
+    struct rutabaga_mapping mapping = { 0 };
+    struct virtio_gpu_resource_map_blob mblob;
+    struct virtio_gpu_resp_map_info resp;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(mblob);
+
+    CHECK(mblob.resource_id != 0, cmd);
+
+    res = virtio_gpu_find_resource(g, mblob.resource_id);
+    CHECK(res, cmd);
+
+    result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
+    CHECK_RESULT(result, cmd);
+
+    memset(&resp, 0, sizeof(resp));
+    resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
+    result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,
+                                        &resp.map_info);
+
+    CHECK_RESULT(result, cmd);
+    virtio_gpu_ctrl_response(g, cmd, &resp.hdr, sizeof(resp));
+}
+
+static void
+rutabaga_cmd_resource_unmap_blob(VirtIOGPU *g,
+                                 struct virtio_gpu_ctrl_command *cmd)
+{
+    int32_t result;
+    struct virtio_gpu_simple_resource *res;
+    struct virtio_gpu_resource_unmap_blob ublob;
+
+    VIRTIO_GPU_FILL_CMD(ublob);
+
+    CHECK(ublob.resource_id != 0, cmd);
+
+    res = virtio_gpu_find_resource(g, ublob.resource_id);
+    CHECK(res, cmd);
+
+    result = rutabaga_handle_unmap(g, res);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
+                                struct virtio_gpu_ctrl_command *cmd)
+{
+    struct rutabaga_fence fence = { 0 };
+    int32_t result;
+
+    GET_VIRTIO_GPU_GL(g);
+    GET_RUTABAGA(virtio_gpu);
+
+    VIRTIO_GPU_FILL_CMD(cmd->cmd_hdr);
+
+    switch (cmd->cmd_hdr.type) {
+    case VIRTIO_GPU_CMD_CTX_CREATE:
+        rutabaga_cmd_context_create(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_CTX_DESTROY:
+        rutabaga_cmd_context_destroy(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_CREATE_2D:
+        rutabaga_cmd_create_resource_2d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_CREATE_3D:
+        rutabaga_cmd_create_resource_3d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_SUBMIT_3D:
+        rutabaga_cmd_submit_3d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_TRANSFER_TO_HOST_2D:
+        rutabaga_cmd_transfer_to_host_2d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_TRANSFER_TO_HOST_3D:
+        rutabaga_cmd_transfer_to_host_3d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_TRANSFER_FROM_HOST_3D:
+        rutabaga_cmd_transfer_from_host_3d(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING:
+        rutabaga_cmd_attach_backing(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING:
+        rutabaga_cmd_detach_backing(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_SET_SCANOUT:
+        rutabaga_cmd_set_scanout(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_FLUSH:
+        rutabaga_cmd_resource_flush(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_UNREF:
+        rutabaga_cmd_resource_unref(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE:
+        rutabaga_cmd_ctx_attach_resource(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_CTX_DETACH_RESOURCE:
+        rutabaga_cmd_ctx_detach_resource(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_GET_CAPSET_INFO:
+        rutabaga_cmd_get_capset_info(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_GET_CAPSET:
+        rutabaga_cmd_get_capset(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_GET_DISPLAY_INFO:
+        virtio_gpu_get_display_info(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_GET_EDID:
+        virtio_gpu_get_edid(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB:
+        rutabaga_cmd_resource_create_blob(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB:
+        rutabaga_cmd_resource_map_blob(g, cmd);
+        break;
+    case VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB:
+        rutabaga_cmd_resource_unmap_blob(g, cmd);
+        break;
+    default:
+        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
+        break;
+    }
+
+    if (cmd->finished) {
+        return;
+    }
+    if (cmd->error) {
+        fprintf(stderr, "%s: ctrl 0x%x, error 0x%x\n", __func__,
+                cmd->cmd_hdr.type, cmd->error);
+        virtio_gpu_ctrl_response_nodata(g, cmd, cmd->error);
+        return;
+    }
+    if (!(cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_FENCE)) {
+        virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
+        return;
+    }
+
+    fence.flags = cmd->cmd_hdr.flags;
+    fence.ctx_id = cmd->cmd_hdr.ctx_id;
+    fence.fence_id = cmd->cmd_hdr.fence_id;
+    fence.ring_idx = cmd->cmd_hdr.ring_idx;
+
+    trace_virtio_gpu_fence_ctrl(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
+
+    result = rutabaga_create_fence(rutabaga, &fence);
+    CHECK_RESULT(result, cmd);
+}
+
+static void
+virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
+                             struct rutabaga_fence fence_data)
+{
+    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
+    struct virtio_gpu_ctrl_command *cmd, *tmp;
+
+    bool signaled_ctx_specific = fence_data.flags & RUTABAGA_FLAG_INFO_RING_IDX;
+
+    QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
+        /*
+         * Due to context specific timelines.
+         */
+        bool target_ctx_specific = cmd->cmd_hdr.flags &
+                                   RUTABAGA_FLAG_INFO_RING_IDX;
+
+        if (signaled_ctx_specific != target_ctx_specific) {
+            continue;
+        }
+
+        if (signaled_ctx_specific &&
+           (cmd->cmd_hdr.ring_idx != fence_data.ring_idx)) {
+            continue;
+        }
+
+        if (cmd->cmd_hdr.fence_id > fence_data.fence_id) {
+            continue;
+        }
+
+        trace_virtio_gpu_fence_resp(cmd->cmd_hdr.fence_id);
+        virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
+        QTAILQ_REMOVE(&g->fenceq, cmd, next);
+        g_free(cmd);
+    }
+}
+
+static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
+{
+    int result;
+    uint64_t capset_mask;
+    struct rutabaga_channels channels = { 0 };
+    struct rutabaga_builder builder = { 0 };
+
+    GET_VIRTIO_GPU_GL(g);
+    virtio_gpu->rutabaga = NULL;
+
+    if (!virtio_gpu->capset_names) {
+        return -EINVAL;
+    }
+
+    result = rutabaga_calculate_capset_mask(virtio_gpu->capset_names,
+                                            &capset_mask);
+    if (result) {
+        return result;
+    }
+
+    /*
+     * rutabaga-0.1.1 is only compiled/tested with gfxstream and cross-domain
+     * support. Future versions may change this to have more context types if
+     * there is any interest.
+     */
+    if ((capset_mask & (1 << RUTABAGA_CAPSET_VIRGL)) ||
+        (capset_mask & (1 << RUTABAGA_CAPSET_VIRGL2)) ||
+        (capset_mask & (1 << RUTABAGA_CAPSET_VENUS)) ||
+        (capset_mask & (1 << RUTABAGA_CAPSET_DRM))) {
+        return -EINVAL;
+     }
+
+    builder.user_data = (uint64_t)(uintptr_t *)(void *)g;
+    builder.fence_cb = virtio_gpu_rutabaga_fence_cb;
+    builder.capset_mask = capset_mask;
+
+    if (virtio_gpu->wayland_socket_path) {
+        if ((builder.capset_mask & (1 << RUTABAGA_CAPSET_CROSS_DOMAIN)) == 0) {
+            return -EINVAL;
+        }
+
+        channels.channels =
+          (struct rutabaga_channel *)calloc(1, sizeof(struct rutabaga_channel));
+        channels.num_channels = 1;
+        channels.channels[0].channel_name = virtio_gpu->wayland_socket_path;
+        channels.channels[0].channel_type = RUTABAGA_CHANNEL_TYPE_WAYLAND;
+        builder.channels = &channels;
+    }
+
+    result = rutabaga_init(&builder, (struct rutabaga **)&virtio_gpu->rutabaga);
+    if (builder.capset_mask & (1 << RUTABAGA_CAPSET_CROSS_DOMAIN)) {
+        free(channels.channels);
+    }
+
+    return result;
+}
+
+static int virtio_gpu_rutabaga_get_num_capsets(VirtIOGPU *g)
+{
+    int result = 0;
+    uint32_t num_capsets;
+    GET_VIRTIO_GPU_GL(g);
+
+    if (!virtio_gpu->renderer_inited) {
+        result = virtio_gpu_rutabaga_init(g);
+        if (result) {
+            qemu_log_mask(LOG_GUEST_ERROR, "Failed to init rutabaga");
+            return 0;
+        }
+
+        virtio_gpu->renderer_inited = true;
+    }
+
+    GET_RUTABAGA(virtio_gpu);
+
+    result = rutabaga_get_num_capsets(rutabaga, &num_capsets);
+    if (result) {
+        qemu_log_mask(LOG_GUEST_ERROR, "Failed to get capsets");
+        return 0;
+    }
+    virtio_gpu->num_capsets = num_capsets;
+    return (int)(num_capsets);
+}
+
+static void virtio_gpu_rutabaga_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOGPU *g = VIRTIO_GPU(vdev);
+    GET_VIRTIO_GPU_GL(g);
+    struct virtio_gpu_ctrl_command *cmd;
+
+    if (!virtio_queue_ready(vq)) {
+        return;
+    }
+
+    if (!virtio_gpu->renderer_inited) {
+        int result = virtio_gpu_rutabaga_init(g);
+        if (!result) {
+            virtio_gpu->renderer_inited = true;
+        }
+    }
+
+    if (!virtio_gpu->renderer_inited) {
+        return;
+    }
+
+    cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
+    while (cmd) {
+        cmd->vq = vq;
+        cmd->error = 0;
+        cmd->finished = false;
+        QTAILQ_INSERT_TAIL(&g->cmdq, cmd, next);
+        cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
+    }
+
+    virtio_gpu_process_cmdq(g);
+}
+
+void virtio_gpu_rutabaga_device_realize(DeviceState *qdev, Error **errp)
+{
+    int num_capsets;
+    VirtIOGPUBase *bdev = VIRTIO_GPU_BASE(qdev);
+    VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_GET_CLASS(bdev);
+
+    VirtIOGPU *gpudev = VIRTIO_GPU(qdev);
+    VirtIOGPUClass *vgc = VIRTIO_GPU_GET_CLASS(gpudev);
+
+    num_capsets = virtio_gpu_rutabaga_get_num_capsets(gpudev);
+    if (!num_capsets) {
+        return;
+    }
+
+    vbc->gl_flushed = virtio_gpu_rutabaga_gl_flushed;
+    vgc->handle_ctrl = virtio_gpu_rutabaga_handle_ctrl;
+    vgc->process_cmd = virtio_gpu_rutabaga_process_cmd;
+    vgc->update_cursor_data = virtio_gpu_rutabaga_update_cursor;
+
+#if HOST_BIG_ENDIAN
+    error_setg(errp, "rutabaga is not supported on bigendian platforms");
+    return;
+#endif
+
+    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_RUTABAGA_ENABLED);
+    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED);
+    gpudev->parent_obj.conf.flags
+      |= (1 << VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED);
+
+    VIRTIO_GPU_BASE(gpudev)->virtio_config.num_capsets = num_capsets;
+    virtio_gpu_device_realize(qdev, errp);
+}
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 11/13] gfxstream + rutabaga: enable rutabaga
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (9 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21  1:12 ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Gurchetan Singh
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

This change enables rutabaga to receive virtio-gpu-3d hypercalls
when it is active.

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-base.c | 3 ++-
 hw/display/virtio-gpu-gl.c   | 9 ++++++++-
 hw/display/virtio-gpu.c      | 5 +++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index 6c5f1f327f..7913d9b82e 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -206,7 +206,8 @@ virtio_gpu_base_get_features(VirtIODevice *vdev, uint64_t features,
 {
     VirtIOGPUBase *g = VIRTIO_GPU_BASE(vdev);
 
-    if (virtio_gpu_virgl_enabled(g->conf)) {
+    if (virtio_gpu_virgl_enabled(g->conf) ||
+        virtio_gpu_rutabaga_enabled(g->conf)) {
         features |= (1 << VIRTIO_GPU_F_VIRGL);
     }
     if (virtio_gpu_edid_enabled(g->conf)) {
diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
index 15270b0c8a..fc09900ed9 100644
--- a/hw/display/virtio-gpu-gl.c
+++ b/hw/display/virtio-gpu-gl.c
@@ -23,7 +23,14 @@
 
 static void virtio_gpu_gl_device_realize(DeviceState *qdev, Error **errp)
 {
-    virtio_gpu_virgl_device_realize(qdev, errp);
+    VirtIOGPUGL *virtio_gpu = VIRTIO_GPU_GL(qdev);
+    virtio_gpu->rutabaga = NULL;
+    virtio_gpu_rutabaga_device_realize(qdev, errp);
+
+    /* Fallback to virgl if rutabaga fails to initialize */
+    if (!virtio_gpu->rutabaga) {
+        virtio_gpu_virgl_device_realize(qdev, errp);
+    }
 }
 
 static Property virtio_gpu_gl_properties[] = {
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 938eed9181..3e92f9db6b 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1314,8 +1314,9 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error **errp)
     VirtIOGPU *g = VIRTIO_GPU(qdev);
 
     if (virtio_gpu_blob_enabled(g->parent_obj.conf)) {
-        if (!virtio_gpu_have_udmabuf()) {
-            error_setg(errp, "cannot enable blob resources without udmabuf");
+        if (!virtio_gpu_have_udmabuf() &&
+            !virtio_gpu_rutabaga_enabled(g->parent_obj.conf)) {
+            error_setg(errp, "need udmabuf or rutabaga for blob resources");
             return;
         }
 
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (10 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 11/13] gfxstream + rutabaga: enable rutabaga Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-22 15:46   ` Akihiko Odaki
  2023-04-22 19:22   ` Peter Maydell
  2023-04-21  1:12 ` [RFC PATCH 13/13] HACK: schedule fence return on main AIO context Gurchetan Singh
  2023-04-21 16:02 ` [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Stefan Hajnoczi
  13 siblings, 2 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

I just copied the patches that have been floating around that do
this, but it doesn't seem to robustly work.  This current
implementation is probably good enough to run vkcube or simple
apps, but whenever a test starts to aggressively map/unmap memory,
things do explode on the QEMU side.

A simple way to reproduce is run:

./deqp-vk --deqp-case=deqp-vk --deqp-case=dEQP-VK.memory.mapping.suballocation.*

You should get stack traces that sometimes look like this:

0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:44
1  __pthread_kill_internal (signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:78
2  __GI___pthread_kill (threadid=140737316304448, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
3  0x00007ffff7042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
4  0x00007ffff70287f3 in __GI_abort () at ./stdlib/abort.c:79
5  0x00007ffff70896f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff71dbb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
6  0x00007ffff70a0d7c in malloc_printerr (str=str@entry=0x7ffff71de7b0 "double free or corruption (out)") at ./malloc/malloc.c:5664
7  0x00007ffff70a2ef0 in _int_free (av=0x7ffff7219c80 <main_arena>, p=0x555557793e00, have_lock=<optimized out>) at ./malloc/malloc.c:4588
8  0x00007ffff70a54d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
9  0x0000555555d65e7e in phys_section_destroy (mr=0x555557793e10) at ../softmmu/physmem.c:1003
10 0x0000555555d65ed0 in phys_sections_free (map=0x555556d4b410) at ../softmmu/physmem.c:1011
11 0x0000555555d69578 in address_space_dispatch_free (d=0x555556d4b400) at ../softmmu/physmem.c:2430
12 0x0000555555d58412 in flatview_destroy (view=0x5555572bb090) at ../softmmu/memory.c:292
13 0x000055555600fd23 in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
14 0x00005555560026d4 in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
15 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
16 0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

or this:

0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
1198        g_assert(obj->ref > 0);
(gdb) bt
0  0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
1  0x0000555555d5cca5 in memory_region_unref (mr=0x5555572b9e20) at ../softmmu/memory.c:1799
2  0x0000555555d65e47 in phys_section_destroy (mr=0x5555572b9e20) at ../softmmu/physmem.c:998
3  0x0000555555d65ec7 in phys_sections_free (map=0x5555588365c0) at ../softmmu/physmem.c:1011
4  0x0000555555d6956f in address_space_dispatch_free (d=0x5555588365b0) at ../softmmu/physmem.c:2430
5  0x0000555555d58409 in flatview_destroy (view=0x555558836570) at ../softmmu/memory.c:292
6  0x000055555600fd1a in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
7  0x00005555560026cb in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
8  0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
9  0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

The reason seems to be memory regions are handled on a different
thread than the virtio-gpu thread, and that inevitably leads to
raciness.  The memory region docs[a] generally seems to dissuade this:

"In order to do this, as a general rule do not create or destroy
 memory regions dynamically during a device’s lifetime, and only
 call object_unparent() in the memory region owner’s instance_finalize
 callback. The dynamically allocated data structure that contains
 the memory region then should obviously be freed in the
 instance_finalize callback as well."

Though instance_finalize is called before device destruction, so
storing the memory until then is unlikely to be an option.  The
tests do pass when virtio-gpu doesn't free the memory, but
progressively the guest becomes slower and then OOMs.

Though the api does make an exception:

"There is an exception to the above rule: it is okay to call
object_unparent at any time for an alias or a container region. It is
therefore also okay to create or destroy alias and container regions
dynamically during a device’s lifetime."

I believe we are trying to create a container subregion, but that's
still failing?  Are we doing it right?  Any memory region experts
here can help out?  The other revelant patch in this series
is "virtio-gpu: hostmem".

[a] https://qemu.readthedocs.io/en/latest/devel/memory.html

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-rutabaga.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
index 5fd1154198..196267aac2 100644
--- a/hw/display/virtio-gpu-rutabaga.c
+++ b/hw/display/virtio-gpu-rutabaga.c
@@ -159,6 +159,12 @@ static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
     GET_VIRTIO_GPU_GL(g);
     GET_RUTABAGA(virtio_gpu);
 
+    memory_region_transaction_begin();
+    memory_region_set_enabled(&res->region, false);
+    memory_region_del_subregion(&g->parent_obj.hostmem, &res->region);
+    object_unparent(OBJECT(&res->region));
+    memory_region_transaction_commit();
+
     res->mapped = NULL;
     return rutabaga_resource_unmap(rutabaga, res->resource_id);
 }
@@ -671,6 +677,14 @@ rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
     result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
     CHECK_RESULT(result, cmd);
 
+    memory_region_transaction_begin();
+    memory_region_init_ram_device_ptr(&res->region, OBJECT(g), NULL,
+                                      mapping.size, (void *)mapping.ptr);
+    memory_region_add_subregion(&g->parent_obj.hostmem, mblob.offset,
+                                &res->region);
+    memory_region_set_enabled(&res->region, true);
+    memory_region_transaction_commit();
+
     memset(&resp, 0, sizeof(resp));
     resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
     result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC PATCH 13/13] HACK: schedule fence return on main AIO context
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (11 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Gurchetan Singh
@ 2023-04-21  1:12 ` Gurchetan Singh
  2023-04-21 15:59   ` Stefan Hajnoczi
  2023-04-21 16:02 ` [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Stefan Hajnoczi
  13 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21  1:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

gfxstream and both cross-domain (and even newer versions
virglrenderer: see VIRGL_RENDERER_ASYNC_FENCE_CB) like to signal
fence completion on threads ("callback threads") that are
different from the thread that processes the command queue
("main thread").

This is generally possible with locking, and this what we do
in crosvm and other virtio-gpu1.1 implementations.  However, on
QEMU a deadlock is observed if virtio_gpu_ctrl_response_nodata(..)
[used in the fence callback] is used from a thread that is not the
main thread.

The reason is the main thread takes the big QEMU lock (bql) somewhere
when processing the command queue, and virtio_gpu_ctrl_response_nodata(..)
needs that lock.  If you add in the lock needed to protect &g->fenceq
from concurrent access by the main thread and the callback threads,
you end can end up with deadlocks.

It's possible to workaround this by scheduling the return of the fence
descriptors via aio_bh_schedule_oneshot_full(..), but that somewhat
negates the rationale for the asynchronous callbacks.

I also played around with aio_context_acquire()/aio_context_release(),
doesn't seem to help.

Is signaling the virtio_queue outside of the main thread possible?  If
so, how?

Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
---
 hw/display/virtio-gpu-rutabaga.c | 29 ++++++++++++++++++++++++++---
 include/hw/virtio/virtio-gpu.h   |  1 +
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
index 196267aac2..5c296aeef1 100644
--- a/hw/display/virtio-gpu-rutabaga.c
+++ b/hw/display/virtio-gpu-rutabaga.c
@@ -31,6 +31,11 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g);
 
 #define CHECK_RESULT(result, cmd) CHECK(result == 0, cmd)
 
+struct rutabaga_aio_data {
+    struct VirtIOGPUGL *virtio_gpu;
+    struct rutabaga_fence fence;
+};
+
 static void
 virtio_gpu_rutabaga_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
                                   uint32_t resource_id)
@@ -823,10 +828,11 @@ virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
 }
 
 static void
-virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
-                             struct rutabaga_fence fence_data)
+virtio_gpu_rutabaga_aio_cb(void *opaque)
 {
-    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
+    struct rutabaga_aio_data *data =  (struct rutabaga_aio_data *)opaque;
+    VirtIOGPU *g = (VirtIOGPU *)data->virtio_gpu;
+    struct rutabaga_fence fence_data = data->fence;
     struct virtio_gpu_ctrl_command *cmd, *tmp;
 
     bool signaled_ctx_specific = fence_data.flags & RUTABAGA_FLAG_INFO_RING_IDX;
@@ -856,6 +862,22 @@ virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
         QTAILQ_REMOVE(&g->fenceq, cmd, next);
         g_free(cmd);
     }
+
+    g_free(data);
+}
+
+static void
+virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
+                             struct rutabaga_fence fence_data) {
+    struct rutabaga_aio_data *data;
+    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
+    GET_VIRTIO_GPU_GL(g);
+
+    data = g_new0(struct rutabaga_aio_data, 1);
+    data->virtio_gpu = virtio_gpu;
+    data->fence = fence_data;
+    aio_bh_schedule_oneshot_full(virtio_gpu->ctx, virtio_gpu_rutabaga_aio_cb,
+                                 (void *)data, "aio");
 }
 
 static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
@@ -912,6 +934,7 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
         free(channels.channels);
     }
 
+    virtio_gpu->ctx = qemu_get_aio_context();
     return result;
 }
 
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 034c71e8f5..b33ad0c68f 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -225,6 +225,7 @@ struct VirtIOGPUGL {
     char *wayland_socket_path;
     uint32_t num_capsets;
     void *rutabaga;
+    AioContext *ctx;
 };
 
 struct VhostUserGPU {
-- 
2.40.0.634.g4ca3ef3211-goog



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl
  2023-04-21  1:12 ` [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl Gurchetan Singh
@ 2023-04-21  9:40   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 28+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-21  9:40 UTC (permalink / raw)
  To: Gurchetan Singh, qemu-devel
  Cc: pbonzini, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

On 21/4/23 03:12, Gurchetan Singh wrote:
> The virtio-gpu GL device has a heavy dependence on virgl.
> Acknowledge this by naming functions accurately.
> 
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>   hw/display/virtio-gpu-gl.c     | 27 ++++++++++++++-------------
>   hw/display/virtio-gpu-virgl.c  |  2 +-
>   include/hw/virtio/virtio-gpu.h |  2 +-
>   3 files changed, 16 insertions(+), 15 deletions(-)


> diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
> index adee17968d..e256e44172 100644
> --- a/include/hw/virtio/virtio-gpu.h
> +++ b/include/hw/virtio/virtio-gpu.h
> @@ -297,7 +297,7 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
>                                     struct virtio_gpu_ctrl_command *cmd);
>   void virtio_gpu_virgl_fence_poll(VirtIOGPU *g);
>   void virtio_gpu_virgl_reset_scanout(VirtIOGPU *g);
> -void virtio_gpu_virgl_reset(VirtIOGPU *g);
> +void virtio_gpu_virglrenderer_reset(VirtIOGPU *g);

Or virtio_gpu_virgl_reset_renderer() similar to scanout?

Regardless,

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

>   int virtio_gpu_virgl_init(VirtIOGPU *g);
>   int virtio_gpu_virgl_get_num_capsets(VirtIOGPU *g);



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function
  2023-04-21  1:12 ` [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function Gurchetan Singh
@ 2023-04-21  9:53   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 28+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-04-21  9:53 UTC (permalink / raw)
  To: Gurchetan Singh, qemu-devel
  Cc: pbonzini, david, stefanha, kraxel, marcandre.lureau,
	akihiko.odaki, dmitry.osipenko, ray.huang, alex.bennee

On 21/4/23 03:12, Gurchetan Singh wrote:
> This reduces the amount of renderer backend specific needed to
> be exposed to the GL device.  We only need one realize function
> per renderer backend.
> 
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>   hw/display/virtio-gpu-gl.c     | 17 +++++++++++------
>   hw/display/virtio-gpu-virgl.c  | 35 ++++++++++++++++++++++++----------
>   include/hw/virtio/virtio-gpu.h |  7 -------
>   3 files changed, 36 insertions(+), 23 deletions(-)


> @@ -34,13 +39,13 @@ static void virtio_gpu_gl_class_init(ObjectClass *klass, void *data)
>       VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_CLASS(klass);
>       VirtIOGPUClass *vgc = VIRTIO_GPU_CLASS(klass);
>   
> -    vbc->gl_flushed = virtio_gpu_virgl_flushed;
> -    vgc->handle_ctrl = virtio_gpu_virgl_handle_ctrl;
> -    vgc->process_cmd = virtio_gpu_virgl_process_cmd;
> -    vgc->update_cursor_data = virtio_gpu_virgl_update_cursor;
> +    vbc->gl_flushed = NULL;
> +    vgc->handle_ctrl = NULL;
> +    vgc->process_cmd = NULL;
> +    vgc->update_cursor_data = NULL;

If we don't check pointers are unset before assigning it in
virtio_gpu_virgl_device_realize(), then NULL-initializing here
is not really helpful.

> -    vdc->realize = virtio_gpu_virgl_device_realize;
> -    vdc->reset = virtio_gpu_virgl_reset;
> +    vdc->realize = virtio_gpu_gl_device_realize;
> +    vdc->reset = NULL;
>       device_class_set_props(dc, virtio_gpu_gl_properties);
>   }


> @@ -737,9 +752,9 @@ void virtio_gpu_virgl_device_realize(DeviceState *qdev, Error **errp)
>           return;
>       }
>   
> -    g->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);
> -    VIRTIO_GPU_BASE(g)->virtio_config.num_capsets =
> -        virtio_gpu_virgl_get_num_capsets(g);
> +    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_VIRGL_ENABLED);

Or:

        VIRTIO_DEVICE(gpudev)->conf.flags |= (1 << 
VIRTIO_GPU_FLAG_VIRGL_ENABLED);

> +    VIRTIO_GPU_BASE(gpudev)->virtio_config.num_capsets =
> +        virtio_gpu_virgl_get_num_capsets(gpudev);
>   
>       virtio_gpu_device_realize(qdev, errp);
>   }

Removing the NULL-inits:

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

PD: I'd use "hw/display/virtio-gpu-virgl" patch prefix instead
of "gfxstream + rutabaga prep". You only start mentioning rutabaga
in the next patch. And the patches up to here could be applied
as virtio-gpu cleanup patches.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 13/13] HACK: schedule fence return on main AIO context
  2023-04-21  1:12 ` [RFC PATCH 13/13] HACK: schedule fence return on main AIO context Gurchetan Singh
@ 2023-04-21 15:59   ` Stefan Hajnoczi
  2023-04-21 23:20     ` Gurchetan Singh
  0 siblings, 1 reply; 28+ messages in thread
From: Stefan Hajnoczi @ 2023-04-21 15:59 UTC (permalink / raw)
  To: Gurchetan Singh
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
<gurchetansingh@chromium.org> wrote:
>
> gfxstream and both cross-domain (and even newer versions
> virglrenderer: see VIRGL_RENDERER_ASYNC_FENCE_CB) like to signal
> fence completion on threads ("callback threads") that are
> different from the thread that processes the command queue
> ("main thread").
>
> This is generally possible with locking, and this what we do
> in crosvm and other virtio-gpu1.1 implementations.  However, on
> QEMU a deadlock is observed if virtio_gpu_ctrl_response_nodata(..)
> [used in the fence callback] is used from a thread that is not the
> main thread.
>
> The reason is the main thread takes the big QEMU lock (bql) somewhere
> when processing the command queue, and virtio_gpu_ctrl_response_nodata(..)
> needs that lock.  If you add in the lock needed to protect &g->fenceq
> from concurrent access by the main thread and the callback threads,
> you end can end up with deadlocks.
>
> It's possible to workaround this by scheduling the return of the fence
> descriptors via aio_bh_schedule_oneshot_full(..), but that somewhat
> negates the rationale for the asynchronous callbacks.
>
> I also played around with aio_context_acquire()/aio_context_release(),
> doesn't seem to help.
>
> Is signaling the virtio_queue outside of the main thread possible?  If
> so, how?

Yes, if you want a specific thread to process a virtqueue, monitor the
virtqueue's host_notifier (aka ioeventfd) from the thread. That's how
--device virtio-blk,iothread= works. It attaches the host_notifier to
the IOThread's AioContext. Whenever the guest kicks the virtqueue, the
IOThread will respond instead of QEMU's main loop thread.

That said, I don't know the threading model of your code. Could you
explain which threads are involved? Do you want to process all buffers
from virtqueue in a specific thread or only these fence buffers?

>
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>  hw/display/virtio-gpu-rutabaga.c | 29 ++++++++++++++++++++++++++---
>  include/hw/virtio/virtio-gpu.h   |  1 +
>  2 files changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
> index 196267aac2..5c296aeef1 100644
> --- a/hw/display/virtio-gpu-rutabaga.c
> +++ b/hw/display/virtio-gpu-rutabaga.c
> @@ -31,6 +31,11 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g);
>
>  #define CHECK_RESULT(result, cmd) CHECK(result == 0, cmd)
>
> +struct rutabaga_aio_data {
> +    struct VirtIOGPUGL *virtio_gpu;
> +    struct rutabaga_fence fence;
> +};
> +
>  static void
>  virtio_gpu_rutabaga_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
>                                    uint32_t resource_id)
> @@ -823,10 +828,11 @@ virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
>  }
>
>  static void
> -virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> -                             struct rutabaga_fence fence_data)
> +virtio_gpu_rutabaga_aio_cb(void *opaque)
>  {
> -    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
> +    struct rutabaga_aio_data *data =  (struct rutabaga_aio_data *)opaque;
> +    VirtIOGPU *g = (VirtIOGPU *)data->virtio_gpu;
> +    struct rutabaga_fence fence_data = data->fence;
>      struct virtio_gpu_ctrl_command *cmd, *tmp;
>
>      bool signaled_ctx_specific = fence_data.flags & RUTABAGA_FLAG_INFO_RING_IDX;
> @@ -856,6 +862,22 @@ virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
>          QTAILQ_REMOVE(&g->fenceq, cmd, next);
>          g_free(cmd);
>      }
> +
> +    g_free(data);
> +}
> +
> +static void
> +virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> +                             struct rutabaga_fence fence_data) {
> +    struct rutabaga_aio_data *data;
> +    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
> +    GET_VIRTIO_GPU_GL(g);
> +
> +    data = g_new0(struct rutabaga_aio_data, 1);
> +    data->virtio_gpu = virtio_gpu;
> +    data->fence = fence_data;
> +    aio_bh_schedule_oneshot_full(virtio_gpu->ctx, virtio_gpu_rutabaga_aio_cb,
> +                                 (void *)data, "aio");
>  }
>
>  static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
> @@ -912,6 +934,7 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
>          free(channels.channels);
>      }
>
> +    virtio_gpu->ctx = qemu_get_aio_context();
>      return result;
>  }
>
> diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
> index 034c71e8f5..b33ad0c68f 100644
> --- a/include/hw/virtio/virtio-gpu.h
> +++ b/include/hw/virtio/virtio-gpu.h
> @@ -225,6 +225,7 @@ struct VirtIOGPUGL {
>      char *wayland_socket_path;
>      uint32_t num_capsets;
>      void *rutabaga;
> +    AioContext *ctx;
>  };
>
>  struct VhostUserGPU {
> --
> 2.40.0.634.g4ca3ef3211-goog
>
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?
  2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
                   ` (12 preceding siblings ...)
  2023-04-21  1:12 ` [RFC PATCH 13/13] HACK: schedule fence return on main AIO context Gurchetan Singh
@ 2023-04-21 16:02 ` Stefan Hajnoczi
  2023-04-21 23:54   ` Gurchetan Singh
  13 siblings, 1 reply; 28+ messages in thread
From: Stefan Hajnoczi @ 2023-04-21 16:02 UTC (permalink / raw)
  To: Gurchetan Singh
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
<gurchetansingh@chromium.org> wrote:
>
> From: Gurchetan Singh <gurchetansingh@google.com>
>
> Rationale:
>
> - gfxstream [a] is good for the Android Emulator/upstream QEMU
>   alignment
> - Wayland passhthrough [b] via the cross-domain context type is good
>   for Linux on Linux display virtualization
> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
>   virglrenderer
> - This series ports rutabaga_gfx to QEMU

What rutabaga_gfx and gfxstream? Can you explain where they sit in the
stack and how they build on or complement virtio-gpu and
virglrenderer?

Stefan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 13/13] HACK: schedule fence return on main AIO context
  2023-04-21 15:59   ` Stefan Hajnoczi
@ 2023-04-21 23:20     ` Gurchetan Singh
  2023-04-25 12:04       ` Stefan Hajnoczi
  0 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21 23:20 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Fri, Apr 21, 2023 at 9:00 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> <gurchetansingh@chromium.org> wrote:
> >
> > gfxstream and both cross-domain (and even newer versions
> > virglrenderer: see VIRGL_RENDERER_ASYNC_FENCE_CB) like to signal
> > fence completion on threads ("callback threads") that are
> > different from the thread that processes the command queue
> > ("main thread").
> >
> > This is generally possible with locking, and this what we do
> > in crosvm and other virtio-gpu1.1 implementations.  However, on
> > QEMU a deadlock is observed if virtio_gpu_ctrl_response_nodata(..)
> > [used in the fence callback] is used from a thread that is not the
> > main thread.
> >
> > The reason is the main thread takes the big QEMU lock (bql) somewhere
> > when processing the command queue, and virtio_gpu_ctrl_response_nodata(..)
> > needs that lock.  If you add in the lock needed to protect &g->fenceq
> > from concurrent access by the main thread and the callback threads,
> > you end can end up with deadlocks.
> >
> > It's possible to workaround this by scheduling the return of the fence
> > descriptors via aio_bh_schedule_oneshot_full(..), but that somewhat
> > negates the rationale for the asynchronous callbacks.
> >
> > I also played around with aio_context_acquire()/aio_context_release(),
> > doesn't seem to help.
> >
> > Is signaling the virtio_queue outside of the main thread possible?  If
> > so, how?
>
> Yes, if you want a specific thread to process a virtqueue, monitor the
> virtqueue's host_notifier (aka ioeventfd) from the thread. That's how
> --device virtio-blk,iothread= works. It attaches the host_notifier to
> the IOThread's AioContext. Whenever the guest kicks the virtqueue, the
> IOThread will respond instead of QEMU's main loop thread.
>
> That said, I don't know the threading model of your code. Could you
> explain which threads are involved? Do you want to process all buffers
> from virtqueue in a specific thread or only these fence buffers?

Only the fence callback would be signalled via these callback threads.
The virtqueue would not be processed by the callback thread.

There can be multiple callback threads: for example, one for the
gfxstream context and another for the Wayland context.  These threads
could be a C++ thread, a Rust thread or any other.

The strategy used by crosvm is to have a mutex around the fence state
to synchronize between multiple callback threads (which signal fences)
and main thread (which generates fences).

I tried to use aio_context_acquire(..)/aio_context_release(..) to
synchronize these threads, but that results in a deadlock.  I think
those functions may assume an IOThread and not necessarily any thread.
It seems aio_context_acquire(..) succeeds for the callback thread
though.

Here's what tried (rather than this patch which works, but is less
ideal than the solution below):

diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
index 196267aac2..0993b09090 100644
--- a/hw/display/virtio-gpu-rutabaga.c
+++ b/hw/display/virtio-gpu-rutabaga.c
@@ -798,6 +798,7 @@ virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
     }

     if (cmd->finished) {
+        g_free(cmd);
         return;
     }
     if (cmd->error) {
@@ -811,6 +812,8 @@ virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
         return;
     }

+    QTAILQ_INSERT_TAIL(&g->fenceq, cmd, next);
+
     fence.flags = cmd->cmd_hdr.flags;
     fence.ctx_id = cmd->cmd_hdr.ctx_id;
     fence.fence_id = cmd->cmd_hdr.fence_id;
@@ -827,9 +830,11 @@ virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
                              struct rutabaga_fence fence_data)
 {
     VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
+    GET_VIRTIO_GPU_GL(g);
     struct virtio_gpu_ctrl_command *cmd, *tmp;

     bool signaled_ctx_specific = fence_data.flags &
RUTABAGA_FLAG_INFO_RING_IDX;
+    aio_context_acquire(virtio_gpu->ctx);

     QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
         /*
@@ -856,6 +861,7 @@ virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
         QTAILQ_REMOVE(&g->fenceq, cmd, next);
         g_free(cmd);
     }
+    aio_context_release(virtio_gpu->ctx);
 }

 static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
@@ -912,6 +918,7 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
         free(channels.channels);
     }

+    virtio_gpu->ctx = qemu_get_aio_context();
     return result;
 }

@@ -942,6 +949,30 @@ static int
virtio_gpu_rutabaga_get_num_capsets(VirtIOGPU *g)
     return (int)(num_capsets);
 }

+static void virtio_gpu_rutabaga_process_cmdq(VirtIOGPU *g)
+{
+    struct virtio_gpu_ctrl_command *cmd;
+    VirtIOGPUClass *vgc = VIRTIO_GPU_GET_CLASS(g);
+
+    if (g->processing_cmdq) {
+        return;
+    }
+    g->processing_cmdq = true;
+    while (!QTAILQ_EMPTY(&g->cmdq)) {
+        cmd = QTAILQ_FIRST(&g->cmdq);
+
+        if (g->parent_obj.renderer_blocked) {
+            break;
+        }
+
+        QTAILQ_REMOVE(&g->cmdq, cmd, next);
+
+        /* process command */
+        vgc->process_cmd(g, cmd);
+    }
+    g->processing_cmdq = false;
+}
+
 static void virtio_gpu_rutabaga_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOGPU *g = VIRTIO_GPU(vdev);
@@ -972,7 +1003,7 @@ static void
virtio_gpu_rutabaga_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
         cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
     }

-    virtio_gpu_process_cmdq(g);
+    virtio_gpu_rutabaga_process_cmdq(g);
 }

 void virtio_gpu_rutabaga_device_realize(DeviceState *qdev, Error **errp)
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 034c71e8f5..b33ad0c68f 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -225,6 +225,7 @@ struct VirtIOGPUGL {
     char *wayland_socket_path;
     uint32_t num_capsets;
     void *rutabaga;
+    AioContext *ctx;
 };

 struct VhostUserGPU {

--------------------------------------------------------------------------------------------------------

Main Thread deadlocked with above patch:

[Switching to thread 1 (Thread 0x7ffff5ced600 (LWP 8584))]
#0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
<qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
146 in ../sysdeps/nptl/futex-internal.h
(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
<qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x5555569893a0
<qemu_global_mutex>, private=0) at ./nptl/lowlevellock.c:49
#2  0x00007ffff7098082 in lll_mutex_lock_optimized
(mutex=0x5555569893a0 <qemu_global_mutex>) at
./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x5555569893a0 <qemu_global_mutex>)
at ./nptl/pthread_mutex_lock.c:93
#4  0x00005555560019da in qemu_mutex_lock_impl (mutex=0x5555569893a0
<qemu_global_mutex>, file=0x55555626eaeb "../util/main-loop.c",
line=311) at ../util/qemu-thread-posix.c:94
#5  0x0000555555afdf2d in qemu_mutex_lock_iothread_impl
(file=0x55555626eaeb "../util/main-loop.c", line=311) at
../softmmu/cpus.c:504
#6  0x000055555601c39e in os_host_main_loop_wait (timeout=1471024) at
../util/main-loop.c:311
#7  0x000055555601c4bd in main_loop_wait (nonblocking=0) at
../util/main-loop.c:592
#8  0x0000555555b070a3 in qemu_main_loop () at ../softmmu/runstate.c:731
#9  0x0000555555e11dbd in qemu_default_main () at ../softmmu/main.c:37
#10 0x0000555555e11df7 in main (argc=23, argv=0x7fffffffde28) at
../softmmu/main.c:48

Callback Thread deadlocked with above patch:

[Switching to thread 56 (Thread 0x7ffd2c9ff640 (LWP 8649))]
#0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
<qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
146 in ../sysdeps/nptl/futex-internal.h
(gdb) bt
#0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
<qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
#1  __GI___lll_lock_wait (futex=futex@entry=0x5555569893a0
<qemu_global_mutex>, private=0) at ./nptl/lowlevellock.c:49
#2  0x00007ffff7098082 in lll_mutex_lock_optimized
(mutex=0x5555569893a0 <qemu_global_mutex>) at
./nptl/pthread_mutex_lock.c:48
#3  ___pthread_mutex_lock (mutex=0x5555569893a0 <qemu_global_mutex>)
at ./nptl/pthread_mutex_lock.c:93
#4  0x00005555560019da in qemu_mutex_lock_impl (mutex=0x5555569893a0
<qemu_global_mutex>, file=0x5555561df60c "../softmmu/physmem.c",
line=2581) at ../util/qemu-thread-posix.c:94
#5  0x0000555555afdf2d in qemu_mutex_lock_iothread_impl
(file=0x5555561df60c "../softmmu/physmem.c", line=2581) at
../softmmu/cpus.c:504
#6  0x0000555555d69a1a in prepare_mmio_access (mr=0x555556c10ca0) at
../softmmu/physmem.c:2581
#7  0x0000555555d6b7a8 in address_space_stl_internal
(as=0x5555578270f8, addr=4276125700, val=33, attrs=..., result=0x0,
endian=DEVICE_LITTLE_ENDIAN) at
/home/lenovo/qemu/memory_ldst.c.inc:318
#8  0x0000555555d6b91d in address_space_stl_le (as=0x5555578270f8,
addr=4276125700, val=33, attrs=..., result=0x0) at
/home/lenovo/qemu/memory_ldst.c.inc:357
#9  0x0000555555a0657c in pci_msi_trigger (dev=0x555557826ec0,
msg=...) at ../hw/pci/pci.c:356
#10 0x0000555555a02a5a in msi_send_message (dev=0x555557826ec0,
msg=...) at ../hw/pci/msi.c:379
#11 0x0000555555a045a0 in msix_notify (dev=0x555557826ec0, vector=1)
at ../hw/pci/msix.c:542
#12 0x0000555555ac8780 in virtio_pci_notify (d=0x555557826ec0,
vector=1) at ../hw/virtio/virtio-pci.c:77
#13 0x0000555555d15ddc in virtio_notify_vector (vdev=0x55555783ffd0,
vector=1) at ../hw/virtio/virtio.c:1985
#14 0x0000555555d17393 in virtio_irq (vq=0x555558716d80) at
../hw/virtio/virtio.c:2461
#15 0x0000555555d17439 in virtio_notify (vdev=0x55555783ffd0,
vq=0x555558716d80) at ../hw/virtio/virtio.c:2473
#16 0x0000555555b98732 in virtio_gpu_ctrl_response (g=0x55555783ffd0,
cmd=0x7ffdd80068b0, resp=0x7ffd2c9fe150, resp_len=24) at
../hw/display/virtio-gpu.c:177
#17 0x0000555555b9879c in virtio_gpu_ctrl_response_nodata
(g=0x55555783ffd0, cmd=0x7ffdd80068b0, type=VIRTIO_GPU_RESP_OK_NODATA)
at ../hw/display/virtio-gpu.c:189
#18 0x0000555555ba6b82 in virtio_gpu_rutabaga_fence_cb
(user_data=93825028849616, fence_data=...) at
../hw/display/virtio-gpu-rutabaga.c:860
#19 0x00007ffff75b9381 in
_ZN131_$LT$rutabaga_gfx..rutabaga_utils..RutabagaFenceClosure$LT$T$GT$$u20$as$u20$rutabaga_gfx..rutabaga_utils..RutabagaFenceCallback$GT$4call17hcbf1b4ac094a2a60E.llvm.14356420492591683714
() at /usr/local/lib/librutabaga_gfx_ffi.so
#20 0x00007ffff75d501b in
rutabaga_gfx::cross_domain::cross_domain::CrossDomainWorker::run () at
/usr/local/lib/librutabaga_gfx_ffi.so
#21 0x00007ffff75dd651 in
std::sys_common::backtrace::__rust_begin_short_backtrace () at
/usr/local/lib/librutabaga_gfx_ffi.so
#22 0x00007ffff75c62ba in
core::ops::function::FnOnce::call_once{{vtable.shim}} () at
/usr/local/lib/librutabaga_gfx_ffi.so
#23 0x00007ffff75e3b73 in alloc::boxed::{impl#44}::call_once<(), dyn
core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>
(self=..., args=()) at library/alloc/src/boxed.rs:1940
#24 alloc::boxed::{impl#44}::call_once<(), alloc::boxed::Box<dyn
core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>,
alloc::alloc::Global> (self=0x5555572e27d0, args=())
    at library/alloc/src/boxed.rs:1940
#25 std::sys::unix::thread::{impl#2}::new::thread_start
(main=0x5555572e27d0) at library/std/src/sys/unix/thread.rs:108
#26 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at
./nptl/pthread_create.c:442
#27 0x00007ffff7126a00 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

>
>
> >
> > Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> > ---
> >  hw/display/virtio-gpu-rutabaga.c | 29 ++++++++++++++++++++++++++---
> >  include/hw/virtio/virtio-gpu.h   |  1 +
> >  2 files changed, 27 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
> > index 196267aac2..5c296aeef1 100644
> > --- a/hw/display/virtio-gpu-rutabaga.c
> > +++ b/hw/display/virtio-gpu-rutabaga.c
> > @@ -31,6 +31,11 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g);
> >
> >  #define CHECK_RESULT(result, cmd) CHECK(result == 0, cmd)
> >
> > +struct rutabaga_aio_data {
> > +    struct VirtIOGPUGL *virtio_gpu;
> > +    struct rutabaga_fence fence;
> > +};
> > +
> >  static void
> >  virtio_gpu_rutabaga_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
> >                                    uint32_t resource_id)
> > @@ -823,10 +828,11 @@ virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
> >  }
> >
> >  static void
> > -virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> > -                             struct rutabaga_fence fence_data)
> > +virtio_gpu_rutabaga_aio_cb(void *opaque)
> >  {
> > -    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
> > +    struct rutabaga_aio_data *data =  (struct rutabaga_aio_data *)opaque;
> > +    VirtIOGPU *g = (VirtIOGPU *)data->virtio_gpu;
> > +    struct rutabaga_fence fence_data = data->fence;
> >      struct virtio_gpu_ctrl_command *cmd, *tmp;
> >
> >      bool signaled_ctx_specific = fence_data.flags & RUTABAGA_FLAG_INFO_RING_IDX;
> > @@ -856,6 +862,22 @@ virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> >          QTAILQ_REMOVE(&g->fenceq, cmd, next);
> >          g_free(cmd);
> >      }
> > +
> > +    g_free(data);
> > +}
> > +
> > +static void
> > +virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> > +                             struct rutabaga_fence fence_data) {
> > +    struct rutabaga_aio_data *data;
> > +    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;
> > +    GET_VIRTIO_GPU_GL(g);
> > +
> > +    data = g_new0(struct rutabaga_aio_data, 1);
> > +    data->virtio_gpu = virtio_gpu;
> > +    data->fence = fence_data;
> > +    aio_bh_schedule_oneshot_full(virtio_gpu->ctx, virtio_gpu_rutabaga_aio_cb,
> > +                                 (void *)data, "aio");
> >  }
> >
> >  static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
> > @@ -912,6 +934,7 @@ static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
> >          free(channels.channels);
> >      }
> >
> > +    virtio_gpu->ctx = qemu_get_aio_context();
> >      return result;
> >  }
> >
> > diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
> > index 034c71e8f5..b33ad0c68f 100644
> > --- a/include/hw/virtio/virtio-gpu.h
> > +++ b/include/hw/virtio/virtio-gpu.h
> > @@ -225,6 +225,7 @@ struct VirtIOGPUGL {
> >      char *wayland_socket_path;
> >      uint32_t num_capsets;
> >      void *rutabaga;
> > +    AioContext *ctx;
> >  };
> >
> >  struct VhostUserGPU {
> > --
> > 2.40.0.634.g4ca3ef3211-goog
> >
> >


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?
  2023-04-21 16:02 ` [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Stefan Hajnoczi
@ 2023-04-21 23:54   ` Gurchetan Singh
  2023-04-22 16:41     ` Akihiko Odaki
  0 siblings, 1 reply; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-21 23:54 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> <gurchetansingh@chromium.org> wrote:
> >
> > From: Gurchetan Singh <gurchetansingh@google.com>
> >
> > Rationale:
> >
> > - gfxstream [a] is good for the Android Emulator/upstream QEMU
> >   alignment
> > - Wayland passhthrough [b] via the cross-domain context type is good
> >   for Linux on Linux display virtualization
> > - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
> >   virglrenderer
> > - This series ports rutabaga_gfx to QEMU
>
> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
> stack and how they build on or complement virtio-gpu and
> virglrenderer?

rutabaga_gfx and gfxstream are both libraries that implement the
virtio-gpu protocol.  There's a document available in the Gitlab issue
to see where they fit in the stack [a].

gfxstream grew out of the Android Emulator's need to virtualize
graphics for app developers.  There's a short history of gfxstream in
patch 10.  It complements virglrenderer in that it's a bit more
cross-platform and targets different use cases -- more detail here
[b].  The ultimate goal is ditch out-of-tree kernel drivers in the
Android Emulator and adopt virtio, and porting gfxstream to QEMU would
speed up that transition.

rutabaga_gfx is a much smaller Rust library that sits on top of
gfxstream and even virglrenderer, but does a few extra things.  It
implements the cross-domain context type, which provides Wayland
passthrough.  This helps virtio-gpu by providing more modern display
virtualization.  For example, Microsoft for WSL2 also uses a similar
technique [c], but I believe it is not virtio-based nor open-source.
With this, we can have the same open-source Wayland passthrough
solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
an additional small Rust context type for security-sensitive use cases
in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
(since it's C++ based) in such cases.

Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.

[a] https://gitlab.com/qemu-project/qemu/-/issues/1611
[b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
[c] https://www.youtube.com/watch?v=EkNBsBx501Q
[d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
[e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533

>
> Stefan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options
  2023-04-21  1:12 ` [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options Gurchetan Singh
@ 2023-04-22 14:54   ` Akihiko Odaki
  0 siblings, 0 replies; 28+ messages in thread
From: Akihiko Odaki @ 2023-04-22 14:54 UTC (permalink / raw)
  To: Gurchetan Singh, qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	dmitry.osipenko, ray.huang, alex.bennee

On 2023/04/21 10:12, Gurchetan Singh wrote:
> This modifies the common virtio-gpu.h file have the fields and
> defintions needed by gfxstream/rutabaga.  It also modifies VirtioGPUGL
> to have the runtime options needed by rutabaga.  They are:
> 
> - a colon separated list of capset names, defined in the virtio spec
> - a wayland socket path to enable guest Wayland passthrough
> 
> The command to run these would be:
> 
> -device virtio-vga-gl,capset_names=gfxstream:cross-domain, \
>          wayland_socket_path=/run/user/1000/wayland-0,hostmem=8G  \

It will be nice if it automatically determines the socket path according to:
https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1af048371dfef7577bd39a3c04b78d0374

A documentation to set up cross-domain just like what you can find in 
the crosvm book* will also be helpful.
https://crosvm.dev/book/devices/wayland.html

> 
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>   hw/display/virtio-gpu-gl.c     | 2 ++
>   include/hw/virtio/virtio-gpu.h | 8 ++++++++
>   2 files changed, 10 insertions(+)
> 
> diff --git a/hw/display/virtio-gpu-gl.c b/hw/display/virtio-gpu-gl.c
> index 547e697333..15270b0c8a 100644
> --- a/hw/display/virtio-gpu-gl.c
> +++ b/hw/display/virtio-gpu-gl.c
> @@ -29,6 +29,8 @@ static void virtio_gpu_gl_device_realize(DeviceState *qdev, Error **errp)
>   static Property virtio_gpu_gl_properties[] = {
>       DEFINE_PROP_BIT("stats", VirtIOGPU, parent_obj.conf.flags,
>                       VIRTIO_GPU_FLAG_STATS_ENABLED, false),
> +    DEFINE_PROP_STRING("capset_names", VirtIOGPUGL, capset_names),
> +    DEFINE_PROP_STRING("wayland_socket_path", VirtIOGPUGL, wayland_socket_path),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
> index 421733d751..a35ade3608 100644
> --- a/include/hw/virtio/virtio-gpu.h
> +++ b/include/hw/virtio/virtio-gpu.h
> @@ -94,6 +94,7 @@ enum virtio_gpu_base_conf_flags {
>       VIRTIO_GPU_FLAG_DMABUF_ENABLED,
>       VIRTIO_GPU_FLAG_BLOB_ENABLED,
>       VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED,
> +    VIRTIO_GPU_FLAG_RUTABAGA_ENABLED,
>   };
>   
>   #define virtio_gpu_virgl_enabled(_cfg) \
> @@ -106,6 +107,8 @@ enum virtio_gpu_base_conf_flags {
>       (_cfg.flags & (1 << VIRTIO_GPU_FLAG_DMABUF_ENABLED))
>   #define virtio_gpu_blob_enabled(_cfg) \
>       (_cfg.flags & (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED))
> +#define virtio_gpu_rutabaga_enabled(_cfg) \
> +    (_cfg.flags & (1 << VIRTIO_GPU_FLAG_RUTABAGA_ENABLED))
>   #define virtio_gpu_hostmem_enabled(_cfg) \
>       (_cfg.hostmem > 0)
>   #define virtio_gpu_context_init_enabled(_cfg) \
> @@ -217,6 +220,11 @@ struct VirtIOGPUGL {
>   
>       bool renderer_inited;
>       bool renderer_reset;
> +
> +    char *capset_names;
> +    char *wayland_socket_path;
> +    uint32_t num_capsets;
> +    void *rutabaga;

I prefer to have a line:
struct rutabaga;

In virtio-gpu.h and use it here. Perhaps it may be a bit weird to have 
such a declaration in a renderer-independent file, but it's practically 
harmless, can prevent something rouge from being assigned to the member, 
and allows to use the variable without annoying casts.

>   };
>   
>   struct VhostUserGPU {


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream
  2023-04-21  1:12 ` [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream Gurchetan Singh
@ 2023-04-22 15:37   ` Akihiko Odaki
  0 siblings, 0 replies; 28+ messages in thread
From: Akihiko Odaki @ 2023-04-22 15:37 UTC (permalink / raw)
  To: Gurchetan Singh, qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	dmitry.osipenko, ray.huang, alex.bennee

On 2023/04/21 10:12, Gurchetan Singh wrote:
> This adds initial support for gfxstream and cross-domain.  Both
> features rely on virtio-gpu blob resources and context types, which
> are also implemented in this patch.
> 
> gfxstream has a long and illustrious history in Android graphics
> paravirtualization.  It has been powering graphics in the Android
> Studio Emulator for more than a decade, which is the main developer
> platform.
> 
> Originally conceived by Jesse Hall, it was first known as "EmuGL" [a].
> The key design characteristic was a 1:1 threading model and
> auto-generation, which fit nicely with the OpenGLES spec.  It also
> allowed easy layering with ANGLE on the host, which provides the GLES
> implementations on Windows or MacOS enviroments.
> 
> gfxstream has traditionally been maintained by a single engineer, and
> between 2015 to 2021, the iron throne passed to Frank Yang.  Just to
> name a few accomplishments in a reign filled with many of them: newer
> versions of GLES, address space graphics, snapshot support and CTS
> compliant Vulkan [b].
> 
> One major drawback was the use of out-of-tree goldfish drivers.
> Android engineers didn't know much about DRM/KMS and especially TTM so
> a simple guest to host pipe was conceived.
> 
> Luckily, virtio-gpu 3D started to emerge in 2016 due to the work of
> the Mesa/virglrenderer communities.  In 2018, the initial virtio-gpu
> port of gfxstream was done by Cuttlefish enthusiast Alistair Delva.
> It was a symbol compatible replacement of virglrenderer [c] and named
> "AVDVirglrenderer".  This implementation forms the basis of the
> current gfxstream host implementation still in use today.
> 
> cross-domain support follows a similar arc.  Originally conceived by
> Wayland aficionado David Reveman and crosvm enjoyer Zach Reizner in
> 2018, it initially relied on the downstream "virtio-wl" device.
> 
> In 2020 and 2021, virtio-gpu was extended to include blob resources
> and multiple timelines by yours truly, features gfxstream/cross-domain
> both require to function correctly.
> 
> Right now, we stand at the precipice of a truly fantastic possibility:
> the Android Emulator powered by upstream QEMU and upstream Linux
> kernel.  gfxstream will then be packaged properfully, and app
> developers can even fix gfxstream bugs on their own if they encounter
> them.
> 
> It's been quite the ride, my friends.  Where will gfxstream head next,
> nobody really knows.  I wouldn't be surprised if it's around for
> another decade, maintained by a new generation of Android graphics
> enthusiasts.  One thing is for sure, though -- it'll be filled with
> friendship and magic!
> 
> Technical details:
>    - Very simple initial display integration: just used Pixman
>    - Largely, 1:1 mapping of virtio-gpu hypercalls to rutabaga function
>      calls
> 
> [a] https://android-review.googlesource.com/c/platform/development/+/34470
> [b] https://android-review.googlesource.com/q/topic:%22vulkan-hostconnection-start%22
> [c] https://android-review.googlesource.com/c/device/generic/goldfish-opengl/+/761927
> 
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>   hw/display/virtio-gpu-rutabaga.c | 995 +++++++++++++++++++++++++++++++
>   1 file changed, 995 insertions(+)
>   create mode 100644 hw/display/virtio-gpu-rutabaga.c
> 
> diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
> new file mode 100644
> index 0000000000..5fd1154198
> --- /dev/null
> +++ b/hw/display/virtio-gpu-rutabaga.c
> @@ -0,0 +1,995 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/iov.h"
> +#include "trace.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-gpu.h"
> +#include "hw/virtio/virtio-gpu-pixman.h"
> +#include "hw/virtio/virtio-iommu.h"
> +
> +#include <rutabaga_gfx/rutabaga_gfx_ffi.h>
> +
> +static int virtio_gpu_rutabaga_init(VirtIOGPU *g);
> +
> +#define GET_VIRTIO_GPU_GL(x)                                                  \
> +        VirtIOGPUGL *virtio_gpu = VIRTIO_GPU_GL(x);                           \

It's confusing to name a VirtIOGPUGL pointer derived from a VirtIOGPU 
pointer "virtio_gpu". "gl", the name used in virtio-gpu-gl.c is less 
confusing though it's a bit strange considering the VirtIOGPU pointer is 
named "g".

I also doubt this macro (and following GET_RUTABAGA()) makes sense. It's 
confusing that they declare a variable, and it's not really saving code 
either as the content of each macro is just one line. But other people 
may prefer these macros to stay.

> +
> +#define GET_RUTABAGA(x)                                                       \
> +        struct rutabaga *rutabaga = (struct rutabaga *)(x->rutabaga);         \

Wrap x with parentheses. Also you don't need casting from (void *) since 
it's C.

> +
> +#define CHECK(condition, cmd)                                                 \
> +    do {                                                                      \
> +        if (!condition) {                                                     \
> +            qemu_log_mask(LOG_GUEST_ERROR, "CHECK_RESULT failed in %s() %s:"  \

This macro is named CHECK but it says CHECK_RESULT.

> +                          "%d\n", __func__, __FILE__, __LINE__);              \
> +            cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;                          \
> +            return;                                                           \
> +       }                                                                      \
> +    } while (0)
> +
> +#define CHECK_RESULT(result, cmd) CHECK(result == 0, cmd)
> +
> +static void
> +virtio_gpu_rutabaga_update_cursor(VirtIOGPU *g, struct virtio_gpu_scanout *s,
> +                                  uint32_t resource_id)
> +{
> +    struct virtio_gpu_simple_resource *res;
> +    struct rutabaga_transfer transfer = { 0 };
> +    struct iovec transfer_iovec;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    res = virtio_gpu_find_resource(g, resource_id);
> +    if (!res) {
> +        return;
> +    }
> +
> +    if (res->width != s->current_cursor->width ||
> +        res->height != s->current_cursor->height) {
> +        return;
> +    }
> +
> +    transfer.x = 0;
> +    transfer.y = 0;
> +    transfer.z = 0;
> +    transfer.w = res->width;
> +    transfer.h = res->height;
> +    transfer.d = 1;
> +
> +    transfer_iovec.iov_base = (void *)s->current_cursor->data;
> +    transfer_iovec.iov_len = res->width * res->height * 4;
> +
> +    rutabaga_resource_transfer_read(rutabaga, 0,
> +                                    resource_id, &transfer,
> +                                    &transfer_iovec);
> +}
> +
> +static void
> +virtio_gpu_rutabaga_gl_flushed(VirtIOGPUBase *b)
> +{
> +    VirtIOGPU *g = VIRTIO_GPU(b);
> +    virtio_gpu_process_cmdq(g);
> +}
> +
> +static void
> +rutabaga_cmd_create_resource_2d(VirtIOGPU *g,
> +                                struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct rutabaga_create_3d rc_3d = { 0 };
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_create_2d c2d;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(c2d);
> +    trace_virtio_gpu_cmd_res_create_2d(c2d.resource_id, c2d.format,
> +                                       c2d.width, c2d.height);
> +
> +    rc_3d.target = 2;
> +    rc_3d.format = c2d.format;
> +    rc_3d.bind = (1 << 1);
> +    rc_3d.width = c2d.width;
> +    rc_3d.height = c2d.height;
> +    rc_3d.depth = 1;
> +    rc_3d.array_size = 1;
> +    rc_3d.last_level = 0;
> +    rc_3d.nr_samples = 0;
> +    rc_3d.flags = VIRTIO_GPU_RESOURCE_FLAG_Y_0_TOP;
> +
> +    result = rutabaga_resource_create_3d(rutabaga, c2d.resource_id, &rc_3d);
> +    CHECK_RESULT(result, cmd);
> +
> +    res = g_new0(struct virtio_gpu_simple_resource, 1);
> +    res->width = c2d.width;
> +    res->height = c2d.height;
> +    res->format = c2d.format;
> +    res->resource_id = c2d.resource_id;
> +
> +    QTAILQ_INSERT_HEAD(&g->reslist, res, next);
> +}
> +
> +static void
> +rutabaga_cmd_create_resource_3d(VirtIOGPU *g,
> +                                struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct rutabaga_create_3d rc_3d = { 0 };
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_create_3d c3d;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(c3d);
> +
> +    trace_virtio_gpu_cmd_res_create_3d(c3d.resource_id, c3d.format,
> +                                       c3d.width, c3d.height, c3d.depth);
> +
> +    rc_3d.target = c3d.target;
> +    rc_3d.format = c3d.format;
> +    rc_3d.bind = c3d.bind;
> +    rc_3d.width = c3d.width;
> +    rc_3d.height = c3d.height;
> +    rc_3d.depth = c3d.depth;
> +    rc_3d.array_size = c3d.array_size;
> +    rc_3d.last_level = c3d.last_level;
> +    rc_3d.nr_samples = c3d.nr_samples;
> +    rc_3d.flags = c3d.flags;
> +
> +    result = rutabaga_resource_create_3d(rutabaga, c3d.resource_id, &rc_3d);
> +    CHECK_RESULT(result, cmd);
> +
> +    res = g_new0(struct virtio_gpu_simple_resource, 1);
> +    res->width = c3d.width;
> +    res->height = c3d.height;
> +    res->format = c3d.format;
> +    res->resource_id = c3d.resource_id;
> +
> +    QTAILQ_INSERT_HEAD(&g->reslist, res, next);
> +}
> +
> +static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
> +                                     struct virtio_gpu_simple_resource *res)
> +{
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    res->mapped = NULL;
> +    return rutabaga_resource_unmap(rutabaga, res->resource_id);
> +}
> +
> +static void
> +rutabaga_cmd_resource_unref(VirtIOGPU *g,
> +                            struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_unref unref;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(unref);
> +
> +    trace_virtio_gpu_cmd_res_unref(unref.resource_id);
> +
> +    res = virtio_gpu_find_resource(g, unref.resource_id);
> +    CHECK(res, cmd);
> +
> +    result = rutabaga_resource_unref(rutabaga, unref.resource_id);
> +    CHECK_RESULT(result, cmd);
> +
> +    if (res->image) {
> +        pixman_image_unref(res->image);
> +    }
> +
> +    if (res->mapped) {
> +        result = rutabaga_handle_unmap(g, res);
> +        CHECK(result, cmd);

If this check fails, res, which has a dangling pointer in "image" 
member, will be kept in reslist.

> +    }
> +
> +    QTAILQ_REMOVE(&g->reslist, res, next);
> +    g_free(res);
> +}
> +
> +static void
> +rutabaga_cmd_context_create(VirtIOGPU *g,
> +                            struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_ctx_create cc;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(cc);
> +    trace_virtio_gpu_cmd_ctx_create(cc.hdr.ctx_id,
> +                                    cc.debug_name);
> +
> +    result = rutabaga_context_create(rutabaga, cc.hdr.ctx_id, cc.context_init,
> +                                     cc.debug_name, cc.nlen);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_context_destroy(VirtIOGPU *g,
> +                             struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_ctx_destroy cd;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(cd);
> +    trace_virtio_gpu_cmd_ctx_destroy(cd.hdr.ctx_id);
> +
> +    result = rutabaga_context_destroy(rutabaga, cd.hdr.ctx_id);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_resource_flush(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result, i;
> +    struct virtio_gpu_scanout *scanout = NULL;
> +    struct virtio_gpu_simple_resource *res;
> +    struct rutabaga_transfer transfer = { 0 };
> +    struct iovec transfer_iovec;
> +    struct virtio_gpu_resource_flush rf;
> +    bool found = false;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(rf);
> +    trace_virtio_gpu_cmd_res_flush(rf.resource_id,
> +                                   rf.r.width, rf.r.height, rf.r.x, rf.r.y);
> +
> +    res = virtio_gpu_find_resource(g, rf.resource_id);
> +    CHECK(res, cmd);
> +
> +    for (i = 0; i < g->parent_obj.conf.max_outputs; i++) {
> +        scanout = &g->parent_obj.scanout[i];
> +        if (i == res->scanout_bitmask) {

This compares an index with bitmask; I doubt that's what you want to do.

> +            found = true;
> +            break;
> +        }
> +    }
> +
> +    if (!found) {
> +        return;
> +    }
> +
> +    transfer.x = 0;
> +    transfer.y = 0;
> +    transfer.z = 0;
> +    transfer.w = res->width;
> +    transfer.h = res->height;
> +    transfer.d = 1;
> +
> +    transfer_iovec.iov_base = (void *)pixman_image_get_data(res->image);
> +    transfer_iovec.iov_len = res->width * res->height * 4;
> +
> +    result = rutabaga_resource_transfer_read(rutabaga, 0,
> +                                             rf.resource_id, &transfer,
> +                                             &transfer_iovec);
> +    CHECK_RESULT(result, cmd);
> +    dpy_gfx_update_full(scanout->con);
> +}
> +
> +static void
> +rutabaga_cmd_set_scanout(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_scanout *scanout = NULL;
> +    struct virtio_gpu_set_scanout ss;
> +
> +    VIRTIO_GPU_FILL_CMD(ss);
> +    trace_virtio_gpu_cmd_set_scanout(ss.scanout_id, ss.resource_id,
> +                                     ss.r.width, ss.r.height, ss.r.x, ss.r.y);
> +
> +    scanout = &g->parent_obj.scanout[ss.scanout_id];
> +    g->parent_obj.enable = 1;
> +
> +    res = virtio_gpu_find_resource(g, ss.resource_id);
> +    CHECK(res, cmd);
> +
> +    if (!res->image) {
> +        pixman_format_code_t pformat;
> +        pformat = virtio_gpu_get_pixman_format(res->format);
> +        CHECK(pformat, cmd);
> +
> +        res->image = pixman_image_create_bits(pformat,
> +                                              res->width,
> +                                              res->height,
> +                                              NULL, 0);
> +        CHECK(res->image, cmd);
> +        pixman_image_ref(res->image);
> +    }
> +
> +    /* realloc the surface ptr */
> +    scanout->ds = qemu_create_displaysurface_pixman(res->image);
> +    dpy_gfx_replace_surface(scanout->con, NULL);

Do you really need to assign NULL first? Maybe it was necessary in an 
old version of QEMU, but I don't think it's still necessary.

> +    dpy_gfx_replace_surface(scanout->con, scanout->ds);
> +    res->scanout_bitmask = ss.scanout_id;
> +}
> +
> +static void
> +rutabaga_cmd_submit_3d(VirtIOGPU *g,
> +                       struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_cmd_submit cs;
> +    void *buf;
> +    size_t s;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(cs);
> +    trace_virtio_gpu_cmd_ctx_submit(cs.hdr.ctx_id, cs.size);
> +
> +    buf = g_malloc(cs.size);
> +    s = iov_to_buf(cmd->elem.out_sg, cmd->elem.out_num,
> +                   sizeof(cs), buf, cs.size);
> +    if (s != cs.size) {
> +        qemu_log_mask(LOG_GUEST_ERROR, "%s: size mismatch (%zd/%d)",
> +                      __func__, s, cs.size);
> +        cmd->error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER;
> +        goto out;
> +    }
> +
> +    result = rutabaga_submit_command(rutabaga, cs.hdr.ctx_id, buf, cs.size);
> +    CHECK_RESULT(result, cmd);

This check leaks buf if it fails. You may use g_autofree; see 
docs/devel/style.rst for more information on g_autofree.

> +
> +out:
> +    g_free(buf);
> +}
> +
> +static void
> +rutabaga_cmd_transfer_to_host_2d(VirtIOGPU *g,
> +                                 struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct rutabaga_transfer transfer = { 0 };
> +    struct virtio_gpu_transfer_to_host_2d t2d;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(t2d);
> +    trace_virtio_gpu_cmd_res_xfer_toh_2d(t2d.resource_id);
> +
> +    transfer.x = t2d.r.x;
> +    transfer.y = t2d.r.y;
> +    transfer.z = 0;
> +    transfer.w = t2d.r.width;
> +    transfer.h = t2d.r.height;
> +    transfer.d = 1;
> +
> +    result = rutabaga_resource_transfer_write(rutabaga, 0, t2d.resource_id,
> +                                              &transfer);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_transfer_to_host_3d(VirtIOGPU *g,
> +                                 struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct rutabaga_transfer transfer = { 0 };
> +    struct virtio_gpu_transfer_host_3d t3d;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(t3d);
> +    trace_virtio_gpu_cmd_res_xfer_toh_3d(t3d.resource_id);
> +
> +    transfer.x = t3d.box.x;
> +    transfer.y = t3d.box.y;
> +    transfer.z = t3d.box.z;
> +    transfer.w = t3d.box.w;
> +    transfer.h = t3d.box.h;
> +    transfer.d = t3d.box.d;
> +    transfer.level = t3d.level;
> +    transfer.stride = t3d.stride;
> +    transfer.layer_stride = t3d.layer_stride;
> +    transfer.offset = t3d.offset;
> +
> +    result = rutabaga_resource_transfer_write(rutabaga, t3d.hdr.ctx_id,
> +                                              t3d.resource_id, &transfer);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_transfer_from_host_3d(VirtIOGPU *g,
> +                                   struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct rutabaga_transfer transfer = { 0 };
> +    struct virtio_gpu_transfer_host_3d t3d;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(t3d);
> +    trace_virtio_gpu_cmd_res_xfer_fromh_3d(t3d.resource_id);
> +
> +    transfer.x = t3d.box.x;
> +    transfer.y = t3d.box.y;
> +    transfer.z = t3d.box.z;
> +    transfer.w = t3d.box.w;
> +    transfer.h = t3d.box.h;
> +    transfer.d = t3d.box.d;
> +    transfer.level = t3d.level;
> +    transfer.stride = t3d.stride;
> +    transfer.layer_stride = t3d.layer_stride;
> +    transfer.offset = t3d.offset;
> +
> +    result = rutabaga_resource_transfer_read(rutabaga, t3d.hdr.ctx_id,
> +                                             t3d.resource_id, &transfer, NULL);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_attach_backing(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    struct rutabaga_iovecs vecs = { 0 };
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_attach_backing att_rb;
> +    struct iovec *res_iovs;
> +    uint32_t res_niov;
> +    int ret;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(att_rb);
> +    trace_virtio_gpu_cmd_res_back_attach(att_rb.resource_id);
> +
> +    res = virtio_gpu_find_resource(g, att_rb.resource_id);
> +    if (!res) {
> +        qemu_log_mask(LOG_GUEST_ERROR, "%s: illegal resource specified %d\n",
> +                      __func__, att_rb.resource_id);
> +        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
> +        return;
> +    }
> +
> +    if (res->iov) {
> +        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
> +        return;
> +    }
> +
> +    ret = virtio_gpu_create_mapping_iov(g, att_rb.nr_entries, sizeof(att_rb),
> +                                        cmd, NULL, &res_iovs, &res_niov);
> +    if (ret != 0) {
> +        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
> +        return;
> +    }
> +
> +    vecs.iovecs = res_iovs;
> +    vecs.num_iovecs = res_niov;
> +
> +    ret = rutabaga_resource_attach_backing(rutabaga, att_rb.resource_id, &vecs);
> +    if (ret != 0) {
> +        virtio_gpu_cleanup_mapping_iov(g, res_iovs, res_niov);
> +    }
> +}
> +
> +static void
> +rutabaga_cmd_detach_backing(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_detach_backing detach_rb;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(detach_rb);
> +    trace_virtio_gpu_cmd_res_back_detach(detach_rb.resource_id);
> +
> +    res = virtio_gpu_find_resource(g, detach_rb.resource_id);
> +    CHECK(res, cmd);
> +
> +    rutabaga_resource_detach_backing(rutabaga,
> +                                     detach_rb.resource_id);
> +
> +    virtio_gpu_cleanup_mapping(g, res);
> +}
> +
> +static void
> +rutabaga_cmd_ctx_attach_resource(VirtIOGPU *g,
> +                                 struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_ctx_resource att_res;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(att_res);
> +    trace_virtio_gpu_cmd_ctx_res_attach(att_res.hdr.ctx_id,
> +                                        att_res.resource_id);
> +
> +    result = rutabaga_context_attach_resource(rutabaga, att_res.hdr.ctx_id,
> +                                              att_res.resource_id);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_ctx_detach_resource(VirtIOGPU *g,
> +                                 struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_ctx_resource det_res;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(det_res);
> +    trace_virtio_gpu_cmd_ctx_res_detach(det_res.hdr.ctx_id,
> +                                        det_res.resource_id);
> +
> +    result = rutabaga_context_detach_resource(rutabaga, det_res.hdr.ctx_id,
> +                                              det_res.resource_id);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_get_capset_info(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_get_capset_info info;
> +    struct virtio_gpu_resp_capset_info resp;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(info);
> +
> +    result = rutabaga_get_capset_info(rutabaga, info.capset_index,
> +                                      &resp.capset_id, &resp.capset_max_version,
> +                                      &resp.capset_max_size);
> +    CHECK_RESULT(result, cmd);
> +
> +    resp.hdr.type = VIRTIO_GPU_RESP_OK_CAPSET_INFO;
> +    virtio_gpu_ctrl_response(g, cmd, &resp.hdr, sizeof(resp));
> +}
> +
> +static void
> +rutabaga_cmd_get_capset(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_get_capset gc;
> +    struct virtio_gpu_resp_capset *resp;
> +    uint32_t capset_size;
> +    uint32_t current_id;
> +    bool found = false;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(gc);
> +    for (uint32_t i = 0; i < virtio_gpu->num_capsets; i++) {
> +        result = rutabaga_get_capset_info(rutabaga, i,
> +                                          &current_id, &capset_size,
> +                                          &capset_size);
> +        CHECK_RESULT(result, cmd);
> +
> +        if (current_id == gc.capset_id) {
> +            found = true;
> +            break;
> +        }
> +    }
> +
> +    if (!found) {
> +        qemu_log_mask(LOG_GUEST_ERROR, "capset not found!");
> +        return;
> +    }
> +
> +    resp = g_malloc0(sizeof(*resp) + capset_size);
> +    resp->hdr.type = VIRTIO_GPU_RESP_OK_CAPSET;
> +    rutabaga_get_capset(rutabaga, gc.capset_id, gc.capset_version,
> +                       (uint8_t *)resp->capset_data, capset_size);
> +
> +    virtio_gpu_ctrl_response(g, cmd, &resp->hdr, sizeof(*resp) + capset_size);
> +    g_free(resp);
> +}
> +
> +static void
> +rutabaga_cmd_resource_create_blob(VirtIOGPU *g,
> +                                  struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int result;
> +    struct rutabaga_iovecs vecs = { 0 };
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_create_blob cblob;
> +    struct rutabaga_create_blob rc_blob = { 0 };
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(cblob);
> +    trace_virtio_gpu_cmd_res_create_blob(cblob.resource_id, cblob.size);
> +
> +    CHECK(cblob.resource_id != 0, cmd);
> +
> +    res = g_new0(struct virtio_gpu_simple_resource, 1);
> +    QTAILQ_INSERT_HEAD(&g->reslist, res, next);

This inserts res to reslit before if the creation actually succeeds, 
which doesn't seem right.

> +
> +    res->resource_id = cblob.resource_id;
> +    res->blob_size = cblob.size;
> +
> +    if (cblob.blob_mem != VIRTIO_GPU_BLOB_MEM_HOST3D) {
> +        result = virtio_gpu_create_mapping_iov(g, cblob.nr_entries,
> +                                               sizeof(cblob), cmd, &res->addrs,
> +                                               &res->iov, &res->iov_cnt);
> +        CHECK_RESULT(result, cmd);
> +    }
> +
> +    rc_blob.blob_id = cblob.blob_id;
> +    rc_blob.blob_mem = cblob.blob_mem;
> +    rc_blob.blob_flags = cblob.blob_flags;
> +    rc_blob.size = cblob.size;
> +
> +    vecs.iovecs = res->iov;
> +    vecs.num_iovecs = res->iov_cnt;
> +
> +    result = rutabaga_resource_create_blob(rutabaga, cblob.hdr.ctx_id,
> +                                           cblob.resource_id, &rc_blob, &vecs,
> +                                           NULL);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
> +                               struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_simple_resource *res;
> +    struct rutabaga_mapping mapping = { 0 };
> +    struct virtio_gpu_resource_map_blob mblob;
> +    struct virtio_gpu_resp_map_info resp;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(mblob);
> +
> +    CHECK(mblob.resource_id != 0, cmd);
> +
> +    res = virtio_gpu_find_resource(g, mblob.resource_id);
> +    CHECK(res, cmd);
> +
> +    result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
> +    CHECK_RESULT(result, cmd);
> +
> +    memset(&resp, 0, sizeof(resp));
> +    resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
> +    result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,
> +                                        &resp.map_info);
> +
> +    CHECK_RESULT(result, cmd);
> +    virtio_gpu_ctrl_response(g, cmd, &resp.hdr, sizeof(resp));
> +}
> +
> +static void
> +rutabaga_cmd_resource_unmap_blob(VirtIOGPU *g,
> +                                 struct virtio_gpu_ctrl_command *cmd)
> +{
> +    int32_t result;
> +    struct virtio_gpu_simple_resource *res;
> +    struct virtio_gpu_resource_unmap_blob ublob;
> +
> +    VIRTIO_GPU_FILL_CMD(ublob);
> +
> +    CHECK(ublob.resource_id != 0, cmd);
> +
> +    res = virtio_gpu_find_resource(g, ublob.resource_id);
> +    CHECK(res, cmd);
> +
> +    result = rutabaga_handle_unmap(g, res);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +virtio_gpu_rutabaga_process_cmd(VirtIOGPU *g,
> +                                struct virtio_gpu_ctrl_command *cmd)
> +{
> +    struct rutabaga_fence fence = { 0 };
> +    int32_t result;
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    VIRTIO_GPU_FILL_CMD(cmd->cmd_hdr);
> +
> +    switch (cmd->cmd_hdr.type) {
> +    case VIRTIO_GPU_CMD_CTX_CREATE:
> +        rutabaga_cmd_context_create(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_CTX_DESTROY:
> +        rutabaga_cmd_context_destroy(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_CREATE_2D:
> +        rutabaga_cmd_create_resource_2d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_CREATE_3D:
> +        rutabaga_cmd_create_resource_3d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_SUBMIT_3D:
> +        rutabaga_cmd_submit_3d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_TRANSFER_TO_HOST_2D:
> +        rutabaga_cmd_transfer_to_host_2d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_TRANSFER_TO_HOST_3D:
> +        rutabaga_cmd_transfer_to_host_3d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_TRANSFER_FROM_HOST_3D:
> +        rutabaga_cmd_transfer_from_host_3d(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_ATTACH_BACKING:
> +        rutabaga_cmd_attach_backing(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING:
> +        rutabaga_cmd_detach_backing(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_SET_SCANOUT:
> +        rutabaga_cmd_set_scanout(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_FLUSH:
> +        rutabaga_cmd_resource_flush(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_UNREF:
> +        rutabaga_cmd_resource_unref(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE:
> +        rutabaga_cmd_ctx_attach_resource(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_CTX_DETACH_RESOURCE:
> +        rutabaga_cmd_ctx_detach_resource(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_GET_CAPSET_INFO:
> +        rutabaga_cmd_get_capset_info(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_GET_CAPSET:
> +        rutabaga_cmd_get_capset(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_GET_DISPLAY_INFO:
> +        virtio_gpu_get_display_info(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_GET_EDID:
> +        virtio_gpu_get_edid(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB:
> +        rutabaga_cmd_resource_create_blob(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB:
> +        rutabaga_cmd_resource_map_blob(g, cmd);
> +        break;
> +    case VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB:
> +        rutabaga_cmd_resource_unmap_blob(g, cmd);
> +        break;
> +    default:
> +        cmd->error = VIRTIO_GPU_RESP_ERR_UNSPEC;
> +        break;
> +    }
> +
> +    if (cmd->finished) {
> +        return;
> +    }
> +    if (cmd->error) {
> +        fprintf(stderr, "%s: ctrl 0x%x, error 0x%x\n", __func__,
> +                cmd->cmd_hdr.type, cmd->error);

Use: error_report()
This one is also briefly described in style.rst

> +        virtio_gpu_ctrl_response_nodata(g, cmd, cmd->error);
> +        return;
> +    }
> +    if (!(cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_FENCE)) {
> +        virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
> +        return;
> +    }
> +
> +    fence.flags = cmd->cmd_hdr.flags;
> +    fence.ctx_id = cmd->cmd_hdr.ctx_id;
> +    fence.fence_id = cmd->cmd_hdr.fence_id;
> +    fence.ring_idx = cmd->cmd_hdr.ring_idx;
> +
> +    trace_virtio_gpu_fence_ctrl(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
> +
> +    result = rutabaga_create_fence(rutabaga, &fence);
> +    CHECK_RESULT(result, cmd);
> +}
> +
> +static void
> +virtio_gpu_rutabaga_fence_cb(uint64_t user_data,
> +                             struct rutabaga_fence fence_data)
> +{
> +    VirtIOGPU *g = (VirtIOGPU *)(void*)(uintptr_t)user_data;

I don't think (void*)(uintptr_t) is necessary.

> +    struct virtio_gpu_ctrl_command *cmd, *tmp;
> +
> +    bool signaled_ctx_specific = fence_data.flags & RUTABAGA_FLAG_INFO_RING_IDX;

bool may not have enough capacity. Just using uint32_t is fine here I think.

> +
> +    QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
> +        /*
> +         * Due to context specific timelines.
> +         */
> +        bool target_ctx_specific = cmd->cmd_hdr.flags &
> +                                   RUTABAGA_FLAG_INFO_RING_IDX;
> +
> +        if (signaled_ctx_specific != target_ctx_specific) {
> +            continue;
> +        }
> +
> +        if (signaled_ctx_specific &&
> +           (cmd->cmd_hdr.ring_idx != fence_data.ring_idx)) {
> +            continue;
> +        }
> +
> +        if (cmd->cmd_hdr.fence_id > fence_data.fence_id) {
> +            continue;
> +        }
> +
> +        trace_virtio_gpu_fence_resp(cmd->cmd_hdr.fence_id);
> +        virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
> +        QTAILQ_REMOVE(&g->fenceq, cmd, next);
> +        g_free(cmd);
> +    }
> +}
> +
> +static int virtio_gpu_rutabaga_init(VirtIOGPU *g)
> +{
> +    int result;
> +    uint64_t capset_mask;
> +    struct rutabaga_channels channels = { 0 };
> +    struct rutabaga_builder builder = { 0 };
> +
> +    GET_VIRTIO_GPU_GL(g);
> +    virtio_gpu->rutabaga = NULL;
> +
> +    if (!virtio_gpu->capset_names) {
> +        return -EINVAL;
> +    }
> +
> +    result = rutabaga_calculate_capset_mask(virtio_gpu->capset_names,
> +                                            &capset_mask);
> +    if (result) {
> +        return result;
> +    }
> +
> +    /*
> +     * rutabaga-0.1.1 is only compiled/tested with gfxstream and cross-domain
> +     * support. Future versions may change this to have more context types if
> +     * there is any interest.
> +     */
> +    if ((capset_mask & (1 << RUTABAGA_CAPSET_VIRGL)) ||
> +        (capset_mask & (1 << RUTABAGA_CAPSET_VIRGL2)) ||
> +        (capset_mask & (1 << RUTABAGA_CAPSET_VENUS)) ||
> +        (capset_mask & (1 << RUTABAGA_CAPSET_DRM))) {

Simpler:
capset_mask & (BIT(RUTABAGA_CAPSET_VIRGL) |
                BIT(RUTABAGA_CAPSET_VIRGL2) |
                BIT(RUTABAGA_CAPSET_VENUS) |
                BIT(RUTABAGA_CAPSET_DRM))

> +        return -EINVAL;
> +     }
> +
> +    builder.user_data = (uint64_t)(uintptr_t *)(void *)g;
> +    builder.fence_cb = virtio_gpu_rutabaga_fence_cb;
> +    builder.capset_mask = capset_mask;
> +
> +    if (virtio_gpu->wayland_socket_path) {
> +        if ((builder.capset_mask & (1 << RUTABAGA_CAPSET_CROSS_DOMAIN)) == 0) {
> +            return -EINVAL;
> +        }
> +
> +        channels.channels =
> +          (struct rutabaga_channel *)calloc(1, sizeof(struct rutabaga_channel));
> +        channels.num_channels = 1;
> +        channels.channels[0].channel_name = virtio_gpu->wayland_socket_path;
> +        channels.channels[0].channel_type = RUTABAGA_CHANNEL_TYPE_WAYLAND;
> +        builder.channels = &channels;
> +    }
> +
> +    result = rutabaga_init(&builder, (struct rutabaga **)&virtio_gpu->rutabaga);
> +    if (builder.capset_mask & (1 << RUTABAGA_CAPSET_CROSS_DOMAIN)) {
> +        free(channels.channels);
> +    }
> +
> +    return result;
> +}
> +
> +static int virtio_gpu_rutabaga_get_num_capsets(VirtIOGPU *g)
> +{
> +    int result = 0;

I don't think this needs to be initialized here.

> +    uint32_t num_capsets;
> +    GET_VIRTIO_GPU_GL(g);
> +
> +    if (!virtio_gpu->renderer_inited) {
> +        result = virtio_gpu_rutabaga_init(g);
> +        if (result) {
> +            qemu_log_mask(LOG_GUEST_ERROR, "Failed to init rutabaga");

Is it really a guest error? Shouldn't error_report() be used here?

> +            return 0;
> +        }
> +
> +        virtio_gpu->renderer_inited = true;
> +    }
> +
> +    GET_RUTABAGA(virtio_gpu);
> +
> +    result = rutabaga_get_num_capsets(rutabaga, &num_capsets);
> +    if (result) {
> +        qemu_log_mask(LOG_GUEST_ERROR, "Failed to get capsets");
> +        return 0;
> +    }
> +    virtio_gpu->num_capsets = num_capsets;
> +    return (int)(num_capsets);

Explict cast is not necessary.

> +}
> +
> +static void virtio_gpu_rutabaga_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
> +{
> +    VirtIOGPU *g = VIRTIO_GPU(vdev);
> +    GET_VIRTIO_GPU_GL(g);
> +    struct virtio_gpu_ctrl_command *cmd;
> +
> +    if (!virtio_queue_ready(vq)) {
> +        return;
> +    }
> +
> +    if (!virtio_gpu->renderer_inited) {
> +        int result = virtio_gpu_rutabaga_init(g);
> +        if (!result) {
> +            virtio_gpu->renderer_inited = true;
> +        }
> +    }
> +
> +    if (!virtio_gpu->renderer_inited) {
> +        return;
> +    }
> +
> +    cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
> +    while (cmd) {
> +        cmd->vq = vq;
> +        cmd->error = 0;
> +        cmd->finished = false;
> +        QTAILQ_INSERT_TAIL(&g->cmdq, cmd, next);
> +        cmd = virtqueue_pop(vq, sizeof(struct virtio_gpu_ctrl_command));
> +    }
> +
> +    virtio_gpu_process_cmdq(g);
> +}
> +
> +void virtio_gpu_rutabaga_device_realize(DeviceState *qdev, Error **errp)
> +{
> +    int num_capsets;
> +    VirtIOGPUBase *bdev = VIRTIO_GPU_BASE(qdev);
> +    VirtIOGPUBaseClass *vbc = VIRTIO_GPU_BASE_GET_CLASS(bdev);
> +
> +    VirtIOGPU *gpudev = VIRTIO_GPU(qdev);
> +    VirtIOGPUClass *vgc = VIRTIO_GPU_GET_CLASS(gpudev);
> +
> +    num_capsets = virtio_gpu_rutabaga_get_num_capsets(gpudev);
> +    if (!num_capsets) {
> +        return;
> +    }
> +
> +    vbc->gl_flushed = virtio_gpu_rutabaga_gl_flushed;
> +    vgc->handle_ctrl = virtio_gpu_rutabaga_handle_ctrl;
> +    vgc->process_cmd = virtio_gpu_rutabaga_process_cmd;
> +    vgc->update_cursor_data = virtio_gpu_rutabaga_update_cursor;
> +
> +#if HOST_BIG_ENDIAN
> +    error_setg(errp, "rutabaga is not supported on bigendian platforms");
> +    return;
> +#endif
> +
> +    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_RUTABAGA_ENABLED);
> +    gpudev->parent_obj.conf.flags |= (1 << VIRTIO_GPU_FLAG_BLOB_ENABLED);
> +    gpudev->parent_obj.conf.flags
> +      |= (1 << VIRTIO_GPU_FLAG_CONTEXT_INIT_ENABLED);
> +
> +    VIRTIO_GPU_BASE(gpudev)->virtio_config.num_capsets = num_capsets;
> +    virtio_gpu_device_realize(qdev, errp);
> +}


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest
  2023-04-21  1:12 ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Gurchetan Singh
@ 2023-04-22 15:46   ` Akihiko Odaki
  2023-04-25  0:08     ` Gurchetan Singh
  2023-04-22 19:22   ` Peter Maydell
  1 sibling, 1 reply; 28+ messages in thread
From: Akihiko Odaki @ 2023-04-22 15:46 UTC (permalink / raw)
  To: Gurchetan Singh, qemu-devel
  Cc: pbonzini, philmd, david, stefanha, kraxel, marcandre.lureau,
	dmitry.osipenko, ray.huang, alex.bennee

On 2023/04/21 10:12, Gurchetan Singh wrote:
> I just copied the patches that have been floating around that do
> this, but it doesn't seem to robustly work.  This current
> implementation is probably good enough to run vkcube or simple
> apps, but whenever a test starts to aggressively map/unmap memory,
> things do explode on the QEMU side.
> 
> A simple way to reproduce is run:
> 
> ./deqp-vk --deqp-case=deqp-vk --deqp-case=dEQP-VK.memory.mapping.suballocation.*
> 
> You should get stack traces that sometimes look like this:
> 
> 0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:44
> 1  __pthread_kill_internal (signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:78
> 2  __GI___pthread_kill (threadid=140737316304448, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
> 3  0x00007ffff7042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> 4  0x00007ffff70287f3 in __GI_abort () at ./stdlib/abort.c:79
> 5  0x00007ffff70896f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff71dbb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
> 6  0x00007ffff70a0d7c in malloc_printerr (str=str@entry=0x7ffff71de7b0 "double free or corruption (out)") at ./malloc/malloc.c:5664
> 7  0x00007ffff70a2ef0 in _int_free (av=0x7ffff7219c80 <main_arena>, p=0x555557793e00, have_lock=<optimized out>) at ./malloc/malloc.c:4588
> 8  0x00007ffff70a54d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
> 9  0x0000555555d65e7e in phys_section_destroy (mr=0x555557793e10) at ../softmmu/physmem.c:1003
> 10 0x0000555555d65ed0 in phys_sections_free (map=0x555556d4b410) at ../softmmu/physmem.c:1011
> 11 0x0000555555d69578 in address_space_dispatch_free (d=0x555556d4b400) at ../softmmu/physmem.c:2430
> 12 0x0000555555d58412 in flatview_destroy (view=0x5555572bb090) at ../softmmu/memory.c:292
> 13 0x000055555600fd23 in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
> 14 0x00005555560026d4 in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
> 15 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> 16 0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> or this:
> 
> 0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
> 1198        g_assert(obj->ref > 0);
> (gdb) bt
> 0  0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
> 1  0x0000555555d5cca5 in memory_region_unref (mr=0x5555572b9e20) at ../softmmu/memory.c:1799
> 2  0x0000555555d65e47 in phys_section_destroy (mr=0x5555572b9e20) at ../softmmu/physmem.c:998
> 3  0x0000555555d65ec7 in phys_sections_free (map=0x5555588365c0) at ../softmmu/physmem.c:1011
> 4  0x0000555555d6956f in address_space_dispatch_free (d=0x5555588365b0) at ../softmmu/physmem.c:2430
> 5  0x0000555555d58409 in flatview_destroy (view=0x555558836570) at ../softmmu/memory.c:292
> 6  0x000055555600fd1a in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
> 7  0x00005555560026cb in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
> 8  0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> 9  0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> The reason seems to be memory regions are handled on a different
> thread than the virtio-gpu thread, and that inevitably leads to
> raciness.  The memory region docs[a] generally seems to dissuade this:
> 
> "In order to do this, as a general rule do not create or destroy
>   memory regions dynamically during a device’s lifetime, and only
>   call object_unparent() in the memory region owner’s instance_finalize
>   callback. The dynamically allocated data structure that contains
>   the memory region then should obviously be freed in the
>   instance_finalize callback as well."
> 
> Though instance_finalize is called before device destruction, so
> storing the memory until then is unlikely to be an option.  The
> tests do pass when virtio-gpu doesn't free the memory, but
> progressively the guest becomes slower and then OOMs.
> 
> Though the api does make an exception:
> 
> "There is an exception to the above rule: it is okay to call
> object_unparent at any time for an alias or a container region. It is
> therefore also okay to create or destroy alias and container regions
> dynamically during a device’s lifetime."
> 
> I believe we are trying to create a container subregion, but that's
> still failing?  Are we doing it right?  Any memory region experts
> here can help out?  The other revelant patch in this series
> is "virtio-gpu: hostmem".

Perhaps dma_memory_map() is what you want?

> 
> [a] https://qemu.readthedocs.io/en/latest/devel/memory.html
> 
> Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> ---
>   hw/display/virtio-gpu-rutabaga.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
> 
> diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
> index 5fd1154198..196267aac2 100644
> --- a/hw/display/virtio-gpu-rutabaga.c
> +++ b/hw/display/virtio-gpu-rutabaga.c
> @@ -159,6 +159,12 @@ static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
>       GET_VIRTIO_GPU_GL(g);
>       GET_RUTABAGA(virtio_gpu);
>   
> +    memory_region_transaction_begin();
> +    memory_region_set_enabled(&res->region, false);
> +    memory_region_del_subregion(&g->parent_obj.hostmem, &res->region);
> +    object_unparent(OBJECT(&res->region));
> +    memory_region_transaction_commit();
> +
>       res->mapped = NULL;
>       return rutabaga_resource_unmap(rutabaga, res->resource_id);
>   }
> @@ -671,6 +677,14 @@ rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
>       result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
>       CHECK_RESULT(result, cmd);
>   
> +    memory_region_transaction_begin();
> +    memory_region_init_ram_device_ptr(&res->region, OBJECT(g), NULL,
> +                                      mapping.size, (void *)mapping.ptr);
> +    memory_region_add_subregion(&g->parent_obj.hostmem, mblob.offset,
> +                                &res->region);
> +    memory_region_set_enabled(&res->region, true);
> +    memory_region_transaction_commit();
> +
>       memset(&resp, 0, sizeof(resp));
>       resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
>       result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?
  2023-04-21 23:54   ` Gurchetan Singh
@ 2023-04-22 16:41     ` Akihiko Odaki
  2023-04-25  0:16       ` Gurchetan Singh
  0 siblings, 1 reply; 28+ messages in thread
From: Akihiko Odaki @ 2023-04-22 16:41 UTC (permalink / raw)
  To: Gurchetan Singh, Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, dmitry.osipenko, ray.huang, alex.bennee

On 2023/04/22 8:54, Gurchetan Singh wrote:
> On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>
>> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
>> <gurchetansingh@chromium.org> wrote:
>>>
>>> From: Gurchetan Singh <gurchetansingh@google.com>
>>>
>>> Rationale:
>>>
>>> - gfxstream [a] is good for the Android Emulator/upstream QEMU
>>>    alignment
>>> - Wayland passhthrough [b] via the cross-domain context type is good
>>>    for Linux on Linux display virtualization
>>> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
>>>    virglrenderer
>>> - This series ports rutabaga_gfx to QEMU
>>
>> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
>> stack and how they build on or complement virtio-gpu and
>> virglrenderer?
> 
> rutabaga_gfx and gfxstream are both libraries that implement the
> virtio-gpu protocol.  There's a document available in the Gitlab issue
> to see where they fit in the stack [a].
> 
> gfxstream grew out of the Android Emulator's need to virtualize
> graphics for app developers.  There's a short history of gfxstream in
> patch 10.  It complements virglrenderer in that it's a bit more
> cross-platform and targets different use cases -- more detail here
> [b].  The ultimate goal is ditch out-of-tree kernel drivers in the
> Android Emulator and adopt virtio, and porting gfxstream to QEMU would
> speed up that transition.

I wonder what is motivation for maintaining gfxstream instead of 
switching to virglrenderer/venus.

> 
> rutabaga_gfx is a much smaller Rust library that sits on top of
> gfxstream and even virglrenderer, but does a few extra things.  It
> implements the cross-domain context type, which provides Wayland
> passthrough.  This helps virtio-gpu by providing more modern display
> virtualization.  For example, Microsoft for WSL2 also uses a similar
> technique [c], but I believe it is not virtio-based nor open-source.

The guest side components of WSLg are open-source, but the host side is 
not: https://github.com/microsoft/wslg
It also uses DirectX for acceleration so it's not really portable for 
outside Windows.

> With this, we can have the same open-source Wayland passthrough
> solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
> an additional small Rust context type for security-sensitive use cases
> in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
> (since it's C++ based) in such cases.
> 
> Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.
> 
> [a] https://gitlab.com/qemu-project/qemu/-/issues/1611
> [b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
> [c] https://www.youtube.com/watch?v=EkNBsBx501Q
> [d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
> [e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533
> 
>>
>> Stefan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest
  2023-04-21  1:12 ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Gurchetan Singh
  2023-04-22 15:46   ` Akihiko Odaki
@ 2023-04-22 19:22   ` Peter Maydell
  1 sibling, 0 replies; 28+ messages in thread
From: Peter Maydell @ 2023-04-22 19:22 UTC (permalink / raw)
  To: Gurchetan Singh
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Fri, 21 Apr 2023 at 02:13, Gurchetan Singh
<gurchetansingh@chromium.org> wrote:

> Though the api does make an exception:
>
> "There is an exception to the above rule: it is okay to call
> object_unparent at any time for an alias or a container region. It is
> therefore also okay to create or destroy alias and container regions
> dynamically during a device’s lifetime."
>
> I believe we are trying to create a container subregion, but that's
> still failing?


> @@ -671,6 +677,14 @@ rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
>      result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
>      CHECK_RESULT(result, cmd);
>
> +    memory_region_transaction_begin();
> +    memory_region_init_ram_device_ptr(&res->region, OBJECT(g), NULL,
> +                                      mapping.size, (void *)mapping.ptr);

This isn't a container MemoryRegion -- it is a RAM MR. That is,
accesses to it are backed by a lump of host memory (viz, the
one provided here via the mapping.ptr). A container MR is one
which provides no backing mechanism (neither host RAM, nor
MMIO read/write callbacks), and whose contents are purely
those of any other MemoryRegions that you add to it via
memory_region_add_subregion(). So the exception listed in the
API docs does not apply here.

-- PMM


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest
  2023-04-22 15:46   ` Akihiko Odaki
@ 2023-04-25  0:08     ` Gurchetan Singh
  0 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-25  0:08 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, dmitry.osipenko, ray.huang, alex.bennee

On Sat, Apr 22, 2023 at 8:46 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>
> On 2023/04/21 10:12, Gurchetan Singh wrote:
> > I just copied the patches that have been floating around that do
> > this, but it doesn't seem to robustly work.  This current
> > implementation is probably good enough to run vkcube or simple
> > apps, but whenever a test starts to aggressively map/unmap memory,
> > things do explode on the QEMU side.
> >
> > A simple way to reproduce is run:
> >
> > ./deqp-vk --deqp-case=deqp-vk --deqp-case=dEQP-VK.memory.mapping.suballocation.*
> >
> > You should get stack traces that sometimes look like this:
> >
> > 0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:44
> > 1  __pthread_kill_internal (signo=6, threadid=140737316304448) at ./nptl/pthread_kill.c:78
> > 2  __GI___pthread_kill (threadid=140737316304448, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
> > 3  0x00007ffff7042476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> > 4  0x00007ffff70287f3 in __GI_abort () at ./stdlib/abort.c:79
> > 5  0x00007ffff70896f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff71dbb8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
> > 6  0x00007ffff70a0d7c in malloc_printerr (str=str@entry=0x7ffff71de7b0 "double free or corruption (out)") at ./malloc/malloc.c:5664
> > 7  0x00007ffff70a2ef0 in _int_free (av=0x7ffff7219c80 <main_arena>, p=0x555557793e00, have_lock=<optimized out>) at ./malloc/malloc.c:4588
> > 8  0x00007ffff70a54d3 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
> > 9  0x0000555555d65e7e in phys_section_destroy (mr=0x555557793e10) at ../softmmu/physmem.c:1003
> > 10 0x0000555555d65ed0 in phys_sections_free (map=0x555556d4b410) at ../softmmu/physmem.c:1011
> > 11 0x0000555555d69578 in address_space_dispatch_free (d=0x555556d4b400) at ../softmmu/physmem.c:2430
> > 12 0x0000555555d58412 in flatview_destroy (view=0x5555572bb090) at ../softmmu/memory.c:292
> > 13 0x000055555600fd23 in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
> > 14 0x00005555560026d4 in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
> > 15 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> > 16 0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> >
> > or this:
> >
> > 0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
> > 1198        g_assert(obj->ref > 0);
> > (gdb) bt
> > 0  0x0000555555e1dc80 in object_unref (objptr=0x6d656d3c6b6e696c) at ../qom/object.c:1198
> > 1  0x0000555555d5cca5 in memory_region_unref (mr=0x5555572b9e20) at ../softmmu/memory.c:1799
> > 2  0x0000555555d65e47 in phys_section_destroy (mr=0x5555572b9e20) at ../softmmu/physmem.c:998
> > 3  0x0000555555d65ec7 in phys_sections_free (map=0x5555588365c0) at ../softmmu/physmem.c:1011
> > 4  0x0000555555d6956f in address_space_dispatch_free (d=0x5555588365b0) at ../softmmu/physmem.c:2430
> > 5  0x0000555555d58409 in flatview_destroy (view=0x555558836570) at ../softmmu/memory.c:292
> > 6  0x000055555600fd1a in call_rcu_thread (opaque=0x0) at ../util/rcu.c:284
> > 7  0x00005555560026cb in qemu_thread_start (args=0x5555569cafa0) at ../util/qemu-thread-posix.c:541
> > 8  0x00007ffff7094b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> > 9  0x00007ffff7126a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> >
> > The reason seems to be memory regions are handled on a different
> > thread than the virtio-gpu thread, and that inevitably leads to
> > raciness.  The memory region docs[a] generally seems to dissuade this:
> >
> > "In order to do this, as a general rule do not create or destroy
> >   memory regions dynamically during a device’s lifetime, and only
> >   call object_unparent() in the memory region owner’s instance_finalize
> >   callback. The dynamically allocated data structure that contains
> >   the memory region then should obviously be freed in the
> >   instance_finalize callback as well."
> >
> > Though instance_finalize is called before device destruction, so
> > storing the memory until then is unlikely to be an option.  The
> > tests do pass when virtio-gpu doesn't free the memory, but
> > progressively the guest becomes slower and then OOMs.
> >
> > Though the api does make an exception:
> >
> > "There is an exception to the above rule: it is okay to call
> > object_unparent at any time for an alias or a container region. It is
> > therefore also okay to create or destroy alias and container regions
> > dynamically during a device’s lifetime."
> >
> > I believe we are trying to create a container subregion, but that's
> > still failing?  Are we doing it right?  Any memory region experts
> > here can help out?  The other revelant patch in this series
> > is "virtio-gpu: hostmem".
>
> Perhaps dma_memory_map() is what you want?

It looks like dma_memory_map(..) gives a host virtual address given an
guest DMA address?  If so, that won't work.  We need to give the guest
a pointer to the host's GPU memory.

I took a look at what the Android Emulator has been doing: it largely
bypasses the MemoryRegion API and defines generic map/unmap functions:

https://android.googlesource.com/platform/external/qemu/+/emu-master-dev/hw/display/virtio-gpu-3d.c#473
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev/exec.c#2472
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev/accel/kvm/kvm-all.c#1787

This is kind of what crosvm does too.  That seems the simplest way to
do what we need.  Any objections to this approach or alternative
methods?






>
> >
> > [a] https://qemu.readthedocs.io/en/latest/devel/memory.html
> >
> > Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
> > ---
> >   hw/display/virtio-gpu-rutabaga.c | 14 ++++++++++++++
> >   1 file changed, 14 insertions(+)
> >
> > diff --git a/hw/display/virtio-gpu-rutabaga.c b/hw/display/virtio-gpu-rutabaga.c
> > index 5fd1154198..196267aac2 100644
> > --- a/hw/display/virtio-gpu-rutabaga.c
> > +++ b/hw/display/virtio-gpu-rutabaga.c
> > @@ -159,6 +159,12 @@ static int32_t rutabaga_handle_unmap(VirtIOGPU *g,
> >       GET_VIRTIO_GPU_GL(g);
> >       GET_RUTABAGA(virtio_gpu);
> >
> > +    memory_region_transaction_begin();
> > +    memory_region_set_enabled(&res->region, false);
> > +    memory_region_del_subregion(&g->parent_obj.hostmem, &res->region);
> > +    object_unparent(OBJECT(&res->region));
> > +    memory_region_transaction_commit();
> > +
> >       res->mapped = NULL;
> >       return rutabaga_resource_unmap(rutabaga, res->resource_id);
> >   }
> > @@ -671,6 +677,14 @@ rutabaga_cmd_resource_map_blob(VirtIOGPU *g,
> >       result = rutabaga_resource_map(rutabaga, mblob.resource_id, &mapping);
> >       CHECK_RESULT(result, cmd);
> >
> > +    memory_region_transaction_begin();
> > +    memory_region_init_ram_device_ptr(&res->region, OBJECT(g), NULL,
> > +                                      mapping.size, (void *)mapping.ptr);
> > +    memory_region_add_subregion(&g->parent_obj.hostmem, mblob.offset,
> > +                                &res->region);
> > +    memory_region_set_enabled(&res->region, true);
> > +    memory_region_transaction_commit();
> > +
> >       memset(&resp, 0, sizeof(resp));
> >       resp.hdr.type = VIRTIO_GPU_RESP_OK_MAP_INFO;
> >       result = rutabaga_resource_map_info(rutabaga, mblob.resource_id,


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?
  2023-04-22 16:41     ` Akihiko Odaki
@ 2023-04-25  0:16       ` Gurchetan Singh
  0 siblings, 0 replies; 28+ messages in thread
From: Gurchetan Singh @ 2023-04-25  0:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: Stefan Hajnoczi, qemu-devel, pbonzini, philmd, david, stefanha,
	kraxel, marcandre.lureau, dmitry.osipenko, ray.huang, alex.bennee

On Sat, Apr 22, 2023 at 9:41 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>
> On 2023/04/22 8:54, Gurchetan Singh wrote:
> > On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >>
> >> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> >> <gurchetansingh@chromium.org> wrote:
> >>>
> >>> From: Gurchetan Singh <gurchetansingh@google.com>
> >>>
> >>> Rationale:
> >>>
> >>> - gfxstream [a] is good for the Android Emulator/upstream QEMU
> >>>    alignment
> >>> - Wayland passhthrough [b] via the cross-domain context type is good
> >>>    for Linux on Linux display virtualization
> >>> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
> >>>    virglrenderer
> >>> - This series ports rutabaga_gfx to QEMU
> >>
> >> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
> >> stack and how they build on or complement virtio-gpu and
> >> virglrenderer?
> >
> > rutabaga_gfx and gfxstream are both libraries that implement the
> > virtio-gpu protocol.  There's a document available in the Gitlab issue
> > to see where they fit in the stack [a].
> >
> > gfxstream grew out of the Android Emulator's need to virtualize
> > graphics for app developers.  There's a short history of gfxstream in
> > patch 10.  It complements virglrenderer in that it's a bit more
> > cross-platform and targets different use cases -- more detail here
> > [b].  The ultimate goal is ditch out-of-tree kernel drivers in the
> > Android Emulator and adopt virtio, and porting gfxstream to QEMU would
> > speed up that transition.
>
> I wonder what is motivation for maintaining gfxstream instead of
> switching to virglrenderer/venus.

gfxstream GLES has features that would require significant redesign to
implement in virgl: multi-threading, live migration, widespread CTS
conformance (virgl only works well on FOSS Linux due to TGSI issues),
memory management to name a few.

Re: gfxstream VK and venus, it's a question of minimizing technical
risk.  Going from upstream to a shipping product that works on
MacOS/Windows/Linux means there's always going to be a long tail of
bugs.

The Android Emulator is still on QEMU 2.12 and the update won't be
easy (there are other things that need to be upstreamed besides GPU),
cross-platform API layering over Vulkan is expected to take 1+ year,
Vulkan doesn't work on many GPUs due to KVM issues [a], and no Vulkan
at all support has landed in upstream QEMU.

Probably the most pragmatic way to do this is to take it step by step,
and align over time by sharing components.  There might be a few
proposals to mesa-dev on that front, but getting upstream QEMU working
is a higher priority right now.  A bulk transition from one stack or
the other would be more difficult to pull off.

The great thing about context types/rutabaga_gfx,
gfxstream/virglrenderer details are largely hidden from QEMU and
present little maintenance burden.  Yes, a dependency on a new Rust
library is added, but moving towards Rust makes a ton of sense
security-wise long-term anyways.

[a] https://lore.kernel.org/all/20230330085802.2414466-1-stevensd@google.com/
-- even if this patch lands today, users will still need 1-2 years to
update

>
> >
> > rutabaga_gfx is a much smaller Rust library that sits on top of
> > gfxstream and even virglrenderer, but does a few extra things.  It
> > implements the cross-domain context type, which provides Wayland
> > passthrough.  This helps virtio-gpu by providing more modern display
> > virtualization.  For example, Microsoft for WSL2 also uses a similar
> > technique [c], but I believe it is not virtio-based nor open-source.
>
> The guest side components of WSLg are open-source, but the host side is
> not: https://github.com/microsoft/wslg
> It also uses DirectX for acceleration so it's not really portable for
> outside Windows.
>
> > With this, we can have the same open-source Wayland passthrough
> > solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
> > an additional small Rust context type for security-sensitive use cases
> > in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
> > (since it's C++ based) in such cases.
> >
> > Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.
> >
> > [a] https://gitlab.com/qemu-project/qemu/-/issues/1611
> > [b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
> > [c] https://www.youtube.com/watch?v=EkNBsBx501Q
> > [d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
> > [e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533
> >
> >>
> >> Stefan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 13/13] HACK: schedule fence return on main AIO context
  2023-04-21 23:20     ` Gurchetan Singh
@ 2023-04-25 12:04       ` Stefan Hajnoczi
  0 siblings, 0 replies; 28+ messages in thread
From: Stefan Hajnoczi @ 2023-04-25 12:04 UTC (permalink / raw)
  To: Gurchetan Singh
  Cc: qemu-devel, pbonzini, philmd, david, stefanha, kraxel,
	marcandre.lureau, akihiko.odaki, dmitry.osipenko, ray.huang,
	alex.bennee

On Fri, 21 Apr 2023 at 19:21, Gurchetan Singh
<gurchetansingh@chromium.org> wrote:
>
> On Fri, Apr 21, 2023 at 9:00 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >
> > On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> > <gurchetansingh@chromium.org> wrote:
> > >
> > > gfxstream and both cross-domain (and even newer versions
> > > virglrenderer: see VIRGL_RENDERER_ASYNC_FENCE_CB) like to signal
> > > fence completion on threads ("callback threads") that are
> > > different from the thread that processes the command queue
> > > ("main thread").
> > >
> > > This is generally possible with locking, and this what we do
> > > in crosvm and other virtio-gpu1.1 implementations.  However, on
> > > QEMU a deadlock is observed if virtio_gpu_ctrl_response_nodata(..)
> > > [used in the fence callback] is used from a thread that is not the
> > > main thread.
> > >
> > > The reason is the main thread takes the big QEMU lock (bql) somewhere
> > > when processing the command queue, and virtio_gpu_ctrl_response_nodata(..)
> > > needs that lock.  If you add in the lock needed to protect &g->fenceq
> > > from concurrent access by the main thread and the callback threads,
> > > you end can end up with deadlocks.
> > >
> > > It's possible to workaround this by scheduling the return of the fence
> > > descriptors via aio_bh_schedule_oneshot_full(..), but that somewhat
> > > negates the rationale for the asynchronous callbacks.
> > >
> > > I also played around with aio_context_acquire()/aio_context_release(),
> > > doesn't seem to help.
> > >
> > > Is signaling the virtio_queue outside of the main thread possible?  If
> > > so, how?
> >
> > Yes, if you want a specific thread to process a virtqueue, monitor the
> > virtqueue's host_notifier (aka ioeventfd) from the thread. That's how
> > --device virtio-blk,iothread= works. It attaches the host_notifier to
> > the IOThread's AioContext. Whenever the guest kicks the virtqueue, the
> > IOThread will respond instead of QEMU's main loop thread.
> >
> > That said, I don't know the threading model of your code. Could you
> > explain which threads are involved? Do you want to process all buffers
> > from virtqueue in a specific thread or only these fence buffers?
>
> Only the fence callback would be signalled via these callback threads.
> The virtqueue would not be processed by the callback thread.
>
> There can be multiple callback threads: for example, one for the
> gfxstream context and another for the Wayland context.  These threads
> could be a C++ thread, a Rust thread or any other.
>
> The strategy used by crosvm is to have a mutex around the fence state
> to synchronize between multiple callback threads (which signal fences)
> and main thread (which generates fences).
>
> I tried to use aio_context_acquire(..)/aio_context_release(..) to
> synchronize these threads, but that results in a deadlock.  I think
> those functions may assume an IOThread and not necessarily any thread.
> It seems aio_context_acquire(..) succeeds for the callback thread
> though.
>
> Here's what tried (rather than this patch which works, but is less
> ideal than the solution below):

I don't have time to study the code and understand the threading
model, but one thing stood out:

> Callback Thread deadlocked with above patch:
>
> [Switching to thread 56 (Thread 0x7ffd2c9ff640 (LWP 8649))]
> #0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
> <qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
> 146 in ../sysdeps/nptl/futex-internal.h
> (gdb) bt
> #0  futex_wait (private=0, expected=2, futex_word=0x5555569893a0
> <qemu_global_mutex>) at ../sysdeps/nptl/futex-internal.h:146
> #1  __GI___lll_lock_wait (futex=futex@entry=0x5555569893a0
> <qemu_global_mutex>, private=0) at ./nptl/lowlevellock.c:49
> #2  0x00007ffff7098082 in lll_mutex_lock_optimized
> (mutex=0x5555569893a0 <qemu_global_mutex>) at
> ./nptl/pthread_mutex_lock.c:48
> #3  ___pthread_mutex_lock (mutex=0x5555569893a0 <qemu_global_mutex>)
> at ./nptl/pthread_mutex_lock.c:93
> #4  0x00005555560019da in qemu_mutex_lock_impl (mutex=0x5555569893a0
> <qemu_global_mutex>, file=0x5555561df60c "../softmmu/physmem.c",
> line=2581) at ../util/qemu-thread-posix.c:94
> #5  0x0000555555afdf2d in qemu_mutex_lock_iothread_impl
> (file=0x5555561df60c "../softmmu/physmem.c", line=2581) at
> ../softmmu/cpus.c:504
> #6  0x0000555555d69a1a in prepare_mmio_access (mr=0x555556c10ca0) at
> ../softmmu/physmem.c:2581
> #7  0x0000555555d6b7a8 in address_space_stl_internal
> (as=0x5555578270f8, addr=4276125700, val=33, attrs=..., result=0x0,
> endian=DEVICE_LITTLE_ENDIAN) at
> /home/lenovo/qemu/memory_ldst.c.inc:318
> #8  0x0000555555d6b91d in address_space_stl_le (as=0x5555578270f8,
> addr=4276125700, val=33, attrs=..., result=0x0) at
> /home/lenovo/qemu/memory_ldst.c.inc:357
> #9  0x0000555555a0657c in pci_msi_trigger (dev=0x555557826ec0,
> msg=...) at ../hw/pci/pci.c:356
> #10 0x0000555555a02a5a in msi_send_message (dev=0x555557826ec0,
> msg=...) at ../hw/pci/msi.c:379
> #11 0x0000555555a045a0 in msix_notify (dev=0x555557826ec0, vector=1)
> at ../hw/pci/msix.c:542
> #12 0x0000555555ac8780 in virtio_pci_notify (d=0x555557826ec0,
> vector=1) at ../hw/virtio/virtio-pci.c:77
> #13 0x0000555555d15ddc in virtio_notify_vector (vdev=0x55555783ffd0,
> vector=1) at ../hw/virtio/virtio.c:1985
> #14 0x0000555555d17393 in virtio_irq (vq=0x555558716d80) at
> ../hw/virtio/virtio.c:2461
> #15 0x0000555555d17439 in virtio_notify (vdev=0x55555783ffd0,
> vq=0x555558716d80) at ../hw/virtio/virtio.c:2473
> #16 0x0000555555b98732 in virtio_gpu_ctrl_response (g=0x55555783ffd0,
> cmd=0x7ffdd80068b0, resp=0x7ffd2c9fe150, resp_len=24) at
> ../hw/display/virtio-gpu.c:177
> #17 0x0000555555b9879c in virtio_gpu_ctrl_response_nodata
> (g=0x55555783ffd0, cmd=0x7ffdd80068b0, type=VIRTIO_GPU_RESP_OK_NODATA)
> at ../hw/display/virtio-gpu.c:189
> #18 0x0000555555ba6b82 in virtio_gpu_rutabaga_fence_cb
> (user_data=93825028849616, fence_data=...) at
> ../hw/display/virtio-gpu-rutabaga.c:860
> #19 0x00007ffff75b9381 in
> _ZN131_$LT$rutabaga_gfx..rutabaga_utils..RutabagaFenceClosure$LT$T$GT$$u20$as$u20$rutabaga_gfx..rutabaga_utils..RutabagaFenceCallback$GT$4call17hcbf1b4ac094a2a60E.llvm.14356420492591683714
> () at /usr/local/lib/librutabaga_gfx_ffi.so
> #20 0x00007ffff75d501b in
> rutabaga_gfx::cross_domain::cross_domain::CrossDomainWorker::run () at
> /usr/local/lib/librutabaga_gfx_ffi.so
> #21 0x00007ffff75dd651 in
> std::sys_common::backtrace::__rust_begin_short_backtrace () at
> /usr/local/lib/librutabaga_gfx_ffi.so
> #22 0x00007ffff75c62ba in
> core::ops::function::FnOnce::call_once{{vtable.shim}} () at
> /usr/local/lib/librutabaga_gfx_ffi.so
> #23 0x00007ffff75e3b73 in alloc::boxed::{impl#44}::call_once<(), dyn
> core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>
> (self=..., args=()) at library/alloc/src/boxed.rs:1940
> #24 alloc::boxed::{impl#44}::call_once<(), alloc::boxed::Box<dyn
> core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>,
> alloc::alloc::Global> (self=0x5555572e27d0, args=())
>     at library/alloc/src/boxed.rs:1940
> #25 std::sys::unix::thread::{impl#2}::new::thread_start
> (main=0x5555572e27d0) at library/std/src/sys/unix/thread.rs:108
> #26 0x00007ffff7094b43 in start_thread (arg=<optimized out>) at
> ./nptl/pthread_create.c:442
> #27 0x00007ffff7126a00 in clone3 () at
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This Rust thread is calling QEMU emulation code that is not designed
to run outside the big QEMU lock (virtio_notify()).

I think prepare_mmio_access() is deadlocking on the big QEMU lock.
Another thread must be holding it (maybe waiting for this thread?) and
therefore the hang occurs.

Also, there is some additional setup like rcu_register_thread() that
may also be missing in this Rust thread.

The conservative approach is to only call QEMU emulation functions
like virtio_notify() from a QEMU thread. Some parts of QEMU are
thread-safe and don't need the big QEMU lock, but that requires
carefully calling only safe functions and at first glance I guess this
patch series isn't doing that.

Stefan


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2023-04-25 12:05 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-21  1:12 [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 01/13] virtio: Add shared memory capability Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 02/13] virtio-gpu: hostmem Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 03/13] virtio-gpu blob prep: improve decoding and add memory region Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 04/13] virtio-gpu: CONTEXT_INIT feature Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 05/13] gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl Gurchetan Singh
2023-04-21  9:40   ` Philippe Mathieu-Daudé
2023-04-21  1:12 ` [RFC PATCH 06/13] gfxstream + rutabaga prep: make GL device more library agnostic Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 07/13] gfxstream + rutabaga prep: define callbacks in realize function Gurchetan Singh
2023-04-21  9:53   ` Philippe Mathieu-Daudé
2023-04-21  1:12 ` [RFC PATCH 08/13] gfxstream + rutabaga prep: added need defintions, fields, and options Gurchetan Singh
2023-04-22 14:54   ` Akihiko Odaki
2023-04-21  1:12 ` [RFC PATCH 09/13] gfxstream + rutabaga: add required meson changes Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 10/13] gfxstream + rutabaga: add initial support for gfxstream Gurchetan Singh
2023-04-22 15:37   ` Akihiko Odaki
2023-04-21  1:12 ` [RFC PATCH 11/13] gfxstream + rutabaga: enable rutabaga Gurchetan Singh
2023-04-21  1:12 ` [RFC PATCH 12/13] HACK: use memory region API to inject memory to guest Gurchetan Singh
2023-04-22 15:46   ` Akihiko Odaki
2023-04-25  0:08     ` Gurchetan Singh
2023-04-22 19:22   ` Peter Maydell
2023-04-21  1:12 ` [RFC PATCH 13/13] HACK: schedule fence return on main AIO context Gurchetan Singh
2023-04-21 15:59   ` Stefan Hajnoczi
2023-04-21 23:20     ` Gurchetan Singh
2023-04-25 12:04       ` Stefan Hajnoczi
2023-04-21 16:02 ` [RFC PATCH 00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? Stefan Hajnoczi
2023-04-21 23:54   ` Gurchetan Singh
2023-04-22 16:41     ` Akihiko Odaki
2023-04-25  0:16       ` Gurchetan Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).