From: Albert Esteve <aesteve@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>,
hi@alyssa.is, stefanha@redhat.com, david@redhat.com,
jasowang@redhat.com, dbassey@redhat.com, stevensd@chromium.org,
"Laurent Vivier" <lvivier@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
slp@redhat.com, manos.pitsidianakis@linaro.org
Subject: Re: [PATCH v10 1/7] vhost-user: Add VirtIO Shared Memory map request
Date: Fri, 17 Oct 2025 14:35:44 +0200 [thread overview]
Message-ID: <CADSE00LLz+hc1DN_x4zHXUSJ30Rf1Gy_WWQmMgMRKE6yOkY8xw@mail.gmail.com> (raw)
In-Reply-To: <5eyt46m7aausn2b26cgtjxhqwt7f6iia3wj7c2kkjaxjjic64p@dyhbrjq32djr>
On Fri, Oct 17, 2025 at 2:13 PM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Fri, Oct 17, 2025 at 01:24:52PM +0200, Albert Esteve wrote:
> >On Fri, Oct 17, 2025 at 11:23 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
> >>
> >> On Thu, Oct 16, 2025 at 04:38:21PM +0200, Albert Esteve wrote:
> >> >Add SHMEM_MAP/UNMAP requests to vhost-user for dynamic management of
> >> >VIRTIO Shared Memory mappings.
> >> >
> >> >This implementation introduces VirtioSharedMemoryMapping as a unified
> >> >QOM object that manages both the mapping metadata and MemoryRegion
> >> >lifecycle. This object provides reference-counted lifecycle management
> >> >with automatic cleanup of file descriptors and memory regions
> >> >through QOM finalization.
> >> >
> >> >This request allows backends to dynamically map file descriptors into a
> >> >VIRTIO Shared Memory Region identified by their shmid. Maps are created
> >> >using memory_region_init_ram_from_fd() with configurable read/write
> >> >permissions, and the resulting MemoryRegions are added as subregions to
> >> >the shmem container region. The mapped memory is then advertised to the
> >> >guest VIRTIO drivers as a base address plus offset for reading and
> >> >writting according to the requested mmap flags.
> >> >
> >> >The backend can unmap memory ranges within a given VIRTIO Shared Memory
> >> >Region to free resources. Upon receiving this message, the frontend
> >> >removes the MemoryRegion as a subregion and automatically unreferences
> >> >the VirtioSharedMemoryMapping object, triggering cleanup if no other
> >> >references exist.
> >> >
> >> >Error handling has been improved to ensure consistent behavior across
> >> >handlers that manage their own vhost_user_send_resp() calls. Since
> >> >these handlers clear the VHOST_USER_NEED_REPLY_MASK flag, explicit
> >> >error checking ensures proper connection closure on failures,
> >> >maintaining the expected error flow.
> >> >
> >> >Note the memory region commit for these operations needs to be delayed
> >> >until after we reply to the backend to avoid deadlocks. Otherwise,
> >> >the MemoryListener would send a VHOST_USER_SET_MEM_TABLE message
> >> >before the reply.
> >> >
> >> >Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> >Signed-off-by: Albert Esteve <aesteve@redhat.com>
> >> >---
> >> > hw/virtio/vhost-user.c | 267 ++++++++++++++++++++++
> >> > hw/virtio/virtio.c | 199 ++++++++++++++++
> >> > include/hw/virtio/virtio.h | 135 +++++++++++
> >> > subprojects/libvhost-user/libvhost-user.c | 70 ++++++
> >> > subprojects/libvhost-user/libvhost-user.h | 54 +++++
> >> > 5 files changed, 725 insertions(+)
> >> >
> >> >diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> >> >index 36c9c2e04d..890be55937 100644
> >> >--- a/hw/virtio/vhost-user.c
> >> >+++ b/hw/virtio/vhost-user.c
> >> >@@ -104,6 +104,7 @@ typedef enum VhostUserRequest {
> >> > VHOST_USER_GET_SHARED_OBJECT = 41,
> >> > VHOST_USER_SET_DEVICE_STATE_FD = 42,
> >> > VHOST_USER_CHECK_DEVICE_STATE = 43,
> >> >+ VHOST_USER_GET_SHMEM_CONFIG = 44,
> >> > VHOST_USER_MAX
> >> > } VhostUserRequest;
> >> >
> >> >@@ -115,6 +116,8 @@ typedef enum VhostUserBackendRequest {
> >> > VHOST_USER_BACKEND_SHARED_OBJECT_ADD = 6,
> >> > VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE = 7,
> >> > VHOST_USER_BACKEND_SHARED_OBJECT_LOOKUP = 8,
> >> >+ VHOST_USER_BACKEND_SHMEM_MAP = 9,
> >> >+ VHOST_USER_BACKEND_SHMEM_UNMAP = 10,
> >> > VHOST_USER_BACKEND_MAX
> >> > } VhostUserBackendRequest;
> >> >
> >> >@@ -136,6 +139,12 @@ typedef struct VhostUserMemRegMsg {
> >> > VhostUserMemoryRegion region;
> >> > } VhostUserMemRegMsg;
> >> >
> >> >+typedef struct VhostUserShMemConfig {
> >> >+ uint32_t nregions;
> >> >+ uint32_t padding;
> >> >+ uint64_t memory_sizes[VIRTIO_MAX_SHMEM_REGIONS];
> >> >+} VhostUserShMemConfig;
> >> >+
> >> > typedef struct VhostUserLog {
> >> > uint64_t mmap_size;
> >> > uint64_t mmap_offset;
> >> >@@ -192,6 +201,23 @@ typedef struct VhostUserShared {
> >> > unsigned char uuid[16];
> >> > } VhostUserShared;
> >> >
> >> >+/* For the flags field of VhostUserMMap */
> >> >+#define VHOST_USER_FLAG_MAP_RW (1u << 0)
> >> >+
> >> >+typedef struct {
> >> >+ /* VIRTIO Shared Memory Region ID */
> >> >+ uint8_t shmid;
> >> >+ uint8_t padding[7];
> >> >+ /* File offset */
> >> >+ uint64_t fd_offset;
> >> >+ /* Offset within the VIRTIO Shared Memory Region */
> >> >+ uint64_t shm_offset;
> >> >+ /* Size of the mapping */
> >> >+ uint64_t len;
> >> >+ /* Flags for the mmap operation, from VHOST_USER_FLAG_MAP_* */
> >> >+ uint64_t flags;
> >> >+} VhostUserMMap;
> >> >+
> >> > typedef struct {
> >> > VhostUserRequest request;
> >> >
> >> >@@ -224,6 +250,8 @@ typedef union {
> >> > VhostUserInflight inflight;
> >> > VhostUserShared object;
> >> > VhostUserTransferDeviceState transfer_state;
> >> >+ VhostUserMMap mmap;
> >> >+ VhostUserShMemConfig shmem;
> >> > } VhostUserPayload;
> >> >
> >> > typedef struct VhostUserMsg {
> >> >@@ -1768,6 +1796,196 @@ vhost_user_backend_handle_shared_object_lookup(struct vhost_user *u,
> >> > return 0;
> >> > }
> >> >
> >> >+/**
> >> >+ * vhost_user_backend_handle_shmem_map() - Handle SHMEM_MAP backend request
> >> >+ * @dev: vhost device
> >> >+ * @ioc: QIOChannel for communication
> >> >+ * @hdr: vhost-user message header
> >> >+ * @payload: message payload containing mapping details
> >> >+ * @fd: file descriptor for the shared memory region
> >> >+ *
> >> >+ * Handles VHOST_USER_BACKEND_SHMEM_MAP requests from the backend. Creates
> >> >+ * a VhostUserShmemObject to manage the shared memory mapping and adds it
> >> >+ * to the appropriate VirtIO shared memory region. The VhostUserShmemObject
> >> >+ * serves as an intermediate parent for the MemoryRegion, ensuring proper
> >> >+ * lifecycle management with reference counting.
> >> >+ *
> >> >+ * Returns: 0 on success, negative errno on failure
> >> >+ */
> >> >+static int
> >> >+vhost_user_backend_handle_shmem_map(struct vhost_dev *dev,
> >> >+ QIOChannel *ioc,
> >> >+ VhostUserHeader *hdr,
> >> >+ VhostUserPayload *payload,
> >> >+ int fd)
> >> >+{
> >> >+ VirtioSharedMemory *shmem;
> >> >+ VhostUserMMap *vu_mmap = &payload->mmap;
> >> >+ VirtioSharedMemoryMapping *existing;
> >> >+ Error *local_err = NULL;
> >> >+ int ret = 0;
> >> >+
> >> >+ if (fd < 0) {
> >> >+ error_report("Bad fd for map");
> >> >+ ret = -EBADF;
> >> >+ goto send_reply;
> >> >+ }
> >> >+
> >> >+ if (QSIMPLEQ_EMPTY(&dev->vdev->shmem_list)) {
> >> >+ error_report("Device has no VIRTIO Shared Memory Regions. "
> >> >+ "Requested ID: %d", vu_mmap->shmid);
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply;
> >> >+ }
> >> >+
> >> >+ shmem = virtio_find_shmem_region(dev->vdev, vu_mmap->shmid);
> >> >+ if (!shmem) {
> >> >+ error_report("VIRTIO Shared Memory Region at "
> >> >+ "ID %d not found or uninitialized", vu_mmap->shmid);
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply;
> >> >+ }
> >> >+
> >> >+ if ((vu_mmap->shm_offset + vu_mmap->len) < vu_mmap->len ||
> >> >+ (vu_mmap->shm_offset + vu_mmap->len) > shmem->mr.size) {
> >> >+ error_report("Bad offset/len for mmap %" PRIx64 "+%" PRIx64,
> >> >+ vu_mmap->shm_offset, vu_mmap->len);
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply;
> >> >+ }
> >> >+
> >> >+ QTAILQ_FOREACH(existing, &shmem->mmaps, link) {
> >> >+ if (ranges_overlap(existing->offset, existing->len,
> >> >+ vu_mmap->shm_offset, vu_mmap->len)) {
> >> >+ error_report("VIRTIO Shared Memory mapping overlap");
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply;
> >> >+ }
> >> >+ }
> >> >+
> >> >+ memory_region_transaction_begin();
> >> >+
> >> >+ /* Create VirtioSharedMemoryMapping object */
> >> >+ VirtioSharedMemoryMapping *mapping = virtio_shared_memory_mapping_new(
> >> >+ vu_mmap->shmid, fd, vu_mmap->fd_offset, vu_mmap->shm_offset,
> >> >+ vu_mmap->len, vu_mmap->flags & VHOST_USER_FLAG_MAP_RW);
> >> >+
> >> >+ if (!mapping) {
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply_commit;
> >> >+ }
> >> >+
> >> >+ /* Add the mapping to the shared memory region */
> >> >+ if (virtio_add_shmem_map(shmem, mapping) != 0) {
> >> >+ error_report("Failed to add shared memory mapping");
> >> >+ object_unref(OBJECT(mapping));
> >> >+ ret = -EFAULT;
> >> >+ goto send_reply_commit;
> >> >+ }
> >> >+
> >> >+send_reply_commit:
> >> >+ /* Send reply and commit after transaction started */
> >> >+ if (hdr->flags & VHOST_USER_NEED_REPLY_MASK) {
> >> >+ payload->u64 = !!ret;
> >> >+ hdr->size = sizeof(payload->u64);
> >> >+ if (!vhost_user_send_resp(ioc, hdr, payload, &local_err)) {
> >> >+ error_report_err(local_err);
> >> >+ memory_region_transaction_commit();
> >> >+ return -EFAULT;
> >> >+ }
> >> >+ }
> >> >+ memory_region_transaction_commit();
> >>
> >> Sorry to be late, I did a quick review, my only doubts is here, maybe it
> >> was already discussed, but why do we commit after responding to the
> >> backend?
> >>
> >> Should we do it first to prevent the backend from “seeing” something
> >> that hasn't been committed yet?
> >
> >There is a race that leads to a deadlock. hw/virtio/vhost.c has a
> >MemoryListener that sends VHOST_USER_SET_MEM_TABLE messages in its
> >.commit() callback. If this happens before the reply, the backend will
> >not process it as it is stuck waiting for the SHMEM reply, and the
> >handler in qemu will not send it as it is waiting for the reply to the
> >SET_MEM_TABLE. So we have to delay the transaction commit to
> >immediately after the reply.
>
> Okay, I see now that you mentioned that in the commit description,
> great, I should have read it more carefully!
> IMO it would be worth adding a comment here, but I definitely won't ask
> you to send a v11 for this! (maybe a followup patch later).
>
> >
> >>
> >> Also, if vhost_user_send_resp() fails, should we call
> >> virtio_del_shmem_map()?
> >
> >If vhost_user_send_resp() fails, the connection with the backend is
> >closed, so the mappings will indeed never be removed unless we reset.
> >
> >Maybe better than removing the single mapping, would be to loop
> >through mappings in the shared memory and clean them all (same we do :
> >
> >```
> > while (!QTAILQ_EMPTY(&shmem->mmaps)) {
> > VirtioSharedMemoryMapping *mapping = QTAILQ_FIRST(&shmem->mmaps);
> > virtio_del_shmem_map(shmem, mapping->offset, mapping->mr->size);
> > }
> >```
> >
> >But since a backend may utilize more than one shared memory region,
> >and we do not know the mapping between a given backend and its shared
> >memories, whatever we do will be incomplete (?).
>
> I don't know if this is the right place to do this kind of cleanup,
> maybe further up in the stack.
>
>
> >I think the only
> >solution after this happens is to reset (virtio_reset) to remove all
> >mappings from the all shared regions, and re-establish the backend
> >channel (is it possible?). Even if the channel cannot be restablished,
> >I wouldn't bother just removing one mapping, I would assume it needs a
> >reset.
>
> So, in conclusion, we are saying that if we can no longer communicate
> with the backend, there is no point in maintaining a consistent state,
> because we have to reset the device anyway.
I guess what I'm saying after looking at the issue you raised (which
is reasonable and founded) is that the is no good option to ensure we
go back to a consistent state unless we reset.
> Are we already doing this, or should we be doing it?
I don't think we are? What I do not know is if we should. Probably yes.
I feel we should at least set the broken flag to true in case of an error:
dev->vdev->broken = true;
Looking at virtio/virtio.h: `bool broken; /* device in invalid state,
needs reset */`
I can send a separate patch.
>
> BTW, I don't want to stop this series, I just found this error path
> strange.
On the contrary, thanks for taking a look! It is better to clear any
potential issues before merging. Even if it needs a new version.
>
> Thanks,
> Stefano
>
next prev parent reply other threads:[~2025-10-17 12:37 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-16 14:38 [PATCH v10 0/7] vhost-user: Add SHMEM_MAP/UNMAP requests Albert Esteve
2025-10-16 14:38 ` [PATCH v10 1/7] vhost-user: Add VirtIO Shared Memory map request Albert Esteve
2025-10-16 15:18 ` Albert Esteve
2025-10-16 18:31 ` Stefan Hajnoczi
2025-10-17 6:58 ` Albert Esteve
2025-10-17 9:22 ` Stefano Garzarella
2025-10-17 9:29 ` Manos Pitsidianakis
2025-10-17 11:24 ` Albert Esteve
2025-10-17 12:12 ` Stefano Garzarella
2025-10-17 12:35 ` Albert Esteve [this message]
2025-10-20 13:50 ` David Hildenbrand
2025-10-20 14:06 ` Albert Esteve
2025-10-16 14:38 ` [PATCH v10 2/7] vhost_user.rst: Align VhostUserMsg excerpt members Albert Esteve
2025-10-16 14:38 ` [PATCH v10 3/7] vhost_user.rst: Add SHMEM_MAP/_UNMAP to spec Albert Esteve
2025-10-20 13:27 ` Stefan Hajnoczi
2025-10-20 13:49 ` Albert Esteve
2025-10-16 14:38 ` [PATCH v10 4/7] vhost_user: Add frontend get_shmem_config command Albert Esteve
2025-10-16 14:38 ` [PATCH v10 5/7] vhost_user.rst: Add GET_SHMEM_CONFIG message Albert Esteve
2025-10-16 14:38 ` [PATCH v10 6/7] qmp: add shmem feature map Albert Esteve
2025-10-16 14:38 ` [PATCH v10 7/7] vhost-user-device: Add shared memory BAR Albert Esteve
2025-10-16 15:05 ` [PATCH v10 0/7] vhost-user: Add SHMEM_MAP/UNMAP requests Stefan Hajnoczi
2025-10-17 12:42 ` Stefano Garzarella
2025-10-17 13:52 ` Michael S. Tsirkin
2025-10-20 13:51 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CADSE00LLz+hc1DN_x4zHXUSJ30Rf1Gy_WWQmMgMRKE6yOkY8xw@mail.gmail.com \
--to=aesteve@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=david@redhat.com \
--cc=dbassey@redhat.com \
--cc=farosas@suse.de \
--cc=hi@alyssa.is \
--cc=jasowang@redhat.com \
--cc=lvivier@redhat.com \
--cc=manos.pitsidianakis@linaro.org \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=sgarzare@redhat.com \
--cc=slp@redhat.com \
--cc=stefanha@redhat.com \
--cc=stevensd@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).