public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed
From: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
To: "Cédric Le Goater" <clg@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Akihiko Odaki" <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Alex Williamson <alex@shazbot.org>
Subject: RE: [PATCH v12 07/10] vfio/device: Add support for creating dmabuf from multiple ranges
Date: Tue, 24 Mar 2026 05:47:19 +0000	[thread overview]
Message-ID: <IA0PR11MB7185D4CE68C2FA1DA354D42DF848A@IA0PR11MB7185.namprd11.prod.outlook.com> (raw)
In-Reply-To: <98c1495e-e142-4e63-ada1-87e14b85342e@redhat.com>

Hi Cedric,

> Subject: Re: [PATCH v12 07/10] vfio/device: Add support for creating
> dmabuf from multiple ranges
> 
> On 3/19/26 06:15, Vivek Kasireddy wrote:
> > In order to create a dmabuf associated with a buffer that spans
> > multiple ranges, we first need to identify the VFIO region and
> > index the buffer (represented by iovec) belongs to and then
> > translate its addresses to offsets within that region.
> >
> > The qemu_ram_block_from_host() API gives us both the region and
> > the offset info we need to populate the dma ranges so that we can
> > invoke the VFIO_DEVICE_FEATURE_DMA_BUF feature to create the
> dmabuf.
> >
> > Also, instead of iterating over all QOM devices to find the
> > VFIODevice associated with a memory region, introduce a helper
> > to just use the vfio_device_list to lookup the VFIODevice.
> > And, introduce another helper lookup the VFIO region given an
> > address.
> >
> > Lastly, introduce an enum to return the type of error encountered
> > while creating the dmabuf fd.
> >
> > Cc: Alex Williamson <alex@shazbot.org>
> > Cc: Cédric Le Goater <clg@redhat.com>
> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> > ---
> >   include/hw/vfio/vfio-device.h |  23 +++++++
> >   hw/vfio/device.c              | 114
> ++++++++++++++++++++++++++++++++++
> >   hw/vfio/dmabuf-stubs.c        |  17 +++++
> >   hw/vfio/meson.build           |   1 +
> >   4 files changed, 155 insertions(+)
> >   create mode 100644 hw/vfio/dmabuf-stubs.c
> >
> > diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-
> device.h
> > index 828a31c006..d46640ff53 100644
> > --- a/include/hw/vfio/vfio-device.h
> > +++ b/include/hw/vfio/vfio-device.h
> > @@ -41,6 +41,13 @@ enum {
> >       VFIO_DEVICE_TYPE_AP = 3,
> >   };
> >
> > +enum {
> > +    /* The Guest passed addresses stored in IOV are invalid */
> > +    VFIO_DMABUF_CREATE_ERR_INVALID_IOV = -1,
> > +    /* Guest or Host may be at fault for this type of error */
> > +    VFIO_DMABUF_CREATE_ERR_UNSPEC = -2,
> > +};
> 
> Sorry, I am not convinced this is useful.
Ok, I'll move these error enums elsewhere or not use them in VFIO.
The goal/idea here is to identify the error case where Guest passed
in invalid addresses that do not belong to VFIO or virtio-gpu and log
them with qemu_log_mask(). In all other cases (of errors), we want
to categorize the errors as Host related and report them using
error_report_err().

We have done it this way because there is no way to know (ahead of
time) if the Guest passed addresses belong to VFIO or virtio-gpu or
invalid. So, we always first try to create a dmabuf via udmabuf and if
that fails, try via VFIO and then bail out with a qemu_log_mask() if we
cannot create a dmabuf via VFIO as well.

> 
> > +
> >   typedef struct VFIODeviceOps VFIODeviceOps;
> >   typedef struct VFIODeviceIOOps VFIODeviceIOOps;
> >   typedef struct VFIOMigration VFIOMigration;
> > @@ -308,6 +315,22 @@ bool vfio_device_has_region_cap(VFIODevice
> *vbasedev, int region, uint16_t cap_t
> >
> >   int vfio_device_get_irq_info(VFIODevice *vbasedev, int index,
> >                                   struct vfio_irq_info *info);
> > +
> > +/**
> > + * Create a dmabuf fd by first translating the addresses in the
> > + * iovec array into VFIO region offsets and then invoking the
> > + * VFIO_DEVICE_FEATURE_DMA_BUF feature.
> > + *
> > + * @iov: array of iovec entries associated with a buffer
> > + * @iov_cnt: number of entries in the iovec array
> > + * @total_size: total size of the dmabuf
> > + * @errp: pointer to Error*, to store an error if it happens.
> > + *
> > + * Returns the created dmabuf fd or either
> VFIO_DMABUF_CREATE_ERR_UNSPEC
> > + * or VFIO_DMABUF_CREATE_ERR_INVALID_IOV on error.
> > + */
> > +int vfio_device_create_dmabuf_fd(struct iovec *iov, unsigned int
> iov_cnt,
> > +                                 size_t total_size, Error **errp);
> >   #endif
> >
> >   /* Returns 0 on success, or a negative errno. */
> > diff --git a/hw/vfio/device.c b/hw/vfio/device.c
> > index 973fc35b59..542c378913 100644
> > --- a/hw/vfio/device.c
> > +++ b/hw/vfio/device.c
> > @@ -21,7 +21,9 @@
> >   #include "qemu/osdep.h"
> >   #include <sys/ioctl.h>
> >
> > +#include "system/ramblock.h"
> >   #include "hw/vfio/vfio-device.h"
> > +#include "hw/vfio/vfio-region.h"
> >   #include "hw/vfio/pci.h"
> >   #include "hw/core/iommu.h"
> >   #include "hw/core/hw-error.h"
> > @@ -644,3 +646,115 @@ static VFIODeviceIOOps
> vfio_device_io_ops_ioctl = {
> >       .region_read = vfio_device_io_region_read,
> >       .region_write = vfio_device_io_region_write,
> >   };
> > +
> > +/*
> > + * This helper looks up the VFIODevice corresponding to the given
> iov. It
> > + * can be useful to determinine if a buffer (represented by the iov)
> belongs
> > + * to a VFIO device or not. This is mainly invoked when external
> components
> > + * such as virtio-gpu request creation of dmabuf fds for buffers that
> may
> > + * belong to a VFIO device.
> > + */
> > +static bool vfio_device_lookup(struct iovec *iov, VFIODevice
> **vbasedevp,
> > +                               RAMBlock **first_rbp, Error **errp)
> > +{
> > +    VFIODevice *vbasedev;
> > +    RAMBlock *first_rb;
> > +    ram_addr_t offset;
> > +
> > +    first_rb = qemu_ram_block_from_host(iov[0].iov_base, false,
> &offset);
> > +    if (!first_rb) {
> > +        error_setg(errp, "Could not find first ramblock\n");
> > +        return false;
> > +    }
> > +
> > +    *first_rbp = first_rb;
> > +    QLIST_FOREACH(vbasedev, &vfio_device_list, next) {
> > +        if (vbasedev->dev == first_rb->mr->dev) {
> > +            *vbasedevp = vbasedev;
> > +            return true;
> > +        }
> > +    }
> > +    error_setg(errp, "No VFIO device found to create dmabuf from\n");
> > +    return false;
> > +}
> > +
> > +/*
> > + * This helper looks up the VFIORegion corresponding to the given
> address.
> > + * It also verifies that the RAMBlock associated with the address is
> the
> > + * same as the first_rb passed in. This is to ensure that all addresses
> > + * in the iov belong to the same region.
> > + */
> > +static bool vfio_region_lookup(void *addr, VFIORegion **regionp,
> > +                               RAMBlock *first_rb, ram_addr_t *offsetp,
> > +                               Error **errp)
> > +{
> > +    VFIORegion *region;
> > +    RAMBlock *rb;
> > +
> > +    rb = qemu_ram_block_from_host(addr, false, offsetp);
> > +    if (!rb || rb != first_rb) {
> > +        error_setg(errp, "Dmabuf segments must belong to the same
> region\n");
> > +        return false;
> > +    }
> > +
> > +    region = vfio_get_region_from_mr(rb->mr);
> > +    if (region) {
> > +        *regionp = region;
> > +        return true;
> > +    }
> > +    error_setg(errp, "Could not find valid region for dmabuf
> segment\n");
> > +    return false;
> > +}
> > +
> > +int vfio_device_create_dmabuf_fd(struct iovec *iov, unsigned int
> iov_cnt,
> > +                                 size_t total_size, Error **errp)
> > +{
> > +    g_autofree struct vfio_device_feature *feature = NULL;
> > +    struct vfio_device_feature_dma_buf *dma_buf;
> > +    RAMBlock *first_rb = NULL;
> > +    VFIODevice *vbasedev;
> > +    VFIORegion *region;
> > +    ram_addr_t offset;
> > +    size_t argsz;
> > +    int i, ret;
> > +
> > +    if (iov_size(iov, iov_cnt) != total_size) {
> > +        error_setg(errp, "Total size of iov does not match dmabuf
> size\n");
> > +        return VFIO_DMABUF_CREATE_ERR_INVALID_IOV;
> 
> This returned value is redundant with errp and error-prone for an API.
> I think you should find an alternative implementation. May be these
> routines simply do not belong to VFIO.
Ok, we'll try to explore an alternative that would probably not have these
error enums in VFIO.

> 
> 
> > +    }
> > +
> > +    if (!vfio_device_lookup(iov, &vbasedev, &first_rb, errp)) {
> > +        return VFIO_DMABUF_CREATE_ERR_INVALID_IOV;
> > +    }
> > +
> > +    argsz = sizeof(*feature) + sizeof (*dma_buf) +
> > +            sizeof(struct vfio_region_dma_range) * iov_cnt;
> > +    feature = g_malloc0(argsz);
> > +    *feature = (struct vfio_device_feature) {
> > +        .argsz = argsz,
> > +        .flags = VFIO_DEVICE_FEATURE_GET |
> VFIO_DEVICE_FEATURE_DMA_BUF,
> > +    };
> > +    dma_buf = (struct vfio_device_feature_dma_buf *)feature->data;
> > +
> > +    for (i = 0; i < iov_cnt; i++) {
> > +        if (!vfio_region_lookup(iov[i].iov_base, &region,
> > +                                first_rb, &offset, errp)) {
> > +            return VFIO_DMABUF_CREATE_ERR_INVALID_IOV;
> > +        }
> > +
> > +        dma_buf->region_index = region->nr;
> > +        dma_buf->dma_ranges[i].offset = offset;
> > +        dma_buf->dma_ranges[i].length = iov[i].iov_len;
> > +    }
> > +
> > +    dma_buf->nr_ranges = iov_cnt;
> > +    dma_buf->open_flags = O_RDONLY | O_CLOEXEC;
> > +
> > +    ret = vfio_device_get_feature(vbasedev, feature);
> > +    if (ret < 0) {
> > +        error_setg_errno(errp, -ret,
> > +                         "Could not create dmabuf fd via VFIO device");
> > +        return VFIO_DMABUF_CREATE_ERR_UNSPEC;
> > +    }
> > +    return ret;
> > +}
> > diff --git a/hw/vfio/dmabuf-stubs.c b/hw/vfio/dmabuf-stubs.c
> > new file mode 100644
> > index 0000000000..374bd2a188
> > --- /dev/null
> > +++ b/hw/vfio/dmabuf-stubs.c
> > @@ -0,0 +1,17 @@
> > +/*
> > + * Copyright (c) 2026 Intel and/or its affiliates.
> > + *
> > + * SPDX-License-Identifier: GPL-2.0-or-later
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/iov.h"
> > +#include "hw/vfio/vfio-device.h"
> > +
> > +int vfio_device_create_dmabuf_fd(struct iovec *iov, unsigned int
> iov_cnt,
> > +                                 size_t total_size, Error **errp)
> > +{
> > +    error_setg(errp, "VFIO dmabuf support is not enabled");
> > +    return VFIO_DMABUF_CREATE_HOST_ERROR;
> 
> ../hw/vfio/dmabuf-stubs.c:16:12: error:
> 'VFIO_DMABUF_CREATE_HOST_ERROR' undeclared (first use in this
> function)
>     16 |     return VFIO_DMABUF_CREATE_HOST_ERROR;
Will fix it in the next version.

Thanks,
Vivek

> 
> Thanks,
> 
> C.
> 
> 
> 
> > +}
> > diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build
> > index 82f68698fb..ac899d30a8 100644
> > --- a/hw/vfio/meson.build
> > +++ b/hw/vfio/meson.build
> > @@ -34,3 +34,4 @@ system_ss.add(when: 'CONFIG_IOMMUFD',
> if_false: files('iommufd-stubs.c'))
> >   system_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files(
> >     'display.c',
> >   ))
> > +system_ss.add(when: 'CONFIG_VFIO', if_false: files('dmabuf-stubs.c'))



  reply	other threads:[~2026-03-24  5:48 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19  5:15 [PATCH v12 00/10] vfio: Create dmabuf from multiple VFIO ranges and use it in virtio-gpu Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 01/10] virtio-gpu: Recreate the resource's dmabuf if new backing is attached Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 02/10] virtio-gpu: Find hva for Guest's DMA addr associated with a ram device Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 03/10] virtio-gpu: Rename udmabuf files and helpers to dmabuf Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 04/10] virtio-gpu-dmabuf: Remove rcu_read_lock/unlock from virtio_gpu_create_udmabuf() Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 05/10] virtio-gpu-dmabuf: Use g_autofree for the list pointer Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 06/10] vfio/region: Add a helper to get VFIO region from memory region Vivek Kasireddy
2026-03-23 17:38   ` Cédric Le Goater
2026-03-24  5:47     ` Kasireddy, Vivek
2026-03-19  5:15 ` [PATCH v12 07/10] vfio/device: Add support for creating dmabuf from multiple ranges Vivek Kasireddy
2026-03-23 18:02   ` Cédric Le Goater
2026-03-24  5:47     ` Kasireddy, Vivek [this message]
2026-03-19  5:15 ` [PATCH v12 08/10] vfio/device: Add a helper to mmap a dmabuf Vivek Kasireddy
2026-03-19  5:15 ` [PATCH v12 09/10] virtio-gpu-dmabuf: Improve error handling with 'Error **' and err enum Vivek Kasireddy
2026-03-23 17:51   ` Cédric Le Goater
2026-03-24  5:53     ` Kasireddy, Vivek
2026-03-24  8:58       ` Akihiko Odaki
2026-03-25  5:31         ` Kasireddy, Vivek
2026-03-25  8:27           ` Akihiko Odaki
2026-03-26  5:54             ` Kasireddy, Vivek
2026-03-19  5:15 ` [PATCH v12 10/10] virtio-gpu-dmabuf: Create dmabuf for blobs associated with VFIO devices Vivek Kasireddy
2026-03-24  9:11   ` Akihiko Odaki
2026-03-25  5:29     ` Kasireddy, Vivek
2026-03-25  8:02       ` Akihiko Odaki
2026-03-26  5:52         ` Kasireddy, Vivek
2026-03-26  6:15           ` Akihiko Odaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=IA0PR11MB7185D4CE68C2FA1DA354D42DF848A@IA0PR11MB7185.namprd11.prod.outlook.com \
    --to=vivek.kasireddy@intel.com \
    --cc=alex@shazbot.org \
    --cc=clg@redhat.com \
    --cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox