From: Alexey Kardashevskiy <aik@amd.com>
To: Xu Yilun <yilun.xu@linux.intel.com>,
kvm@vger.kernel.org, dri-devel@lists.freedesktop.org,
linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
sumit.semwal@linaro.org, christian.koenig@amd.com,
pbonzini@redhat.com, seanjc@google.com,
alex.williamson@redhat.com, jgg@nvidia.com,
vivek.kasireddy@intel.com, dan.j.williams@intel.com
Cc: yilun.xu@intel.com, linux-coco@lists.linux.dev,
linux-kernel@vger.kernel.org, lukas@wunner.de,
yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org,
baolu.lu@linux.intel.com, zhenzhong.duan@intel.com,
tao1.su@intel.com
Subject: Re: [RFC PATCH 04/12] vfio/pci: Allow MMIO regions to be exported through dma-buf
Date: Wed, 6 May 2026 12:35:42 +1000 [thread overview]
Message-ID: <c0b160f8-2930-4158-9e50-b4cc4209e2ca@amd.com> (raw)
In-Reply-To: <20250107142719.179636-5-yilun.xu@linux.intel.com>
Hi!
Let's reignite this topic.
I've been using these patches + QEMU side hacks for 6+ months. And it's been fine until I got a device where MSIX BAR is in a middle of another BAR marked as TEE in the TDISP interface report. And no trusted MSIX yet.
Every time QEMU mmaps a BAR - I request a dmabuf fd from VFIO in QEMU. Since mapping of an entire MSIX BAR is allowed by default, VFIORegion::nr_mmaps==1 and it is an entire BAR.
Problem: KVM memslot mismatches the dmabuf fd size
How: QEMU emulates MSIX table and PBA by adding emulated MemoryRegions on top of the mapped BARs. In the QEMU memory flatview this splits the BAR into 2 or 3 sections (2 if MSIX at the start/end, 3 if in a middle). QEMU tries registering 1 or 2 KVM memory slots for regions outside of MSIX which fails in kvm_vfio_dmabuf_bind() as regions are smaller than the exported dmabuf fd (which is the entire BAR == 32KB).
Solution1: use QEMU x-msix-relocation hack to move MSIX elsewhere - end of some BAR (doubles the bar size so problematic for huge BARs like 512GB+), or another BAR (there may be no available as 3 of 64bit BARs is the limit).
Solution2: modify logic in VFIO dmabuf to allow multiple KVM memory slots per dmabuf. Now it is kvm_memory_slot::dmabuf_attach with no offset into the dmabuf and one kvm_vfio_dmabuf per dma_buf.
Solution3: hack QEMU into smashing a MSIX-containing BAR in vfio_pci_fixup_msix_region. Here is what it does:
0000380004000000-0000380004002fff (prio 0, ramd): 0000:41:04.0 BAR 4 mmaps[0] KVM
0000380004003000-0000380004006fff (prio 0, i/o): msix-table
0000380004007000-0000380004007fff (prio 0, ramd): 0000:41:04.0 BAR 4 mmaps[1] KVM
The problem now is that the TDI report must have the same split of the BAR as the VM is going to validate TEE (==trusted) MMIO ranges and this has to match the QEMU view. Harder than it sounds as the size of MSIX table in bytes is not exactly specified anywhere except the report.
Solution4: the above + modify QEMU to check the TDI report for not reporting MSIX+PBA where QEMU emulates them. The problem is that when QEMU adds emulated MRs - there is no report yet (the report is created later on, some TDISP witchery). So when the device is accepted - QEMU could reinitialize those emulated msix and pba MRs to match the report exactly so the flatview generates correct KVM memory regions and then KVM can work with dmabuf fine.
Any thoughts? what is acceptable for everybody? Thanks,
On 8/1/25 01:27, Xu Yilun wrote:
> From: Vivek Kasireddy <vivek.kasireddy@intel.com>
>
> This is a reduced version of Vivek's series [1]. Removed the
> dma_buf_ops.attach/map/unmap_dma_buf/mmap() as they are not necessary
> in this series, also because of the WIP p2p dma mapping opens [2]. Just
> focus on the private MMIO get PFN function in this early stage.
>
>>From Jason Gunthorpe:
> "dma-buf has become a way to safely acquire a handle to non-struct page
> memory that can still have lifetime controlled by the exporter. Notably
> RDMA can now import dma-buf FDs and build them into MRs which allows for
> PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory
> from PCI device BARs.
>
> The patch design loosely follows the pattern in commit
> db1a8dd916aa ("habanalabs: add support for dma-buf exporter") except this
> does not support pinning.
>
> Instead, this implements what, in the past, we've called a revocable
> attachment using move. In normal situations the attachment is pinned, as a
> BAR does not change physical address. However when the VFIO device is
> closed, or a PCI reset is issued, access to the MMIO memory is revoked.
>
> Revoked means that move occurs, but an attempt to immediately re-map the
> memory will fail. In the reset case a future move will be triggered when
> MMIO access returns. As both close and reset are under userspace control
> it is expected that userspace will suspend use of the dma-buf before doing
> these operations, the revoke is purely for kernel self-defense against a
> hostile userspace."
>
> [1] https://lore.kernel.org/kvm/20240624065552.1572580-4-vivek.kasireddy@intel.com/
> [2] https://lore.kernel.org/all/IA0PR11MB7185FDD56CFDD0A2B8D21468F83B2@IA0PR11MB7185.namprd11.prod.outlook.com/
>
> Original-patch-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
> drivers/vfio/pci/Makefile | 1 +
> drivers/vfio/pci/dma_buf.c | 223 +++++++++++++++++++++++++++++
> drivers/vfio/pci/vfio_pci_config.c | 22 ++-
> drivers/vfio/pci/vfio_pci_core.c | 20 ++-
> drivers/vfio/pci/vfio_pci_priv.h | 25 ++++
> include/linux/vfio_pci_core.h | 1 +
> include/uapi/linux/vfio.h | 29 ++++
> 7 files changed, 316 insertions(+), 5 deletions(-)
> create mode 100644 drivers/vfio/pci/dma_buf.c
>
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index cf00c0a7e55c..0cfdc9ede82f 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -2,6 +2,7 @@
>
> vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
> vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV_KVM) += vfio_pci_zdev.o
> +vfio-pci-core-$(CONFIG_DMA_SHARED_BUFFER) += dma_buf.o
> obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
>
> vfio-pci-y := vfio_pci.o
> diff --git a/drivers/vfio/pci/dma_buf.c b/drivers/vfio/pci/dma_buf.c
> new file mode 100644
> index 000000000000..1d5f46744922
> --- /dev/null
> +++ b/drivers/vfio/pci/dma_buf.c
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES.
> + */
> +#include <linux/dma-buf.h>
> +#include <linux/dma-resv.h>
> +
> +#include "vfio_pci_priv.h"
> +
> +MODULE_IMPORT_NS("DMA_BUF");
> +
> +struct vfio_pci_dma_buf {
> + struct dma_buf *dmabuf;
> + struct vfio_pci_core_device *vdev;
> + struct list_head dmabufs_elm;
> + unsigned int nr_ranges;
> + struct vfio_region_dma_range *dma_ranges;
> + bool revoked;
> +};
> +
> +static void vfio_pci_dma_buf_unpin(struct dma_buf_attachment *attachment)
> +{
> +}
> +
> +static int vfio_pci_dma_buf_pin(struct dma_buf_attachment *attachment)
> +{
> + /*
> + * Uses the dynamic interface but must always allow for
> + * dma_buf_move_notify() to do revoke
> + */
> + return -EINVAL;
> +}
> +
> +static int vfio_pci_dma_buf_get_pfn(struct dma_buf_attachment *attachment,
> + pgoff_t pgoff, u64 *pfn, int *max_order)
> +{
> + /* TODO */
> + return -EOPNOTSUPP;
> +}
> +
> +static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf)
> +{
> + struct vfio_pci_dma_buf *priv = dmabuf->priv;
> +
> + /*
> + * Either this or vfio_pci_dma_buf_cleanup() will remove from the list.
> + * The refcount prevents both.
> + */
> + if (priv->vdev) {
> + down_write(&priv->vdev->memory_lock);
> + list_del_init(&priv->dmabufs_elm);
> + up_write(&priv->vdev->memory_lock);
> + vfio_device_put_registration(&priv->vdev->vdev);
> + }
> + kfree(priv);
> +}
> +
> +static const struct dma_buf_ops vfio_pci_dmabuf_ops = {
> + .pin = vfio_pci_dma_buf_pin,
> + .unpin = vfio_pci_dma_buf_unpin,
> + .get_pfn = vfio_pci_dma_buf_get_pfn,
> + .release = vfio_pci_dma_buf_release,
> +};
> +
> +static int check_dma_ranges(struct vfio_pci_dma_buf *priv, u64 *dmabuf_size)
> +{
> + struct vfio_region_dma_range *dma_ranges = priv->dma_ranges;
> + struct pci_dev *pdev = priv->vdev->pdev;
> + resource_size_t bar_size;
> + int i;
> +
> + for (i = 0; i < priv->nr_ranges; i++) {
> + /*
> + * For PCI the region_index is the BAR number like
> + * everything else.
> + */
> + if (dma_ranges[i].region_index >= VFIO_PCI_ROM_REGION_INDEX)
> + return -EINVAL;
> +
> + bar_size = pci_resource_len(pdev, dma_ranges[i].region_index);
> + if (!bar_size)
> + return -EINVAL;
> +
> + if (!dma_ranges[i].offset && !dma_ranges[i].length)
> + dma_ranges[i].length = bar_size;
> +
> + if (!IS_ALIGNED(dma_ranges[i].offset, PAGE_SIZE) ||
> + !IS_ALIGNED(dma_ranges[i].length, PAGE_SIZE) ||
> + dma_ranges[i].length > bar_size ||
> + dma_ranges[i].offset >= bar_size ||
> + dma_ranges[i].offset + dma_ranges[i].length > bar_size)
> + return -EINVAL;
> +
> + *dmabuf_size += dma_ranges[i].length;
> + }
> +
> + return 0;
> +}
> +
> +int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> + struct vfio_device_feature_dma_buf __user *arg,
> + size_t argsz)
> +{
> + struct vfio_device_feature_dma_buf get_dma_buf;
> + struct vfio_region_dma_range *dma_ranges;
> + DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> + struct vfio_pci_dma_buf *priv;
> + u64 dmabuf_size = 0;
> + int ret;
> +
> + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
> + sizeof(get_dma_buf));
> + if (ret != 1)
> + return ret;
> +
> + if (copy_from_user(&get_dma_buf, arg, sizeof(get_dma_buf)))
> + return -EFAULT;
> +
> + dma_ranges = memdup_array_user(&arg->dma_ranges,
> + get_dma_buf.nr_ranges,
> + sizeof(*dma_ranges));
> + if (IS_ERR(dma_ranges))
> + return PTR_ERR(dma_ranges);
> +
> + priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> + if (!priv) {
> + kfree(dma_ranges);
> + return -ENOMEM;
> + }
> +
> + priv->vdev = vdev;
> + priv->nr_ranges = get_dma_buf.nr_ranges;
> + priv->dma_ranges = dma_ranges;
> +
> + ret = check_dma_ranges(priv, &dmabuf_size);
> + if (ret)
> + goto err_free_priv;
> +
> + if (!vfio_device_try_get_registration(&vdev->vdev)) {
> + ret = -ENODEV;
> + goto err_free_priv;
> + }
> +
> + exp_info.ops = &vfio_pci_dmabuf_ops;
> + exp_info.size = dmabuf_size;
> + exp_info.flags = get_dma_buf.open_flags;
> + exp_info.priv = priv;
> +
> + priv->dmabuf = dma_buf_export(&exp_info);
> + if (IS_ERR(priv->dmabuf)) {
> + ret = PTR_ERR(priv->dmabuf);
> + goto err_dev_put;
> + }
> +
> + /* dma_buf_put() now frees priv */
> + INIT_LIST_HEAD(&priv->dmabufs_elm);
> + down_write(&vdev->memory_lock);
> + dma_resv_lock(priv->dmabuf->resv, NULL);
> + priv->revoked = !__vfio_pci_memory_enabled(vdev);
> + list_add_tail(&priv->dmabufs_elm, &vdev->dmabufs);
> + dma_resv_unlock(priv->dmabuf->resv);
> + up_write(&vdev->memory_lock);
> +
> + /*
> + * dma_buf_fd() consumes the reference, when the file closes the dmabuf
> + * will be released.
> + */
> + return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags);
> +
> +err_dev_put:
> + vfio_device_put_registration(&vdev->vdev);
> +err_free_priv:
> + kfree(dma_ranges);
> + kfree(priv);
> + return ret;
> +}
> +
> +void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked)
> +{
> + struct vfio_pci_dma_buf *priv;
> + struct vfio_pci_dma_buf *tmp;
> +
> + lockdep_assert_held_write(&vdev->memory_lock);
> +
> + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
> + /*
> + * Returns true if a reference was successfully obtained.
> + * The caller must interlock with the dmabuf's release
> + * function in some way, such as RCU, to ensure that this
> + * is not called on freed memory.
> + */
> + if (!get_file_rcu(&priv->dmabuf->file))
> + continue;
> +
> + if (priv->revoked != revoked) {
> + dma_resv_lock(priv->dmabuf->resv, NULL);
> + priv->revoked = revoked;
> + dma_buf_move_notify(priv->dmabuf);
> + dma_resv_unlock(priv->dmabuf->resv);
> + }
> + dma_buf_put(priv->dmabuf);
> + }
> +}
> +
> +void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> +{
> + struct vfio_pci_dma_buf *priv;
> + struct vfio_pci_dma_buf *tmp;
> +
> + down_write(&vdev->memory_lock);
> + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
> + if (!get_file_rcu(&priv->dmabuf->file))
> + continue;
> + dma_resv_lock(priv->dmabuf->resv, NULL);
> + list_del_init(&priv->dmabufs_elm);
> + priv->vdev = NULL;
> + priv->revoked = true;
> + dma_buf_move_notify(priv->dmabuf);
> + dma_resv_unlock(priv->dmabuf->resv);
> + vfio_device_put_registration(&vdev->vdev);
> + dma_buf_put(priv->dmabuf);
> + }
> + up_write(&vdev->memory_lock);
> +}
> diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
> index ea2745c1ac5e..5cc200e15edc 100644
> --- a/drivers/vfio/pci/vfio_pci_config.c
> +++ b/drivers/vfio/pci/vfio_pci_config.c
> @@ -589,10 +589,12 @@ static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos,
> virt_mem = !!(le16_to_cpu(*virt_cmd) & PCI_COMMAND_MEMORY);
> new_mem = !!(new_cmd & PCI_COMMAND_MEMORY);
>
> - if (!new_mem)
> + if (!new_mem) {
> vfio_pci_zap_and_down_write_memory_lock(vdev);
> - else
> + vfio_pci_dma_buf_move(vdev, true);
> + } else {
> down_write(&vdev->memory_lock);
> + }
>
> /*
> * If the user is writing mem/io enable (new_mem/io) and we
> @@ -627,6 +629,8 @@ static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos,
> *virt_cmd &= cpu_to_le16(~mask);
> *virt_cmd |= cpu_to_le16(new_cmd & mask);
>
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, false);
> up_write(&vdev->memory_lock);
> }
>
> @@ -707,12 +711,16 @@ static int __init init_pci_cap_basic_perm(struct perm_bits *perm)
> static void vfio_lock_and_set_power_state(struct vfio_pci_core_device *vdev,
> pci_power_t state)
> {
> - if (state >= PCI_D3hot)
> + if (state >= PCI_D3hot) {
> vfio_pci_zap_and_down_write_memory_lock(vdev);
> - else
> + vfio_pci_dma_buf_move(vdev, true);
> + } else {
> down_write(&vdev->memory_lock);
> + }
>
> vfio_pci_set_power_state(vdev, state);
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, false);
> up_write(&vdev->memory_lock);
> }
>
> @@ -900,7 +908,10 @@ static int vfio_exp_config_write(struct vfio_pci_core_device *vdev, int pos,
>
> if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) {
> vfio_pci_zap_and_down_write_memory_lock(vdev);
> + vfio_pci_dma_buf_move(vdev, true);
> pci_try_reset_function(vdev->pdev);
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, true);
> up_write(&vdev->memory_lock);
> }
> }
> @@ -982,7 +993,10 @@ static int vfio_af_config_write(struct vfio_pci_core_device *vdev, int pos,
>
> if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) {
> vfio_pci_zap_and_down_write_memory_lock(vdev);
> + vfio_pci_dma_buf_move(vdev, true);
> pci_try_reset_function(vdev->pdev);
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, true);
> up_write(&vdev->memory_lock);
> }
> }
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index c3269d708411..f69eda5956ad 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -287,6 +287,8 @@ static int vfio_pci_runtime_pm_entry(struct vfio_pci_core_device *vdev,
> * semaphore.
> */
> vfio_pci_zap_and_down_write_memory_lock(vdev);
> + vfio_pci_dma_buf_move(vdev, true);
> +
> if (vdev->pm_runtime_engaged) {
> up_write(&vdev->memory_lock);
> return -EINVAL;
> @@ -370,6 +372,8 @@ static void vfio_pci_runtime_pm_exit(struct vfio_pci_core_device *vdev)
> */
> down_write(&vdev->memory_lock);
> __vfio_pci_runtime_pm_exit(vdev);
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, false);
> up_write(&vdev->memory_lock);
> }
>
> @@ -690,6 +694,8 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
> #endif
> vfio_pci_core_disable(vdev);
>
> + vfio_pci_dma_buf_cleanup(vdev);
> +
> mutex_lock(&vdev->igate);
> if (vdev->err_trigger) {
> eventfd_ctx_put(vdev->err_trigger);
> @@ -1234,7 +1240,10 @@ static int vfio_pci_ioctl_reset(struct vfio_pci_core_device *vdev,
> */
> vfio_pci_set_power_state(vdev, PCI_D0);
>
> + vfio_pci_dma_buf_move(vdev, true);
> ret = pci_try_reset_function(vdev->pdev);
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, false);
> up_write(&vdev->memory_lock);
>
> return ret;
> @@ -1523,6 +1532,8 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
> return vfio_pci_core_pm_exit(vdev, flags, arg, argsz);
> case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
> return vfio_pci_core_feature_token(vdev, flags, arg, argsz);
> + case VFIO_DEVICE_FEATURE_DMA_BUF:
> + return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz);
> default:
> return -ENOTTY;
> }
> @@ -2098,6 +2109,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
> INIT_LIST_HEAD(&vdev->dummy_resources_list);
> INIT_LIST_HEAD(&vdev->ioeventfds_list);
> INIT_LIST_HEAD(&vdev->sriov_pfs_item);
> + INIT_LIST_HEAD(&vdev->dmabufs);
> init_rwsem(&vdev->memory_lock);
> xa_init(&vdev->ctx);
>
> @@ -2480,11 +2492,17 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set,
> * cause the PCI config space reset without restoring the original
> * state (saved locally in 'vdev->pm_save').
> */
> - list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list)
> + list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list) {
> + vfio_pci_dma_buf_move(vdev, true);
> vfio_pci_set_power_state(vdev, PCI_D0);
> + }
>
> ret = pci_reset_bus(pdev);
>
> + list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list)
> + if (__vfio_pci_memory_enabled(vdev))
> + vfio_pci_dma_buf_move(vdev, false);
> +
> vdev = list_last_entry(&dev_set->device_list,
> struct vfio_pci_core_device, vdev.dev_set_list);
>
> diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
> index 5e4fa69aee16..d27f383f3931 100644
> --- a/drivers/vfio/pci/vfio_pci_priv.h
> +++ b/drivers/vfio/pci/vfio_pci_priv.h
> @@ -101,4 +101,29 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
> return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA;
> }
>
> +#ifdef CONFIG_DMA_SHARED_BUFFER
> +int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> + struct vfio_device_feature_dma_buf __user *arg,
> + size_t argsz);
> +void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev);
> +void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked);
> +#else
> +static int
> +vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags,
> + struct vfio_device_feature_dma_buf __user *arg,
> + size_t argsz)
> +{
> + return -ENOTTY;
> +}
> +
> +static inline void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev)
> +{
> +}
> +
> +static inline void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev,
> + bool revoked)
> +{
> +}
> +#endif
> +
> #endif
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index fbb472dd99b3..da5d8955ae56 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -94,6 +94,7 @@ struct vfio_pci_core_device {
> struct vfio_pci_core_device *sriov_pf_core_dev;
> struct notifier_block nb;
> struct rw_semaphore memory_lock;
> + struct list_head dmabufs;
> };
>
> /* Will be exported for vfio pci drivers usage */
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index c8dbf8219c4f..f43dfbde7352 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1458,6 +1458,35 @@ struct vfio_device_feature_bus_master {
> };
> #define VFIO_DEVICE_FEATURE_BUS_MASTER 10
>
> +/**
> + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the
> + * regions selected.
> + *
> + * For struct struct vfio_device_feature_dma_buf, open_flags are the typical
> + * flags passed to open(2), eg O_RDWR, O_CLOEXEC, etc. nr_ranges is the total
> + * number of dma_ranges that comprise the dmabuf.
> + *
> + * For struct vfio_region_dma_range, region_index/offset/length specify a slice
> + * of the region to create the dmabuf from, if both offset & length are 0 then
> + * the whole region is used.
> + *
> + * Return: The fd number on success, -1 and errno is set on failure.
> + */
> +struct vfio_region_dma_range {
> + __u32 region_index;
> + __u32 __pad;
> + __u64 offset;
> + __u64 length;
> +};
> +
> +struct vfio_device_feature_dma_buf {
> + __u32 open_flags;
> + __u32 nr_ranges;
> + struct vfio_region_dma_range dma_ranges[];
> +};
> +
> +#define VFIO_DEVICE_FEATURE_DMA_BUF 11
> +
> /* -------- API for Type1 VFIO IOMMU -------- */
>
> /**
--
Alexey
next prev parent reply other threads:[~2026-05-06 2:36 UTC|newest]
Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 14:27 [RFC PATCH 00/12] Private MMIO support for private assigned dev Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI Xu Yilun
2025-01-08 8:01 ` Christian König
2025-01-08 13:23 ` Jason Gunthorpe
2025-01-08 13:44 ` Christian König
2025-01-08 14:58 ` Jason Gunthorpe
2025-01-08 15:25 ` Christian König
2025-01-08 16:22 ` Jason Gunthorpe
2025-01-08 17:56 ` Xu Yilun
2025-01-10 19:24 ` Simona Vetter
2025-01-10 20:16 ` Jason Gunthorpe
2025-01-08 18:44 ` Simona Vetter
2025-01-08 19:22 ` Xu Yilun
[not found] ` <58e97916-e6fd-41ef-84b4-bbf53ed0e8e4@amd.com>
2025-01-08 23:06 ` Xu Yilun
2025-01-10 19:34 ` Simona Vetter
2025-01-10 20:38 ` Jason Gunthorpe
2025-01-12 22:10 ` Xu Yilun
2025-01-14 14:44 ` Simona Vetter
2025-01-14 17:31 ` Jason Gunthorpe
2025-01-15 8:55 ` Simona Vetter
2025-01-15 9:32 ` Christoph Hellwig
2025-01-15 13:34 ` Jason Gunthorpe
2025-01-16 5:33 ` Christoph Hellwig
2024-06-19 23:39 ` Xu Yilun
2025-01-16 13:28 ` Jason Gunthorpe
[not found] ` <420bd2ea-d87c-4f01-883e-a7a5cf1635fe@amd.com>
2025-01-17 14:42 ` Simona Vetter
2025-01-20 12:14 ` Christian König
2025-01-20 17:59 ` Jason Gunthorpe
2025-01-20 18:50 ` Simona Vetter
2025-01-20 19:48 ` Jason Gunthorpe
2025-01-21 16:11 ` Simona Vetter
2025-01-21 17:36 ` Jason Gunthorpe
2025-01-22 11:04 ` Simona Vetter
2025-01-22 13:28 ` Jason Gunthorpe
2025-01-22 13:29 ` Christian König
2025-01-22 14:37 ` Jason Gunthorpe
2025-01-22 14:59 ` Christian König
2025-01-23 13:59 ` Jason Gunthorpe
[not found] ` <9a36fba5-2dee-46fd-9f51-47c5f0ffc1d4@amd.com>
2025-01-23 14:35 ` Christian König
2025-01-23 15:02 ` Jason Gunthorpe
[not found] ` <89f46c7f-a585-44e2-963d-bf00bf09b493@amd.com>
2025-01-23 16:08 ` Jason Gunthorpe
2025-01-09 8:09 ` Christian König
2025-01-10 20:54 ` Jason Gunthorpe
2025-01-15 9:38 ` Christian König
2025-01-15 13:38 ` Jason Gunthorpe
[not found] ` <f6c2524f-5ef5-4c2c-a464-a7b195e0bf6c@amd.com>
2025-01-15 13:46 ` Christian König
2025-01-15 14:14 ` Jason Gunthorpe
[not found] ` <c86cfee1-063a-4972-a343-ea0eff2141c9@amd.com>
2025-01-15 14:30 ` Christian König
2025-01-15 15:10 ` Jason Gunthorpe
[not found] ` <6f7a14aa-f607-45f9-9e15-759e26079dec@amd.com>
2025-01-15 17:09 ` Jason Gunthorpe
[not found] ` <5f588dac-d3e2-445d-9389-067b875412dc@amd.com>
2024-06-20 22:02 ` Xu Yilun
2025-01-20 13:44 ` Christian König
2025-01-22 4:16 ` Xu Yilun
2025-01-16 16:07 ` Jason Gunthorpe
2025-01-17 14:37 ` Simona Vetter
[not found] ` <0e7f92bd-7da3-4328-9081-0957b3d155ca@amd.com>
2025-01-09 9:28 ` Leon Romanovsky
2025-01-07 14:27 ` [RFC PATCH 02/12] vfio: Export vfio device get and put registration helpers Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 03/12] vfio/pci: Share the core device pointer while invoking feature functions Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 04/12] vfio/pci: Allow MMIO regions to be exported through dma-buf Xu Yilun
2026-05-06 2:35 ` Alexey Kardashevskiy [this message]
2026-05-06 13:16 ` Jason Gunthorpe
2025-01-07 14:27 ` [RFC PATCH 05/12] vfio/pci: Support get_pfn() callback for dma-buf Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 06/12] KVM: Support vfio_dmabuf backed MMIO region Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 07/12] KVM: x86/mmu: Handle page fault for vfio_dmabuf backed MMIO Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device Xu Yilun
2025-01-08 13:30 ` Jason Gunthorpe
2025-01-08 16:57 ` Xu Yilun
2025-01-09 14:40 ` Jason Gunthorpe
2025-01-09 16:40 ` Xu Yilun
2025-01-10 13:31 ` Jason Gunthorpe
2025-01-11 3:48 ` Xu Yilun
2025-01-13 16:49 ` Jason Gunthorpe
2024-06-17 23:28 ` Xu Yilun
2025-01-14 13:35 ` Jason Gunthorpe
2025-01-15 12:57 ` Alexey Kardashevskiy
2025-01-15 13:01 ` Jason Gunthorpe
2025-01-17 1:57 ` Baolu Lu
2025-01-17 13:25 ` Jason Gunthorpe
2024-06-23 19:59 ` Xu Yilun
2025-01-20 13:25 ` Jason Gunthorpe
2024-06-24 21:12 ` Xu Yilun
2025-01-21 17:43 ` Jason Gunthorpe
2025-01-22 4:32 ` Xu Yilun
2025-01-22 12:55 ` Jason Gunthorpe
2025-01-23 7:41 ` Xu Yilun
2025-01-23 13:08 ` Jason Gunthorpe
2025-01-20 4:41 ` Baolu Lu
2025-01-20 9:45 ` Alexey Kardashevskiy
2025-01-20 13:28 ` Jason Gunthorpe
2025-03-12 1:37 ` Dan Williams
2025-03-17 16:38 ` Jason Gunthorpe
2025-01-07 14:27 ` [RFC PATCH 09/12] vfio/pci: Export vfio dma-buf specific info for importers Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 10/12] KVM: vfio_dmabuf: Fetch VFIO specific dma-buf data for sanity check Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 11/12] KVM: x86/mmu: Export kvm_is_mmio_pfn() Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 12/12] KVM: TDX: Implement TDX specific private MMIO map/unmap for SEPT Xu Yilun
2025-04-29 6:48 ` [RFC PATCH 00/12] Private MMIO support for private assigned dev Alexey Kardashevskiy
2025-04-29 7:50 ` Alexey Kardashevskiy
2025-05-09 3:04 ` Alexey Kardashevskiy
2025-05-09 11:12 ` Xu Yilun
2025-05-09 16:28 ` Xu Yilun
2025-05-09 18:43 ` Jason Gunthorpe
2025-05-10 3:47 ` Xu Yilun
2025-05-12 9:30 ` Alexey Kardashevskiy
2025-05-12 14:06 ` Jason Gunthorpe
2025-05-13 10:03 ` Zhi Wang
2025-05-14 9:47 ` Xu Yilun
2025-05-14 20:05 ` Zhi Wang
2025-05-15 18:02 ` Xu Yilun
2025-05-15 19:21 ` Jason Gunthorpe
2025-05-16 6:19 ` Xu Yilun
2025-05-16 12:49 ` Jason Gunthorpe
2025-05-17 2:33 ` Xu Yilun
2025-05-20 10:57 ` Alexey Kardashevskiy
2025-05-24 3:33 ` Xu Yilun
2025-05-15 10:29 ` Alexey Kardashevskiy
2025-05-15 16:44 ` Zhi Wang
2025-05-15 16:53 ` Zhi Wang
2025-05-21 10:41 ` Alexey Kardashevskiy
2025-05-14 7:02 ` Xu Yilun
2025-05-14 16:33 ` Jason Gunthorpe
2025-05-15 16:04 ` Xu Yilun
2025-05-15 17:56 ` Jason Gunthorpe
2025-05-16 6:03 ` Xu Yilun
2025-05-22 3:45 ` Alexey Kardashevskiy
2025-05-24 3:13 ` Xu Yilun
2025-05-26 7:18 ` Alexey Kardashevskiy
2025-05-29 14:41 ` Xu Yilun
2025-05-29 16:29 ` Jason Gunthorpe
2025-05-30 16:07 ` Xu Yilun
2025-05-30 2:29 ` Alexey Kardashevskiy
2025-05-30 16:23 ` Xu Yilun
2025-06-10 4:20 ` Alexey Kardashevskiy
2025-06-10 5:19 ` Baolu Lu
2025-06-10 6:53 ` Xu Yilun
2025-05-14 3:20 ` Xu Yilun
2025-06-10 4:37 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c0b160f8-2930-4158-9e50-b4cc4209e2ca@amd.com \
--to=aik@amd.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=christian.koenig@amd.com \
--cc=dan.j.williams@intel.com \
--cc=daniel.vetter@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=jgg@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=sumit.semwal@linaro.org \
--cc=tao1.su@intel.com \
--cc=vivek.kasireddy@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yilun.xu@intel.com \
--cc=yilun.xu@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox