From: Jason Gunthorpe <jgg@nvidia.com>
To: Matt Evans <mattev@meta.com>
Cc: "Alex Williamson" <alex@shazbot.org>,
"Leon Romanovsky" <leon@kernel.org>,
"Alex Mastro" <amastro@fb.com>,
"Christian König" <christian.koenig@amd.com>,
"Mahmoud Adam" <mngyadam@amazon.de>,
"David Matlack" <dmatlack@google.com>,
"Björn Töpel" <bjorn@kernel.org>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Kevin Tian" <kevin.tian@intel.com>,
"Ankit Agrawal" <ankita@nvidia.com>,
"Pranjal Shrivastava" <praan@google.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Vivek Kasireddy" <vivek.kasireddy@intel.com>,
linux-kernel@vger.kernel.org, linux-media@vger.kernel.org,
dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
kvm@vger.kernel.org
Subject: Re: [PATCH 3/9] vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA
Date: Tue, 19 May 2026 11:56:47 -0300 [thread overview]
Message-ID: <20260519145647.GA3602937@nvidia.com> (raw)
In-Reply-To: <52162da4-e1cc-4f90-a95a-218d6089cd71@meta.com>
On Wed, May 06, 2026 at 08:03:08PM +0100, Matt Evans wrote:
> > > > > > + /*
> > > > > > + * The mmap() request's vma->vm_offs might be non-zero, but
> > > > > > + * the DMABUF is created from _offset zero_ of the BAR. The
> > > > > > + * portion between zero and the vm_offs is inaccessible
> > > > > > + * through this VMA, but this approach keeps the
> > > > > > + * /proc/<pid>/maps offset somewhat consistent with the
> > > > > > + * pre-DMABUF code. Size includes the offset portion.
> > > > >
> > > > > I'm not sure I understand this comment?
> > > > >
> > > > > For the old path vm_pgoff for byte 0 of the bar starts at some large
> > > > > offset
> > > > >
> > > > > For the new path vm_pgoff for byte 0 of the first range starts at 0
> > > >
> > > > Glad you asked. :)
> > > >
> > > > This is trying to achieve keeping /proc/<pid>/maps (or similar) somewhat
> > > > as informative as pre-DMABUF BAR mmap, in terms of keeping the VMA
> > > > vm_offs column useful. Before this patch, say you mmap() two slices A
> > > > and B of the same BAR:
> > > >
> > > > struct vfio_region_info bar_region;
> > > >
> > > > vm_a = mmap(0, 0x1000, ..., device_fd, bar_region.offset + 0);
> > > > vm_b = mmap(0, 0x1000, ..., device_fd, bar_region.offset + 0x4000);
> > > >
> > > > ...you'd see something like this in /proc/blah/maps:
> > > >
> > > > fffff4000000-fffff4001000 rw-s 10000000000 00:07 148
> > > > /dev/vfio/ devices/vfio0
> > > > fffff5000000-fffff5001000 rw-s 10000004000 00:07 148
> > > > /dev/vfio/ devices/vfio0
>
> Looking at this again, I/we got this backwards and I mixed up two things:
>
> The goal of this patch _is already_ to make sure the VMA's vm_pgoff (whether
> viewed in /proc/<pid>/maps or elsewhere) still matches the mmap()'s offset.
>
> (For a mo, ignore the resource index encoded into the offset. Consider just
> the offset into the BAR itself, inside the VFIO_PCI_OFFSET_MASK. I'll come
> back to the index encoded into the upper bits.)
>
> > > > then the VMA's vm_offs would need to be thunked back down to 0 (since
> > > > the fault handler then treats vm_b + 0 as the first byte of the DMABUF).
> > > > That works/adds up, but then the vm_offs of both VMAs A & B both have
> > > > offset 0, and it's harder to differentiate in /proc/blah/maps.
> > >
> > > Yes, and that would be correct.
>
> Why? This paragraph was outlining a hypothetical alternative implementation
> that creates the DMABUF the size of the VMA and starting from an offset into
> the BAR based on vm_pgoff, and then compensates by setting vma->vm_pgoff = 0
> so that the fault doesn't re-apply the offset again. That would make byte 0
> of the VMA access correct:
I see, I mis understood what you were suggesting
> This patch is supporting that property by instead creating the DMABUF so
> that the VMA's vm_pgoff (which is maintained and the same* as passed from
> mmap()!) indexes the DMABUF so that byte 0 of the VMA accesses the same
> address above in [1]. The DMABUF spans from the start of the BAR so the
> fault handler maths (which indexes the DMABUF by vm_pgoffs) is common for
> all buffers.
>
> a = mmap(0, 0x10000, ..., device_fd, 0x4000);
> +0 +0x4000
> +------------v------------------------------------------+
> | BAR |
> | |
> +------------^------------------------------------------+
> . .
> . +--------------------------+
> . | VMA |
> . | vma->vm_pgoff = 4 |
> . +--------------------------+
> . . .
> +------------+--------------------------+
> | invisible | DMABUF |
> | | |
> +------------+--------------------------+
>
> Same* externally-observable behaviour as the old mmap().
Sure, but it is a mess..
You should create the dma_buf that is the narrow one that only covers
the requested mmap. The vma_pgoff should be exactly what is passed to mmap.
And then have a simple 'vma_pgoff_adjust' that fixes up the pgoff to
be 0 based for internal operation of the fault handler.
It is nonsense stuff like this:
+ priv->size = (vma->vm_pgoff << PAGE_SHIFT) + req_len;
That is really objectionable, the size should never have anything to
do with a pgoff.
Jason
next prev parent reply other threads:[~2026-05-19 14:56 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-16 13:17 [PATCH 0/9] vfio/pci: Add mmap() for DMABUFs Matt Evans
2026-04-16 13:17 ` [PATCH 1/9] vfio/pci: Fix vfio_pci_dma_buf_cleanup() double-put Matt Evans
2026-04-24 18:05 ` Jason Gunthorpe
2026-05-01 19:12 ` Alex Williamson
2026-05-06 13:53 ` Matt Evans
2026-05-06 15:29 ` Leon Romanovsky
2026-05-06 15:55 ` Matt Evans
2026-05-06 16:14 ` Leon Romanovsky
2026-05-06 16:42 ` Matt Evans
2026-04-16 13:17 ` [PATCH 2/9] vfio/pci: Add a helper to look up PFNs for DMABUFs Matt Evans
2026-04-24 18:15 ` Jason Gunthorpe
2026-05-07 15:48 ` Matt Evans
2026-04-16 13:17 ` [PATCH 3/9] vfio/pci: Add a helper to create a DMABUF for a BAR-map VMA Matt Evans
2026-04-24 18:24 ` Jason Gunthorpe
2026-04-30 16:47 ` Matt Evans
2026-04-30 17:11 ` Jason Gunthorpe
2026-05-05 18:13 ` Matt Evans
2026-05-06 19:03 ` Matt Evans
2026-05-19 14:56 ` Jason Gunthorpe [this message]
2026-04-16 13:17 ` [PATCH 4/9] vfio/pci: Convert BAR mmap() to use a DMABUF Matt Evans
2026-05-01 22:19 ` Alex Williamson
2026-05-04 7:40 ` Jason Gunthorpe
2026-05-05 10:49 ` Leon Romanovsky
2026-05-05 14:50 ` Alex Williamson
2026-05-05 14:59 ` Jason Gunthorpe
2026-05-06 5:35 ` Leon Romanovsky
2026-05-14 17:52 ` Matt Evans
2026-04-16 13:17 ` [PATCH 5/9] vfio/pci: Provide a user-facing name for BAR mappings Matt Evans
2026-04-24 18:26 ` Jason Gunthorpe
2026-05-01 22:44 ` Alex Williamson
2026-05-07 16:56 ` Matt Evans
2026-05-07 17:17 ` Matt Evans
2026-04-16 13:17 ` [PATCH 6/9] vfio/pci: Clean up BAR zap and revocation Matt Evans
2026-05-01 23:19 ` Alex Williamson
2026-05-05 10:58 ` Leon Romanovsky
2026-05-18 16:54 ` Matt Evans
2026-04-16 13:17 ` [PATCH 7/9] vfio/pci: Support mmap() of a VFIO DMABUF Matt Evans
2026-04-24 18:30 ` Jason Gunthorpe
2026-05-07 16:09 ` Matt Evans
2026-04-16 13:17 ` [PATCH 8/9] vfio/pci: Permanently revoke a DMABUF on request Matt Evans
2026-04-16 13:17 ` [PATCH 9/9] vfio/pci: Add mmap() attributes to DMABUF feature Matt Evans
2026-04-24 18:31 ` Jason Gunthorpe
2026-04-26 10:52 ` Leon Romanovsky
2026-04-27 14:36 ` Alex Williamson
2026-05-11 15:30 ` Matt Evans
2026-05-11 17:51 ` Leon Romanovsky
2026-05-11 20:09 ` Alex Williamson
2026-05-12 17:51 ` Matt Evans
2026-05-13 18:27 ` Alex Williamson
2026-05-14 13:55 ` Matt Evans
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260519145647.GA3602937@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex@shazbot.org \
--cc=amastro@fb.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=bjorn@kernel.org \
--cc=christian.koenig@amd.com \
--cc=dmatlack@google.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=leon@kernel.org \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=mattev@meta.com \
--cc=mngyadam@amazon.de \
--cc=praan@google.com \
--cc=sumit.semwal@linaro.org \
--cc=vivek.kasireddy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.