linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Christian König" <christian.koenig@amd.com>,
	"Xu Yilun" <yilun.xu@linux.intel.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Leon Romanovsky" <leonro@nvidia.com>,
	kvm@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org,
	sumit.semwal@linaro.org, pbonzini@redhat.com, seanjc@google.com,
	alex.williamson@redhat.com, vivek.kasireddy@intel.com,
	dan.j.williams@intel.com, aik@amd.com, yilun.xu@intel.com,
	linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org,
	lukas@wunner.de, yan.y.zhao@intel.com, leon@kernel.org,
	baolu.lu@linux.intel.com, zhenzhong.duan@intel.com,
	tao1.su@intel.com
Subject: Re: [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI
Date: Tue, 21 Jan 2025 13:36:33 -0400	[thread overview]
Message-ID: <20250121173633.GU5556@nvidia.com> (raw)
In-Reply-To: <Z4_HNA4QQbIsK8D9@phenom.ffwll.local>

On Tue, Jan 21, 2025 at 05:11:32PM +0100, Simona Vetter wrote:
> On Mon, Jan 20, 2025 at 03:48:04PM -0400, Jason Gunthorpe wrote:
> > On Mon, Jan 20, 2025 at 07:50:23PM +0100, Simona Vetter wrote:
> > > On Mon, Jan 20, 2025 at 01:59:01PM -0400, Jason Gunthorpe wrote:
> > > > On Mon, Jan 20, 2025 at 01:14:12PM +0100, Christian König wrote:
> > > > What is going wrong with your email? You replied to Simona, but
> > > > Simona Vetter <simona.vetter@ffwll.ch> is dropped from the To/CC
> > > > list??? I added the address back, but seems like a weird thing to
> > > > happen.
> > > 
> > > Might also be funny mailing list stuff, depending how you get these. I
> > > read mails over lore and pretty much ignore cc (unless it's not also on
> > > any list, since those tend to be security issues) because I get cc'ed on
> > > way too much stuff for that to be a useful signal.
> > 
> > Oh I see, you are sending a Mail-followup-to header that excludes your
> > address, so you don't get any emails at all.. My mutt is dropping you
> > as well.
> > 
> > > Yeah I'm not worried about cpu mmap locking semantics. drm/ttm is a pretty
> > > clear example that you can implement dma-buf mmap with the rules we have,
> > > except the unmap_mapping_range might need a bit fudging with a separate
> > > address_space.
> > 
> > From my perspective the mmap thing is a bit of a side/DRM-only thing
> > as nothing I'm interested in wants to mmap dmabuf into a VMA.
>
> I guess we could just skip mmap on these pfn exporters, but it also means
> a bit more boilerplate. 

I have been assuming that dmabuf mmap remains unchanged, that
exporters will continue to implement that mmap() callback as today.

My main interest has been what data structure is produced in the
attach APIs.

Eg today we have a struct dma_buf_attachment that returns a sg_table.

I'm expecting some kind of new data structure, lets call it "physical
list" that is some efficient coding of meta/addr/len tuples that works
well with the new DMA API. Matthew has been calling this thing phyr..

So, I imagine, struct dma_buf_attachment gaining an optional
feature negotiation and then we have in dma_buf_attachment:

        union {
              struct sg_table *sgt;
	      struct physical_list *phyr;
	};

That's basicaly it, an alternative to scatterlist that has a clean
architecture.

Now, if you are asking if the current dmabuf mmap callback can be
improved with the above? Maybe? phyr should have the neccessary
information inside it to populate a VMA - eventually even fully
correctly with all the right cachable/encrypted/forbidden/etc flags.

So, you could imagine that exporters could just have one routine to
generate the phyr list and that goes into the attachment, goes into
some common code to fill VMA PTEs, and some other common code that
will convert it into the DMABUF scatterlist. If performance is not a
concern with these data structure conversions it could be an appealing
simplification.

And yes, I could imagine the meta information being descriptive enough
to support the private interconnect cases, the common code could
detect private meta information and just cleanly fail.

> At least the device mapping / dma_buf_attachment
> side should be doable with just the pfn and the new dma-api?

Yes, that would be my first goal post. Figure out some meta
information and a container data structure that allows struct
page-less P2P mapping through the new DMA API.

> > I'm hoping we can get to something where we describe not just how the
> > pfns should be DMA mapped, but also can describe how they should be
> > CPU mapped. For instance that this PFN space is always mapped
> > uncachable, in CPU and in IOMMU.
> 
> I was pondering whether dma_mmap and friends would be a good place to
> prototype this and go for a fully generic implementation. But then even
> those have _wc/_uncached variants.

Given that the inability to correctly DMA map P2P MMIO without struct
page is a current pain point and current source of hacks in dmabuf
exporters, I wanted to make resolving that a priority.

However, if you mean what I described above for "fully generic [dmabuf
mmap] implementation", then we'd have the phyr datastructure as a
dependency to attempt that work.

phyr, and particularly the meta information, has a number of
stakeholders. I was thinking of going first with rdma's memory
registration flow because we are now pretty close to being able to do
such a big change, and it can demonstrate most of the requirements.

But that doesn't mean mmap couldn't go concurrently on the same agreed
datastructure if people are interested.

> > We also have current bugs in the iommu/vfio side where we are fudging
> > CC stuff, like assuming CPU memory is encrypted (not always true) and
> > that MMIO is non-encrypted (not always true)
> 
> tbf CC pte flags I just don't grok at all. I've once tried to understand
> what current exporters and gpu drivers do and just gave up. But that's
> also a bit why I'm worried here because it's an enigma to me.

For CC, inside the secure world, is some information if each PFN
inside the VM is 'encrypted' or not. Any VM PTE (including the IOPTEs)
pointing at the PFN must match the secure world's view of
'encrypted'. The VM can ask the secure world to change its view at
runtime.

The way CC has been bolted on to the kernel so far laregly hides this
from drivers, so it is troubled to tell in driver code if the PFN you
have is 'encrypted' or not. Right now the general rule (that is not
always true) is that struct page CPU memory is encrypted and
everything else is decrypted.

So right now, you can mostly ignore it and the above assumption
largely happens for you transparently.

However, soon we will have encrypted P2P MMIO which will stress this
hiding strategy.

> > > I thought iommuv2 (or whatever linux calls these) has full fault support
> > > and could support current move semantics. But yeah for iommu without
> > > fault support we need some kind of pin or a newly formalized revoke model.
> > 
> > No, this is HW dependent, including PCI device, and I'm aware of no HW
> > that fully implements this in a way that could be useful to implement
> > arbitary move semantics for VFIO..
> 
> Hm I thought we've had at least prototypes floating around of device fault
> repair, but I guess that only works with ATS/pasid stuff and not general
> iommu traffic from devices. Definitely needs some device cooperation since
> the timeouts of a full fault are almost endless.

Yes, exactly. What all real devices I'm aware have done is make a
subset of their traffic work with ATS and PRI, but not all their
traffic. Without *all* traffic you can't make any generic assumption
in the iommu that a transient non-present won't be fatal to the
device.

Stuff like dmabuf move semantics rely on transient non-present being
non-disruptive...

Jason

  reply	other threads:[~2025-01-21 17:36 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 14:27 [RFC PATCH 00/12] Private MMIO support for private assigned dev Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 01/12] dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI Xu Yilun
2025-01-08  8:01   ` Christian König
2025-01-08 13:23     ` Jason Gunthorpe
2025-01-08 13:44       ` Christian König
2025-01-08 14:58         ` Jason Gunthorpe
2025-01-08 15:25           ` Christian König
2025-01-08 16:22             ` Jason Gunthorpe
2025-01-08 17:56               ` Xu Yilun
2025-01-10 19:24                 ` Simona Vetter
2025-01-10 20:16                   ` Jason Gunthorpe
2025-01-08 18:44               ` Simona Vetter
2025-01-08 19:22                 ` Xu Yilun
     [not found]                   ` <58e97916-e6fd-41ef-84b4-bbf53ed0e8e4@amd.com>
2025-01-08 23:06                     ` Xu Yilun
2025-01-10 19:34                       ` Simona Vetter
2025-01-10 20:38                         ` Jason Gunthorpe
2025-01-12 22:10                           ` Xu Yilun
2025-01-14 14:44                           ` Simona Vetter
2025-01-14 17:31                             ` Jason Gunthorpe
2025-01-15  8:55                               ` Simona Vetter
2025-01-15  9:32                                 ` Christoph Hellwig
2025-01-15 13:34                                   ` Jason Gunthorpe
2025-01-16  5:33                                     ` Christoph Hellwig
2024-06-19 23:39                                       ` Xu Yilun
2025-01-16 13:28                                       ` Jason Gunthorpe
     [not found]                                 ` <420bd2ea-d87c-4f01-883e-a7a5cf1635fe@amd.com>
2025-01-17 14:42                                   ` Simona Vetter
2025-01-20 12:14                                     ` Christian König
2025-01-20 17:59                                       ` Jason Gunthorpe
2025-01-20 18:50                                         ` Simona Vetter
2025-01-20 19:48                                           ` Jason Gunthorpe
2025-01-21 16:11                                             ` Simona Vetter
2025-01-21 17:36                                               ` Jason Gunthorpe [this message]
2025-01-22 11:04                                                 ` Simona Vetter
2025-01-22 13:28                                                   ` Jason Gunthorpe
2025-01-22 13:29                                                   ` Christian König
2025-01-22 14:37                                                     ` Jason Gunthorpe
2025-01-22 14:59                                                       ` Christian König
2025-01-23 13:59                                                         ` Jason Gunthorpe
     [not found]                                                           ` <9a36fba5-2dee-46fd-9f51-47c5f0ffc1d4@amd.com>
2025-01-23 14:35                                                             ` Christian König
2025-01-23 15:02                                                               ` Jason Gunthorpe
     [not found]                                                                 ` <89f46c7f-a585-44e2-963d-bf00bf09b493@amd.com>
2025-01-23 16:08                                                                   ` Jason Gunthorpe
2025-01-09  8:09                     ` Christian König
2025-01-10 20:54                       ` Jason Gunthorpe
2025-01-15  9:38                         ` Christian König
2025-01-15 13:38                           ` Jason Gunthorpe
     [not found]                             ` <f6c2524f-5ef5-4c2c-a464-a7b195e0bf6c@amd.com>
2025-01-15 13:46                               ` Christian König
2025-01-15 14:14                                 ` Jason Gunthorpe
     [not found]                                   ` <c86cfee1-063a-4972-a343-ea0eff2141c9@amd.com>
2025-01-15 14:30                                     ` Christian König
2025-01-15 15:10                                       ` Jason Gunthorpe
     [not found]                                         ` <6f7a14aa-f607-45f9-9e15-759e26079dec@amd.com>
2025-01-15 17:09                                           ` Jason Gunthorpe
     [not found]                                             ` <5f588dac-d3e2-445d-9389-067b875412dc@amd.com>
2024-06-20 22:02                                               ` Xu Yilun
2025-01-20 13:44                                                 ` Christian König
2025-01-22  4:16                                                   ` Xu Yilun
2025-01-16 16:07                                               ` Jason Gunthorpe
2025-01-17 14:37                                                 ` Simona Vetter
     [not found]               ` <0e7f92bd-7da3-4328-9081-0957b3d155ca@amd.com>
2025-01-09  9:28                 ` Leon Romanovsky
2025-01-07 14:27 ` [RFC PATCH 02/12] vfio: Export vfio device get and put registration helpers Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 03/12] vfio/pci: Share the core device pointer while invoking feature functions Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 04/12] vfio/pci: Allow MMIO regions to be exported through dma-buf Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 05/12] vfio/pci: Support get_pfn() callback for dma-buf Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 06/12] KVM: Support vfio_dmabuf backed MMIO region Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 07/12] KVM: x86/mmu: Handle page fault for vfio_dmabuf backed MMIO Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device Xu Yilun
2025-01-08 13:30   ` Jason Gunthorpe
2025-01-08 16:57     ` Xu Yilun
2025-01-09 14:40       ` Jason Gunthorpe
2025-01-09 16:40         ` Xu Yilun
2025-01-10 13:31           ` Jason Gunthorpe
2025-01-11  3:48             ` Xu Yilun
2025-01-13 16:49               ` Jason Gunthorpe
2024-06-17 23:28                 ` Xu Yilun
2025-01-14 13:35                   ` Jason Gunthorpe
2025-01-15 12:57                     ` Alexey Kardashevskiy
2025-01-15 13:01                       ` Jason Gunthorpe
2025-01-17  1:57                         ` Baolu Lu
2025-01-17 13:25                           ` Jason Gunthorpe
2024-06-23 19:59                             ` Xu Yilun
2025-01-20 13:25                               ` Jason Gunthorpe
2024-06-24 21:12                                 ` Xu Yilun
2025-01-21 17:43                                   ` Jason Gunthorpe
2025-01-22  4:32                                     ` Xu Yilun
2025-01-22 12:55                                       ` Jason Gunthorpe
2025-01-23  7:41                                         ` Xu Yilun
2025-01-23 13:08                                           ` Jason Gunthorpe
2025-01-20  4:41                             ` Baolu Lu
2025-01-20  9:45                             ` Alexey Kardashevskiy
2025-01-20 13:28                               ` Jason Gunthorpe
2025-03-12  1:37                                 ` Dan Williams
2025-03-17 16:38                                   ` Jason Gunthorpe
2025-01-07 14:27 ` [RFC PATCH 09/12] vfio/pci: Export vfio dma-buf specific info for importers Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 10/12] KVM: vfio_dmabuf: Fetch VFIO specific dma-buf data for sanity check Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 11/12] KVM: x86/mmu: Export kvm_is_mmio_pfn() Xu Yilun
2025-01-07 14:27 ` [RFC PATCH 12/12] KVM: TDX: Implement TDX specific private MMIO map/unmap for SEPT Xu Yilun
2025-04-29  6:48 ` [RFC PATCH 00/12] Private MMIO support for private assigned dev Alexey Kardashevskiy
2025-04-29  7:50   ` Alexey Kardashevskiy
2025-05-09  3:04     ` Alexey Kardashevskiy
2025-05-09 11:12       ` Xu Yilun
2025-05-09 16:28         ` Xu Yilun
2025-05-09 18:43           ` Jason Gunthorpe
2025-05-10  3:47             ` Xu Yilun
2025-05-12  9:30               ` Alexey Kardashevskiy
2025-05-12 14:06                 ` Jason Gunthorpe
2025-05-13 10:03                   ` Zhi Wang
2025-05-14  9:47                     ` Xu Yilun
2025-05-14 20:05                       ` Zhi Wang
2025-05-15 18:02                         ` Xu Yilun
2025-05-15 19:21                           ` Jason Gunthorpe
2025-05-16  6:19                             ` Xu Yilun
2025-05-16 12:49                               ` Jason Gunthorpe
2025-05-17  2:33                                 ` Xu Yilun
2025-05-20 10:57                           ` Alexey Kardashevskiy
2025-05-24  3:33                             ` Xu Yilun
2025-05-15 10:29                     ` Alexey Kardashevskiy
2025-05-15 16:44                       ` Zhi Wang
2025-05-15 16:53                         ` Zhi Wang
2025-05-21 10:41                           ` Alexey Kardashevskiy
2025-05-14  7:02                   ` Xu Yilun
2025-05-14 16:33                     ` Jason Gunthorpe
2025-05-15 16:04                       ` Xu Yilun
2025-05-15 17:56                         ` Jason Gunthorpe
2025-05-16  6:03                           ` Xu Yilun
2025-05-22  3:45                         ` Alexey Kardashevskiy
2025-05-24  3:13                           ` Xu Yilun
2025-05-26  7:18                             ` Alexey Kardashevskiy
2025-05-29 14:41                               ` Xu Yilun
2025-05-29 16:29                                 ` Jason Gunthorpe
2025-05-30 16:07                                   ` Xu Yilun
2025-05-30  2:29                                 ` Alexey Kardashevskiy
2025-05-30 16:23                                   ` Xu Yilun
2025-06-10  4:20                                     ` Alexey Kardashevskiy
2025-06-10  5:19                                       ` Baolu Lu
2025-06-10  6:53                                       ` Xu Yilun
2025-05-14  3:20                 ` Xu Yilun
2025-06-10  4:37                   ` Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250121173633.GU5556@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=aik@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=kvm@vger.kernel.org \
    --cc=leon@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tao1.su@intel.com \
    --cc=vivek.kasireddy@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yilun.xu@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).