Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Leon Romanovsky <leonro@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>, <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <francois.dugast@intel.com>,
	<thomas.hellstrom@linux.intel.com>,
	<himal.prasad.ghimiray@intel.com>
Subject: Re: [RFC PATCH v3 04/11] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap
Date: Wed, 28 Jan 2026 13:04:49 -0800	[thread overview]
Message-ID: <aXp58fqfeeDD6YI+@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <20260128194540.GA10992@unreal>

On Wed, Jan 28, 2026 at 09:45:40PM +0200, Leon Romanovsky wrote:
> On Wed, Jan 28, 2026 at 11:29:23AM -0800, Matthew Brost wrote:
> > On Wed, Jan 28, 2026 at 01:55:31PM -0400, Jason Gunthorpe wrote:
> > > On Wed, Jan 28, 2026 at 09:46:44AM -0800, Matthew Brost wrote:
> > > 
> > > > It is intended to fill holes. The input pages come from the
> > > > migrate_vma_* functions, which can return a sparsely populated array of
> > > > pages for a region (e.g., it scans a 2M range but only finds several of
> > > > the 512 pages eligible for migration). As a result, if (!page) is true
> > > > for many entries.
> > > 
> > > This is migration?? So something is DMA'ing from A -> B - why put
> > > holes in the first place? Can you tightly pack the pages in the IOVA?
> > > 
> > 
> > This could probably could be made to work. I think it would be an
> > initial pass to figure out the IOVA size then tightly pack.
> > 
> > Let me look at this. Probably better too as installing dummy pages is a
> > non-zero cost as I assume dma_iova_link is a radix tree walk.
> > 
> > > If there is no iommu then the addresses are scattered all over anyhow
> > > so it can't be relying on some dma_addr_t relationship?
> > 
> > Scattered dma-addresses is already handled in the copy code, likewise
> > holes so non-issue.
> > 
> > > 
> > > You don't have to fully populate the allocated iova, you can link from
> > > A-B and then unlink from A-B even if B is less than the total size
> > > requested.
> > > 
> > > The hmm users have the holes because hmm is dynamically
> > > adding/removing pages as it runs and it can't do anything to pack the
> > > mapping.
> > > 
> > > > > IOVA space? If so, what necessitates those holes? You can have less mapped
> > > > > than IOVA and dma_iova_*() API can handle it.
> > > > 
> > > > I was actually going to ask you about this, so I’m glad you brought it
> > > > up here. Again, this is a hack to avoid holes — the holes are never
> > > > touched by our copy function, but rather skipped, so we just jam in a
> > > > dummy address so the entire IOVA range has valid IOMMU pages.
> > > 
> > > I would say what you are doing is trying to optimize unmap by
> > 
> > Yes and make the code simplish.
> > 
> > > unmapping everything in one shot instead of just the mapped areas, and
> > > the WARN_ON is telling you that it isn't allowed to unmap across a
> > > hole.
> > > 
> > > > at the moment I’m not sure whether this warning affects actual
> > > > functionality or if we could just delete it. 
> > > 
> > > It means the iommu page table stopped unmapping when it hit a hole and
> > > there is a bunch of left over maps in the page table that shouldn't be
> > > there. So yes, it is serious and cannot be deleted.
> > >
> > 
> > Cool, this explains the warning.
> >  
> > > This is a possible option to teach things to detect the holes and
> > > ignore them..
> > 
> > Another option — and IMO probably the best one — as it makes potential
> > usages with holes the simplest at the driver level. Let me look at this
> > too.
> 
> It would be ideal if we could code a more general solution. In HMM we
> release pages one by one, and it would be preferable to have a single-shot
> unmap routine instead. In similar to NVMe which release all IOVA space
> with one call to dma_iova_destroy().
> 
> HMM chain:
> 
> ib_umem_odp_unmap_dma_pages()
>  -> for (...)
>    -> hmm_dma_unmap_pfn()
> 
> After giving more thought to my earlier suggestion to use
> hmm_pfn_to_phys(), I began to wonder why did not you use the
> hmm_dma_*() API instead?
> 

That is ill-suited for high-speed fabrics, but so is our existing
implementation — we’re just in slightly better shape (?). It also seems
ill-suited [1][2][3] for variable page sizes (which are possible with
our API), as well as the way we currently program device PTEs in our
driver. We also receive PFNs from the migrate_vma_* layer, which must
also be mapped.

I also believe the hmm_dma_* code predates the DRM code being merged, or
was merged around the same time.

We could work to unify the HMM helpers and make them usable, but that
won’t happen overnight. The HMM layer needs quite a bit of work to
useable, and then we’d have to propagate everything upward through
DRM/Xe and any new users. Let me play around with this though a bit
though to get rough idea what would need to be done here.

[1] https://elixir.bootlin.com/linux/v6.18.6/source/drivers/infiniband/core/umem_odp.c#L255
[2] https://elixir.bootlin.com/linux/v6.18.6/source/drivers/infiniband/core/umem_odp.c#L193
[3] https://elixir.bootlin.com/linux/v6.18.6/source/drivers/infiniband/core/umem_odp.c#L104 

Also this is some odd stuff going... Why sync after every mapping [4].
Blindly doing BIDIRECTIONAL [5]... 

[4] https://elixir.bootlin.com/linux/v6.18.6/source/mm/hmm.c#L826
[5] https://elixir.bootlin.com/linux/v6.18.6/source/mm/hmm.c#L821

> > 
> > Do you think we need flag somewhere for 'ignore holes' or can I just
> > blindly skip them?
> 
> Better if we will have something like dma_iova_with_holes_destroy()
> function call to make sure that we don't hurt performance of existing
> dma_iova_destroy() users.
>

Yes, I think this is the best route for the time being. Let me look at
this.

Matt

> Thanks
> 
> > 
> > Matt
> > 
> > > 
> > > Jason

  reply	other threads:[~2026-01-28 21:04 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-28  0:48 [RFC PATCH v3 00/11] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 01/11] drm/pagemap: Add helper to access zone_device_data Matthew Brost
2026-01-28 13:53   ` Leon Romanovsky
2026-01-28  0:48 ` [RFC PATCH v3 02/11] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM Matthew Brost
2026-01-28 14:04   ` Leon Romanovsky
2026-01-28  0:48 ` [RFC PATCH v3 03/11] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 04/11] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
2026-01-28 14:28   ` Leon Romanovsky
2026-01-28 17:46     ` Matthew Brost
     [not found]       ` <20260128175531.GR1641016@ziepe.ca>
2026-01-28 19:29         ` Matthew Brost
2026-01-28 19:45           ` Leon Romanovsky
2026-01-28 21:04             ` Matthew Brost [this message]
2026-01-29 10:14               ` Leon Romanovsky
2026-01-29 18:22                 ` Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 05/11] drm/pagemap: Reduce number of IOVA link calls Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 06/11] drm/pagemap: Add IOVA interface to DRM pagemap Matthew Brost
     [not found]   ` <20260128151458.GJ1641016@ziepe.ca>
2026-01-28 18:42     ` Matthew Brost
2026-01-28 19:41       ` Matthew Brost
     [not found]       ` <20260128193509.GU1641016@ziepe.ca>
2026-01-28 20:24         ` Matthew Brost
2026-01-29 18:57           ` Jason Gunthorpe
2026-01-29 19:28             ` Matthew Brost
2026-01-29 19:32               ` Jason Gunthorpe
2026-01-28  0:48 ` [RFC PATCH v3 07/11] drm/xe: Stub out DRM pagemap IOVA alloc implementation Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 08/11] drm/pagemap: Use device-to-device IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 09/11] drm/xe: Drop BO dma-resv lock during SVM migrate-to-device Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 10/11] drm/xe: Implement DRM pagemap IOVA vfuncs Matthew Brost
2026-01-28  0:48 ` [RFC PATCH v3 11/11] drm/gpusvm: Use device-to-device IOVA alloc, link, and sync API in GPU SVM Matthew Brost
2026-01-28  0:59 ` ✗ CI.checkpatch: warning for Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap (rev3) Patchwork
2026-01-28  1:01 ` ✓ CI.KUnit: success " Patchwork
2026-01-28  1:42 ` ✓ Xe.CI.BAT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXp58fqfeeDD6YI+@lstrano-desk.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=francois.dugast@intel.com \
    --cc=himal.prasad.ghimiray@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jgg@ziepe.ca \
    --cc=leonro@nvidia.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox