From: Matthew Brost <matthew.brost@intel.com>
To: Francois Dugast <francois.dugast@intel.com>
Cc: <intel-xe@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <leonro@nvidia.com>,
<jgg@ziepe.ca>, <thomas.hellstrom@linux.intel.com>,
<himal.prasad.ghimiray@intel.com>
Subject: Re: [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap
Date: Wed, 8 Apr 2026 09:46:46 -0700 [thread overview]
Message-ID: <adaGdp9W0+OJVK3K@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <ac6SWbCDbBvRPbUj@fdugast-desk>
On Thu, Apr 02, 2026 at 05:59:21PM +0200, Francois Dugast wrote:
> On Thu, Feb 19, 2026 at 12:10:57PM -0800, Matthew Brost wrote:
> > The dma-map IOVA alloc, link, and sync APIs perform significantly better
> > than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
> > This difference is especially noticeable when mapping a 2MB region in
> > 4KB pages.
>
> Still a good improvement but with device THP now in drm-tip for GPU SVM,
> the speedup is less noticeable when looking at latency and throughput.
>
Yes, it is less important with THP but 64k gets a speedup or memory gets
fragmented and THP allocation fails we will get a perf win.
> >
> > Use the IOVA alloc, link, and sync APIs for DRM pagemap, which create DMA
> > mappings between the CPU and GPU for copying data.
> >
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> >
> > ---
> > v5:
> > - Remove extra newline (Thomas)
> > - Adjust alignemnt calculation (Thomas)
> > ---
> > drivers/gpu/drm/drm_pagemap.c | 83 +++++++++++++++++++++++++++++------
> > 1 file changed, 69 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> > index ef8b9c69d1d4..d9fceffce347 100644
> > --- a/drivers/gpu/drm/drm_pagemap.c
> > +++ b/drivers/gpu/drm/drm_pagemap.c
> > @@ -281,6 +281,19 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
> > return 0;
> > }
> >
> > +/**
> > + * struct drm_pagemap_iova_state - DRM pagemap IOVA state
> > + * @dma_state: DMA IOVA state.
> > + * @offset: Current offset in IOVA.
> > + *
> > + * This structure acts as an iterator for packing all IOVA addresses within a
> > + * contiguous range.
> > + */
> > +struct drm_pagemap_iova_state {
> > + struct dma_iova_state dma_state;
> > + unsigned long offset;
> > +};
> > +
> > /**
> > * drm_pagemap_migrate_map_system_pages() - Map system or device coherent
> > * migration pages for GPU SVM migration
> > @@ -289,6 +302,7 @@ drm_pagemap_migrate_map_device_private_pages(struct device *dev,
> > * @migrate_pfn: Array of page frame numbers of system pages or peer pages to map.
> > * @npages: Number of system or device coherent pages to map.
> > * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> > + * @state: DMA IOVA state for mapping.
> > *
> > * This function maps pages of memory for migration usage in GPU SVM. It
> > * iterates over each page frame number provided in @migrate_pfn, maps the
>
> Not visible in this diff but we should update the doc as the return value is
> not only 0 or -EFAULT, it can be any error code returned by dma_iova_link().
>
Will fix.
> > @@ -302,9 +316,11 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> > struct drm_pagemap_addr *pagemap_addr,
> > unsigned long *migrate_pfn,
> > unsigned long npages,
> > - enum dma_data_direction dir)
> > + enum dma_data_direction dir,
> > + struct drm_pagemap_iova_state *state)
> > {
> > unsigned long i;
> > + bool try_alloc = false;
> >
> > for (i = 0; i < npages;) {
> > struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> > @@ -319,9 +335,31 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> > folio = page_folio(page);
> > order = folio_order(folio);
> >
> > - dma_addr = dma_map_page(dev, page, 0, page_size(page), dir);
> > - if (dma_mapping_error(dev, dma_addr))
> > - return -EFAULT;
> > + if (!try_alloc) {
> > + dma_iova_try_alloc(dev, &state->dma_state,
> > + (npages - i) * PAGE_SIZE >=
> > + HPAGE_PMD_SIZE ?
> > + HPAGE_PMD_SIZE : 0,
> > + npages * PAGE_SIZE);
> > + try_alloc = true;
> > + }
> > +
> > + if (dma_use_iova(&state->dma_state)) {
> > + int err = dma_iova_link(dev, &state->dma_state,
> > + page_to_phys(page),
> > + state->offset, page_size(page),
> > + dir, 0);
> > + if (err)
> > + return err;
> > +
> > + dma_addr = state->dma_state.addr + state->offset;
> > + state->offset += page_size(page);
> > + } else {
> > + dma_addr = dma_map_page(dev, page, 0, page_size(page),
> > + dir);
> > + if (dma_mapping_error(dev, dma_addr))
> > + return -EFAULT;
> > + }
> >
> > pagemap_addr[i] =
> > drm_pagemap_addr_encode(dma_addr,
> > @@ -332,6 +370,9 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> > i += NR_PAGES(order);
> > }
> >
> > + if (dma_use_iova(&state->dma_state))
> > + return dma_iova_sync(dev, &state->dma_state, 0, state->offset);
> > +
> > return 0;
> > }
> >
> > @@ -343,6 +384,7 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
> > * @pagemap_addr: Array of DMA information corresponding to mapped pages
> > * @npages: Number of pages to unmap
> > * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> > + * @state: DMA IOVA state for mapping.
> > *
> > * This function unmaps previously mapped pages of memory for GPU Shared Virtual
> > * Memory (SVM). It iterates over each DMA address provided in @dma_addr, checks
>
> While we are here: s/@dma_addr/@pagemap_addr/
>
Will fix.
Matt
> Francois
>
> > @@ -352,10 +394,17 @@ static void drm_pagemap_migrate_unmap_pages(struct device *dev,
> > struct drm_pagemap_addr *pagemap_addr,
> > unsigned long *migrate_pfn,
> > unsigned long npages,
> > - enum dma_data_direction dir)
> > + enum dma_data_direction dir,
> > + struct drm_pagemap_iova_state *state)
> > {
> > unsigned long i;
> >
> > + if (state && dma_use_iova(&state->dma_state)) {
> > + dma_iova_unlink(dev, &state->dma_state, 0, state->offset, dir, 0);
> > + dma_iova_free(dev, &state->dma_state);
> > + return;
> > + }
> > +
> > for (i = 0; i < npages;) {
> > struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> >
> > @@ -410,7 +459,7 @@ drm_pagemap_migrate_remote_to_local(struct drm_pagemap_devmem *devmem,
> > devmem->pre_migrate_fence);
> > out:
> > drm_pagemap_migrate_unmap_pages(remote_device, pagemap_addr, local_pfns,
> > - npages, DMA_FROM_DEVICE);
> > + npages, DMA_FROM_DEVICE, NULL);
> > return err;
> > }
> >
> > @@ -420,11 +469,13 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
> > struct page *local_pages[],
> > struct drm_pagemap_addr pagemap_addr[],
> > unsigned long npages,
> > - const struct drm_pagemap_devmem_ops *ops)
> > + const struct drm_pagemap_devmem_ops *ops,
> > + struct drm_pagemap_iova_state *state)
> > {
> > int err = drm_pagemap_migrate_map_system_pages(devmem->dev,
> > pagemap_addr, sys_pfns,
> > - npages, DMA_TO_DEVICE);
> > + npages, DMA_TO_DEVICE,
> > + state);
> >
> > if (err)
> > goto out;
> > @@ -433,7 +484,7 @@ drm_pagemap_migrate_sys_to_dev(struct drm_pagemap_devmem *devmem,
> > devmem->pre_migrate_fence);
> > out:
> > drm_pagemap_migrate_unmap_pages(devmem->dev, pagemap_addr, sys_pfns, npages,
> > - DMA_TO_DEVICE);
> > + DMA_TO_DEVICE, state);
> > return err;
> > }
> >
> > @@ -461,6 +512,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
> > const struct migrate_range_loc *cur,
> > const struct drm_pagemap_migrate_details *mdetails)
> > {
> > + struct drm_pagemap_iova_state state = {};
> > int ret = 0;
> >
> > if (cur->start == 0)
> > @@ -488,7 +540,7 @@ static int drm_pagemap_migrate_range(struct drm_pagemap_devmem *devmem,
> > &pages[last->start],
> > &pagemap_addr[last->start],
> > cur->start - last->start,
> > - last->ops);
> > + last->ops, &state);
> >
> > out:
> > *last = *cur;
> > @@ -993,6 +1045,7 @@ EXPORT_SYMBOL(drm_pagemap_put);
> > int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> > {
> > const struct drm_pagemap_devmem_ops *ops = devmem_allocation->ops;
> > + struct drm_pagemap_iova_state state = {};
> > unsigned long npages, mpages = 0;
> > struct page **pages;
> > unsigned long *src, *dst;
> > @@ -1034,7 +1087,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> > err = drm_pagemap_migrate_map_system_pages(devmem_allocation->dev,
> > pagemap_addr,
> > dst, npages,
> > - DMA_FROM_DEVICE);
> > + DMA_FROM_DEVICE, &state);
> > if (err)
> > goto err_finalize;
> >
> > @@ -1051,7 +1104,7 @@ int drm_pagemap_evict_to_ram(struct drm_pagemap_devmem *devmem_allocation)
> > migrate_device_pages(src, dst, npages);
> > migrate_device_finalize(src, dst, npages);
> > drm_pagemap_migrate_unmap_pages(devmem_allocation->dev, pagemap_addr, dst, npages,
> > - DMA_FROM_DEVICE);
> > + DMA_FROM_DEVICE, &state);
> >
> > err_free:
> > kvfree(buf);
> > @@ -1095,6 +1148,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> > MIGRATE_VMA_SELECT_DEVICE_COHERENT,
> > .fault_page = page,
> > };
> > + struct drm_pagemap_iova_state state = {};
> > struct drm_pagemap_zdd *zdd;
> > const struct drm_pagemap_devmem_ops *ops;
> > struct device *dev = NULL;
> > @@ -1154,7 +1208,7 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> >
> > err = drm_pagemap_migrate_map_system_pages(dev, pagemap_addr,
> > migrate.dst, npages,
> > - DMA_FROM_DEVICE);
> > + DMA_FROM_DEVICE, &state);
> > if (err)
> > goto err_finalize;
> >
> > @@ -1172,7 +1226,8 @@ static int __drm_pagemap_migrate_to_ram(struct vm_area_struct *vas,
> > migrate_vma_finalize(&migrate);
> > if (dev)
> > drm_pagemap_migrate_unmap_pages(dev, pagemap_addr, migrate.dst,
> > - npages, DMA_FROM_DEVICE);
> > + npages, DMA_FROM_DEVICE,
> > + &state);
> > err_free:
> > kvfree(buf);
> > err_out:
> > --
> > 2.34.1
> >
next prev parent reply other threads:[~2026-04-08 16:46 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-19 20:10 [PATCH v5 0/5] Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap Matthew Brost
2026-02-19 20:10 ` [PATCH v5 1/5] drm/pagemap: Add helper to access zone_device_data Matthew Brost
2026-02-19 20:10 ` [PATCH v5 2/5] drm/gpusvm: Use dma-map IOVA alloc, link, and sync API in GPU SVM Matthew Brost
2026-02-19 20:10 ` [PATCH v5 3/5] drm/pagemap: Drop source_peer_migrates flag and assume true Matthew Brost
2026-02-19 20:53 ` Matthew Brost
2026-04-02 10:33 ` Francois Dugast
2026-02-19 20:10 ` [PATCH v5 4/5] drm/pagemap: Split drm_pagemap_migrate_map_pages into device / system Matthew Brost
2026-04-02 14:12 ` Francois Dugast
2026-02-19 20:10 ` [PATCH v5 5/5] drm/pagemap: Use dma-map IOVA alloc, link, and sync API for DRM pagemap Matthew Brost
2026-04-02 15:59 ` Francois Dugast
2026-04-08 16:46 ` Matthew Brost [this message]
2026-02-19 20:18 ` ✓ CI.KUnit: success for Use new dma-map IOVA alloc, link, and sync API in GPU SVM and DRM pagemap (rev5) Patchwork
2026-02-20 8:47 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-20 14:26 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adaGdp9W0+OJVK3K@gsse-cloud1.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jgg@ziepe.ca \
--cc=leonro@nvidia.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.