* + gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch added to mm-new branch
@ 2025-10-09 3:19 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2025-10-09 3:19 UTC (permalink / raw)
To: mm-commits, ziy, ying.huang, simona, ryan.roberts, rcampbell,
rakie.kim, osalvador, npache, mpenttil, matthew.brost, lyude,
lorenzo.stoakes, Liam.Howlett, joshua.hahnjy, gourry,
francois.dugast, dev.jain, david, dakr, byungchul, baolin.wang,
baohua, apopple, airlied, balbirs, akpm
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 22758 bytes --]
The patch titled
Subject: gpu/drm/nouveau: enable THP support for GPU memory migration
has been added to the -mm mm-new branch. Its filename is
gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Balbir Singh <balbirs@nvidia.com>
Subject: gpu/drm/nouveau: enable THP support for GPU memory migration
Date: Wed, 1 Oct 2025 16:57:07 +1000
Enable MIGRATE_VMA_SELECT_COMPOUND support in nouveau driver to take
advantage of THP zone device migration capabilities.
Update migration and eviction code paths to handle compound page sizes
appropriately, improving memory bandwidth utilization and reducing
migration overhead for large GPU memory allocations.
Link: https://lkml.kernel.org/r/20251001065707.920170-17-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 303 ++++++++++++++++-------
drivers/gpu/drm/nouveau/nouveau_svm.c | 6
drivers/gpu/drm/nouveau/nouveau_svm.h | 3
3 files changed, 229 insertions(+), 83 deletions(-)
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -50,6 +50,7 @@
*/
#define DMEM_CHUNK_SIZE (2UL << 20)
#define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT)
+#define NR_CHUNKS (128)
enum nouveau_aper {
NOUVEAU_APER_VIRT,
@@ -83,9 +84,15 @@ struct nouveau_dmem {
struct list_head chunks;
struct mutex mutex;
struct page *free_pages;
+ struct folio *free_folios;
spinlock_t lock;
};
+struct nouveau_dmem_dma_info {
+ dma_addr_t dma_addr;
+ size_t size;
+};
+
static struct nouveau_dmem_chunk *nouveau_page_to_chunk(struct page *page)
{
return container_of(page_pgmap(page), struct nouveau_dmem_chunk,
@@ -115,8 +122,13 @@ static void nouveau_dmem_folio_free(stru
struct nouveau_dmem *dmem = chunk->drm->dmem;
spin_lock(&dmem->lock);
- page->zone_device_data = dmem->free_pages;
- dmem->free_pages = page;
+ if (folio_order(folio)) {
+ page->zone_device_data = dmem->free_folios;
+ dmem->free_folios = folio;
+ } else {
+ page->zone_device_data = dmem->free_pages;
+ dmem->free_pages = page;
+ }
WARN_ON(!chunk->callocated);
chunk->callocated--;
@@ -140,20 +152,28 @@ static void nouveau_dmem_fence_done(stru
}
}
-static int nouveau_dmem_copy_one(struct nouveau_drm *drm, struct page *spage,
- struct page *dpage, dma_addr_t *dma_addr)
+static int nouveau_dmem_copy_folio(struct nouveau_drm *drm,
+ struct folio *sfolio, struct folio *dfolio,
+ struct nouveau_dmem_dma_info *dma_info)
{
struct device *dev = drm->dev->dev;
+ struct page *dpage = folio_page(dfolio, 0);
+ struct page *spage = folio_page(sfolio, 0);
- lock_page(dpage);
+ folio_lock(dfolio);
- *dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
- if (dma_mapping_error(dev, *dma_addr))
+ dma_info->dma_addr = dma_map_page(dev, dpage, 0, page_size(dpage),
+ DMA_BIDIRECTIONAL);
+ dma_info->size = page_size(dpage);
+ if (dma_mapping_error(dev, dma_info->dma_addr))
return -EIO;
- if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr,
- NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage))) {
- dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ if (drm->dmem->migrate.copy_func(drm, folio_nr_pages(sfolio),
+ NOUVEAU_APER_HOST, dma_info->dma_addr,
+ NOUVEAU_APER_VRAM,
+ nouveau_dmem_page_addr(spage))) {
+ dma_unmap_page(dev, dma_info->dma_addr, page_size(dpage),
+ DMA_BIDIRECTIONAL);
return -EIO;
}
@@ -166,22 +186,48 @@ static vm_fault_t nouveau_dmem_migrate_t
struct nouveau_dmem *dmem = drm->dmem;
struct nouveau_fence *fence;
struct nouveau_svmm *svmm;
- struct page *spage, *dpage;
- unsigned long src = 0, dst = 0;
- dma_addr_t dma_addr = 0;
+ struct page *dpage;
vm_fault_t ret = 0;
struct migrate_vma args = {
.vma = vmf->vma,
- .start = vmf->address,
- .end = vmf->address + PAGE_SIZE,
- .src = &src,
- .dst = &dst,
.pgmap_owner = drm->dev,
.fault_page = vmf->page,
- .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE,
+ .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE |
+ MIGRATE_VMA_SELECT_COMPOUND,
+ .src = NULL,
+ .dst = NULL,
};
+ unsigned int order, nr;
+ struct folio *sfolio, *dfolio;
+ struct nouveau_dmem_dma_info dma_info;
+
+ sfolio = page_folio(vmf->page);
+ order = folio_order(sfolio);
+ nr = 1 << order;
/*
+ * Handle partial unmap faults, where the folio is large, but
+ * the pmd is split.
+ */
+ if (vmf->pte) {
+ order = 0;
+ nr = 1;
+ }
+
+ if (order)
+ args.flags |= MIGRATE_VMA_SELECT_COMPOUND;
+
+ args.start = ALIGN_DOWN(vmf->address, (PAGE_SIZE << order));
+ args.vma = vmf->vma;
+ args.end = args.start + (PAGE_SIZE << order);
+ args.src = kcalloc(nr, sizeof(*args.src), GFP_KERNEL);
+ args.dst = kcalloc(nr, sizeof(*args.dst), GFP_KERNEL);
+
+ if (!args.src || !args.dst) {
+ ret = VM_FAULT_OOM;
+ goto err;
+ }
+ /*
* FIXME what we really want is to find some heuristic to migrate more
* than just one page on CPU fault. When such fault happens it is very
* likely that more surrounding page will CPU fault too.
@@ -191,20 +237,26 @@ static vm_fault_t nouveau_dmem_migrate_t
if (!args.cpages)
return 0;
- spage = migrate_pfn_to_page(src);
- if (!spage || !(src & MIGRATE_PFN_MIGRATE))
- goto done;
-
- dpage = alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vmf->vma, vmf->address);
- if (!dpage)
+ if (order)
+ dpage = folio_page(vma_alloc_folio(GFP_HIGHUSER | __GFP_ZERO,
+ order, vmf->vma, vmf->address), 0);
+ else
+ dpage = alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vmf->vma,
+ vmf->address);
+ if (!dpage) {
+ ret = VM_FAULT_OOM;
goto done;
+ }
- dst = migrate_pfn(page_to_pfn(dpage));
+ args.dst[0] = migrate_pfn(page_to_pfn(dpage));
+ if (order)
+ args.dst[0] |= MIGRATE_PFN_COMPOUND;
+ dfolio = page_folio(dpage);
- svmm = spage->zone_device_data;
+ svmm = folio_zone_device_data(sfolio);
mutex_lock(&svmm->mutex);
nouveau_svmm_invalidate(svmm, args.start, args.end);
- ret = nouveau_dmem_copy_one(drm, spage, dpage, &dma_addr);
+ ret = nouveau_dmem_copy_folio(drm, sfolio, dfolio, &dma_info);
mutex_unlock(&svmm->mutex);
if (ret) {
ret = VM_FAULT_SIGBUS;
@@ -214,25 +266,40 @@ static vm_fault_t nouveau_dmem_migrate_t
nouveau_fence_new(&fence, dmem->migrate.chan);
migrate_vma_pages(&args);
nouveau_dmem_fence_done(&fence);
- dma_unmap_page(drm->dev->dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ dma_unmap_page(drm->dev->dev, dma_info.dma_addr, PAGE_SIZE,
+ DMA_BIDIRECTIONAL);
done:
migrate_vma_finalize(&args);
+err:
+ kfree(args.src);
+ kfree(args.dst);
return ret;
}
+static void nouveau_dmem_folio_split(struct folio *head, struct folio *tail)
+{
+ if (tail == NULL)
+ return;
+ tail->pgmap = head->pgmap;
+ tail->mapping = head->mapping;
+ folio_set_zone_device_data(tail, folio_zone_device_data(head));
+}
+
static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = {
.folio_free = nouveau_dmem_folio_free,
.migrate_to_ram = nouveau_dmem_migrate_to_ram,
+ .folio_split = nouveau_dmem_folio_split,
};
static int
-nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage)
+nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
+ bool is_large)
{
struct nouveau_dmem_chunk *chunk;
struct resource *res;
struct page *page;
void *ptr;
- unsigned long i, pfn_first;
+ unsigned long i, pfn_first, pfn;
int ret;
chunk = kzalloc(sizeof(*chunk), GFP_KERNEL);
@@ -242,7 +309,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_
}
/* Allocate unused physical address space for device private pages. */
- res = request_free_mem_region(&iomem_resource, DMEM_CHUNK_SIZE,
+ res = request_free_mem_region(&iomem_resource, DMEM_CHUNK_SIZE * NR_CHUNKS,
"nouveau_dmem");
if (IS_ERR(res)) {
ret = PTR_ERR(res);
@@ -275,16 +342,40 @@ nouveau_dmem_chunk_alloc(struct nouveau_
pfn_first = chunk->pagemap.range.start >> PAGE_SHIFT;
page = pfn_to_page(pfn_first);
spin_lock(&drm->dmem->lock);
- for (i = 0; i < DMEM_CHUNK_NPAGES - 1; ++i, ++page) {
- page->zone_device_data = drm->dmem->free_pages;
- drm->dmem->free_pages = page;
+
+ pfn = pfn_first;
+ for (i = 0; i < NR_CHUNKS; i++) {
+ int j;
+
+ if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) || !is_large) {
+ for (j = 0; j < DMEM_CHUNK_NPAGES - 1; j++, pfn++) {
+ page = pfn_to_page(pfn);
+ page->zone_device_data = drm->dmem->free_pages;
+ drm->dmem->free_pages = page;
+ }
+ } else {
+ page = pfn_to_page(pfn);
+ page->zone_device_data = drm->dmem->free_folios;
+ drm->dmem->free_folios = page_folio(page);
+ pfn += DMEM_CHUNK_NPAGES;
+ }
+ }
+
+ /* Move to next page */
+ if (is_large) {
+ *ppage = &drm->dmem->free_folios->page;
+ drm->dmem->free_folios = (*ppage)->zone_device_data;
+ } else {
+ *ppage = drm->dmem->free_pages;
+ drm->dmem->free_pages = (*ppage)->zone_device_data;
}
- *ppage = page;
+
chunk->callocated++;
spin_unlock(&drm->dmem->lock);
- NV_INFO(drm, "DMEM: registered %ldMB of device memory\n",
- DMEM_CHUNK_SIZE >> 20);
+ NV_INFO(drm, "DMEM: registered %ldMB of %sdevice memory %lx %lx\n",
+ NR_CHUNKS * DMEM_CHUNK_SIZE >> 20, is_large ? "THP " : "", pfn_first,
+ nouveau_dmem_page_addr(page));
return 0;
@@ -299,27 +390,41 @@ out:
}
static struct page *
-nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
+nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, bool is_large)
{
struct nouveau_dmem_chunk *chunk;
struct page *page = NULL;
+ struct folio *folio = NULL;
int ret;
+ unsigned int order = 0;
spin_lock(&drm->dmem->lock);
- if (drm->dmem->free_pages) {
+ if (is_large && drm->dmem->free_folios) {
+ folio = drm->dmem->free_folios;
+ page = &folio->page;
+ drm->dmem->free_folios = page->zone_device_data;
+ chunk = nouveau_page_to_chunk(&folio->page);
+ chunk->callocated++;
+ spin_unlock(&drm->dmem->lock);
+ order = ilog2(DMEM_CHUNK_NPAGES);
+ } else if (!is_large && drm->dmem->free_pages) {
page = drm->dmem->free_pages;
drm->dmem->free_pages = page->zone_device_data;
chunk = nouveau_page_to_chunk(page);
chunk->callocated++;
spin_unlock(&drm->dmem->lock);
+ folio = page_folio(page);
} else {
spin_unlock(&drm->dmem->lock);
- ret = nouveau_dmem_chunk_alloc(drm, &page);
+ ret = nouveau_dmem_chunk_alloc(drm, &page, is_large);
if (ret)
return NULL;
+ folio = page_folio(page);
+ if (is_large)
+ order = ilog2(DMEM_CHUNK_NPAGES);
}
- zone_device_page_init(page, 0);
+ zone_device_folio_init(folio, order);
return page;
}
@@ -370,12 +475,12 @@ nouveau_dmem_evict_chunk(struct nouveau_
{
unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
unsigned long *src_pfns, *dst_pfns;
- dma_addr_t *dma_addrs;
+ struct nouveau_dmem_dma_info *dma_info;
struct nouveau_fence *fence;
src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
- dma_addrs = kvcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL);
+ dma_info = kvcalloc(npages, sizeof(*dma_info), GFP_KERNEL | __GFP_NOFAIL);
migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
npages);
@@ -383,17 +488,28 @@ nouveau_dmem_evict_chunk(struct nouveau_
for (i = 0; i < npages; i++) {
if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
struct page *dpage;
+ struct folio *folio = page_folio(
+ migrate_pfn_to_page(src_pfns[i]));
+ unsigned int order = folio_order(folio);
+
+ if (src_pfns[i] & MIGRATE_PFN_COMPOUND) {
+ dpage = folio_page(
+ folio_alloc(
+ GFP_HIGHUSER_MOVABLE, order), 0);
+ } else {
+ /*
+ * _GFP_NOFAIL because the GPU is going away and there
+ * is nothing sensible we can do if we can't copy the
+ * data back.
+ */
+ dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
+ }
- /*
- * _GFP_NOFAIL because the GPU is going away and there
- * is nothing sensible we can do if we can't copy the
- * data back.
- */
- dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
- nouveau_dmem_copy_one(chunk->drm,
- migrate_pfn_to_page(src_pfns[i]), dpage,
- &dma_addrs[i]);
+ nouveau_dmem_copy_folio(chunk->drm,
+ page_folio(migrate_pfn_to_page(src_pfns[i])),
+ page_folio(dpage),
+ &dma_info[i]);
}
}
@@ -404,8 +520,9 @@ nouveau_dmem_evict_chunk(struct nouveau_
kvfree(src_pfns);
kvfree(dst_pfns);
for (i = 0; i < npages; i++)
- dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
- kvfree(dma_addrs);
+ dma_unmap_page(chunk->drm->dev->dev, dma_info[i].dma_addr,
+ dma_info[i].size, DMA_BIDIRECTIONAL);
+ kvfree(dma_info);
}
void
@@ -608,31 +725,36 @@ nouveau_dmem_init(struct nouveau_drm *dr
static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
struct nouveau_svmm *svmm, unsigned long src,
- dma_addr_t *dma_addr, u64 *pfn)
+ struct nouveau_dmem_dma_info *dma_info, u64 *pfn)
{
struct device *dev = drm->dev->dev;
struct page *dpage, *spage;
unsigned long paddr;
+ bool is_large = false;
+ unsigned long mpfn;
spage = migrate_pfn_to_page(src);
if (!(src & MIGRATE_PFN_MIGRATE))
goto out;
- dpage = nouveau_dmem_page_alloc_locked(drm);
+ is_large = src & MIGRATE_PFN_COMPOUND;
+ dpage = nouveau_dmem_page_alloc_locked(drm, is_large);
if (!dpage)
goto out;
paddr = nouveau_dmem_page_addr(dpage);
if (spage) {
- *dma_addr = dma_map_page(dev, spage, 0, page_size(spage),
+ dma_info->dma_addr = dma_map_page(dev, spage, 0, page_size(spage),
DMA_BIDIRECTIONAL);
- if (dma_mapping_error(dev, *dma_addr))
+ dma_info->size = page_size(spage);
+ if (dma_mapping_error(dev, dma_info->dma_addr))
goto out_free_page;
- if (drm->dmem->migrate.copy_func(drm, 1,
- NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST, *dma_addr))
+ if (drm->dmem->migrate.copy_func(drm, folio_nr_pages(page_folio(spage)),
+ NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST,
+ dma_info->dma_addr))
goto out_dma_unmap;
} else {
- *dma_addr = DMA_MAPPING_ERROR;
+ dma_info->dma_addr = DMA_MAPPING_ERROR;
if (drm->dmem->migrate.clear_func(drm, page_size(dpage),
NOUVEAU_APER_VRAM, paddr))
goto out_free_page;
@@ -643,10 +765,13 @@ static unsigned long nouveau_dmem_migrat
((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
if (src & MIGRATE_PFN_WRITE)
*pfn |= NVIF_VMM_PFNMAP_V0_W;
- return migrate_pfn(page_to_pfn(dpage));
+ mpfn = migrate_pfn(page_to_pfn(dpage));
+ if (folio_order(page_folio(dpage)))
+ mpfn |= MIGRATE_PFN_COMPOUND;
+ return mpfn;
out_dma_unmap:
- dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ dma_unmap_page(dev, dma_info->dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
out_free_page:
nouveau_dmem_page_free_locked(drm, dpage);
out:
@@ -656,27 +781,38 @@ out:
static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
struct nouveau_svmm *svmm, struct migrate_vma *args,
- dma_addr_t *dma_addrs, u64 *pfns)
+ struct nouveau_dmem_dma_info *dma_info, u64 *pfns)
{
struct nouveau_fence *fence;
unsigned long addr = args->start, nr_dma = 0, i;
+ unsigned long order = 0;
+
+ for (i = 0; addr < args->end; ) {
+ struct folio *folio;
- for (i = 0; addr < args->end; i++) {
args->dst[i] = nouveau_dmem_migrate_copy_one(drm, svmm,
- args->src[i], dma_addrs + nr_dma, pfns + i);
- if (!dma_mapping_error(drm->dev->dev, dma_addrs[nr_dma]))
+ args->src[i], dma_info + nr_dma, pfns + i);
+ if (!args->dst[i]) {
+ i++;
+ addr += PAGE_SIZE;
+ continue;
+ }
+ if (!dma_mapping_error(drm->dev->dev, dma_info[nr_dma].dma_addr))
nr_dma++;
- addr += PAGE_SIZE;
+ folio = page_folio(migrate_pfn_to_page(args->dst[i]));
+ order = folio_order(folio);
+ i += 1 << order;
+ addr += (1 << order) * PAGE_SIZE;
}
nouveau_fence_new(&fence, drm->dmem->migrate.chan);
migrate_vma_pages(args);
nouveau_dmem_fence_done(&fence);
- nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i);
+ nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i, order);
while (nr_dma--) {
- dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
- DMA_BIDIRECTIONAL);
+ dma_unmap_page(drm->dev->dev, dma_info[nr_dma].dma_addr,
+ dma_info[nr_dma].size, DMA_BIDIRECTIONAL);
}
migrate_vma_finalize(args);
}
@@ -689,20 +825,27 @@ nouveau_dmem_migrate_vma(struct nouveau_
unsigned long end)
{
unsigned long npages = (end - start) >> PAGE_SHIFT;
- unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
- dma_addr_t *dma_addrs;
+ unsigned long max = npages;
struct migrate_vma args = {
.vma = vma,
.start = start,
.pgmap_owner = drm->dev,
- .flags = MIGRATE_VMA_SELECT_SYSTEM,
+ .flags = MIGRATE_VMA_SELECT_SYSTEM
+ | MIGRATE_VMA_SELECT_COMPOUND,
};
unsigned long i;
u64 *pfns;
int ret = -ENOMEM;
+ struct nouveau_dmem_dma_info *dma_info;
- if (drm->dmem == NULL)
- return -ENODEV;
+ if (drm->dmem == NULL) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+ if (max > (unsigned long)HPAGE_PMD_NR)
+ max = (unsigned long)HPAGE_PMD_NR;
args.src = kcalloc(max, sizeof(*args.src), GFP_KERNEL);
if (!args.src)
@@ -711,8 +854,8 @@ nouveau_dmem_migrate_vma(struct nouveau_
if (!args.dst)
goto out_free_src;
- dma_addrs = kmalloc_array(max, sizeof(*dma_addrs), GFP_KERNEL);
- if (!dma_addrs)
+ dma_info = kmalloc_array(max, sizeof(*dma_info), GFP_KERNEL);
+ if (!dma_info)
goto out_free_dst;
pfns = nouveau_pfns_alloc(max);
@@ -730,7 +873,7 @@ nouveau_dmem_migrate_vma(struct nouveau_
goto out_free_pfns;
if (args.cpages)
- nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_addrs,
+ nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_info,
pfns);
args.start = args.end;
}
@@ -739,7 +882,7 @@ nouveau_dmem_migrate_vma(struct nouveau_
out_free_pfns:
nouveau_pfns_free(pfns);
out_free_dma:
- kfree(dma_addrs);
+ kfree(dma_info);
out_free_dst:
kfree(args.dst);
out_free_src:
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -921,12 +921,14 @@ nouveau_pfns_free(u64 *pfns)
void
nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
- unsigned long addr, u64 *pfns, unsigned long npages)
+ unsigned long addr, u64 *pfns, unsigned long npages,
+ unsigned int page_shift)
{
struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
args->p.addr = addr;
- args->p.size = npages << PAGE_SHIFT;
+ args->p.size = npages << page_shift;
+ args->p.page = page_shift;
mutex_lock(&svmm->mutex);
--- a/drivers/gpu/drm/nouveau/nouveau_svm.h~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_svm.h
@@ -33,7 +33,8 @@ void nouveau_svmm_invalidate(struct nouv
u64 *nouveau_pfns_alloc(unsigned long npages);
void nouveau_pfns_free(u64 *pfns);
void nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
- unsigned long addr, u64 *pfns, unsigned long npages);
+ unsigned long addr, u64 *pfns, unsigned long npages,
+ unsigned int page_shift);
#else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
static inline void nouveau_svm_init(struct nouveau_drm *drm) {}
static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}
_
Patches currently in -mm which might be from balbirs@nvidia.com are
mm-zone_device-support-large-zone-device-private-folios.patch
mm-zone_device-rename-page_free-callback-to-folio_free.patch
mm-huge_memory-add-device-private-thp-support-to-pmd-operations.patch
mm-rmap-extend-rmap-and-migration-support-device-private-entries.patch
mm-huge_memory-implement-device-private-thp-splitting.patch
mm-migrate_device-handle-partially-mapped-folios-during-collection.patch
mm-migrate_device-implement-thp-migration-of-zone-device-pages.patch
mm-memory-fault-add-thp-fault-handling-for-zone-device-private-pages.patch
lib-test_hmm-add-zone-device-private-thp-test-infrastructure.patch
mm-memremap-add-driver-callback-support-for-folio-splitting.patch
mm-migrate_device-add-thp-splitting-during-migration.patch
lib-test_hmm-add-large-page-allocation-failure-testing.patch
selftests-mm-hmm-tests-new-tests-for-zone-device-thp-migration.patch
selftests-mm-hmm-tests-new-throughput-tests-including-thp.patch
gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
^ permalink raw reply [flat|nested] 2+ messages in thread* + gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch added to mm-new branch
@ 2025-09-09 4:01 Andrew Morton
0 siblings, 0 replies; 2+ messages in thread
From: Andrew Morton @ 2025-09-09 4:01 UTC (permalink / raw)
To: mm-commits, ziy, ying.huang, simona, ryan.roberts, rcampbell,
rakie.kim, osalvador, npache, mpenttil, matthew.brost, lyude,
lorenzo.stoakes, Liam.Howlett, joshua.hahnjy, gourry,
francois.dugast, dev.jain, david, dakr, byungchul, baolin.wang,
baohua, apopple, airlied, balbirs, akpm
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 22970 bytes --]
The patch titled
Subject: gpu/drm/nouveau: enable THP support for GPU memory migration
has been added to the -mm mm-new branch. Its filename is
gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Balbir Singh <balbirs@nvidia.com>
Subject: gpu/drm/nouveau: enable THP support for GPU memory migration
Date: Mon, 8 Sep 2025 10:04:48 +1000
Enable MIGRATE_VMA_SELECT_COMPOUND support in nouveau driver to take
advantage of THP zone device migration capabilities.
Update migration and eviction code paths to handle compound page sizes
appropriately, improving memory bandwidth utilization and reducing
migration overhead for large GPU memory allocations.
Link: https://lkml.kernel.org/r/20250908000448.180088-16-balbirs@nvidia.com
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Ying Huang <ying.huang@linux.alibaba.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Mika Penttilä <mpenttil@redhat.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 306 ++++++++++++++++-------
drivers/gpu/drm/nouveau/nouveau_svm.c | 6
drivers/gpu/drm/nouveau/nouveau_svm.h | 3
3 files changed, 231 insertions(+), 84 deletions(-)
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -48,8 +48,9 @@
* bigger page size) at lowest level and have some shim layer on top that would
* provide the same functionality as TTM.
*/
-#define DMEM_CHUNK_SIZE (2UL << 20)
+#define DMEM_CHUNK_SIZE (HPAGE_PMD_SIZE)
#define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT)
+#define NR_CHUNKS (128)
enum nouveau_aper {
NOUVEAU_APER_VIRT,
@@ -83,9 +84,15 @@ struct nouveau_dmem {
struct list_head chunks;
struct mutex mutex;
struct page *free_pages;
+ struct folio *free_folios;
spinlock_t lock;
};
+struct nouveau_dmem_dma_info {
+ dma_addr_t dma_addr;
+ size_t size;
+};
+
static struct nouveau_dmem_chunk *nouveau_page_to_chunk(struct page *page)
{
return container_of(page_pgmap(page), struct nouveau_dmem_chunk,
@@ -112,10 +119,16 @@ static void nouveau_dmem_page_free(struc
{
struct nouveau_dmem_chunk *chunk = nouveau_page_to_chunk(page);
struct nouveau_dmem *dmem = chunk->drm->dmem;
+ struct folio *folio = page_folio(page);
spin_lock(&dmem->lock);
- page->zone_device_data = dmem->free_pages;
- dmem->free_pages = page;
+ if (folio_order(folio)) {
+ page->zone_device_data = dmem->free_folios;
+ dmem->free_folios = folio;
+ } else {
+ page->zone_device_data = dmem->free_pages;
+ dmem->free_pages = page;
+ }
WARN_ON(!chunk->callocated);
chunk->callocated--;
@@ -139,20 +152,28 @@ static void nouveau_dmem_fence_done(stru
}
}
-static int nouveau_dmem_copy_one(struct nouveau_drm *drm, struct page *spage,
- struct page *dpage, dma_addr_t *dma_addr)
+static int nouveau_dmem_copy_folio(struct nouveau_drm *drm,
+ struct folio *sfolio, struct folio *dfolio,
+ struct nouveau_dmem_dma_info *dma_info)
{
struct device *dev = drm->dev->dev;
+ struct page *dpage = folio_page(dfolio, 0);
+ struct page *spage = folio_page(sfolio, 0);
- lock_page(dpage);
+ folio_lock(dfolio);
- *dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
- if (dma_mapping_error(dev, *dma_addr))
+ dma_info->dma_addr = dma_map_page(dev, dpage, 0, page_size(dpage),
+ DMA_BIDIRECTIONAL);
+ dma_info->size = page_size(dpage);
+ if (dma_mapping_error(dev, dma_info->dma_addr))
return -EIO;
- if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr,
- NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage))) {
- dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ if (drm->dmem->migrate.copy_func(drm, folio_nr_pages(sfolio),
+ NOUVEAU_APER_HOST, dma_info->dma_addr,
+ NOUVEAU_APER_VRAM,
+ nouveau_dmem_page_addr(spage))) {
+ dma_unmap_page(dev, dma_info->dma_addr, page_size(dpage),
+ DMA_BIDIRECTIONAL);
return -EIO;
}
@@ -165,22 +186,48 @@ static vm_fault_t nouveau_dmem_migrate_t
struct nouveau_dmem *dmem = drm->dmem;
struct nouveau_fence *fence;
struct nouveau_svmm *svmm;
- struct page *spage, *dpage;
- unsigned long src = 0, dst = 0;
- dma_addr_t dma_addr = 0;
+ struct page *dpage;
vm_fault_t ret = 0;
struct migrate_vma args = {
.vma = vmf->vma,
- .start = vmf->address,
- .end = vmf->address + PAGE_SIZE,
- .src = &src,
- .dst = &dst,
.pgmap_owner = drm->dev,
.fault_page = vmf->page,
- .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE,
+ .flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE |
+ MIGRATE_VMA_SELECT_COMPOUND,
+ .src = NULL,
+ .dst = NULL,
};
+ unsigned int order, nr;
+ struct folio *sfolio, *dfolio;
+ struct nouveau_dmem_dma_info dma_info;
+
+ sfolio = page_folio(vmf->page);
+ order = folio_order(sfolio);
+ nr = 1 << order;
/*
+ * Handle partial unmap faults, where the folio is large, but
+ * the pmd is split.
+ */
+ if (vmf->pte) {
+ order = 0;
+ nr = 1;
+ }
+
+ if (order)
+ args.flags |= MIGRATE_VMA_SELECT_COMPOUND;
+
+ args.start = ALIGN_DOWN(vmf->address, (PAGE_SIZE << order));
+ args.vma = vmf->vma;
+ args.end = args.start + (PAGE_SIZE << order);
+ args.src = kcalloc(nr, sizeof(*args.src), GFP_KERNEL);
+ args.dst = kcalloc(nr, sizeof(*args.dst), GFP_KERNEL);
+
+ if (!args.src || !args.dst) {
+ ret = VM_FAULT_OOM;
+ goto err;
+ }
+ /*
* FIXME what we really want is to find some heuristic to migrate more
* than just one page on CPU fault. When such fault happens it is very
* likely that more surrounding page will CPU fault too.
@@ -190,20 +237,26 @@ static vm_fault_t nouveau_dmem_migrate_t
if (!args.cpages)
return 0;
- spage = migrate_pfn_to_page(src);
- if (!spage || !(src & MIGRATE_PFN_MIGRATE))
- goto done;
-
- dpage = alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vmf->vma, vmf->address);
- if (!dpage)
+ if (order)
+ dpage = folio_page(vma_alloc_folio(GFP_HIGHUSER | __GFP_ZERO,
+ order, vmf->vma, vmf->address), 0);
+ else
+ dpage = alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO, vmf->vma,
+ vmf->address);
+ if (!dpage) {
+ ret = VM_FAULT_OOM;
goto done;
+ }
- dst = migrate_pfn(page_to_pfn(dpage));
+ args.dst[0] = migrate_pfn(page_to_pfn(dpage));
+ if (order)
+ args.dst[0] |= MIGRATE_PFN_COMPOUND;
+ dfolio = page_folio(dpage);
- svmm = spage->zone_device_data;
+ svmm = folio_zone_device_data(sfolio);
mutex_lock(&svmm->mutex);
nouveau_svmm_invalidate(svmm, args.start, args.end);
- ret = nouveau_dmem_copy_one(drm, spage, dpage, &dma_addr);
+ ret = nouveau_dmem_copy_folio(drm, sfolio, dfolio, &dma_info);
mutex_unlock(&svmm->mutex);
if (ret) {
ret = VM_FAULT_SIGBUS;
@@ -213,25 +266,40 @@ static vm_fault_t nouveau_dmem_migrate_t
nouveau_fence_new(&fence, dmem->migrate.chan);
migrate_vma_pages(&args);
nouveau_dmem_fence_done(&fence);
- dma_unmap_page(drm->dev->dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ dma_unmap_page(drm->dev->dev, dma_info.dma_addr, PAGE_SIZE,
+ DMA_BIDIRECTIONAL);
done:
migrate_vma_finalize(&args);
+err:
+ kfree(args.src);
+ kfree(args.dst);
return ret;
}
+static void nouveau_dmem_folio_split(struct folio *head, struct folio *tail)
+{
+ if (tail == NULL)
+ return;
+ tail->pgmap = head->pgmap;
+ tail->mapping = head->mapping;
+ folio_set_zone_device_data(tail, folio_zone_device_data(head));
+}
+
static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = {
.page_free = nouveau_dmem_page_free,
.migrate_to_ram = nouveau_dmem_migrate_to_ram,
+ .folio_split = nouveau_dmem_folio_split,
};
static int
-nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage)
+nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct page **ppage,
+ bool is_large)
{
struct nouveau_dmem_chunk *chunk;
struct resource *res;
struct page *page;
void *ptr;
- unsigned long i, pfn_first;
+ unsigned long i, pfn_first, pfn;
int ret;
chunk = kzalloc(sizeof(*chunk), GFP_KERNEL);
@@ -241,7 +309,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_
}
/* Allocate unused physical address space for device private pages. */
- res = request_free_mem_region(&iomem_resource, DMEM_CHUNK_SIZE,
+ res = request_free_mem_region(&iomem_resource, DMEM_CHUNK_SIZE * NR_CHUNKS,
"nouveau_dmem");
if (IS_ERR(res)) {
ret = PTR_ERR(res);
@@ -274,16 +342,40 @@ nouveau_dmem_chunk_alloc(struct nouveau_
pfn_first = chunk->pagemap.range.start >> PAGE_SHIFT;
page = pfn_to_page(pfn_first);
spin_lock(&drm->dmem->lock);
- for (i = 0; i < DMEM_CHUNK_NPAGES - 1; ++i, ++page) {
- page->zone_device_data = drm->dmem->free_pages;
- drm->dmem->free_pages = page;
+
+ pfn = pfn_first;
+ for (i = 0; i < NR_CHUNKS; i++) {
+ int j;
+
+ if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) || !is_large) {
+ for (j = 0; j < DMEM_CHUNK_NPAGES - 1; j++, pfn++) {
+ page = pfn_to_page(pfn);
+ page->zone_device_data = drm->dmem->free_pages;
+ drm->dmem->free_pages = page;
+ }
+ } else {
+ page = pfn_to_page(pfn);
+ page->zone_device_data = drm->dmem->free_folios;
+ drm->dmem->free_folios = page_folio(page);
+ pfn += DMEM_CHUNK_NPAGES;
+ }
+ }
+
+ /* Move to next page */
+ if (is_large) {
+ *ppage = &drm->dmem->free_folios->page;
+ drm->dmem->free_folios = (*ppage)->zone_device_data;
+ } else {
+ *ppage = drm->dmem->free_pages;
+ drm->dmem->free_pages = (*ppage)->zone_device_data;
}
- *ppage = page;
+
chunk->callocated++;
spin_unlock(&drm->dmem->lock);
- NV_INFO(drm, "DMEM: registered %ldMB of device memory\n",
- DMEM_CHUNK_SIZE >> 20);
+ NV_INFO(drm, "DMEM: registered %ldMB of %sdevice memory %lx %lx\n",
+ NR_CHUNKS * DMEM_CHUNK_SIZE >> 20, is_large ? "THP " : "", pfn_first,
+ nouveau_dmem_page_addr(page));
return 0;
@@ -298,27 +390,41 @@ out:
}
static struct page *
-nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm)
+nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm, bool is_large)
{
struct nouveau_dmem_chunk *chunk;
struct page *page = NULL;
+ struct folio *folio = NULL;
int ret;
+ unsigned int order = 0;
spin_lock(&drm->dmem->lock);
- if (drm->dmem->free_pages) {
+ if (is_large && drm->dmem->free_folios) {
+ folio = drm->dmem->free_folios;
+ page = &folio->page;
+ drm->dmem->free_folios = page->zone_device_data;
+ chunk = nouveau_page_to_chunk(&folio->page);
+ chunk->callocated++;
+ spin_unlock(&drm->dmem->lock);
+ order = ilog2(DMEM_CHUNK_NPAGES);
+ } else if (!is_large && drm->dmem->free_pages) {
page = drm->dmem->free_pages;
drm->dmem->free_pages = page->zone_device_data;
chunk = nouveau_page_to_chunk(page);
chunk->callocated++;
spin_unlock(&drm->dmem->lock);
+ folio = page_folio(page);
} else {
spin_unlock(&drm->dmem->lock);
- ret = nouveau_dmem_chunk_alloc(drm, &page);
+ ret = nouveau_dmem_chunk_alloc(drm, &page, is_large);
if (ret)
return NULL;
+ folio = page_folio(page);
+ if (is_large)
+ order = ilog2(DMEM_CHUNK_NPAGES);
}
- zone_device_page_init(page);
+ zone_device_folio_init(folio, order);
return page;
}
@@ -369,12 +475,12 @@ nouveau_dmem_evict_chunk(struct nouveau_
{
unsigned long i, npages = range_len(&chunk->pagemap.range) >> PAGE_SHIFT;
unsigned long *src_pfns, *dst_pfns;
- dma_addr_t *dma_addrs;
+ struct nouveau_dmem_dma_info *dma_info;
struct nouveau_fence *fence;
src_pfns = kvcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
dst_pfns = kvcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
- dma_addrs = kvcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL);
+ dma_info = kvcalloc(npages, sizeof(*dma_info), GFP_KERNEL | __GFP_NOFAIL);
migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
npages);
@@ -382,17 +488,28 @@ nouveau_dmem_evict_chunk(struct nouveau_
for (i = 0; i < npages; i++) {
if (src_pfns[i] & MIGRATE_PFN_MIGRATE) {
struct page *dpage;
+ struct folio *folio = page_folio(
+ migrate_pfn_to_page(src_pfns[i]));
+ unsigned int order = folio_order(folio);
+
+ if (src_pfns[i] & MIGRATE_PFN_COMPOUND) {
+ dpage = folio_page(
+ folio_alloc(
+ GFP_HIGHUSER_MOVABLE, order), 0);
+ } else {
+ /*
+ * _GFP_NOFAIL because the GPU is going away and there
+ * is nothing sensible we can do if we can't copy the
+ * data back.
+ */
+ dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
+ }
- /*
- * _GFP_NOFAIL because the GPU is going away and there
- * is nothing sensible we can do if we can't copy the
- * data back.
- */
- dpage = alloc_page(GFP_HIGHUSER | __GFP_NOFAIL);
dst_pfns[i] = migrate_pfn(page_to_pfn(dpage));
- nouveau_dmem_copy_one(chunk->drm,
- migrate_pfn_to_page(src_pfns[i]), dpage,
- &dma_addrs[i]);
+ nouveau_dmem_copy_folio(chunk->drm,
+ page_folio(migrate_pfn_to_page(src_pfns[i])),
+ page_folio(dpage),
+ &dma_info[i]);
}
}
@@ -403,8 +520,9 @@ nouveau_dmem_evict_chunk(struct nouveau_
kvfree(src_pfns);
kvfree(dst_pfns);
for (i = 0; i < npages; i++)
- dma_unmap_page(chunk->drm->dev->dev, dma_addrs[i], PAGE_SIZE, DMA_BIDIRECTIONAL);
- kvfree(dma_addrs);
+ dma_unmap_page(chunk->drm->dev->dev, dma_info[i].dma_addr,
+ dma_info[i].size, DMA_BIDIRECTIONAL);
+ kvfree(dma_info);
}
void
@@ -607,31 +725,36 @@ nouveau_dmem_init(struct nouveau_drm *dr
static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
struct nouveau_svmm *svmm, unsigned long src,
- dma_addr_t *dma_addr, u64 *pfn)
+ struct nouveau_dmem_dma_info *dma_info, u64 *pfn)
{
struct device *dev = drm->dev->dev;
struct page *dpage, *spage;
unsigned long paddr;
+ bool is_large = false;
+ unsigned long mpfn;
spage = migrate_pfn_to_page(src);
if (!(src & MIGRATE_PFN_MIGRATE))
goto out;
- dpage = nouveau_dmem_page_alloc_locked(drm);
+ is_large = src & MIGRATE_PFN_COMPOUND;
+ dpage = nouveau_dmem_page_alloc_locked(drm, is_large);
if (!dpage)
goto out;
paddr = nouveau_dmem_page_addr(dpage);
if (spage) {
- *dma_addr = dma_map_page(dev, spage, 0, page_size(spage),
+ dma_info->dma_addr = dma_map_page(dev, spage, 0, page_size(spage),
DMA_BIDIRECTIONAL);
- if (dma_mapping_error(dev, *dma_addr))
+ dma_info->size = page_size(spage);
+ if (dma_mapping_error(dev, dma_info->dma_addr))
goto out_free_page;
- if (drm->dmem->migrate.copy_func(drm, 1,
- NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST, *dma_addr))
+ if (drm->dmem->migrate.copy_func(drm, folio_nr_pages(page_folio(spage)),
+ NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST,
+ dma_info->dma_addr))
goto out_dma_unmap;
} else {
- *dma_addr = DMA_MAPPING_ERROR;
+ dma_info->dma_addr = DMA_MAPPING_ERROR;
if (drm->dmem->migrate.clear_func(drm, page_size(dpage),
NOUVEAU_APER_VRAM, paddr))
goto out_free_page;
@@ -642,10 +765,13 @@ static unsigned long nouveau_dmem_migrat
((paddr >> PAGE_SHIFT) << NVIF_VMM_PFNMAP_V0_ADDR_SHIFT);
if (src & MIGRATE_PFN_WRITE)
*pfn |= NVIF_VMM_PFNMAP_V0_W;
- return migrate_pfn(page_to_pfn(dpage));
+ mpfn = migrate_pfn(page_to_pfn(dpage));
+ if (folio_order(page_folio(dpage)))
+ mpfn |= MIGRATE_PFN_COMPOUND;
+ return mpfn;
out_dma_unmap:
- dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
+ dma_unmap_page(dev, dma_info->dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL);
out_free_page:
nouveau_dmem_page_free_locked(drm, dpage);
out:
@@ -655,27 +781,38 @@ out:
static void nouveau_dmem_migrate_chunk(struct nouveau_drm *drm,
struct nouveau_svmm *svmm, struct migrate_vma *args,
- dma_addr_t *dma_addrs, u64 *pfns)
+ struct nouveau_dmem_dma_info *dma_info, u64 *pfns)
{
struct nouveau_fence *fence;
unsigned long addr = args->start, nr_dma = 0, i;
+ unsigned long order = 0;
+
+ for (i = 0; addr < args->end; ) {
+ struct folio *folio;
- for (i = 0; addr < args->end; i++) {
args->dst[i] = nouveau_dmem_migrate_copy_one(drm, svmm,
- args->src[i], dma_addrs + nr_dma, pfns + i);
- if (!dma_mapping_error(drm->dev->dev, dma_addrs[nr_dma]))
+ args->src[i], dma_info + nr_dma, pfns + i);
+ if (!args->dst[i]) {
+ i++;
+ addr += PAGE_SIZE;
+ continue;
+ }
+ if (!dma_mapping_error(drm->dev->dev, dma_info[nr_dma].dma_addr))
nr_dma++;
- addr += PAGE_SIZE;
+ folio = page_folio(migrate_pfn_to_page(args->dst[i]));
+ order = folio_order(folio);
+ i += 1 << order;
+ addr += (1 << order) * PAGE_SIZE;
}
nouveau_fence_new(&fence, drm->dmem->migrate.chan);
migrate_vma_pages(args);
nouveau_dmem_fence_done(&fence);
- nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i);
+ nouveau_pfns_map(svmm, args->vma->vm_mm, args->start, pfns, i, order);
while (nr_dma--) {
- dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE,
- DMA_BIDIRECTIONAL);
+ dma_unmap_page(drm->dev->dev, dma_info[nr_dma].dma_addr,
+ dma_info[nr_dma].size, DMA_BIDIRECTIONAL);
}
migrate_vma_finalize(args);
}
@@ -688,20 +825,27 @@ nouveau_dmem_migrate_vma(struct nouveau_
unsigned long end)
{
unsigned long npages = (end - start) >> PAGE_SHIFT;
- unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages);
- dma_addr_t *dma_addrs;
+ unsigned long max = npages;
struct migrate_vma args = {
.vma = vma,
.start = start,
.pgmap_owner = drm->dev,
- .flags = MIGRATE_VMA_SELECT_SYSTEM,
+ .flags = MIGRATE_VMA_SELECT_SYSTEM
+ | MIGRATE_VMA_SELECT_COMPOUND,
};
unsigned long i;
u64 *pfns;
int ret = -ENOMEM;
+ struct nouveau_dmem_dma_info *dma_info;
- if (drm->dmem == NULL)
- return -ENODEV;
+ if (drm->dmem == NULL) {
+ ret = -ENODEV;
+ goto out;
+ }
+
+ if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
+ if (max > (unsigned long)HPAGE_PMD_NR)
+ max = (unsigned long)HPAGE_PMD_NR;
args.src = kcalloc(max, sizeof(*args.src), GFP_KERNEL);
if (!args.src)
@@ -710,8 +854,8 @@ nouveau_dmem_migrate_vma(struct nouveau_
if (!args.dst)
goto out_free_src;
- dma_addrs = kmalloc_array(max, sizeof(*dma_addrs), GFP_KERNEL);
- if (!dma_addrs)
+ dma_info = kmalloc_array(max, sizeof(*dma_info), GFP_KERNEL);
+ if (!dma_info)
goto out_free_dst;
pfns = nouveau_pfns_alloc(max);
@@ -729,7 +873,7 @@ nouveau_dmem_migrate_vma(struct nouveau_
goto out_free_pfns;
if (args.cpages)
- nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_addrs,
+ nouveau_dmem_migrate_chunk(drm, svmm, &args, dma_info,
pfns);
args.start = args.end;
}
@@ -738,7 +882,7 @@ nouveau_dmem_migrate_vma(struct nouveau_
out_free_pfns:
nouveau_pfns_free(pfns);
out_free_dma:
- kfree(dma_addrs);
+ kfree(dma_info);
out_free_dst:
kfree(args.dst);
out_free_src:
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -921,12 +921,14 @@ nouveau_pfns_free(u64 *pfns)
void
nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
- unsigned long addr, u64 *pfns, unsigned long npages)
+ unsigned long addr, u64 *pfns, unsigned long npages,
+ unsigned int page_shift)
{
struct nouveau_pfnmap_args *args = nouveau_pfns_to_args(pfns);
args->p.addr = addr;
- args->p.size = npages << PAGE_SHIFT;
+ args->p.size = npages << page_shift;
+ args->p.page = page_shift;
mutex_lock(&svmm->mutex);
--- a/drivers/gpu/drm/nouveau/nouveau_svm.h~gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration
+++ a/drivers/gpu/drm/nouveau/nouveau_svm.h
@@ -33,7 +33,8 @@ void nouveau_svmm_invalidate(struct nouv
u64 *nouveau_pfns_alloc(unsigned long npages);
void nouveau_pfns_free(u64 *pfns);
void nouveau_pfns_map(struct nouveau_svmm *svmm, struct mm_struct *mm,
- unsigned long addr, u64 *pfns, unsigned long npages);
+ unsigned long addr, u64 *pfns, unsigned long npages,
+ unsigned int page_shift);
#else /* IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM) */
static inline void nouveau_svm_init(struct nouveau_drm *drm) {}
static inline void nouveau_svm_fini(struct nouveau_drm *drm) {}
_
Patches currently in -mm which might be from balbirs@nvidia.com are
mm-zone_device-support-large-zone-device-private-folios.patch
mm-huge_memory-add-device-private-thp-support-to-pmd-operations.patch
mm-rmap-extend-rmap-and-migration-support-device-private-entries.patch
mm-huge_memory-implement-device-private-thp-splitting.patch
mm-migrate_device-handle-partially-mapped-folios-during-collection.patch
mm-migrate_device-implement-thp-migration-of-zone-device-pages.patch
mm-memory-fault-add-thp-fault-handling-for-zone-device-private-pages.patch
lib-test_hmm-add-zone-device-private-thp-test-infrastructure.patch
mm-memremap-add-driver-callback-support-for-folio-splitting.patch
mm-migrate_device-add-thp-splitting-during-migration.patch
lib-test_hmm-add-large-page-allocation-failure-testing.patch
selftests-mm-hmm-tests-new-tests-for-zone-device-thp-migration.patch
selftests-mm-hmm-tests-new-throughput-tests-including-thp.patch
gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-10-09 3:19 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-09 3:19 + gpu-drm-nouveau-enable-thp-support-for-gpu-memory-migration.patch added to mm-new branch Andrew Morton
-- strict thread matches above, loose matches on Subject: below --
2025-09-09 4:01 Andrew Morton
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.