From: Matthew Brost <matthew.brost@intel.com>
To: Matthew Auld <matthew.auld@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
himal.prasad.ghimiray@intel.com, apopple@nvidia.com,
airlied@gmail.com, simona.vetter@ffwll.ch,
felix.kuehling@amd.com, dakr@kernel.org
Subject: Re: [PATCH v5 27/32] drm/xe: Add SVM VRAM migration
Date: Fri, 21 Feb 2025 07:22:56 -0800 [thread overview]
Message-ID: <Z7iaUA92rdgcQ/1s@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <e8753d24-a0bd-4f53-bf56-d72475cb73ee@intel.com>
On Fri, Feb 21, 2025 at 03:15:38PM +0000, Matthew Auld wrote:
> On 20/02/2025 19:55, Matthew Brost wrote:
> > On Thu, Feb 20, 2025 at 04:59:29PM +0100, Thomas Hellström wrote:
> > > On Thu, 2025-02-20 at 15:53 +0000, Matthew Auld wrote:
> > > > On 13/02/2025 02:11, Matthew Brost wrote:
> > > > > Migration is implemented with range granularity, with VRAM backing
> > > > > being
> > > > > a VM private TTM BO (i.e., shares dma-resv with VM). The lifetime
> > > > > of the
> > > > > TTM BO is limited to when the SVM range is in VRAM (i.e., when a
> > > > > VRAM
> > > > > SVM range is migrated to SRAM, the TTM BO is destroyed).
> > > > >
> > > > > The design choice for using TTM BO for VRAM backing store, as
> > > > > opposed to
> > > > > direct buddy allocation, is as follows:
> > > > >
> > > > > - DRM buddy allocations are not at page granularity, offering no
> > > > > advantage over a BO.
> > > > > - Unified eviction is required (SVM VRAM and TTM BOs need to be
> > > > > able to
> > > > > evict each other).
> > > > > - For exhaustive eviction [1], SVM VRAM allocations will almost
> > > > > certainly
> > > > > require a dma-resv.
> > > > > - Likely allocation size is 2M which makes of size of BO (872)
> > > > > acceptable per allocation (872 / 2M == .0004158).
> > > > >
> > > > > With this, using TTM BO for VRAM backing store seems to be an
> > > > > obvious
> > > > > choice as it allows leveraging of the TTM eviction code.
> > > > >
> > > > > Current migration policy is migrate any SVM range greater than or
> > > > > equal
> > > > > to 64k once.
> > > > >
> > > > > [1] https://patchwork.freedesktop.org/series/133643/
> > > > >
> > > > > v2:
> > > > > - Rebase on latest GPU SVM
> > > > > - Retry page fault on get pages returning mixed allocation
> > > > > - Use drm_gpusvm_devmem
> > > > > v3:
> > > > > - Use new BO flags
> > > > > - New range structure (Thomas)
> > > > > - Hide migration behind Kconfig
> > > > > - Kernel doc (Thomas)
> > > > > - Use check_pages_threshold
> > > > > v4:
> > > > > - Don't evict partial unmaps in garbage collector (Thomas)
> > > > > - Use %pe to print errors (Thomas)
> > > > > - Use %p to print pointers (Thomas)
> > > > > v5:
> > > > > - Use range size helper (Thomas)
> > > > > - Make BO external (Thomas)
> > > > > - Set tile to NULL for BO creation (Thomas)
> > > > > - Drop BO mirror flag (Thomas)
> > > > > - Hold BO dma-resv lock across migration (Auld, Thomas)
> > > > >
> > > > > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_svm.c | 111
> > > > > ++++++++++++++++++++++++++++++++++--
> > > > > drivers/gpu/drm/xe/xe_svm.h | 5 ++
> > > > > 2 files changed, 112 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > > > > b/drivers/gpu/drm/xe/xe_svm.c
> > > > > index 0a78a838508c..2e1e0f31c1a8 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_svm.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > > > > @@ -32,6 +32,11 @@ static unsigned long xe_svm_range_end(struct
> > > > > xe_svm_range *range)
> > > > > return drm_gpusvm_range_end(&range->base);
> > > > > }
> > > > > +static unsigned long xe_svm_range_size(struct xe_svm_range *range)
> > > > > +{
> > > > > + return drm_gpusvm_range_size(&range->base);
> > > > > +}
> > > > > +
> > > > > static void *xe_svm_devm_owner(struct xe_device *xe)
> > > > > {
> > > > > return xe;
> > > > > @@ -512,7 +517,6 @@ static int xe_svm_populate_devmem_pfn(struct
> > > > > drm_gpusvm_devmem *devmem_allocatio
> > > > > return 0;
> > > > > }
> > > > > -__maybe_unused
> > > > > static const struct drm_gpusvm_devmem_ops gpusvm_devmem_ops = {
> > > > > .devmem_release = xe_svm_devmem_release,
> > > > > .populate_devmem_pfn = xe_svm_populate_devmem_pfn,
> > > > > @@ -592,6 +596,71 @@ static bool xe_svm_range_is_valid(struct
> > > > > xe_svm_range *range,
> > > > > return (range->tile_present & ~range->tile_invalidated) &
> > > > > BIT(tile->id);
> > > > > }
> > > > > +static struct xe_vram_region *tile_to_vr(struct xe_tile *tile)
> > > > > +{
> > > > > + return &tile->mem.vram;
> > > > > +}
> > > > > +
> > > > > +static struct xe_bo *xe_svm_alloc_vram(struct xe_vm *vm, struct
> > > > > xe_tile *tile,
> > > > > + struct xe_svm_range *range,
> > > > > + const struct drm_gpusvm_ctx
> > > > > *ctx)
> > > > > +{
> > > > > + struct mm_struct *mm = vm->svm.gpusvm.mm;
> > > > > + struct xe_vram_region *vr = tile_to_vr(tile);
> > > > > + struct drm_buddy_block *block;
> > > > > + struct list_head *blocks;
> > > > > + struct xe_bo *bo;
> > > > > + ktime_t end = 0;
> > > > > + int err;
> > > > > +
> > > > > + if (!mmget_not_zero(mm))
> > > > > + return ERR_PTR(-EFAULT);
> > > > > + mmap_read_lock(mm);
> > > > > +
> > > > > +retry:
> > > > > + bo = xe_bo_create_locked(tile_to_xe(tile), NULL, NULL,
> > > > > + xe_svm_range_size(range),
> > > > > + ttm_bo_type_device,
> > > > > + XE_BO_FLAG_VRAM_IF_DGFX(tile));
> > > >
> > > > Just to confirm, there is nothing scary with the vram still
> > > > potentially
> > > > being used by the GPU at this point (like with an async eviction +
> > > > clear
> > > > op), right? At some point we have some kind of synchronisation before
> > > > the user can touch this memory?
> > >
> > > Good point. I don't think there is.
> > >
> >
> > Agree - there shouldn't be anything scary happening here. The new VRAM
> > is allocated from buddy which doesn't have dma-resv attached to it (and
> > thus no outstanding fences) and creating a new dma-resv object. A clear
> > is issued and then we do an immediate copy but those operations are
> > serialized on the same queue.
>
> So there is basically always a copy operation after this? Is is not possible
> to have completely empty entries on the CPU side such that there is nothing
> to actually copy?
>
It is possible for a copy to not be issued if the CPU has yet to fault
in the pages. In that case only the clear is issued and bind job waits
on the clear via dma-resv fences in the KERNEL slots.
For the full picture, if a copy is issued it directly waited on in
migration code as we can't release the CPU pages until the copy
completes.
Matt
> >
> > Matt
> >
> > > >
> > > > > + if (IS_ERR(bo)) {
> > > > > + err = PTR_ERR(bo);
> > > > > + if (xe_vm_validate_should_retry(NULL, err, &end))
> > > > > + goto retry;
> > > > > + goto unlock;
> > > > > + }
> > > > > +
> > > > > + drm_gpusvm_devmem_init(&bo->devmem_allocation,
> > > > > + vm->xe->drm.dev, mm,
> > > > > + &gpusvm_devmem_ops,
> > > > > + &tile->mem.vram.dpagemap,
> > > > > + xe_svm_range_size(range));
> > > > > +
> > > > > + blocks = &to_xe_ttm_vram_mgr_resource(bo->ttm.resource)-
> > > > > > blocks;
> > > > > + list_for_each_entry(block, blocks, link)
> > > > > + block->private = vr;
> > > > > +
> > > > > + /*
> > > > > + * Take ref because as soon as
> > > > > drm_gpusvm_migrate_to_devmem succeeds the
> > > > > + * creation ref can be dropped upon CPU fault or unmap.
> > > > > + */
> > > > > + xe_bo_get(bo);
> > > > > +
> > > > > + err = drm_gpusvm_migrate_to_devmem(&vm->svm.gpusvm,
> > > > > &range->base,
> > > > > + &bo->devmem_allocation,
> > > > > ctx);
> > > > > + xe_bo_unlock(bo);
> > > > > + if (err) {
> > > > > + xe_bo_put(bo); /* Local ref */
> > > > > + xe_bo_put(bo); /* Creation ref */
> > > > > + bo = ERR_PTR(err);
> > > > > + }
> > > > > +
> > > > > +unlock:
> > > > > + mmap_read_unlock(mm);
> > > > > + mmput(mm);
> > > > > +
> > > > > + return bo;
> > > > > +}
> > > > > +
> > > > > /**
> > > > > * xe_svm_handle_pagefault() - SVM handle page fault
> > > > > * @vm: The VM.
> > > > > @@ -600,7 +669,8 @@ static bool xe_svm_range_is_valid(struct
> > > > > xe_svm_range *range,
> > > > > * @fault_addr: The GPU fault address.
> > > > > * @atomic: The fault atomic access bit.
> > > > > *
> > > > > - * Create GPU bindings for a SVM page fault.
> > > > > + * Create GPU bindings for a SVM page fault. Optionally migrate to
> > > > > device
> > > > > + * memory.
> > > > > *
> > > > > * Return: 0 on success, negative error code on error.
> > > > > */
> > > > > @@ -608,11 +678,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > > > struct xe_vma *vma,
> > > > > struct xe_tile *tile, u64 fault_addr,
> > > > > bool atomic)
> > > > > {
> > > > > - struct drm_gpusvm_ctx ctx = { .read_only =
> > > > > xe_vma_read_only(vma), };
> > > > > + struct drm_gpusvm_ctx ctx = {
> > > > > + .read_only = xe_vma_read_only(vma),
> > > > > + .devmem_possible = IS_DGFX(vm->xe) &&
> > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR),
> > > > > + .check_pages_threshold = IS_DGFX(vm->xe) &&
> > > > > + IS_ENABLED(CONFIG_DRM_XE_DEVMEM_MIRROR) ?
> > > > > SZ_64K : 0,
> > > > > + };
> > > > > struct xe_svm_range *range;
> > > > > struct drm_gpusvm_range *r;
> > > > > struct drm_exec exec;
> > > > > struct dma_fence *fence;
> > > > > + struct xe_bo *bo = NULL;
> > > > > ktime_t end = 0;
> > > > > int err;
> > > > > @@ -620,6 +697,9 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > > > struct xe_vma *vma,
> > > > > xe_assert(vm->xe, xe_vma_is_cpu_addr_mirror(vma));
> > > > > retry:
> > > > > + xe_bo_put(bo);
> > > > > + bo = NULL;
> > > > > +
> > > > > /* Always process UNMAPs first so view SVM ranges is
> > > > > current */
> > > > > err = xe_svm_garbage_collector(vm);
> > > > > if (err)
> > > > > @@ -635,9 +715,31 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > > > struct xe_vma *vma,
> > > > > if (xe_svm_range_is_valid(range, tile))
> > > > > return 0;
> > > > > + /* XXX: Add migration policy, for now migrate range once
> > > > > */
> > > > > + if (!range->migrated && range->base.flags.migrate_devmem
> > > > > &&
> > > > > + xe_svm_range_size(range) >= SZ_64K) {
> > > > > + range->migrated = true;
> > > > > +
> > > > > + bo = xe_svm_alloc_vram(vm, tile, range, &ctx);
> > > > > + if (IS_ERR(bo)) {
> > > > > + drm_info(&vm->xe->drm,
> > > > > + "VRAM allocation failed, falling
> > > > > back to retrying, asid=%u, errno %pe\n",
> > > > > + vm->usm.asid, bo);
> > > > > + bo = NULL;
> > > > > + goto retry;
> > > > > + }
> > > > > + }
> > > > > +
> > > > > err = drm_gpusvm_range_get_pages(&vm->svm.gpusvm, r,
> > > > > &ctx);
> > > > > - if (err == -EFAULT || err == -EPERM) /* Corner where
> > > > > CPU mappings have changed */
> > > > > + /* Corner where CPU mappings have changed */
> > > > > + if (err == -EOPNOTSUPP || err == -EFAULT || err == -EPERM)
> > > > > {
> > > > > + if (err == -EOPNOTSUPP)
> > > > > + drm_gpusvm_range_evict(&vm->svm.gpusvm,
> > > > > &range->base);
> > > > > + drm_info(&vm->xe->drm,
> > > > > + "Get pages failed, falling back to
> > > > > retrying, asid=%u, gpusvm=%p, errno %pe\n",
> > > > > + vm->usm.asid, &vm->svm.gpusvm,
> > > > > ERR_PTR(err));
> > > > > goto retry;
> > > > > + }
> > > > > if (err)
> > > > > goto err_out;
> > > > > @@ -668,6 +770,7 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> > > > > struct xe_vma *vma,
> > > > > dma_fence_put(fence);
> > > > > err_out:
> > > > > + xe_bo_put(bo);
> > > > > return err;
> > > > > }
> > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > > > > b/drivers/gpu/drm/xe/xe_svm.h
> > > > > index 0fa525d34987..984a61651d9e 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_svm.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > > > > @@ -35,6 +35,11 @@ struct xe_svm_range {
> > > > > * range. Protected by GPU SVM notifier lock.
> > > > > */
> > > > > u8 tile_invalidated;
> > > > > + /**
> > > > > + * @migrated: Range has been migrated to device memory,
> > > > > protected by
> > > > > + * GPU fault handler locking.
> > > > > + */
> > > > > + u8 migrated :1;
> > > > > };
> > > > > int xe_devm_add(struct xe_tile *tile, struct xe_vram_region *vr);
> > > >
> > >
>
next prev parent reply other threads:[~2025-02-21 15:22 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 2:10 [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation Matthew Brost
2025-02-13 2:10 ` [PATCH v5 01/32] drm/xe: Retry BO allocation Matthew Brost
2025-02-13 2:10 ` [PATCH v5 02/32] mm/migrate: Add migrate_device_pfns Matthew Brost
2025-02-13 2:10 ` [PATCH v5 03/32] mm/migrate: Trylock device page in do_swap_page Matthew Brost
2025-02-19 5:36 ` Alistair Popple
2025-02-19 6:08 ` Matthew Brost
2025-02-19 6:25 ` Alistair Popple
2025-02-20 13:28 ` Gwan-gyeong Mun
2025-02-20 20:03 ` Matthew Brost
2025-02-13 2:10 ` [PATCH v5 04/32] drm/pagemap: Add DRM pagemap Matthew Brost
2025-02-20 13:53 ` Gwan-gyeong Mun
2025-02-13 2:10 ` [PATCH v5 05/32] drm/xe/bo: Introduce xe_bo_put_async Matthew Brost
2025-02-14 9:52 ` Ghimiray, Himal Prasad
2025-02-20 14:33 ` Gwan-gyeong Mun
2025-02-13 2:10 ` [PATCH v5 06/32] drm/gpusvm: Add support for GPU Shared Virtual Memory Matthew Brost
2025-02-19 8:59 ` Thomas Hellström
2025-02-13 2:10 ` [PATCH v5 07/32] drm/xe: Select DRM_GPUSVM Kconfig Matthew Brost
2025-02-13 2:10 ` [PATCH v5 08/32] drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_CPU_ADDR_MIRROR Matthew Brost
2025-02-13 2:10 ` [PATCH v5 09/32] drm/xe: Add SVM init / close / fini to faulting VMs Matthew Brost
2025-02-13 2:10 ` [PATCH v5 10/32] drm/xe: Add dma_addr res cursor Matthew Brost
2025-02-13 2:10 ` [PATCH v5 11/32] drm/xe: Nuke VM's mapping upon close Matthew Brost
2025-02-13 2:10 ` [PATCH v5 12/32] drm/xe: Add SVM range invalidation and page fault Matthew Brost
2025-02-13 10:05 ` Ghimiray, Himal Prasad
2025-02-13 2:10 ` [PATCH v5 13/32] drm/gpuvm: Add DRM_GPUVA_OP_DRIVER Matthew Brost
2025-02-13 2:10 ` [PATCH v5 14/32] drm/xe: Add (re)bind to SVM page fault handler Matthew Brost
2025-02-13 2:10 ` [PATCH v5 15/32] drm/xe: Add SVM garbage collector Matthew Brost
2025-02-13 10:07 ` Ghimiray, Himal Prasad
2025-02-13 2:10 ` [PATCH v5 16/32] drm/xe: Add unbind to " Matthew Brost
2025-02-19 15:05 ` Thomas Hellström
2025-02-13 2:10 ` [PATCH v5 17/32] drm/xe: Do not allow CPU address mirror VMA unbind if the GPU has bindings Matthew Brost
2025-02-13 11:28 ` Ghimiray, Himal Prasad
2025-02-13 2:10 ` [PATCH v5 18/32] drm/xe: Enable CPU address mirror uAPI Matthew Brost
2025-02-13 11:26 ` Ghimiray, Himal Prasad
2025-02-13 2:10 ` [PATCH v5 19/32] drm/xe/uapi: Add DRM_XE_QUERY_CONFIG_FLAG_HAS_CPU_ADDR_MIRROR Matthew Brost
2025-02-13 2:11 ` [PATCH v5 20/32] drm/xe: Add migrate layer functions for SVM support Matthew Brost
2025-02-13 2:11 ` [PATCH v5 21/32] drm/xe: Add SVM device memory mirroring Matthew Brost
2025-02-13 11:28 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 22/32] drm/xe: Add drm_gpusvm_devmem to xe_bo Matthew Brost
2025-02-13 11:29 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 23/32] drm/xe: Add drm_pagemap ops to SVM Matthew Brost
2025-02-13 2:11 ` [PATCH v5 24/32] drm/xe: Add GPUSVM device memory copy vfunc functions Matthew Brost
2025-02-13 2:11 ` [PATCH v5 25/32] drm/xe: Add Xe SVM populate_devmem_pfn GPU SVM vfunc Matthew Brost
2025-02-13 2:11 ` [PATCH v5 26/32] drm/xe: Add Xe SVM devmem_release " Matthew Brost
2025-02-13 18:29 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 27/32] drm/xe: Add SVM VRAM migration Matthew Brost
2025-02-13 18:28 ` Ghimiray, Himal Prasad
2025-02-18 21:54 ` Matthew Brost
2025-02-19 2:59 ` Ghimiray, Himal Prasad
2025-02-19 3:05 ` Matthew Brost
2025-02-19 3:40 ` Ghimiray, Himal Prasad
2025-02-19 10:30 ` Thomas Hellström
2025-02-19 17:38 ` Matthew Brost
2025-02-20 15:53 ` Matthew Auld
2025-02-20 15:59 ` Thomas Hellström
2025-02-20 19:55 ` Matthew Brost
2025-02-21 15:15 ` Matthew Auld
2025-02-21 15:22 ` Matthew Brost [this message]
2025-02-13 2:11 ` [PATCH v5 28/32] drm/xe: Basic SVM BO eviction Matthew Brost
2025-02-13 2:11 ` [PATCH v5 29/32] drm/xe: Add SVM debug Matthew Brost
2025-02-13 11:30 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 30/32] drm/xe: Add modparam for SVM notifier size Matthew Brost
2025-02-13 11:31 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 31/32] drm/xe: Add always_migrate_to_vram modparam Matthew Brost
2025-02-13 11:31 ` Ghimiray, Himal Prasad
2025-02-13 2:11 ` [PATCH v5 32/32] drm/doc: gpusvm: Add GPU SVM documentation Matthew Brost
2025-02-13 3:35 ` ✓ CI.Patch_applied: success for Introduce GPU SVM and Xe SVM implementation (rev5) Patchwork
2025-02-13 3:36 ` ✗ CI.checkpatch: warning " Patchwork
2025-02-13 3:37 ` ✗ CI.KUnit: failure " Patchwork
2025-02-13 21:23 ` [PATCH v5 00/32] Introduce GPU SVM and Xe SVM implementation Demi Marie Obenour
2025-02-14 8:47 ` Thomas Hellström
2025-02-14 9:07 ` Ghimiray, Himal Prasad
2025-02-14 9:10 ` Ghimiray, Himal Prasad
2025-02-14 16:14 ` Demi Marie Obenour
2025-02-14 16:26 ` Thomas Hellström
2025-02-14 18:36 ` Demi Marie Obenour
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7iaUA92rdgcQ/1s@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=airlied@gmail.com \
--cc=apopple@nvidia.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=felix.kuehling@amd.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
--cc=simona.vetter@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox