Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <airlied@gmail.com>,
	<christian.koenig@amd.com>, <thomas.hellstrom@linux.intel.com>,
	<matthew.auld@intel.com>, <daniel@ffwll.ch>
Subject: Re: [RFC PATCH 05/28] drm/gpusvm: Add support for GPU Shared Virtual Memory
Date: Thu, 29 Aug 2024 16:49:15 +0000	[thread overview]
Message-ID: <ZtCmi3Hjk7K+Sk09@DUT025-TGLU.fm.intel.com> (raw)
In-Reply-To: <Zs9xWiECsszs6nPR@phenom.ffwll.local>

On Wed, Aug 28, 2024 at 08:50:02PM +0200, Daniel Vetter wrote:
> On Tue, Aug 27, 2024 at 07:48:38PM -0700, Matthew Brost wrote:
> > +int drm_gpusvm_migrate_to_sram(struct drm_gpusvm *gpusvm,
> > +			       struct drm_gpusvm_range *range,
> > +			       const struct drm_gpusvm_ctx *ctx)
> > +{
> > +	u64 start = range->va.start, end = range->va.end;
> > +	struct mm_struct *mm = gpusvm->mm;
> > +	struct vm_area_struct *vas;
> > +	int err;
> > +	bool retry = false;
> > +
> > +	if (!ctx->mmap_locked) {
> > +		if (!mmget_not_zero(mm)) {
> > +			err = -EFAULT;
> > +			goto err_out;
> > +		}
> > +		if (ctx->trylock_mmap) {
> > +			if (!mmap_read_trylock(mm))  {
> > +				err = drm_gpusvm_evict_to_sram(gpusvm, range);
> > +				goto err_mmput;
> > +			}
> > +		} else {
> > +			mmap_read_lock(mm);
> > +		}
> > +	}
> > +
> > +	mmap_assert_locked(mm);
> > +
> > +	/*
> > +	 * Loop required to find all VMA area structs for the corner case when
> > +	 * VRAM backing has been partially unmapped from MM's address space.
> > +	 */
> > +again:
> > +	vas = find_vma(mm, start);
> 
> So a hiliarous case that amdkfd gets a bit better but still not entirely
> is that the original vma might entirely gone. Even when you can still get
> at the mm of that process. This happens with cow (or shared too I think)
> mappings in forked child processes, or also if you play fun mremap games.
> 
> I think that outside of the ->migrate_to_ram callback migration/eviction
> to sram cannot assume there's any reasonable vma around and has to
> unconditionally go with the drm_gpusvm_evict_to_sram path.
> 

See my response here [1]. Let me drop the whole trylock thing and
convert to an 'evict' flag which calls drm_gpusvm_evict_to_sram in
places where Xe needs to evict VRAM. Or maybe just export that function
and call it directly. That way the only place the VMA is looked up for
SRAM -> VRAM is upon CPU page fault.

[1] https://patchwork.freedesktop.org/patch/610955/?series=137870&rev=1#comment_1111164

> Also in the migrate_to_ram case the vma is essentially nothing else that
> informational about which ranges we might need if we prefault a bit (in
> case the child changed the vma compared to the original one). So it's good
> to as parameter for migrate_vma_setup, but absolutely nothing else.
> 
> amdkfd almost gets this right by being entirely based on their svm_range
> structures, except they still have the lingering check that the orignal mm
> is still alive. Of course you cannot ever use that memory on the gpu
> anymore, but the child process could get very pissed if their memory is
> suddenly gone. Also the eviction code has the same issue as yours and
> limits itself to vma that still exist in the original mm, leaving anything
> that's orphaned in children or remaps stuck in vram. At least that's my
> understanding, I might very well be wrong.
> 
> So probably want a bunch of these testcases too to make sure that all
> works, and we're not stuck with memory allocations in vram that we can't
> move out.

When writing some additional test cases, let me add hooks in my IGTs to
be able to verify we are not orphaning VRAM too.

Matt

> -Sima
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

  reply	other threads:[~2024-08-29 16:50 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-28  2:48 [RFC PATCH 00/28] Introduce GPU SVM and Xe SVM implementation Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 01/28] dma-buf: Split out dma fence array create into alloc and arm functions Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 02/28] drm/xe: Invalidate media_gt TLBs in PT code Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 03/28] drm/xe: Retry BO allocation Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 04/28] mm/migrate: Add migrate_device_vma_range Matthew Brost
2024-08-29  9:03   ` Daniel Vetter
2024-08-29 15:58     ` Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 05/28] drm/gpusvm: Add support for GPU Shared Virtual Memory Matthew Brost
2024-08-28 14:31   ` Daniel Vetter
2024-08-28 14:46     ` Christian König
2024-08-28 15:43       ` Matthew Brost
2024-08-28 16:06         ` Alex Deucher
2024-08-28 16:25         ` Daniel Vetter
2024-08-29 16:40           ` Matthew Brost
2024-09-02 11:29             ` Daniel Vetter
2024-08-30  5:00     ` Matthew Brost
2024-09-02 11:36       ` Daniel Vetter
2024-08-28 18:50   ` Daniel Vetter
2024-08-29 16:49     ` Matthew Brost [this message]
2024-09-02 11:40       ` Daniel Vetter
2024-08-29  9:16   ` Thomas Hellström
2024-08-29 17:45     ` Matthew Brost
2024-08-29 18:13       ` Matthew Brost
2024-08-29 19:18       ` Thomas Hellström
2024-08-29 20:56         ` Matthew Brost
2024-08-30  8:18           ` Thomas Hellström
2024-08-30 13:58             ` Matthew Brost
2024-09-02  9:57               ` Thomas Hellström
2024-08-30  9:57           ` Thomas Hellström
2024-08-30 13:47             ` Matthew Brost
2024-09-02  9:45               ` Thomas Hellström
2024-09-02 12:33           ` Daniel Vetter
2024-09-04 12:27             ` Thomas Hellström
2024-09-24  8:41               ` Simona Vetter
2024-08-30  1:35     ` Matthew Brost
2024-08-29  9:45   ` Daniel Vetter
2024-08-29 17:27     ` Matthew Brost
2024-09-02 11:53       ` Daniel Vetter
2024-09-02 17:03         ` Matthew Brost
2024-09-11 16:06           ` Matthew Brost
2024-08-30  9:16   ` Thomas Hellström
2024-09-02 12:20     ` Daniel Vetter
2024-09-06 18:41   ` Zeng, Oak
2024-09-24  9:25     ` Simona Vetter
2024-09-25 16:34       ` Zeng, Oak
2024-09-24 10:42   ` Thomas Hellström
2024-09-24 16:30     ` Matthew Brost
2024-09-25 21:12       ` Matthew Brost
2024-10-09 10:50   ` Thomas Hellström
2024-10-16  3:18     ` Matthew Brost
2024-10-16  6:27       ` Thomas Hellström
2024-10-16  8:24         ` Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 06/28] drm/xe/uapi: Add DRM_XE_VM_BIND_FLAG_SYSTEM_ALLOCATON flag Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 07/28] drm/xe: Add SVM init / fini to faulting VMs Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 08/28] drm/xe: Add dma_addr res cursor Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 09/28] drm/xe: Add SVM range invalidation Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 10/28] drm/gpuvm: Add DRM_GPUVA_OP_USER Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 11/28] drm/xe: Add (re)bind to SVM page fault handler Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 12/28] drm/xe: Add SVM garbage collector Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 13/28] drm/xe: Add unbind to " Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 14/28] drm/xe: Do not allow system allocator VMA unbind if the GPU has bindings Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 15/28] drm/xe: Enable system allocator uAPI Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 16/28] drm/xe: Add migrate layer functions for SVM support Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 17/28] drm/xe: Add SVM device memory mirroring Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 18/28] drm/xe: Add GPUSVM copy SRAM / VRAM vfunc functions Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 19/28] drm/xe: Update PT layer to understand ranges in VRAM Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 20/28] drm/xe: Add Xe SVM populate_vram_pfn vfunc Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 21/28] drm/xe: Add Xe SVM vram_release vfunc Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 22/28] drm/xe: Add BO flags required for SVM Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 23/28] drm/xe: Add SVM VRAM migration Matthew Brost
2024-08-28 16:06   ` Daniel Vetter
2024-08-28 18:22     ` Daniel Vetter
2024-08-29  9:24     ` Christian König
2024-08-29  9:53       ` Thomas Hellström
2024-08-29 11:02         ` Daniel Vetter
2024-08-29 22:12           ` Matthew Brost
2024-08-29 22:23             ` Matthew Brost
2024-09-02 11:01             ` Christian König
2024-09-02 12:50               ` Daniel Vetter
2024-09-02 12:48             ` Daniel Vetter
2024-09-02 22:20               ` Matthew Brost
2024-09-03  8:07                 ` Simona Vetter
2024-08-29 14:30         ` Christian König
2024-08-29 21:53           ` Matthew Brost
2024-08-29 21:48       ` Matthew Brost
2024-09-02 13:02         ` Daniel Vetter
2024-08-28  2:48 ` [RFC PATCH 24/28] drm/xe: Basic SVM BO eviction Matthew Brost
2024-08-29 10:14   ` Daniel Vetter
2024-08-29 15:55     ` Matthew Brost
2024-09-02 13:05       ` Daniel Vetter
2024-08-28  2:48 ` [RFC PATCH 25/28] drm/xe: Add SVM debug Matthew Brost
2024-08-28  2:48 ` [RFC PATCH 26/28] drm/xe: Add modparam for SVM notifier size Matthew Brost
2024-08-28  2:49 ` [RFC PATCH 27/28] drm/xe: Add modparam for SVM prefault Matthew Brost
2024-08-28  2:49 ` [RFC PATCH 28/28] drm/gpusvm: Ensure all pages migrated upon eviction Matthew Brost
2024-08-28  2:55 ` ✓ CI.Patch_applied: success for Introduce GPU SVM and Xe SVM implementation Patchwork
2024-08-28  2:55 ` ✗ CI.checkpatch: warning " Patchwork
2024-08-28  2:56 ` ✗ CI.KUnit: failure " Patchwork
2024-09-24  9:16 ` [RFC PATCH 00/28] " Simona Vetter
2024-09-24 19:36   ` Matthew Brost
2024-09-25 11:41     ` Simona Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZtCmi3Hjk7K+Sk09@DUT025-TGLU.fm.intel.com \
    --to=matthew.brost@intel.com \
    --cc=airlied@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox