Re: [RFC/POC PATCH 00/12] POC SVM implementation in AMDGPU based on drm_gpusvm

dri-devel Archive on lore.kernel.org
 help / color / mirror / Atom feed

From: "Huang, Honglei1" <honghuan@amd.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>,
	amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
	Alexander.Deucher@amd.com, Felix.Kuehling@amd.com,
	"Honglei Huang" <honglei1.huang@amd.com>,
	Oak.Zeng@amd.com, Jenny-Jing.Liu@amd.com, Philip.Yang@amd.com,
	Xiaogang.Chen@amd.com, Ray.Huang@amd.com, Lingshan.Zhu@amd.com,
	Junhua.Shen@amd.com,
	"Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
	"Rodrigo Vivi" <rodrigo.vivi@intel.com>,
	"Danilo Krummrich" <dakr@kernel.org>,
	"Alice Ryhl" <aliceryhl@google.com>
Subject: Re: [RFC/POC PATCH 00/12] POC SVM implementation in AMDGPU based on drm_gpusvm
Date: Thu, 23 Apr 2026 19:03:52 +0800	[thread overview]
Message-ID: <812102ac-906b-4fbf-bc8c-1881493a3eca@amd.com> (raw)
In-Reply-To: <aenH0REel7/enRi+@gsse-cloud1.jf.intel.com>



On 4/23/2026 3:18 PM, Matthew Brost wrote:
...
>>>>>>> This clarifies a lot. This is what we’d call in Xe “preemption fence”
>>>>>>> mode for a VM. Anytime memory is moved, we trigger a GPU preemption and
>>>>>>> resume. We don’t actually support SVM in this case; instead, we use
>>>>>>> “userptr binds,” which are built on gpusvm for page
>>>>>>> collection. However,
>>>>>>> we don’t support migrating memory to the device—though we could.
>>>>>>>
>>>>>>> I’d look at how we converted 'userptr' to be based on GPU SVM [2]. In
>>>>>>> this case, don’t maintain a range tree, as those—as you
>>>>>>> suggest—are more
>>>>>>> of an on-demand fault driver concern. Instead, just embed 'struct
>>>>>>> drm_gpusvm_pages' in the VMA struct defined by the IOCTLs..
>>>>>>>
>>>>>>> We could extend this to support migrating 'userptr', but we
>>>>>>> just haven’t
>>>>>>> done that yet—this may be what you want to do in “XNACK off..
>>>>>>>
>>>>>>> [2] https://patchwork.freedesktop.org/series/146553/
>>>>>>>
>>>>>>
>>>>>> Actually we need to swith the xnack mode between on and off, so
>>>>>> in xnack off
>>>>>> mode, the driver operats in "implicit  prefetch mode". This may
>>>>>> be due to
>>>>>> compatibility with older hardware and the need for UMD runtime. We will
>>>>>> further discuss the handling method under xnack off internally.
>>>>>>
>>>
>>> Hi Matt,
>>>
>>> I studied the xe_userptr code and the conversion series [2] you
>>> pointed to.
>>>
>>> I have a question that:
>>> Would it be possible to reuse drm_gpusvm_range to handle the hardware
>>> without gpu fault feature(xnack off mode).
>>
>> That’s not how we’ve done it. We embedded drm_gpusvm_pages into our VMA
>> structure and then attached a notifier. The notifier attachment is
>> open-coded on the Xe side, and this could be normalized and opened up
>> for common driver use cases.

The way in xe_userptr likes the implementation in kfd_svm: embeded 
physical pages into structure and attach same size notifier.
But kfd_svm is an implementation of SVM semantics, which supports 
partial unmap, doesn't need explicitly delete userptr ioctl calling when 
remove , and doesn't need a explicitly userptr flag when creating.
And actually there is also a existing implementation for userptr 
semantics in amdgpu kfd: KFD_IOC_ALLOC_MEM_FLAGS_USERPTR.
If the no gpu fault mode can not use the drm gpu svm fram work, use the 
same way for xe_userptr, it seems like doing the duplicate work.

I think the core gap is we are trying to use the drmgpu_svm to implement 
a SVM semantics driver for no gpu fault hardware instead of userptr 
semantics.

>>
>> The problem with reusing drm_gpusvm_range directly is that a VMA may
>> span multiple gpusvm notifiers—i.e., it can be larger than the notifier
>> size. Of course, we could rework this as well.

So the "VMA spans multiple gpusvm notifiers" concern: I'd like to 
clarify that this is not actually a blocker for amdgpu's XNACK-off path, 
because amdgpu does not try to represent one user ioctl virtual address 
interval as a single drm_gpusvm_range.

we walk the attr interval and call drm_gpusvm_range_find_or_insert() 
repeatedly, letting gpusvm pick chunk aligned ranges bounded by 
notifier_size. One ioctl interval will create N chunk sized ranges.

>>
> 
> Sorry for the double reply—I just glanced at the latest series. I don’t
> think creating a range per page of the userptr is desirable. While it
> would work, from a time-complexity point of view I don’t think this is
> ideal.
> 
> The issue with spans across multiple notifiers is real, though.
> 
> My rough idea would be:
> 
> - Give drivers an interface to create larger ranges.

So maybe we do not need to create larger ranges if we call 
drm_gpusvm_range_find_or_insert() repeatedly.

> 
> - If the range fits inside a single notifier’s size → done.
> 
> - If the range spans multiple notifier sizes → round up to a power of
>    two and create a larger notifier. This may overlap with existing
>    notifiers, which is likely fine given that interval trees support
>    overlaps (?). We’d need to double-check and test this. If overlapping
>    notifiers are not acceptable, we’d need some heavy-handed notifier merge
>    logic—it will be complicated, but isolated, so once we get it right
>    everyone can use it.

If we call drm_gpusvm_range_find_or_insert() repeatedly the drmgpu_svm 
will create the corresponding notifier correctly as far as I can see.

Regards,
Honglei

> 
> - Finally, make sure that individual userptr pages can reside at any
>    location.
> 
> Over conversely:
> 
> - Normalize embedding of drm_gpusvm_pages in VMA structs + notifier
>    creation
> 
> - Make sure that individual userptr pages can reside at any location.

> 
> Both options actually sound really similar after typing this out.
> 
> Matt
> 
>> So either way, the Xe userptr + gpusvm implementation should be refined
>> further for common driver use.
>>
>>>
>>> Reusing drm_gpusvm_range for the XNACK-off case would simplify our
>>> implementation considerably, it already provides large page chunk
>>> optimization, can reuse the existing migration infrastructure.
>>>
>>> Building these on top of a standalone drm_gpusvm_pages
>>> would mean reimplementing much of what the range layer already offers.
>>> It would also let us keep a single code path for both XNACK modes,
>>> which reduces maintenance burden and avoids behavioral difference.
>>>
>>> Would this direction be acceptable, or do you see concerns with reusing
>>> the range infrastructure for the no-fault case?
>>>
>>
>> If you prefer something like insert a range exactly here + create range
>> + notifier I think that completely reasonable direction and Xe would
>> likely switch over to using this.
>>
>> I guess my only concern is sub-userptr migration. We are trending
>> towards allowing userptrs to being migrated either via prefetch IOCTLs
>> or access counters on the GPU side - access counter we'd likely a single
>> 2M page at time migration within the userptr. get_pages() supports mixed
>> mappings between VRAM + system but likely needs some more work to really
>> make this complete though.
>>
>> Matt
>>   
>>> Regards,
>>> Honglei
...

next prev parent reply	other threads:[~2026-04-23 11:04 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 11:29 [RFC/POC PATCH 00/12] POC SVM implementation in AMDGPU based on drm_gpusvm Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 01/12] drm/amdgpu: add SVM UAPI definitions Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 02/12] drm/amdgpu: add SVM data structures and header Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 03/12] drm/amdgpu: add SVM attribute data structures Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 04/12] drm/amdgpu: implement SVM attribute tree operations Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 05/12] drm/amdgpu: implement SVM attribute set Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 06/12] drm/amdgpu: add SVM range data structures Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 07/12] drm/amdgpu: implement SVM range PTE flags and GPU mapping Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 08/12] drm/amdgpu: implement SVM range notifier and invalidation Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 09/12] drm/amdgpu: implement SVM range workers Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 10/12] drm/amdgpu: implement SVM core initialization and fini Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 11/12] drm/amdgpu: implement SVM ioctl and fault handler Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 12/12] drm/amdgpu: wire up SVM build system " Honglei Huang
2026-03-17 11:48 ` [RFC/POC PATCH 00/12] POC SVM implementation in AMDGPU based on drm_gpusvm Christian König
2026-03-18  8:59   ` Honglei Huang
2026-03-19  5:08     ` Matthew Brost
2026-03-19 14:17       ` Honglei Huang
2026-03-23  6:31         ` Matthew Brost
2026-03-24  7:24           ` Honglei Huang
2026-03-25 22:24             ` Matthew Brost
2026-03-26 12:16               ` Honglei Huang
2026-04-15 10:04                 ` Huang, Honglei1
2026-04-23  6:40                   ` Matthew Brost
2026-04-23  7:18                     ` Matthew Brost
2026-04-23 11:03                       ` Huang, Honglei1 [this message]
2026-04-23 20:21                         ` Matthew Brost
2026-04-24 10:43                           ` Huang, Honglei1
2026-04-27 20:00                             ` Felix Kuehling
2026-04-28  2:23                               ` Huang, Honglei1
2026-04-30  3:04                                 ` Matthew Brost
2026-04-23  6:09           ` Huang, Honglei1
2026-04-23  6:52             ` Matthew Brost
2026-04-23  8:22               ` Huang, Honglei1
2026-04-29  9:56                 ` Huang, Honglei1
2026-04-30  2:56                   ` Huang, Honglei1
2026-04-30  3:12                     ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=812102ac-906b-4fbf-bc8c-1881493a3eca@amd.com \
    --to=honghuan@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Jenny-Jing.Liu@amd.com \
    --cc=Junhua.Shen@amd.com \
    --cc=Lingshan.Zhu@amd.com \
    --cc=Oak.Zeng@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=Ray.Huang@amd.com \
    --cc=Xiaogang.Chen@amd.com \
    --cc=aliceryhl@google.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=honglei1.huang@amd.com \
    --cc=matthew.brost@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox