From: Huang Rui <ray.huang@amd.com>
To: Honglei Huang <honglei1.huang@amd.com>
Cc: Alexander.Deucher@amd.com, Felix.Kuehling@amd.com,
Christian.Koenig@amd.com, Oak.Zeng@amd.com,
Jenny-Jing.Liu@amd.com, Philip.Yang@amd.com,
Xiaogang.Chen@amd.com, Lingshan.Zhu@amd.com, Junhua.Shen@amd.com,
matthew.brost@intel.com, rodrigo.vivi@intel.com,
thomas.hellstrom@linux.intel.com, dakr@kernel.org,
aliceryhl@google.com, amd-gfx@lists.freedesktop.org,
dri-devel@lists.freedesktop.org, honghuan@amd.com
Subject: Re: [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm
Date: Tue, 21 Apr 2026 10:31:09 +0800 [thread overview]
Message-ID: <aebhbe9gMAqz1tpM@amd.com> (raw)
In-Reply-To: <20260420131307.1816671-1-honglei1.huang@amd.com>
On Mon, Apr 20, 2026 at 09:12:55PM +0800, Honglei Huang wrote:
> From: Honglei Huang <honghuan@amd.com>
>
> V3 of the SVM patch series for amdgpu based on the drm_gpusvm framework.
> This revision incorporates feedback from V1, adds XNACK on GPU fault handling,
> improves code organization, and removes the XNACK off (no GPU fault) implementation
> to focus on the fault driven model that aligns with drm_gpusvm's design.
> The implementation references extensively from xe_svm.
>
> This patch series implements SVM support with the following design:
>
> 1. Attributes separated from physical page management:
>
> - Attribute layer (amdgpu_svm_attr_tree): a driver-side interval
> tree storing per-range SVM attributes. Managed through SET_ATTR
> ioctl and preserved across range lifecycle events.
>
> - Physical page layer (drm_gpusvm ranges): managed by the
> drm_gpusvm framework, representing HMM-backed DMA mappings
> and GPU page table entries.
>
> This separation ensures attributes survive when GPU ranges are
> destroyed (partial munmap, attribute split, GC). The fault
> handler recreates GPU ranges from the attribute tree on demand.
>
> 2. GPU fault driven mapping (XNACK on):
>
> The core mapping path is driven by GPU page faults instead of ioctls.
> amdgpu_svm_handle_fault() looks up SVM by PASID, runs GC,
> resolves attributes, then maps via find_or_insert -> get_pages
> -> GPU PTE update. For unregistered addresses, default
> attributes are derived from VMA properties automatically.
>
> 3. MMU notifier invalidation:
>
> Two-phase callback: event_begin() zaps GPU PTEs and flushes
> TLB, event_end() unmaps DMA pages. UNMAP events queue ranges
> to GC for deferred cleanup. Non-UNMAP events (eviction) rely
> on GPU fault to remap.
>
> 4. Garbage collector:
>
> GC workqueue processes unmapped ranges: removes them
> from drm_gpusvm and clears corresponding attributes. No
> rebuild or restore logic, GPU fault handles recreation.
>
> Changes since V2:
> - Add version tittle in commit message.
> - Fix some content mistaken.
>
> Changes since V1:
> - Added GPU fault handler: amdgpu_svm_handle_fault with PASID-based
> SVM lookup, following the standard flow: garbage collector ->
> find or insert range -> check valid -> migrate (TODO) / get_pages
> -> GPU bind/map.
>
> - Removed the restore worker queue entirely. V1 had separate GC
> and restore workers: restore workers were responsible for
> synchronously restore in queue stop/start cause no GPU fault support.
> With XNACK on fault driven model, synchronous restore is unnecessary,
> the GPU fault handler recreates ranges on demand. The GC worker in
> V2 is simplified to only discard ranges and clear their attributes,
> with no rebuild or restore logic. AMDGPU_SVM_FLAG_GPU_ALWAYS_MAPPED
> support is removed as no restore worker.
>
> - Reworked MMU notifier callback (amdgpu_svm_range_invalidate):
> V1 had a monolithic dispatcher with flag combinations and
> queue ops (CLEAR_PTE/QUEUE_INTERVAL, UNMAP/RESTORE) plus
> begin_restore() to quiesce KFD queues. V2 uses a two-phase
> model: event_begin() zaps GPU PTEs and flushes TLB,
> event_end() unmaps DMA pages and queues UNMAP ranges to GC.
> Non-UNMAP events (eviction) just zap PTEs and let GPU fault
> remap. Removed begin_restore/end_restore callbacks,
> has_always_mapped_range() check, and NOTIFIER flag dispatch.
> Added checkpoint timestamp capture on UNMAP for fault dedup.
>
> - Added amdgpu_svm_range_invalidate_interval(): when userspace
> sets new attributes on a sub region of an existing attribute
> range, the attribute tree splits the old range and the new
> sub region gets different attributes. However, existing
> drm_gpusvm ranges may across the new attribute boundary
> (e.g., a 2M GPU range covers both the old and new attribute
> regions). This function walks all gpusvm ranges in the
> affected interval, zaps GPU PTEs and flushes TLB. Ranges
> that cross the new boundary and old boundary are removed
> entirely so the GPU fault handler can recreate them with
> boundaries aligned to the updated attribute layout.
>
> - On MMU_NOTIFY_UNMAP events, discard all affected gpusvm ranges
> entirely without synchronous rebuild in v1. The unmap may destroy
> more ranges than strictly necessary (e.g., a partial munmap
> hits a 2M range that extends beyond the unmapped region), but
> the attribute layer preserves the still valid attributes for
> the remaining address space. When the GPU next accesses those
> addresses, the fault handler automatically recreates the
> ranges with correct boundaries from the surviving attributes.
> This avoids the synchronous rebuild logic that V1 required
> (unmap -> rebuild in GC/restore worker).
>
> - Add attribute creation for unregistered addresses:
> amdgpu_svm_range_get_unregistered_attrs() derives default
> SVM attributes from VMA properties and GPU IP capabilities
> when the faulting address has no user attributes registered.
> this feature is needed to pass ROCm user mode runtime tests:
> kfd/rocr/hip. ROCm supports no registered virtual address access
> with default SVM attributes before, so amdgpu svm needs to support.
>
> - Explicitly returns -EOPNOTSUPP in amdgpu_svm_init when XNACK
> is disabled. V1 attempted mixed XNACK on/off support with
> complex KFD queue quiesce/resume callbacks and ioctl driven
> mapping paths, which added substantial complexity. V2 drops
> these implementations to focus on the fault driven model.
>
> - Removed kgd2kfd_quiesce_mm()/resume_mm() dependency that V1
> used for XNACK off queue control. For XNACK on, the GPU fault
> handler is the enterance for SVM range mapping, so no quiesce/resume
> is needed for this version.
>
> - Added new change triggers: TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH.
> for sub attr set and prefetch trigger support.
>
> - Added helper functions: find_locked, get_bounds_locked,
> set_default for GPU fault handling.
>
> - Design questions section removed.
>
> TODO:
> - Add multi GPU support.
> - Add XNACK off mode.
> - Add migration or prefetch. This part work is ongoing in:
> https://lore.kernel.org/amd-gfx/20260410113146.146212-1-Junhua.Shen@amd.com/
>
> Test results:
> Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
> - KFD test: 95%+ passed.
> - ROCR test: all passed.
> - HIP catch test: gfx943 (MI300X): 96% passed.
> gfx906 (MI60):99% passed.
It would be best to also include the ROCm runtime merge request in the
cover letter, and clarify that the above test results are based on V3 +
user-space ROCR.
https://github.com/ROCm/rocm-systems/pull/4364
Thanks,
Ray
>
> Patch overview:
>
> 01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl, SVM flags, SET_ATTR/GET_ATTR
> operations, attribute types in amdgpu_drm.h.
>
> 02/12 Core header: amdgpu_svm wrapping drm_gpusvm with refcount,
> attr_tree, GC struct, locks, and VM integration hooks.
>
> 03/12 Attribute types: amdgpu_svm_attrs, attr_range (interval tree
> node), attr_tree, access enum, flag masks, change triggers.
>
> 04/12 Attribute tree ops: interval tree lookup, insert, remove,
> find_locked, get_bounds_locked, set_default, and lifecycle.
>
> 05/12 Attribute set/get/clear: validate UAPI attributes, apply to
> tree with head/tail splitting, change propagation, and query.
>
> 06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range
> with gpu_mapped state, pending ops, work queue linkage,
> and op_ctx for batch processing.
>
> 07/12 Range GPU mapping: PTE flags computation with read_only
> support, GPU page table update, range mapping loop.
>
> 08/12 Notifier and GC helpers: two-phase notifier events, range
> removal, GC enqueue/add with dedicated workqueue.
>
> 09/12 Attribute change and invalidation: apply attribute triggers
> to GPU ranges, invalidate_interval for boundary realignment,
> work queue dequeue helpers, checkpoint timestamp.
>
> 10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init
> with chunk sizes (2M/64K/4K), XNACK detection, GC init,
> PASID lookup, TLB flush, and init/close/fini lifecycle.
>
> 11/12 Ioctl, GC, and fault handler: ioctl dispatcher, GC worker,
> and amdgpu_svm_fault.c/h with full fault path including
> unregistered attribute derivation and retry logic.
>
> 12/12 Build integration: Kconfig (CONFIG_DRM_AMDGPU_SVM), Makefile
> rules, ioctl registration, and amdgpu_vm fault dispatch.
>
> Honglei Huang (12):
> drm/amdgpu: define SVM UAPI for GPU shared virtual memory
> drm/amdgpu: introduce SVM core header and VM integration
> drm/amdgpu: define SVM attribute subsystem types
> drm/amdgpu: implement SVM attribute tree and helper functions
> drm/amdgpu: implement SVM attribute set, get, and clear
> drm/amdgpu: define SVM range types and work queue interface
> drm/amdgpu: implement SVM range GPU mapping core
> drm/amdgpu: implement SVM range notifier and GC helpers
> drm/amdgpu: implement SVM attribute change and invalidation callback
> drm/amdgpu: implement SVM initialization and lifecycle
> drm/amdgpu: add SVM ioctl, garbage collector, and fault handler
> drm/amdgpu: integrate SVM into build system and VM fault path
>
> drivers/gpu/drm/amd/amdgpu/Kconfig | 11 +
> drivers/gpu/drm/amd/amdgpu/Makefile | 13 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c | 467 +++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h | 162 +++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c | 952 ++++++++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h | 144 +++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 368 +++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h | 39 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 863 ++++++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 148 +++
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 20 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 4 +
> include/uapi/drm/amdgpu_drm.h | 39 +
> 14 files changed, 3231 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h
>
> --
> 2.34.1
>
next prev parent reply other threads:[~2026-04-21 2:31 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-20 13:12 [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm Honglei Huang
2026-04-20 13:12 ` [RFC V3 01/12] drm/amdgpu: define SVM UAPI for GPU shared virtual memory Honglei Huang
2026-04-20 13:12 ` [RFC V3 02/12] drm/amdgpu: introduce SVM core header and VM integration Honglei Huang
2026-04-20 13:12 ` [RFC V3 03/12] drm/amdgpu: define SVM attribute subsystem types Honglei Huang
2026-04-20 13:12 ` [RFC V3 04/12] drm/amdgpu: implement SVM attribute tree and helper functions Honglei Huang
2026-04-20 13:13 ` [RFC V3 05/12] drm/amdgpu: implement SVM attribute set, get, and clear Honglei Huang
2026-04-20 13:13 ` [RFC V3 06/12] drm/amdgpu: define SVM range types and work queue interface Honglei Huang
2026-04-20 13:13 ` [RFC V3 07/12] drm/amdgpu: implement SVM range GPU mapping core Honglei Huang
2026-04-20 13:13 ` [RFC V3 08/12] drm/amdgpu: implement SVM range notifier and GC helpers Honglei Huang
2026-04-20 13:13 ` [RFC V3 09/12] drm/amdgpu: implement SVM attribute change and invalidation callback Honglei Huang
2026-04-20 13:13 ` [RFC V3 10/12] drm/amdgpu: implement SVM initialization and lifecycle Honglei Huang
2026-04-20 13:13 ` [RFC V3 11/12] drm/amdgpu: add SVM ioctl, garbage collector, and fault handler Honglei Huang
2026-04-20 16:24 ` Matthew Brost
2026-04-21 10:07 ` Huang, Honglei1
2026-04-20 13:13 ` [RFC V3 12/12] drm/amdgpu: integrate SVM into build system and VM fault path Honglei Huang
2026-04-21 2:31 ` Huang Rui [this message]
2026-04-21 9:54 ` [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm Huang, Honglei1
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aebhbe9gMAqz1tpM@amd.com \
--to=ray.huang@amd.com \
--cc=Alexander.Deucher@amd.com \
--cc=Christian.Koenig@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=Jenny-Jing.Liu@amd.com \
--cc=Junhua.Shen@amd.com \
--cc=Lingshan.Zhu@amd.com \
--cc=Oak.Zeng@amd.com \
--cc=Philip.Yang@amd.com \
--cc=Xiaogang.Chen@amd.com \
--cc=aliceryhl@google.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=honghuan@amd.com \
--cc=honglei1.huang@amd.com \
--cc=matthew.brost@intel.com \
--cc=rodrigo.vivi@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox