public inbox for dri-devel@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Huang Rui <ray.huang@amd.com>
To: Honglei Huang <honglei1.huang@amd.com>
Cc: Alexander.Deucher@amd.com, Felix.Kuehling@amd.com,
	Christian.Koenig@amd.com, Oak.Zeng@amd.com,
	Jenny-Jing.Liu@amd.com, Philip.Yang@amd.com,
	Xiaogang.Chen@amd.com, Lingshan.Zhu@amd.com, Junhua.Shen@amd.com,
	matthew.brost@intel.com, rodrigo.vivi@intel.com,
	thomas.hellstrom@linux.intel.com, dakr@kernel.org,
	aliceryhl@google.com, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org, honghuan@amd.com
Subject: Re: [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm
Date: Tue, 21 Apr 2026 10:31:09 +0800	[thread overview]
Message-ID: <aebhbe9gMAqz1tpM@amd.com> (raw)
In-Reply-To: <20260420131307.1816671-1-honglei1.huang@amd.com>

On Mon, Apr 20, 2026 at 09:12:55PM +0800, Honglei Huang wrote:
> From: Honglei Huang <honghuan@amd.com>
> 
> V3 of the SVM patch series for amdgpu based on the drm_gpusvm framework. 
> This revision incorporates feedback from V1, adds XNACK on GPU fault handling,
> improves code organization, and removes the XNACK off (no GPU fault) implementation
> to focus on the fault driven model that aligns with drm_gpusvm's design. 
> The implementation references extensively from xe_svm.
> 
> This patch series implements SVM support with the following design:
> 
>   1. Attributes separated from physical page management:
> 
>     - Attribute layer (amdgpu_svm_attr_tree): a driver-side interval
>       tree storing per-range SVM attributes. Managed through SET_ATTR
>       ioctl and preserved across range lifecycle events.
> 
>     - Physical page layer (drm_gpusvm ranges): managed by the
>       drm_gpusvm framework, representing HMM-backed DMA mappings
>       and GPU page table entries.
> 
>     This separation ensures attributes survive when GPU ranges are
>     destroyed (partial munmap, attribute split, GC). The fault
>     handler recreates GPU ranges from the attribute tree on demand.
> 
>   2. GPU fault driven mapping (XNACK on):
> 
>     The core mapping path is driven by GPU page faults instead of ioctls.
>     amdgpu_svm_handle_fault() looks up SVM by PASID, runs GC,
>     resolves attributes, then maps via find_or_insert -> get_pages
>     -> GPU PTE update. For unregistered addresses, default
>     attributes are derived from VMA properties automatically.
> 
>   3. MMU notifier invalidation:
> 
>     Two-phase callback: event_begin() zaps GPU PTEs and flushes
>     TLB, event_end() unmaps DMA pages. UNMAP events queue ranges
>     to GC for deferred cleanup. Non-UNMAP events (eviction) rely
>     on GPU fault to remap.
> 
>   4. Garbage collector:
> 
>     GC workqueue processes unmapped ranges: removes them
>     from drm_gpusvm and clears corresponding attributes. No
>     rebuild or restore logic, GPU fault handles recreation.
> 
> Changes since V2:
>   - Add version tittle in commit message.
>   - Fix some content mistaken.
> 
> Changes since V1:
>   - Added GPU fault handler: amdgpu_svm_handle_fault with PASID-based
>     SVM lookup, following the standard flow: garbage collector ->
>     find or insert range -> check valid -> migrate (TODO) / get_pages
>     -> GPU bind/map.
> 
>   - Removed the restore worker queue entirely. V1 had separate GC
>     and restore workers: restore workers were responsible for 
>     synchronously restore in queue stop/start cause no GPU fault support.
>     With XNACK on fault driven model, synchronous restore is unnecessary,
>     the GPU fault handler recreates ranges on demand. The GC worker in 
>     V2 is simplified to only discard ranges and clear their attributes, 
>     with no rebuild or restore logic. AMDGPU_SVM_FLAG_GPU_ALWAYS_MAPPED
>     support is removed as no restore worker.
> 
>   - Reworked MMU notifier callback (amdgpu_svm_range_invalidate):
>     V1 had a monolithic dispatcher with flag combinations and
>     queue ops (CLEAR_PTE/QUEUE_INTERVAL, UNMAP/RESTORE) plus
>     begin_restore() to quiesce KFD queues. V2 uses a two-phase
>     model: event_begin() zaps GPU PTEs and flushes TLB,
>     event_end() unmaps DMA pages and queues UNMAP ranges to GC.
>     Non-UNMAP events (eviction) just zap PTEs and let GPU fault
>     remap. Removed begin_restore/end_restore callbacks,
>     has_always_mapped_range() check, and NOTIFIER flag dispatch.
>     Added checkpoint timestamp capture on UNMAP for fault dedup.
> 
>   - Added amdgpu_svm_range_invalidate_interval(): when userspace
>     sets new attributes on a sub region of an existing attribute
>     range, the attribute tree splits the old range and the new
>     sub region gets different attributes. However, existing
>     drm_gpusvm ranges may across the new attribute boundary
>     (e.g., a 2M GPU range covers both the old and new attribute
>     regions). This function walks all gpusvm ranges in the
>     affected interval, zaps GPU PTEs and flushes TLB. Ranges
>     that cross the new boundary and old boundary are removed 
>     entirely so the GPU fault handler can recreate them with 
>     boundaries aligned to the updated attribute layout.
> 
>   - On MMU_NOTIFY_UNMAP events, discard all affected gpusvm ranges
>     entirely without synchronous rebuild in v1. The unmap may destroy
>     more ranges than strictly necessary (e.g., a partial munmap
>     hits a 2M range that extends beyond the unmapped region), but
>     the attribute layer preserves the still valid attributes for
>     the remaining address space. When the GPU next accesses those
>     addresses, the fault handler automatically recreates the
>     ranges with correct boundaries from the surviving attributes.
>     This avoids the synchronous rebuild logic that V1 required 
>     (unmap -> rebuild in GC/restore worker).
> 
>   - Add attribute creation for unregistered addresses:
>     amdgpu_svm_range_get_unregistered_attrs() derives default
>     SVM attributes from VMA properties and GPU IP capabilities
>     when the faulting address has no user attributes registered.
>     this feature is needed to pass ROCm user mode runtime tests:
>     kfd/rocr/hip. ROCm supports no registered virtual address access
>     with default SVM attributes before, so amdgpu svm needs to support.
> 
>   - Explicitly returns -EOPNOTSUPP in amdgpu_svm_init when XNACK
>     is disabled. V1 attempted mixed XNACK on/off support with
>     complex KFD queue quiesce/resume callbacks and ioctl driven
>     mapping paths, which added substantial complexity. V2 drops
>     these implementations to focus on the fault driven model.
> 
>   - Removed kgd2kfd_quiesce_mm()/resume_mm() dependency that V1
>     used for XNACK off queue control. For XNACK on, the GPU fault 
>     handler is the enterance for SVM range mapping, so no quiesce/resume
>     is needed for this version. 
> 
>   - Added new change triggers: TRIGGER_RANGE_SPLIT, TRIGGER_PREFETCH.
>     for sub attr set and prefetch trigger support.
> 
>   - Added helper functions: find_locked, get_bounds_locked,
>     set_default for GPU fault handling.
> 
>   - Design questions section removed.
> 
> TODO:
>   - Add multi GPU support.
>   - Add XNACK off mode.
>   - Add migration or prefetch. This part work is ongoing in:
>     https://lore.kernel.org/amd-gfx/20260410113146.146212-1-Junhua.Shen@amd.com/
> 
> Test results:
>   Tested on gfx943 (MI300X) and gfx906 (MI60) with XNACK on:
>   - KFD test: 95%+ passed.
>   - ROCR test: all passed.
>   - HIP catch test: gfx943 (MI300X): 96% passed.
>                     gfx906 (MI60):99% passed.

It would be best to also include the ROCm runtime merge request in the
cover letter, and clarify that the above test results are based on V3 +
user-space ROCR.

https://github.com/ROCm/rocm-systems/pull/4364

Thanks,
Ray

> 
> Patch overview:
> 
>   01/12 UAPI: DRM_AMDGPU_GEM_SVM ioctl, SVM flags, SET_ATTR/GET_ATTR
>         operations, attribute types in amdgpu_drm.h.
> 
>   02/12 Core header: amdgpu_svm wrapping drm_gpusvm with refcount,
>         attr_tree, GC struct, locks, and VM integration hooks.
> 
>   03/12 Attribute types: amdgpu_svm_attrs, attr_range (interval tree
>         node), attr_tree, access enum, flag masks, change triggers.
> 
>   04/12 Attribute tree ops: interval tree lookup, insert, remove,
>         find_locked, get_bounds_locked, set_default, and lifecycle.
> 
>   05/12 Attribute set/get/clear: validate UAPI attributes, apply to
>         tree with head/tail splitting, change propagation, and query.
> 
>   06/12 Range types: amdgpu_svm_range extending drm_gpusvm_range
>         with gpu_mapped state, pending ops, work queue linkage,
>         and op_ctx for batch processing.
> 
>   07/12 Range GPU mapping: PTE flags computation with read_only
>         support, GPU page table update, range mapping loop.
> 
>   08/12 Notifier and GC helpers: two-phase notifier events, range
>         removal, GC enqueue/add with dedicated workqueue.
> 
>   09/12 Attribute change and invalidation: apply attribute triggers
>         to GPU ranges, invalidate_interval for boundary realignment,
>         work queue dequeue helpers, checkpoint timestamp.
> 
>   10/12 Initialization and lifecycle: kmem_cache, drm_gpusvm_init
>         with chunk sizes (2M/64K/4K), XNACK detection, GC init,
>         PASID lookup, TLB flush, and init/close/fini lifecycle.
> 
>   11/12 Ioctl, GC, and fault handler: ioctl dispatcher, GC worker,
>         and amdgpu_svm_fault.c/h with full fault path including
>         unregistered attribute derivation and retry logic.
> 
>   12/12 Build integration: Kconfig (CONFIG_DRM_AMDGPU_SVM), Makefile
>         rules, ioctl registration, and amdgpu_vm fault dispatch.
> 
> Honglei Huang (12):
>   drm/amdgpu: define SVM UAPI for GPU shared virtual memory
>   drm/amdgpu: introduce SVM core header and VM integration
>   drm/amdgpu: define SVM attribute subsystem types
>   drm/amdgpu: implement SVM attribute tree and helper functions
>   drm/amdgpu: implement SVM attribute set, get, and clear
>   drm/amdgpu: define SVM range types and work queue interface
>   drm/amdgpu: implement SVM range GPU mapping core
>   drm/amdgpu: implement SVM range notifier and GC helpers
>   drm/amdgpu: implement SVM attribute change and invalidation callback
>   drm/amdgpu: implement SVM initialization and lifecycle
>   drm/amdgpu: add SVM ioctl, garbage collector, and fault handler
>   drm/amdgpu: integrate SVM into build system and VM fault path
> 
>  drivers/gpu/drm/amd/amdgpu/Kconfig            |  11 +
>  drivers/gpu/drm/amd/amdgpu/Makefile           |  13 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |   2 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c       | 467 +++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h       | 162 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c  | 952 ++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h  | 144 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c | 368 +++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h |  39 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c | 863 ++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h | 148 +++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |  20 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   4 +
>  include/uapi/drm/amdgpu_drm.h                 |  39 +
>  14 files changed, 3231 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_attr.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_fault.h
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.c
>  create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_svm_range.h
> 
> -- 
> 2.34.1
> 

  parent reply	other threads:[~2026-04-21  2:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 13:12 [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm Honglei Huang
2026-04-20 13:12 ` [RFC V3 01/12] drm/amdgpu: define SVM UAPI for GPU shared virtual memory Honglei Huang
2026-04-20 13:12 ` [RFC V3 02/12] drm/amdgpu: introduce SVM core header and VM integration Honglei Huang
2026-04-20 13:12 ` [RFC V3 03/12] drm/amdgpu: define SVM attribute subsystem types Honglei Huang
2026-04-20 13:12 ` [RFC V3 04/12] drm/amdgpu: implement SVM attribute tree and helper functions Honglei Huang
2026-04-20 13:13 ` [RFC V3 05/12] drm/amdgpu: implement SVM attribute set, get, and clear Honglei Huang
2026-04-20 13:13 ` [RFC V3 06/12] drm/amdgpu: define SVM range types and work queue interface Honglei Huang
2026-04-20 13:13 ` [RFC V3 07/12] drm/amdgpu: implement SVM range GPU mapping core Honglei Huang
2026-04-20 13:13 ` [RFC V3 08/12] drm/amdgpu: implement SVM range notifier and GC helpers Honglei Huang
2026-04-20 13:13 ` [RFC V3 09/12] drm/amdgpu: implement SVM attribute change and invalidation callback Honglei Huang
2026-04-20 13:13 ` [RFC V3 10/12] drm/amdgpu: implement SVM initialization and lifecycle Honglei Huang
2026-04-20 13:13 ` [RFC V3 11/12] drm/amdgpu: add SVM ioctl, garbage collector, and fault handler Honglei Huang
2026-04-20 16:24   ` Matthew Brost
2026-04-21 10:07     ` Huang, Honglei1
2026-04-20 13:13 ` [RFC V3 12/12] drm/amdgpu: integrate SVM into build system and VM fault path Honglei Huang
2026-04-21  2:31 ` Huang Rui [this message]
2026-04-21  9:54   ` [RFC V3 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm Huang, Honglei1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aebhbe9gMAqz1tpM@amd.com \
    --to=ray.huang@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Jenny-Jing.Liu@amd.com \
    --cc=Junhua.Shen@amd.com \
    --cc=Lingshan.Zhu@amd.com \
    --cc=Oak.Zeng@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=Xiaogang.Chen@amd.com \
    --cc=aliceryhl@google.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=honghuan@amd.com \
    --cc=honglei1.huang@amd.com \
    --cc=matthew.brost@intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox