public inbox for dri-devel@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: "Christian König" <christian.koenig@amd.com>
Cc: "Huang, Honglei1" <honghuan@amd.com>, <Alexander.Deucher@amd.com>,
	<Felix.Kuehling@amd.com>, <Oak.Zeng@amd.com>,
	<Jenny-Jing.Liu@amd.com>, <Philip.Yang@amd.com>,
	<Xiaogang.Chen@amd.com>, <Ray.Huang@amd.com>,
	<Lingshan.Zhu@amd.com>, <Junhua.Shen@amd.com>,
	<rodrigo.vivi@intel.com>, <thomas.hellstrom@linux.intel.com>,
	<dakr@kernel.org>, <aliceryhl@google.com>,
	<amd-gfx@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>
Subject: Re: [RFC/POC PATCH 01/12] drm/amdgpu: add SVM UAPI definitions
Date: Mon, 20 Apr 2026 22:08:05 -0700	[thread overview]
Message-ID: <aecGNdso+EB/Aip5@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <e494689d-f126-4be8-8337-f04be1abd12a@amd.com>

On Mon, Apr 20, 2026 at 08:07:38PM +0200, Christian König wrote:
> On 4/20/26 18:06, Matthew Brost wrote:
> > On Mon, Apr 20, 2026 at 05:37:43PM +0200, Christian König wrote:
> >> On 4/20/26 15:30, Huang, Honglei1 wrote:
> >>> On 4/20/2026 8:15 PM, Christian König wrote:
> >>>>
> >>>>
> >>>> On 4/20/26 14:07, Honglei Huang wrote:
> >>>>> From: Honglei Huang <honghuan@amd.com>
> >>>>>
> >>>>> Add amdgpu drm SVM API definitions built on the
> >>>>> DRM GPUSVM framework.
> >>>>>
> >>>>> This includes:
> >>>>> - DRM_AMDGPU_GEM_SVM ioctl
> >>>>> - AMDGPU_SVM_FLAG_* flags
> >>>>> - AMDGPU_SVM_OP_SET_ATTR / AMDGPU_SVM_OP_GET_ATTR operations
> >>>>> - AMDGPU_SVM_ATTR_* attribute types
> >>>>> - AMDGPU_SVM_LOCATION_SYSMEM / AMDGPU_SVM_LOCATION_UNDEFINED
> >>>>> - struct drm_amdgpu_svm_attribute and struct drm_amdgpu_gem_svm
> >>>>>
> >>>>> Signed-off-by: Honglei Huang <honghuan@amd.com>
> >>>>> ---
> >>>>>   include/uapi/drm/amdgpu_drm.h | 39 +++++++++++++++++++++++++++++++++++
> >>>>>   1 file changed, 39 insertions(+)
> >>>>>
> >>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>>>> index 406a42be4..bed71ed9b 100644
> >>>>> --- a/include/uapi/drm/amdgpu_drm.h
> >>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>>>> @@ -58,6 +58,7 @@ extern "C" {
> >>>>>   #define DRM_AMDGPU_USERQ_SIGNAL        0x17
> >>>>>   #define DRM_AMDGPU_USERQ_WAIT        0x18
> >>>>>   #define DRM_AMDGPU_GEM_LIST_HANDLES    0x19
> >>>>> +#define DRM_AMDGPU_GEM_SVM        0x1a
> >>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>>>> @@ -79,6 +80,7 @@ extern "C" {
> >>>>>   #define DRM_IOCTL_AMDGPU_USERQ_SIGNAL    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_SIGNAL, struct drm_amdgpu_userq_signal)
> >>>>>   #define DRM_IOCTL_AMDGPU_USERQ_WAIT    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_USERQ_WAIT, struct drm_amdgpu_userq_wait)
> >>>>>   #define DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_LIST_HANDLES, struct drm_amdgpu_gem_list_handles)
> >>>>> +#define DRM_IOCTL_AMDGPU_GEM_SVM    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_SVM, struct drm_amdgpu_gem_svm)
> >>>>>     /**
> >>>>>    * DOC: memory domains
> >>>>> @@ -1665,6 +1667,43 @@ struct drm_color_ctm_3x4 {
> >>>>>       __u64 matrix[12];
> >>>>>   };
> >>>>>   +#define AMDGPU_SVM_FLAG_HOST_ACCESS        0x00000001
> >>>>> +#define AMDGPU_SVM_FLAG_COHERENT        0x00000002
> >>>>> +#define AMDGPU_SVM_FLAG_HIVE_LOCAL        0x00000004
> >>>>> +#define AMDGPU_SVM_FLAG_GPU_RO            0x00000008
> >>>>> +#define AMDGPU_SVM_FLAG_GPU_EXEC        0x00000010
> >>>>> +#define AMDGPU_SVM_FLAG_GPU_READ_MOSTLY        0x00000020
> >>>>> +#define AMDGPU_SVM_FLAG_GPU_ALWAYS_MAPPED    0x00000040
> >>>>> +#define AMDGPU_SVM_FLAG_EXT_COHERENT        0x00000080
> >>>>> +
> >>>>> +#define AMDGPU_SVM_OP_SET_ATTR        0
> >>>>> +#define AMDGPU_SVM_OP_GET_ATTR        1
> >>>>> +
> >>>>> +#define AMDGPU_SVM_ATTR_PREFERRED_LOC        0
> >>>>> +#define AMDGPU_SVM_ATTR_PREFETCH_LOC        1
> >>>>
> >>>> Up till here the interface makes perfect sense, but then it becomes a bit fuzzy.
> >>>>
> >>>>> +#define AMDGPU_SVM_ATTR_ACCESS            2
> >>>>> +#define AMDGPU_SVM_ATTR_ACCESS_IN_PLACE        3
> >>>>> +#define AMDGPU_SVM_ATTR_NO_ACCESS        4
> >>>>
> >>>> Why are those separate attributes? What is the difference between those?
> >>>
> >>> Really thanks for the comments, I have some content mistaken in V2, so I updated the V3 to fix that. For the header they are same. for other content please review the V3, sorry about that. And will fix the concern you raised in next version.
> >>>
> >>> So the meaning of AMDGPU_SVM_ATTR_ACCESS and AMDGPU_SVM_ATTR_NO_ACCESS are clear, GPU can access it or not, and the SVM can set the preferred location, it can be in VRAM or system, for AMDGPU_SVM_ATTR_ACCESS it can be migrated between RAM and VRAM. For AMDGPU_SVM_ATTR_ACCESS_IN_PLACE,
> >>> it can not migrate, GPU only can access it in the initial place.
> >>
> >> Yeah but that doesn't then the interface doesn't seem to make sense since such states are mutual exclusive.
> >>
> >> It would make sense when you have some attribute which is named (for example) AMDGPU_SVM_ATTR_ACCESS which can have the values INACCESSIBLE, IN_PLACE, MIGRATE.
> >>
> >>>>> +#define AMDGPU_SVM_ATTR_SET_FLAGS        5
> >>>>> +#define AMDGPU_SVM_ATTR_CLR_FLAGS        6
> >>>>
> >>>> Why is that separated into set and clear flags?
> >>>
> >>> This method inherits from KFD and is also designed to be compatible with upper layer applications such as ROCR.
> >>
> >> That is *not* sufficient as justification. We need to document why that is necessary and *not* just say ROCR works that way.
> >>
> >> As far as I can see just a SET_FLAGS should be sufficient.
> >>
> >>>>> +#define AMDGPU_SVM_ATTR_GRANULARITY        7
> >>>>> +
> >>>>> +#define AMDGPU_SVM_LOCATION_SYSMEM        0
> >>>>> +#define AMDGPU_SVM_LOCATION_UNDEFINED        0xffffffff
> >>>>
> >>>> No location for device local memory?
> >>>
> >>> Vaule > 0 means for device memory, in xe_svm, it seems like it uses fd for device local memory.
> > 
> > I have no stake in AMD’s uAPI, but I can at least explain how Xe’s uAPI
> > works here—and admittedly, it’s somewhat goofy.
> > 
> > 0 == device-local memory, with first-touch placement on whichever
> > device/tile touches the memory first
> > 
> > -1 == system memory
> > 
> > ≥ 0 == a render-node FD (which could refer to a local or remote device),
> > paired with a region instance to extract the pgmap for the desired
> > placement
> > 
> > I believe the reason this isn’t fully FD-based is that the compute UMD
> > team wasn’t keen on exporting every pgmap as an FD, though that was
> > something that had been considered.
> 
> That absolutely doesn't make sense to me at all.
> 
> > 
> >>
> >> Absolute clear NAK for that approach. This interface is per FD!
> >>
> >> We need some value AMDGPU_SVM_LOCATION_DEVICE which means that the memory should be migrated to the current device.
> >>
> >> We also need to make sure that setting attributes for different devices doesn't affect each other.
> > 
> > We landed on the conclusion that it is undefined behavior if different
> > render FDs—or more specifically, VMs across devices within the same SVM
> > address space—set different madvise attributes. I believe this was at
> > Sima's suggestion.
> > 
> > From the UMD point of view, every madvise call therefore becomes:
> > 
> > for_each_fd_vm
> > 	set_madvise_attributes
> > 
> > This choice was made to keep madvise attributes local to the per-device
> > VM structure, rather than introducing some form of cross-device shared
> > storage.
> > 
> > A misbehaving user can absolutely shoot themselves in the foot, but at
> > worst this only ends up corrupting behavior within their own process
> > shared across devices.
> 
> Yeah that makes totally sense.
> 
> As far as I can see the two interfaces contradict each other.
> 

No.

> Either you set the information per-device and then each device only gets the information if it needs to migrate the page to it's own local memory or you have global information.
> 
> So why does a device fd needs to know about remote pgmap?
> 

Simplest example:

Devices A and B. The user sets the preferred placement to Device A.
Device B faults first, and Device B moves memory to Device A via remote
pull and can access locally via P2P, scale-up, etc. Avoid a double
bounce once Device A faults.

Matt

> Thanks,
> Christian.
> 
> > 
> > Matt
> > 
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>>>
> >>>>> +
> >>>>> +struct drm_amdgpu_svm_attribute {
> >>>>> +    __u32 type;
> >>>>> +    __u32 value;
> >>>>> +};
> >>>>> +
> >>>>> +struct drm_amdgpu_gem_svm {
> >>>>> +    __u64 start_addr;
> >>>>> +    __u64 size;
> >>>>> +    __u32 operation;
> >>>>> +    __u32 nattr;
> >>>>> +    __u64 attrs_ptr;
> >>>>> +};
> >>>>
> >>>> Those struct make perfect sense but clearly need documentation. Preferable as kerneldoc.
> >>>>
> >>>> And we usually use unions in this header to separate the input from the output parameters.
> >>>
> >>> Got it will add documentation for it and will use unions in next version. Really thanks for the comments.
> >>>
> >>> Regards,
> >>> Honglei
> >>>
> >>>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> +
> >>>>>   #if defined(__cplusplus)
> >>>>>   }
> >>>>>   #endif
> >>>>
> >>>
> >>
> 

  reply	other threads:[~2026-04-21  5:08 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-20 12:07 [RFC V2 00/12] drm/amdgpu: SVM implementation based on drm_gpusvm Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 01/12] drm/amdgpu: add SVM UAPI definitions Honglei Huang
2026-04-20 12:15   ` Christian König
2026-04-20 13:30     ` Huang, Honglei1
2026-04-20 15:37       ` Christian König
2026-04-20 16:06         ` Matthew Brost
2026-04-20 16:28           ` Thomas Hellström
2026-04-20 18:07           ` Christian König
2026-04-21  5:08             ` Matthew Brost [this message]
2026-04-21  6:19               ` Christian König
2026-04-21  6:48                 ` Matthew Brost
2026-04-21  7:13                   ` Christian König
2026-04-21  9:52           ` Huang, Honglei1
2026-04-23  6:21         ` Huang, Honglei1
2026-04-23 10:39           ` Christian König
2026-04-23 11:06             ` Huang, Honglei1
2026-04-23 20:02               ` Matthew Brost
2026-04-24 10:20                 ` Huang, Honglei1
2026-04-24 10:12               ` Huang, Honglei1
2026-04-27 21:05                 ` Felix Kuehling
2026-04-28  2:24                   ` Huang, Honglei1
2026-04-28  6:49                   ` Christian König
2026-04-28  7:00                     ` Huang, Honglei1
2026-04-21  3:37     ` Kuehling, Felix
2026-04-21  6:39       ` Christian König
2026-04-20 12:07 ` [RFC/POC PATCH 02/12] drm/amdgpu: add SVM data structures and header Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 03/12] drm/amdgpu: add SVM attribute data structures Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 04/12] drm/amdgpu: implement SVM attribute tree operations Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 05/12] drm/amdgpu: implement SVM attribute set Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 06/12] drm/amdgpu: add SVM range data structures Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 07/12] drm/amdgpu: implement SVM range PTE flags and GPU mapping Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 08/12] drm/amdgpu: implement SVM range notifier and invalidation Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 09/12] drm/amdgpu: implement SVM range workers Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 10/12] drm/amdgpu: implement SVM core initialization and fini Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 11/12] drm/amdgpu: implement SVM ioctl and fault handler Honglei Huang
2026-04-20 12:07 ` [RFC/POC PATCH 12/12] drm/amdgpu: wire up SVM build system " Honglei Huang
  -- strict thread matches above, loose matches on Subject: below --
2026-03-17 11:29 [RFC/POC PATCH 00/12] POC SVM implementation in AMDGPU based on drm_gpusvm Honglei Huang
2026-03-17 11:29 ` [RFC/POC PATCH 01/12] drm/amdgpu: add SVM UAPI definitions Honglei Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aecGNdso+EB/Aip5@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=Jenny-Jing.Liu@amd.com \
    --cc=Junhua.Shen@amd.com \
    --cc=Lingshan.Zhu@amd.com \
    --cc=Oak.Zeng@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=Ray.Huang@amd.com \
    --cc=Xiaogang.Chen@amd.com \
    --cc=aliceryhl@google.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=honghuan@amd.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox