* [PATCH 0/2] Make device links between KFD and GPU device
@ 2025-12-07 14:04 Mario Limonciello (AMD)
2025-12-07 14:04 ` [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires Mario Limonciello (AMD)
2025-12-07 14:04 ` [PATCH 2/2] amdkfd: Add device links between kfd device and amdgpu device Mario Limonciello (AMD)
0 siblings, 2 replies; 4+ messages in thread
From: Mario Limonciello (AMD) @ 2025-12-07 14:04 UTC (permalink / raw)
To: amd-gfx; +Cc: Mario Limonciello (AMD)
Discovering which KFD device is associated with a GPU is relatively
awkward right now in userspace.
This series creates sysfs links between the devices to simplify it
for userspace.
Mario Limonciello (AMD) (2):
amdkfd: Only ignore -ENOENT for KFD init failuires
amdkfd: Add device links between kfd device and amdgpu device
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 +++++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++--
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 36 +++++++++++++++++++
.../gpu/drm/amd/include/kgd_kfd_interface.h | 2 ++
6 files changed, 55 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires
2025-12-07 14:04 [PATCH 0/2] Make device links between KFD and GPU device Mario Limonciello (AMD)
@ 2025-12-07 14:04 ` Mario Limonciello (AMD)
2025-12-10 17:06 ` Russell, Kent
2025-12-07 14:04 ` [PATCH 2/2] amdkfd: Add device links between kfd device and amdgpu device Mario Limonciello (AMD)
1 sibling, 1 reply; 4+ messages in thread
From: Mario Limonciello (AMD) @ 2025-12-07 14:04 UTC (permalink / raw)
To: amd-gfx; +Cc: Mario Limonciello (AMD)
When compiled without CONFIG_HSA_AMD KFD will return -ENOENT.
As other errors will cause KFD functionality issues this is the
only error code that should be ignored at init.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 16adeba4d7e68..e804461e5f272 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -3169,8 +3169,10 @@ static int __init amdgpu_init(void)
amdgpu_register_atpx_handler();
amdgpu_acpi_detect();
- /* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */
- amdgpu_amdkfd_init();
+ /* Ignore KFD init failures when CONFIG_HSA_AMD is not set. */
+ r = amdgpu_amdkfd_init();
+ if (r && r != -ENOENT)
+ goto error_fence;
if (amdgpu_pp_feature_mask & PP_OVERDRIVE_MASK) {
add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] amdkfd: Add device links between kfd device and amdgpu device
2025-12-07 14:04 [PATCH 0/2] Make device links between KFD and GPU device Mario Limonciello (AMD)
2025-12-07 14:04 ` [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires Mario Limonciello (AMD)
@ 2025-12-07 14:04 ` Mario Limonciello (AMD)
1 sibling, 0 replies; 4+ messages in thread
From: Mario Limonciello (AMD) @ 2025-12-07 14:04 UTC (permalink / raw)
To: amd-gfx; +Cc: Mario Limonciello (AMD)
Mapping out a KFD device to a GPU can be done manually by looking at the
domain and location properties. To make it easier to discover which
KFD device goes with what GPU add bidirectional links.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 8 +++++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 36 +++++++++++++++++++
.../gpu/drm/amd/include/kgd_kfd_interface.h | 2 ++
5 files changed, 51 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index a2879d2b7c8ec..5d6cf3adfa7b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -910,3 +910,11 @@ int amdgpu_amdkfd_config_sq_perfmon(struct amdgpu_device *adev, uint32_t xcp_id,
return r;
}
+
+int amdgpu_amdkfd_create_sysfs_links(struct amdgpu_device *adev)
+{
+ if (!adev->kfd.init_complete || !adev->kfd.dev)
+ return 0;
+
+ return kgd2kfd_create_sysfs_links(adev->kfd.dev);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 2fa5f1925f5a3..542f5bc2dd189 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -270,6 +270,7 @@ int amdgpu_amdkfd_stop_sched(struct amdgpu_device *adev, uint32_t node_id);
int amdgpu_amdkfd_config_sq_perfmon(struct amdgpu_device *adev, uint32_t xcp_id,
bool core_override_enable, bool reg_override_enable, bool perfmon_override_enable);
bool amdgpu_amdkfd_compute_active(struct amdgpu_device *adev, uint32_t node_id);
+int amdgpu_amdkfd_create_sysfs_links(struct amdgpu_device *adev);
/* Read user wptr from a specified user address space with page fault
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a0213a07023d..44c9320d72a56 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4947,6 +4947,10 @@ int amdgpu_device_init(struct amdgpu_device *adev,
*/
r = amdgpu_device_sys_interface_init(adev);
+ r = amdgpu_amdkfd_create_sysfs_links(adev);
+ if (r)
+ dev_err(adev->dev, "Failed to create KFD sysfs link: %d\n", r);
+
if (IS_ENABLED(CONFIG_PERF_EVENTS))
r = amdgpu_pmu_init(adev);
if (r)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 9c3e8f946a3d5..be673e35978eb 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -79,6 +79,37 @@ struct kfd_topology_device *kfd_topology_device_by_proximity_domain(
return device;
}
+int kgd2kfd_create_sysfs_links(struct kfd_dev *kfd)
+{
+ struct kfd_topology_device *top_dev;
+ int ret = -ENODEV;
+
+ if (!kfd)
+ return -EINVAL;
+
+ down_read(&topology_lock);
+
+ list_for_each_entry(top_dev, &topology_device_list, list) {
+ struct kobject *amdgpu_kobj;
+
+ if (!top_dev->gpu || top_dev->gpu->kfd != kfd || !top_dev->kobj_node)
+ continue;
+
+ amdgpu_kobj = &top_dev->gpu->adev->dev->kobj;
+ ret = sysfs_create_link(top_dev->kobj_node, amdgpu_kobj, "device");
+ if (ret)
+ break;
+
+ ret = sysfs_create_link(amdgpu_kobj, top_dev->kobj_node, "kfd");
+ if (ret)
+ sysfs_remove_link(top_dev->kobj_node, "device");
+ break;
+ }
+
+ up_read(&topology_lock);
+ return ret;
+}
+
struct kfd_topology_device *kfd_topology_device_by_id(uint32_t gpu_id)
{
struct kfd_topology_device *top_dev = NULL;
@@ -567,6 +598,11 @@ static void kfd_remove_sysfs_node_entry(struct kfd_topology_device *dev)
struct kfd_mem_properties *mem;
struct kfd_perf_properties *perf;
+ if (dev->gpu) {
+ sysfs_remove_link(dev->kobj_node, "device");
+ sysfs_remove_link(&dev->gpu->adev->dev->kobj, "kfd");
+ }
+
if (dev->kobj_iolink) {
list_for_each_entry(iolink, &dev->io_link_props, list)
if (iolink->kobj) {
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 9aba8596faa7e..f6db1dc634399 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -335,4 +335,6 @@ struct kfd2kgd_calls {
int engine, int queue);
};
+int kgd2kfd_create_sysfs_links(struct kfd_dev *kfd);
+
#endif /* KGD_KFD_INTERFACE_H_INCLUDED */
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires
2025-12-07 14:04 ` [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires Mario Limonciello (AMD)
@ 2025-12-10 17:06 ` Russell, Kent
0 siblings, 0 replies; 4+ messages in thread
From: Russell, Kent @ 2025-12-10 17:06 UTC (permalink / raw)
To: Mario Limonciello (AMD), amd-gfx@lists.freedesktop.org
[AMD Official Use Only - AMD Internal Distribution Only]
Reviewed-by: Kent Russell <kent.russell@amd.com>
> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Mario
> Limonciello (AMD)
> Sent: Sunday, December 7, 2025 9:04 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Mario Limonciello (AMD) <superm1@kernel.org>
> Subject: [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires
>
> When compiled without CONFIG_HSA_AMD KFD will return -ENOENT.
> As other errors will cause KFD functionality issues this is the
> only error code that should be ignored at init.
>
> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index 16adeba4d7e68..e804461e5f272 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -3169,8 +3169,10 @@ static int __init amdgpu_init(void)
> amdgpu_register_atpx_handler();
> amdgpu_acpi_detect();
>
> - /* Ignore KFD init failures. Normal when CONFIG_HSA_AMD is not set. */
> - amdgpu_amdkfd_init();
> + /* Ignore KFD init failures when CONFIG_HSA_AMD is not set. */
> + r = amdgpu_amdkfd_init();
> + if (r && r != -ENOENT)
> + goto error_fence;
>
> if (amdgpu_pp_feature_mask & PP_OVERDRIVE_MASK) {
> add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
> --
> 2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-12-10 17:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-07 14:04 [PATCH 0/2] Make device links between KFD and GPU device Mario Limonciello (AMD)
2025-12-07 14:04 ` [PATCH 1/2] amdkfd: Only ignore -ENOENT for KFD init failuires Mario Limonciello (AMD)
2025-12-10 17:06 ` Russell, Kent
2025-12-07 14:04 ` [PATCH 2/2] amdkfd: Add device links between kfd device and amdgpu device Mario Limonciello (AMD)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox