From: Felix Kuehling <felix.kuehling@amd.com>
To: Zhu Lingshan <lingshan.zhu@amd.com>, alexander.deucher@amd.com
Cc: ray.huang@amd.com, amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH V5 18/18] amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS
Date: Fri, 17 Oct 2025 19:55:29 -0400 [thread overview]
Message-ID: <b545b31e-26ff-4b7a-be86-312d9aa96e97@amd.com> (raw)
In-Reply-To: <20251017084222.54721-19-lingshan.zhu@amd.com>
[-- Attachment #1: Type: text/plain, Size: 6407 bytes --]
On 2025-10-17 04:42, Zhu Lingshan wrote:
> This commit implemetns a new ioctl AMDKFD_IOC_CREATE_PROCESS
> that creates a new secondary kfd_progress on the FD.
>
> To keep backward compatibility, userspace programs need to invoke
> this ioctl explicitly on a FD to create a secondary
> kfd_process which replacing its primary kfd_process.
>
> This commit bumps ioctl minor version.
>
> Signed-off-by: Zhu Lingshan<lingshan.zhu@amd.com>
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 41 ++++++++++++++++++++++++
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 3 +-
> include/uapi/linux/kfd_ioctl.h | 8 +++--
> 4 files changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> index 7c02e8473622..8fe58526bc3a 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
> @@ -3140,6 +3140,44 @@ static int kfd_ioctl_set_debug_trap(struct file *filep, struct kfd_process *p, v
> return r;
> }
>
> +/* userspace programs need to invoke this ioctl explicitly on a FD to
> + * create a secondary kfd_process which replacing its primary kfd_process
> + */
> +static int kfd_ioctl_create_process(struct file *filep, struct kfd_process *p, void *data)
> +{
> + struct kfd_process *process;
> + int ret;
> +
> + if (!filep->private_data || !p)
> + return -EINVAL;
> +
> + if (p != filep->private_data)
> + return -EINVAL;
> +
> + /* Each FD owns only one kfd_process */
> + if (p->context_id != KFD_CONTEXT_ID_PRIMARY)
> + return -EINVAL;
> +
> + mutex_lock(&kfd_processes_mutex);
> + process = create_process(current, false);
> + mutex_unlock(&kfd_processes_mutex);
There could be race conditions between multiple threads calling this
ioctl on the same primary fd concurrently. We could use the
kfd_processes_mutex to protect against this. You'd need to do some of
the above checks and the assignment to filep->private_data under the
lock to be safe, to allow only one thread to successfully create the
secondary context. I think this would work:
/* Atomically create the secondary context and assign it to filep->private_data */
mutex_lock(&kfd_processes_mutex);
if (p != filep->private_data) {
/* Another thread already created a secondary context */
mutex_unlock(&kfd_processes_mutex);
return -EINVAL;
}
secondary = create_process(current, false);
if (!IS_ERR(process))
filep->private_data = secondary;
mutex_unlock(&kfd_processes_mutex);
if (IS_ERR(secondary))
return PTR_ERR(secondary);
/* Release the process reference that was held by filep->private_data */
kfd_unref_process(p);
...
Regards,
Felix
> +
> + if (IS_ERR(process))
> + return PTR_ERR(process);
> +
> + /* Each open() increases kref of the primary kfd_process,
> + * so we need to reduce it here before we create a new secondary process replacing it
> + */
> + kfd_unref_process(p);
> +
> + filep->private_data = process;
> + ret = kfd_create_process_sysfs(process);
> + if (ret)
> + pr_warn("Failed to create sysfs entry for the kfd_process");
> +
> + return 0;
> +}
> +
> #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
> [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
> .cmd_drv = 0, .name = #ioctl}
> @@ -3258,6 +3296,9 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
>
> AMDKFD_IOCTL_DEF(AMDKFD_IOC_DBG_TRAP,
> kfd_ioctl_set_debug_trap, 0),
> +
> + AMDKFD_IOCTL_DEF(AMDKFD_IOC_CREATE_PROCESS,
> + kfd_ioctl_create_process, 0),
> };
>
> #define AMDKFD_CORE_IOCTL_COUNT ARRAY_SIZE(amdkfd_ioctls)
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 4237c859050d..578f9f217e16 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -1055,6 +1055,7 @@ struct amdkfd_ioctl_desc {
> };
> bool kfd_dev_is_large_bar(struct kfd_node *dev);
>
> +struct kfd_process *create_process(const struct task_struct *thread, bool primary);
> int kfd_process_create_wq(void);
> void kfd_process_destroy_wq(void);
> void kfd_cleanup_processes(void);
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 629a706e2a13..bbb72a77bed5 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -68,7 +68,6 @@ static struct workqueue_struct *kfd_restore_wq;
> static struct kfd_process *find_process(const struct task_struct *thread,
> bool ref);
> static void kfd_process_ref_release(struct kref *ref);
> -static struct kfd_process *create_process(const struct task_struct *thread, bool primary);
>
> static void evict_process_worker(struct work_struct *work);
> static void restore_process_worker(struct work_struct *work);
> @@ -1578,7 +1577,7 @@ void kfd_process_set_trap_debug_flag(struct qcm_process_device *qpd,
> * On return the kfd_process is fully operational and will be freed when the
> * mm is released
> */
> -static struct kfd_process *create_process(const struct task_struct *thread, bool primary)
> +struct kfd_process *create_process(const struct task_struct *thread, bool primary)
> {
> struct kfd_process *process;
> struct mmu_notifier *mn;
> diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
> index 5d1727a6d040..84aa24c02715 100644
> --- a/include/uapi/linux/kfd_ioctl.h
> +++ b/include/uapi/linux/kfd_ioctl.h
> @@ -44,9 +44,10 @@
> * - 1.16 - Add contiguous VRAM allocation flag
> * - 1.17 - Add SDMA queue creation with target SDMA engine ID
> * - 1.18 - Rename pad in set_memory_policy_args to misc_process_flag
> + * - 1.19 - Add a new ioctl to craete secondary kfd processes
> */
> #define KFD_IOCTL_MAJOR_VERSION 1
> -#define KFD_IOCTL_MINOR_VERSION 18
> +#define KFD_IOCTL_MINOR_VERSION 19
>
> struct kfd_ioctl_get_version_args {
> __u32 major_version; /* from KFD */
> @@ -1671,7 +1672,10 @@ struct kfd_ioctl_dbg_trap_args {
> #define AMDKFD_IOC_DBG_TRAP \
> AMDKFD_IOWR(0x26, struct kfd_ioctl_dbg_trap_args)
>
> +#define AMDKFD_IOC_CREATE_PROCESS \
> + AMDKFD_IO(0x27)
> +
> #define AMDKFD_COMMAND_START 0x01
> -#define AMDKFD_COMMAND_END 0x27
> +#define AMDKFD_COMMAND_END 0x28
>
> #endif
[-- Attachment #2: Type: text/html, Size: 6813 bytes --]
next prev parent reply other threads:[~2025-10-17 23:55 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 8:42 [PATCH V5 00/18] [PATCH V4 00/18] amdkfd: Implement kfd multiple contexts Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 01/18] amdkfd: enlarge the hashtable of kfd_process Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 02/18] amdkfd: mark the first kfd_process as the primary one Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 03/18] amdkfd: find_process_by_mm always return the primary context Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 04/18] amdkfd: Introduce kfd_create_process_sysfs as a separate function Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 05/18] amdkfd: destroy kfd secondary contexts through fd close Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 06/18] amdkfd: process svm ioctl only on the primary kfd process Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 07/18] amdkfd: process USERPTR allocation " Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 08/18] amdkfd: identify a secondary kfd process by its id Zhu Lingshan
2025-10-17 23:16 ` Felix Kuehling
2025-10-22 6:56 ` Zhu, Lingshan
2025-10-17 8:42 ` [PATCH V5 09/18] amdkfd: find kfd_process by filep->private_data in kfd_mmap Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 10/18] amdkfd: remove DIQ support Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 11/18] amdkfd: process pointer of a HIQ should be NULL Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 12/18] amdkfd: remove test_kq Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 13/18] amdkfd: introduce new helper kfd_lookup_process_by_id Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 14/18] amdkfd: record kfd context id into kfd process_info Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 15/18] amdkfd: record kfd context id in amdkfd_fence Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 16/18] amdkfd: fence handler evict and restore a kfd process by its context id Zhu Lingshan
2025-10-17 8:42 ` [PATCH V5 17/18] amdkfd: process debug trap ioctl only on a primary context Zhu Lingshan
2025-10-17 23:32 ` Felix Kuehling
2025-10-22 6:55 ` Zhu, Lingshan
2025-10-17 8:42 ` [PATCH V5 18/18] amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS Zhu Lingshan
2025-10-17 23:55 ` Felix Kuehling [this message]
2025-10-22 6:59 ` Zhu, Lingshan
2025-10-17 23:58 ` [PATCH V5 00/18] [PATCH V4 00/18] amdkfd: Implement kfd multiple contexts Felix Kuehling
2025-10-22 7:01 ` Zhu, Lingshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b545b31e-26ff-4b7a-be86-312d9aa96e97@amd.com \
--to=felix.kuehling@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=lingshan.zhu@amd.com \
--cc=ray.huang@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox