From: "Christian König" <christian.koenig@amd.com>
To: Alex Deucher <alexander.deucher@amd.com>, amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 03/10] drm/amdgpu/job: use GFP_ATOMIC while in gpu reset
Date: Mon, 19 Jan 2026 13:27:02 +0100 [thread overview]
Message-ID: <9fc31b01-2d1d-4915-9c10-c140a754fd59@amd.com> (raw)
In-Reply-To: <20260116162027.21550-4-alexander.deucher@amd.com>
On 1/16/26 17:20, Alex Deucher wrote:
> If we need to allocate a job during GPU reset, use
> GFP_ATOMIC rather than GFP_KERNEL.
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 9 ++++++---
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 3 ++-
> drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 6 ++++--
> 4 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> index 72ec455fa932c..136e50de712a0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
> @@ -68,7 +68,7 @@ int amdgpu_ib_get(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> int r;
>
> if (size) {
> - r = amdgpu_sa_bo_new(&adev->ib_pools[pool_type],
> + r = amdgpu_sa_bo_new(adev, &adev->ib_pools[pool_type],
> &ib->sa_bo, size);
> if (r) {
> dev_err(adev->dev, "failed to get a new IB (%d)\n", r);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 1daa9145b217e..c7e4d79b9f61d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -192,18 +192,21 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm,
> if (num_ibs == 0)
> return -EINVAL;
>
> - *job = kzalloc(struct_size(*job, ibs, num_ibs), GFP_KERNEL);
> + *job = kzalloc(struct_size(*job, ibs, num_ibs),
> + amdgpu_in_reset(adev) ? GFP_ATOMIC : GFP_KERNEL);
That's an extremely bad idea, amdgpu_in_reset() returns true even outside of the reset thread.
We really need to look at the pool type.
Regards,
Christian.
> if (!*job)
> return -ENOMEM;
>
> - af = kzalloc(sizeof(struct amdgpu_fence), GFP_KERNEL);
> + af = kzalloc(sizeof(struct amdgpu_fence),
> + amdgpu_in_reset(adev) ? GFP_ATOMIC : GFP_KERNEL);
> if (!af) {
> r = -ENOMEM;
> goto err_job;
> }
> (*job)->hw_fence = af;
>
> - af = kzalloc(sizeof(struct amdgpu_fence), GFP_KERNEL);
> + af = kzalloc(sizeof(struct amdgpu_fence),
> + amdgpu_in_reset(adev) ? GFP_ATOMIC : GFP_KERNEL);
> if (!af) {
> r = -ENOMEM;
> goto err_fence;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index 912c9afaf9e11..7ee0cc46b4608 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -339,7 +339,8 @@ void amdgpu_sa_bo_manager_fini(struct amdgpu_device *adev,
> struct amdgpu_sa_manager *sa_manager);
> int amdgpu_sa_bo_manager_start(struct amdgpu_device *adev,
> struct amdgpu_sa_manager *sa_manager);
> -int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager,
> +int amdgpu_sa_bo_new(struct amdgpu_device *adev,
> + struct amdgpu_sa_manager *sa_manager,
> struct drm_suballoc **sa_bo,
> unsigned int size);
> void amdgpu_sa_bo_free(struct drm_suballoc **sa_bo,
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> index 39070b2a4c04f..fc13969f8ef49 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> @@ -76,12 +76,14 @@ void amdgpu_sa_bo_manager_fini(struct amdgpu_device *adev,
> amdgpu_bo_free_kernel(&sa_manager->bo, &sa_manager->gpu_addr, &sa_manager->cpu_ptr);
> }
>
> -int amdgpu_sa_bo_new(struct amdgpu_sa_manager *sa_manager,
> +int amdgpu_sa_bo_new(struct amdgpu_device *adev,
> + struct amdgpu_sa_manager *sa_manager,
> struct drm_suballoc **sa_bo,
> unsigned int size)
> {
> struct drm_suballoc *sa = drm_suballoc_new(&sa_manager->base, size,
> - GFP_KERNEL, false, 0);
> + amdgpu_in_reset(adev) ? GFP_ATOMIC : GFP_KERNEL,
> + false, 0);
>
> if (IS_ERR(sa)) {
> *sa_bo = NULL;
next prev parent reply other threads:[~2026-01-19 12:27 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 16:20 [PATCH 00/10] Improvements for IB handling V3 Alex Deucher
2026-01-16 16:20 ` [PATCH 01/10] drm/amdgpu: fix type for wptr in ring backup Alex Deucher
2026-01-19 12:18 ` Christian König
2026-01-16 16:20 ` [PATCH 02/10] drm/amdgpu: rename amdgpu_fence_driver_guilty_force_completion() Alex Deucher
2026-01-19 12:22 ` Christian König
2026-01-16 16:20 ` [PATCH 03/10] drm/amdgpu/job: use GFP_ATOMIC while in gpu reset Alex Deucher
2026-01-19 12:27 ` Christian König [this message]
2026-01-16 16:20 ` [PATCH 04/10] drm/amdgpu: switch all IPs to using job for IBs Alex Deucher
2026-01-19 12:31 ` Christian König
2026-01-16 16:20 ` [PATCH 05/10] drm/amdgpu: require a job to schedule an IB Alex Deucher
2026-01-19 12:41 ` Christian König
2026-01-16 16:20 ` [PATCH 06/10] drm/amdgpu: don't call drm_sched_stop/start() in asic reset Alex Deucher
2026-01-19 12:42 ` Christian König
2026-01-16 16:20 ` [PATCH 07/10] drm/amdgpu: drop drm_sched_increase_karma() Alex Deucher
2026-01-19 12:44 ` Christian König
2026-01-16 16:20 ` [PATCH 08/10] drm/amdgpu: plumb timedout fence through to force completion Alex Deucher
2026-01-16 16:20 ` [PATCH 09/10] drm/amdgpu: simplify VCN reset helper Alex Deucher
2026-01-16 16:20 ` [PATCH 10/10] drm/amdgpu: rework ring reset backup and reemit Alex Deucher
2026-01-19 13:19 ` Christian König
2026-01-20 2:41 ` Timur Kristóf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9fc31b01-2d1d-4915-9c10-c140a754fd59@amd.com \
--to=christian.koenig@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox