From: "Christian König" <christian.koenig@amd.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
intel-xe@lists.freedesktop.org
Cc: Felix Kuehling <Felix.Kuehling@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
David Airlie <airlied@gmail.com>, Simona Vetter <simona@ffwll.ch>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
Danilo Krummrich <dakr@kernel.org>,
Matthew Brost <matthew.brost@intel.com>,
Alice Ryhl <aliceryhl@google.com>,
Rob Clark <robin.clark@oss.qualcomm.com>,
Dmitry Baryshkov <lumag@kernel.org>,
Abhinav Kumar <abhinav.kumar@linux.dev>,
Jessica Zhang <jesszhan0024@gmail.com>,
Sean Paul <sean@poorly.run>,
Marijn Suijten <marijn.suijten@somainline.org>,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org
Subject: Re: [PATCH 4/5] drm/exec, drm/xe: Avoid abusing the drm_exec retry pointer
Date: Tue, 31 Mar 2026 13:59:40 +0200 [thread overview]
Message-ID: <41747ef2-54d6-45e6-8d19-d682817c23a3@amd.com> (raw)
In-Reply-To: <e4855d379990345e47e1175ff4b20a757888ff42.camel@linux.intel.com>
On 3/31/26 12:13, Thomas Hellström wrote:
> On Tue, 2026-03-31 at 11:44 +0200, Christian König wrote:
>> On 3/31/26 11:20, Thomas Hellström wrote:
>>> The xe driver was using the drm_exec retry pointer directly to
>>> restart the locking loop after out-of-memory errors. This is
>>> relying on documented behaviour.
>>>
>>> Instead add a drm_exec_retry() macro that can be used in this
>>> situation, and that also asserts that the struct drm_exec is
>>> in a state that is compatible with retrying:
>>> Either newly initialized or in a contended state with all locks
>>> dropped.
>>>
>>> Use that macro in xe.
>>>
>>> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_validation.h | 2 +-
>>> include/drm/drm_exec.h | 13 +++++++++++++
>>> 2 files changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_validation.h
>>> b/drivers/gpu/drm/xe/xe_validation.h
>>> index a30e732c4d51..4cd955ce6cd2 100644
>>> --- a/drivers/gpu/drm/xe/xe_validation.h
>>> +++ b/drivers/gpu/drm/xe/xe_validation.h
>>> @@ -146,7 +146,7 @@ bool xe_validation_should_retry(struct
>>> xe_validation_ctx *ctx, int *ret);
>>> #define xe_validation_retry_on_oom(_ctx,
>>> _ret) \
>>> do
>>> { \
>>> if (xe_validation_should_retry(_ctx,
>>> _ret)) \
>>> - goto
>>> *__drm_exec_retry_ptr; \
>>> + drm_exec_retry((_ctx)-
>>>> exec); \
>>
>> Oh, that goto is extremely questionable to begin with.
>>
>>> } while (0)
>>>
>>> /**
>>> diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
>>> index fc95a979e253..5ed5be1f8244 100644
>>> --- a/include/drm/drm_exec.h
>>> +++ b/include/drm/drm_exec.h
>>> @@ -138,6 +138,19 @@ static inline bool
>>> drm_exec_is_contended(struct drm_exec *exec)
>>> return !!exec->contended;
>>> }
>>>
>>> +/**
>>> + * drm_exec_retry() - Unconditionally restart the loop to grab all
>>> locks.
>>> + * @exec: drm_exec object
>>> + *
>>> + * Unconditionally retry the loop to lock all objects. For
>>> consistency,
>>> + * the exec object needs to be newly initialized or contended.
>>> + */
>>> +#define drm_exec_retry(_exec) \
>>> + do { \
>>> + WARN_ON(!drm_exec_is_contended(_exec)); \
>>
>> This warning would trigger!
>>
>> See the code in xe_bo_notifier_prepare_pinned() for example:
>>
>> drm_exec_retry_on_contention(&exec);
>> ret = PTR_ERR(backup);
>> xe_validation_retry_on_oom(&ctx, &ret);
>>
>> Without contention we would just skip the loop and never lock
>> anything.
>>
>> What XE does here just doesn't work as far as I can see.
>
> So if the xe_validation_retry_on_oom() is actually retrying it
> internally call drm_exec_fini() and drm_exec_init() first, which means
> that the warning doesn't trigger, due to the dummy value of contended.
Ah! Yeah that information was missing.
I'm really wondering if the calls to drm_exec_fini()/drm_exec_init() should be part of the drm_exec_retry() handling.
Otherwise that is kind of easy to mess up.
Regards,
Christian.
>
> So the warning does its job, and xe is safe.
>
> Thanks,
> Thomas
>
>
>
>>
>> Regards,
>> Christian.
>>
>>> + goto *__drm_exec_retry_ptr; \
>>> + } while (0)
>>> +
>>> void drm_exec_init(struct drm_exec *exec, u32 flags, unsigned nr);
>>> void drm_exec_fini(struct drm_exec *exec);
>>> bool drm_exec_cleanup(struct drm_exec *exec);
next prev parent reply other threads:[~2026-03-31 11:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-31 9:20 [PATCH 0/5] drm/exec: drm_exec polishing Thomas Hellström
2026-03-31 9:20 ` [PATCH 1/5] drm/exec: Remove the index parameter from drm_exec_for_each_locked_obj[_reverse] Thomas Hellström
2026-03-31 9:29 ` Christian König
2026-03-31 9:20 ` [PATCH 2/5] drm/msm: Remove abuse of drm_exec internals Thomas Hellström
2026-03-31 9:30 ` Christian König
2026-03-31 9:36 ` Christian König
2026-03-31 19:08 ` Rob Clark
2026-03-31 19:52 ` Thomas Hellström
2026-03-31 20:39 ` Rob Clark
2026-03-31 9:20 ` [PATCH 3/5] drm/exec: Make the drm_exec_until_all_locked() macro more readable Thomas Hellström
2026-03-31 9:39 ` Christian König
2026-03-31 11:03 ` Thomas Hellström
2026-03-31 9:20 ` [PATCH 4/5] drm/exec, drm/xe: Avoid abusing the drm_exec retry pointer Thomas Hellström
2026-03-31 9:44 ` Christian König
2026-03-31 10:13 ` Thomas Hellström
2026-03-31 11:09 ` Thomas Hellström
2026-03-31 11:59 ` Christian König [this message]
2026-03-31 9:20 ` [PATCH 5/5] drm/exec, drm/xe, drm/amdgpu: Add an accessor for struct drm_exec::ticket Thomas Hellström
2026-03-31 9:46 ` Christian König
2026-03-31 10:18 ` Thomas Hellström
2026-03-31 21:46 ` kernel test robot
2026-03-31 22:07 ` kernel test robot
2026-04-01 0:38 ` kernel test robot
2026-03-31 9:49 ` ✗ CI.checkpatch: warning for drm/exec: drm_exec polishing Patchwork
2026-03-31 9:51 ` ✓ CI.KUnit: success " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41747ef2-54d6-45e6-8d19-d682817c23a3@amd.com \
--to=christian.koenig@amd.com \
--cc=Felix.Kuehling@amd.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=aliceryhl@google.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jesszhan0024@gmail.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=robin.clark@oss.qualcomm.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox