From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: "Christian König" <christian.koenig@amd.com>,
intel-xe@lists.freedesktop.org
Cc: Felix Kuehling <Felix.Kuehling@amd.com>,
Alex Deucher <alexander.deucher@amd.com>,
David Airlie <airlied@gmail.com>,
Simona Vetter <simona@ffwll.ch>,
Maarten Lankhorst <maarten.lankhorst@linux.intel.com>,
Maxime Ripard <mripard@kernel.org>,
Thomas Zimmermann <tzimmermann@suse.de>,
Danilo Krummrich <dakr@kernel.org>,
Matthew Brost <matthew.brost@intel.com>,
Alice Ryhl <aliceryhl@google.com>,
Rob Clark <robin.clark@oss.qualcomm.com>,
Dmitry Baryshkov <lumag@kernel.org>,
Abhinav Kumar <abhinav.kumar@linux.dev>,
Jessica Zhang <jesszhan0024@gmail.com>,
Sean Paul <sean@poorly.run>,
Marijn Suijten <marijn.suijten@somainline.org>,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org,
linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org
Subject: Re: [PATCH 4/5] drm/exec, drm/xe: Avoid abusing the drm_exec retry pointer
Date: Tue, 31 Mar 2026 13:09:56 +0200 [thread overview]
Message-ID: <0e73b7e6dbcec756332f672c443ff127c1707fa4.camel@linux.intel.com> (raw)
In-Reply-To: <e4855d379990345e47e1175ff4b20a757888ff42.camel@linux.intel.com>
On Tue, 2026-03-31 at 12:13 +0200, Thomas Hellström wrote:
> On Tue, 2026-03-31 at 11:44 +0200, Christian König wrote:
> > On 3/31/26 11:20, Thomas Hellström wrote:
> > > The xe driver was using the drm_exec retry pointer directly to
> > > restart the locking loop after out-of-memory errors. This is
> > > relying on documented behaviour.
> > >
> > > Instead add a drm_exec_retry() macro that can be used in this
> > > situation, and that also asserts that the struct drm_exec is
> > > in a state that is compatible with retrying:
> > > Either newly initialized or in a contended state with all locks
> > > dropped.
> > >
> > > Use that macro in xe.
> > >
> > > Signed-off-by: Thomas Hellström
> > > <thomas.hellstrom@linux.intel.com>
> > > ---
> > > drivers/gpu/drm/xe/xe_validation.h | 2 +-
> > > include/drm/drm_exec.h | 13 +++++++++++++
> > > 2 files changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_validation.h
> > > b/drivers/gpu/drm/xe/xe_validation.h
> > > index a30e732c4d51..4cd955ce6cd2 100644
> > > --- a/drivers/gpu/drm/xe/xe_validation.h
> > > +++ b/drivers/gpu/drm/xe/xe_validation.h
> > > @@ -146,7 +146,7 @@ bool xe_validation_should_retry(struct
> > > xe_validation_ctx *ctx, int *ret);
> > > #define xe_validation_retry_on_oom(_ctx,
> > > _ret) \
> > > do
> > > { \
> > > if (xe_validation_should_retry(_ctx,
> > > _ret)) \
> > > - goto
> > > *__drm_exec_retry_ptr; \
> > > + drm_exec_retry((_ctx)-
> > > > exec); \
> >
> > Oh, that goto is extremely questionable to begin with.
> >
> > > } while (0)
> > >
> > > /**
> > > diff --git a/include/drm/drm_exec.h b/include/drm/drm_exec.h
> > > index fc95a979e253..5ed5be1f8244 100644
> > > --- a/include/drm/drm_exec.h
> > > +++ b/include/drm/drm_exec.h
> > > @@ -138,6 +138,19 @@ static inline bool
> > > drm_exec_is_contended(struct drm_exec *exec)
> > > return !!exec->contended;
> > > }
> > >
> > > +/**
> > > + * drm_exec_retry() - Unconditionally restart the loop to grab
> > > all
> > > locks.
> > > + * @exec: drm_exec object
> > > + *
> > > + * Unconditionally retry the loop to lock all objects. For
> > > consistency,
> > > + * the exec object needs to be newly initialized or contended.
> > > + */
> > > +#define drm_exec_retry(_exec) \
> > > + do { \
> > > + WARN_ON(!drm_exec_is_contended(_exec)); \
> >
> > This warning would trigger!
> >
> > See the code in xe_bo_notifier_prepare_pinned() for example:
> >
> > drm_exec_retry_on_contention(&exec);
> > ret = PTR_ERR(backup);
> > xe_validation_retry_on_oom(&ctx, &ret);
> >
> > Without contention we would just skip the loop and never lock
> > anything.
> >
> > What XE does here just doesn't work as far as I can see.
>
> So if the xe_validation_retry_on_oom() is actually retrying it
> internally call drm_exec_fini() and drm_exec_init() first, which
> means
> that the warning doesn't trigger, due to the dummy value of
> contended.
>
> So the warning does its job, and xe is safe.
So the xe stuff is actually basically an outer loop to
drm_exec_until_all_locked().
We could ofc explicitly code that implementing an
xe_validation_until_all_valid() and have a separate goto ptr, but I'm
not sure that is cleaner, really. They'd point to the same address
anyway.
In the end, the WARN_ON in drm_exec_retry() would ensure drm_exec is
not in an awkward state anyway.
Thanks,
Thomas
>
> Thanks,
> Thomas
>
>
>
> >
> > Regards,
> > Christian.
> >
> > > + goto *__drm_exec_retry_ptr; \
> > > + } while (0)
> > > +
> > > void drm_exec_init(struct drm_exec *exec, u32 flags, unsigned
> > > nr);
> > > void drm_exec_fini(struct drm_exec *exec);
> > > bool drm_exec_cleanup(struct drm_exec *exec);
next prev parent reply other threads:[~2026-04-01 18:19 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-31 9:20 [PATCH 0/5] drm/exec: drm_exec polishing Thomas Hellström
2026-03-31 9:20 ` [PATCH 1/5] drm/exec: Remove the index parameter from drm_exec_for_each_locked_obj[_reverse] Thomas Hellström
2026-03-31 9:29 ` Christian König
2026-03-31 9:20 ` [PATCH 2/5] drm/msm: Remove abuse of drm_exec internals Thomas Hellström
2026-03-31 9:30 ` Christian König
2026-03-31 9:36 ` Christian König
2026-03-31 19:08 ` Rob Clark
2026-03-31 19:52 ` Thomas Hellström
2026-03-31 20:39 ` Rob Clark
2026-03-31 9:20 ` [PATCH 3/5] drm/exec: Make the drm_exec_until_all_locked() macro more readable Thomas Hellström
2026-03-31 9:39 ` Christian König
2026-03-31 11:03 ` Thomas Hellström
2026-03-31 9:20 ` [PATCH 4/5] drm/exec, drm/xe: Avoid abusing the drm_exec retry pointer Thomas Hellström
2026-03-31 9:44 ` Christian König
2026-03-31 10:13 ` Thomas Hellström
2026-03-31 11:09 ` Thomas Hellström [this message]
2026-03-31 11:59 ` Christian König
2026-03-31 9:20 ` [PATCH 5/5] drm/exec, drm/xe, drm/amdgpu: Add an accessor for struct drm_exec::ticket Thomas Hellström
2026-03-31 9:46 ` Christian König
2026-03-31 10:18 ` Thomas Hellström
2026-03-31 21:46 ` kernel test robot
2026-03-31 22:07 ` kernel test robot
2026-04-01 0:38 ` kernel test robot
2026-03-31 9:49 ` ✗ CI.checkpatch: warning for drm/exec: drm_exec polishing Patchwork
2026-03-31 9:51 ` ✓ CI.KUnit: success " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e73b7e6dbcec756332f672c443ff127c1707fa4.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=Felix.Kuehling@amd.com \
--cc=abhinav.kumar@linux.dev \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=aliceryhl@google.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dakr@kernel.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=jesszhan0024@gmail.com \
--cc=linux-arm-msm@vger.kernel.org \
--cc=lumag@kernel.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=marijn.suijten@somainline.org \
--cc=matthew.brost@intel.com \
--cc=mripard@kernel.org \
--cc=robin.clark@oss.qualcomm.com \
--cc=sean@poorly.run \
--cc=simona@ffwll.ch \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.