Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Gajendra Uttamchand <gajendra.uttamchand@intel.com>,
	<intel-xe@lists.freedesktop.org>,
	<Uttamchand@soc-5cg1426vcc.clients.intel.com>
Subject: Re: [PATCH 1/2] [PATCH] drm/xe/lrc: document sentinel and make CTX_TIMESTAMP read TOCTOU-safe
Date: Fri, 12 Jun 2026 10:33:54 -0700	[thread overview]
Message-ID: <aixDAgen0XZCImj0@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <aixBttaQIeDTFyaR@soc-5CG1426VCC.clients.intel.com>

On Fri, Jun 12, 2026 at 10:28:22AM -0700, Umesh Nerlige Ramappa wrote:
> On Fri, Jun 12, 2026 at 03:27:47AM +0000, Gajendra Uttamchand wrote:
> > From: "Uttamchand, Gajendra" <gajendra.uttamchand@intel.com>
> > 
> > Problem: CTX_TIMESTAMP MMIO reads could be stale if a context
> > switched out between check and read; LRC stores a sentinel while
> > a context starts that must not be treated as a real timestamp.
> > 
> > Fix: Check the LRC-stored sentinel before and after the MMIO read;
> > return the LRC value if the context switched out to avoid TOCTOU.
> > 
> > Note: Keep XE_LRC_CTX_TIMESTAMP_ACTIVE in xe_lrc.h as the
> > canonical sentinel.
> > 
> > Signed-off-by: Uttamchand, Gajendra <gajendra.uttamchand@intel.com>
> 
> Hi Gajendra,
> 
> Thanks for addressing this.
> 
> The patch looks correct. I think this regression was introduced when refactoring for multi-queue, so please add a Fixes tag to this.
> 
> Fixes: d243ef6a39c6 ("drm/xe/lrc: Refactor xe_lrc_timestamp to simplify logic")
> 
> With that, this is
> 
> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> 

Sashiko [1] also flagged some other valid existing issue. There are very
hard to hit race, but look valid. I don't think it worth blocking this
patch and this fix itself LGTM but I think we should open a Jira to
track the issues raised by Sashiko and get those fixed sooner or later.

With the fixes tag Umesh suggests:
Acked-by: Matthew Brost <matthew.brost@intel.com>

[1] https://sashiko.dev/#/patchset/20260612032745.179673-4-gajendra.uttamchand%40intel.com

> Umesh
> 
> > ---
> > drivers/gpu/drm/xe/xe_lrc.c | 29 +++++++++++++++++++++--------
> > drivers/gpu/drm/xe/xe_lrc.h |  7 +++++++
> > 2 files changed, 28 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
> > index a4292a11391d..e284c008c126 100644
> > --- a/drivers/gpu/drm/xe/xe_lrc.c
> > +++ b/drivers/gpu/drm/xe/xe_lrc.c
> > @@ -1096,7 +1096,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc)
> >  * on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and
> >  * store it in the PPHSWP.
> >  */
> > -#define CONTEXT_ACTIVE 1ULL
> > +#define CONTEXT_ACTIVE XE_LRC_CTX_TIMESTAMP_ACTIVE
> > static ssize_t setup_utilization_wa(struct xe_lrc *lrc,
> > 				    struct xe_hw_engine *hwe,
> > 				    u32 *batch,
> > @@ -2720,21 +2720,34 @@ static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
> > static u64 xe_lrc_context_timestamp(struct xe_lrc *lrc)
> > {
> > 	u64 reg_ts, new_ts = lrc->ctx_timestamp;
> > +	u64 stored;
> > 
> > 	/* CTX_TIMESTAMP mmio read is invalid on VF, so return the LRC value */
> > 	if (IS_SRIOV_VF(lrc_to_xe(lrc)))
> > 		return xe_lrc_ctx_timestamp(lrc);
> > 
> > -	if (context_active(lrc) &&
> > -	    !get_ctx_timestamp(lrc, xe_lrc_engine_id(lrc), &reg_ts))
> > +	/*
> > +	 * Safely read CTX_TIMESTAMP: check the LRC-stored value before and
> > +	 * after the MMIO read to avoid a TOCTOU where a context switch makes the
> > +	 * MMIO value stale. If the LRC value is not `CONTEXT_ACTIVE` return it;
> > +	 * otherwise accept the MMIO value only if the context remained active.
> > +	 */
> > +
> > +	stored = xe_lrc_ctx_timestamp(lrc);
> > +	if (stored != CONTEXT_ACTIVE)
> > +		return stored;
> > +
> > +	/* Context is active: read the live timestamp from the engine's MMIO register. */
> > +	if (!get_ctx_timestamp(lrc, xe_lrc_engine_id(lrc), &reg_ts))
> > 		new_ts = reg_ts;
> > 
> > -	/*
> > -	 * If context swicthed out while we were here, just return the latest
> > -	 * LRC CTX TIMESTAMP value.
> > +	/* Re-check the LRC-stored timestamp: if the context switched out while
> > +	 * reading MMIO the hardware saved the canonical timestamp into the LRC
> > +	 * during context-save, so return that value instead of the MMIO read.
> > 	 */
> > -	if (!context_active(lrc))
> > -		return xe_lrc_ctx_timestamp(lrc);
> > +	stored = xe_lrc_ctx_timestamp(lrc);
> > +	if (stored != CONTEXT_ACTIVE)
> > +		return stored;
> > 
> > 	return new_ts;
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
> > index 0a3a611391ee..7be5e3da8bc8 100644
> > --- a/drivers/gpu/drm/xe/xe_lrc.h
> > +++ b/drivers/gpu/drm/xe/xe_lrc.h
> > @@ -9,6 +9,13 @@
> > 
> > #include "xe_lrc_types.h"
> > 
> > +/*
> > + * Sentinel value stored in lrc->ctx_timestamp while a context is starting.
> > + * The hardware hasn't yet written the real CTX_TIMESTAMP, so this is not a
> > + * valid elapsed-time sample and must not be used as one.
> > + */
> > +#define XE_LRC_CTX_TIMESTAMP_ACTIVE 1ULL
> > +
> > struct drm_printer;
> > struct xe_bb;
> > struct xe_device;
> > --
> > 2.43.0
> > 

  reply	other threads:[~2026-06-12 17:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12  3:27 [PATCH 0/2] drm/xe: make CTX_TIMESTAMP TOCTOU-safe and handle sentinels Gajendra Uttamchand
2026-06-12  3:27 ` [PATCH 1/2] [PATCH] drm/xe/lrc: document sentinel and make CTX_TIMESTAMP read TOCTOU-safe Gajendra Uttamchand
2026-06-12 17:28   ` Umesh Nerlige Ramappa
2026-06-12 17:33     ` Matthew Brost [this message]
2026-06-12  3:27 ` [PATCH 2/2] [PATCH] drm/xe/guc: treat sentinel timestamps as no elapsed time to avoid false TDR Gajendra Uttamchand
2026-06-12  5:20   ` Matthew Brost
2026-06-12  3:33 ` ✗ CI.checkpatch: warning for drm/xe: make CTX_TIMESTAMP TOCTOU-safe and handle sentinels Patchwork
2026-06-12  3:35 ` ✓ CI.KUnit: success " Patchwork
2026-06-12  4:13 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-12 19:31 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aixDAgen0XZCImj0@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=Uttamchand@soc-5cg1426vcc.clients.intel.com \
    --cc=gajendra.uttamchand@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox