All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Brost <matthew.brost@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Gajendra Uttamchand <gajendra.uttamchand@intel.com>,
	<intel-xe@lists.freedesktop.org>,
	<Uttamchand@soc-5cg1426vcc.clients.intel.com>
Subject: Re: [PATCH 1/2] [PATCH] drm/xe/lrc: document sentinel and make CTX_TIMESTAMP read TOCTOU-safe
Date: Fri, 12 Jun 2026 10:33:54 -0700	[thread overview]
Message-ID: <aixDAgen0XZCImj0@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <aixBttaQIeDTFyaR@soc-5CG1426VCC.clients.intel.com>

On Fri, Jun 12, 2026 at 10:28:22AM -0700, Umesh Nerlige Ramappa wrote:
> On Fri, Jun 12, 2026 at 03:27:47AM +0000, Gajendra Uttamchand wrote:
> > From: "Uttamchand, Gajendra" <gajendra.uttamchand@intel.com>
> > 
> > Problem: CTX_TIMESTAMP MMIO reads could be stale if a context
> > switched out between check and read; LRC stores a sentinel while
> > a context starts that must not be treated as a real timestamp.
> > 
> > Fix: Check the LRC-stored sentinel before and after the MMIO read;
> > return the LRC value if the context switched out to avoid TOCTOU.
> > 
> > Note: Keep XE_LRC_CTX_TIMESTAMP_ACTIVE in xe_lrc.h as the
> > canonical sentinel.
> > 
> > Signed-off-by: Uttamchand, Gajendra <gajendra.uttamchand@intel.com>
> 
> Hi Gajendra,
> 
> Thanks for addressing this.
> 
> The patch looks correct. I think this regression was introduced when refactoring for multi-queue, so please add a Fixes tag to this.
> 
> Fixes: d243ef6a39c6 ("drm/xe/lrc: Refactor xe_lrc_timestamp to simplify logic")
> 
> With that, this is
> 
> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> 

Sashiko [1] also flagged some other valid existing issue. There are very
hard to hit race, but look valid. I don't think it worth blocking this
patch and this fix itself LGTM but I think we should open a Jira to
track the issues raised by Sashiko and get those fixed sooner or later.

With the fixes tag Umesh suggests:
Acked-by: Matthew Brost <matthew.brost@intel.com>

[1] https://sashiko.dev/#/patchset/20260612032745.179673-4-gajendra.uttamchand%40intel.com

> Umesh
> 
> > ---
> > drivers/gpu/drm/xe/xe_lrc.c | 29 +++++++++++++++++++++--------
> > drivers/gpu/drm/xe/xe_lrc.h |  7 +++++++
> > 2 files changed, 28 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
> > index a4292a11391d..e284c008c126 100644
> > --- a/drivers/gpu/drm/xe/xe_lrc.c
> > +++ b/drivers/gpu/drm/xe/xe_lrc.c
> > @@ -1096,7 +1096,7 @@ static void xe_lrc_finish(struct xe_lrc *lrc)
> >  * on until it is scheduled, we also read the ENGINE_ID MMIO in the WA BB and
> >  * store it in the PPHSWP.
> >  */
> > -#define CONTEXT_ACTIVE 1ULL
> > +#define CONTEXT_ACTIVE XE_LRC_CTX_TIMESTAMP_ACTIVE
> > static ssize_t setup_utilization_wa(struct xe_lrc *lrc,
> > 				    struct xe_hw_engine *hwe,
> > 				    u32 *batch,
> > @@ -2720,21 +2720,34 @@ static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
> > static u64 xe_lrc_context_timestamp(struct xe_lrc *lrc)
> > {
> > 	u64 reg_ts, new_ts = lrc->ctx_timestamp;
> > +	u64 stored;
> > 
> > 	/* CTX_TIMESTAMP mmio read is invalid on VF, so return the LRC value */
> > 	if (IS_SRIOV_VF(lrc_to_xe(lrc)))
> > 		return xe_lrc_ctx_timestamp(lrc);
> > 
> > -	if (context_active(lrc) &&
> > -	    !get_ctx_timestamp(lrc, xe_lrc_engine_id(lrc), &reg_ts))
> > +	/*
> > +	 * Safely read CTX_TIMESTAMP: check the LRC-stored value before and
> > +	 * after the MMIO read to avoid a TOCTOU where a context switch makes the
> > +	 * MMIO value stale. If the LRC value is not `CONTEXT_ACTIVE` return it;
> > +	 * otherwise accept the MMIO value only if the context remained active.
> > +	 */
> > +
> > +	stored = xe_lrc_ctx_timestamp(lrc);
> > +	if (stored != CONTEXT_ACTIVE)
> > +		return stored;
> > +
> > +	/* Context is active: read the live timestamp from the engine's MMIO register. */
> > +	if (!get_ctx_timestamp(lrc, xe_lrc_engine_id(lrc), &reg_ts))
> > 		new_ts = reg_ts;
> > 
> > -	/*
> > -	 * If context swicthed out while we were here, just return the latest
> > -	 * LRC CTX TIMESTAMP value.
> > +	/* Re-check the LRC-stored timestamp: if the context switched out while
> > +	 * reading MMIO the hardware saved the canonical timestamp into the LRC
> > +	 * during context-save, so return that value instead of the MMIO read.
> > 	 */
> > -	if (!context_active(lrc))
> > -		return xe_lrc_ctx_timestamp(lrc);
> > +	stored = xe_lrc_ctx_timestamp(lrc);
> > +	if (stored != CONTEXT_ACTIVE)
> > +		return stored;
> > 
> > 	return new_ts;
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
> > index 0a3a611391ee..7be5e3da8bc8 100644
> > --- a/drivers/gpu/drm/xe/xe_lrc.h
> > +++ b/drivers/gpu/drm/xe/xe_lrc.h
> > @@ -9,6 +9,13 @@
> > 
> > #include "xe_lrc_types.h"
> > 
> > +/*
> > + * Sentinel value stored in lrc->ctx_timestamp while a context is starting.
> > + * The hardware hasn't yet written the real CTX_TIMESTAMP, so this is not a
> > + * valid elapsed-time sample and must not be used as one.
> > + */
> > +#define XE_LRC_CTX_TIMESTAMP_ACTIVE 1ULL
> > +
> > struct drm_printer;
> > struct xe_bb;
> > struct xe_device;
> > --
> > 2.43.0
> > 

  reply	other threads:[~2026-06-12 17:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12  3:27 [PATCH 0/2] drm/xe: make CTX_TIMESTAMP TOCTOU-safe and handle sentinels Gajendra Uttamchand
2026-06-12  3:27 ` [PATCH 1/2] [PATCH] drm/xe/lrc: document sentinel and make CTX_TIMESTAMP read TOCTOU-safe Gajendra Uttamchand
2026-06-12 17:28   ` Umesh Nerlige Ramappa
2026-06-12 17:33     ` Matthew Brost [this message]
2026-06-12  3:27 ` [PATCH 2/2] [PATCH] drm/xe/guc: treat sentinel timestamps as no elapsed time to avoid false TDR Gajendra Uttamchand
2026-06-12  5:20   ` Matthew Brost
2026-06-12  3:33 ` ✗ CI.checkpatch: warning for drm/xe: make CTX_TIMESTAMP TOCTOU-safe and handle sentinels Patchwork
2026-06-12  3:35 ` ✓ CI.KUnit: success " Patchwork
2026-06-12  4:13 ` ✓ Xe.CI.BAT: " Patchwork
2026-06-12 19:31 ` ✓ Xe.CI.FULL: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aixDAgen0XZCImj0@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=Uttamchand@soc-5cg1426vcc.clients.intel.com \
    --cc=gajendra.uttamchand@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.