[PATCH v2 0/9] Support run ticks for multi-queue use case

public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed

* [PATCH v2 0/9] Support run ticks for multi-queue use case
@ 2026-05-02  0:53 Umesh Nerlige Ramappa
  2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

In single queue use cases, the CTX TIMESTAMP can be used to track
context run ticks. In multi-queue scenarios the CTX_TIMESTAMP represents
run ticks of all the queues. To determine individual queue run ticks, we
need to use QUEUE TIMESTAMP.

The series adds support to read out QUEUE TIMESTAMP for multi-queue use
cases.

v2: Review comments incorporated

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

Matthew Brost (1):
  drm/xe: Add timestamp_ms to LRC snapshot

Umesh Nerlige Ramappa (8):
  drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot
  drm/xe/multi_queue: Store primary LRC and position info in LRC
  drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc
  drm/xe/lrc: Refactor out engine id to hwe conversion
  drm/xe/multi_queue: Capture queue run times for active queues
  drm/xe/multi_queue: Add trace event for the multi queue timestamp
  drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for
    multi-queue
  drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register

 drivers/gpu/drm/xe/regs/xe_engine_regs.h |   4 +
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h  |   3 +
 drivers/gpu/drm/xe/xe_exec_queue.c       |  23 ++-
 drivers/gpu/drm/xe/xe_lrc.c              | 192 +++++++++++++++++++----
 drivers/gpu/drm/xe/xe_lrc.h              |  10 +-
 drivers/gpu/drm/xe/xe_lrc_types.h        |  11 ++
 drivers/gpu/drm/xe/xe_reg_whitelist.c    |  14 ++
 drivers/gpu/drm/xe/xe_ring_ops.c         |   8 +-
 drivers/gpu/drm/xe/xe_trace_lrc.h        |  27 ++++
 9 files changed, 259 insertions(+), 33 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-04 23:51   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

Use the 64 bit value when available for the context timestamp in the LRC
snapshot.

Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/xe/xe_lrc.c | 4 ++--
 drivers/gpu/drm/xe/xe_lrc.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 9d12a0d2f0b5..98dc4d0eb61b 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -2475,7 +2475,7 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
 	snapshot->replay_offset = 0;
 	snapshot->replay_size = lrc->replay_size;
 	snapshot->lrc_snapshot = NULL;
-	snapshot->ctx_timestamp = lower_32_bits(xe_lrc_ctx_timestamp(lrc));
+	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
 	return snapshot;
 }
@@ -2528,7 +2528,7 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
 	drm_printf(p, "\tRing start: (memory) 0x%08x\n", snapshot->start);
 	drm_printf(p, "\tStart seqno: (memory) %d\n", snapshot->start_seqno);
 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
-	drm_printf(p, "\tTimestamp: 0x%08x\n", snapshot->ctx_timestamp);
+	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
 
 	if (!snapshot->lrc_snapshot)
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index e7c975f9e2d9..62beaffba0af 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -37,7 +37,7 @@ struct xe_lrc_snapshot {
 	} tail;
 	u32 start_seqno;
 	u32 seqno;
-	u32 ctx_timestamp;
+	u64 ctx_timestamp;
 	u32 ctx_job_timestamp;
 };
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/9] drm/xe: Add timestamp_ms to LRC snapshot
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
  2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-04 23:59   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

From: Matthew Brost <matthew.brost@intel.com>

Add a timestamp in milliseconds to the LRC snapshot to make it easier to
reason about how long the LRC has been running and the average duration
of each job.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
---
 drivers/gpu/drm/xe/xe_lrc.c | 4 ++++
 drivers/gpu/drm/xe/xe_lrc.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 98dc4d0eb61b..d85c712d106b 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -23,6 +23,7 @@
 #include "xe_drm_client.h"
 #include "xe_exec_queue_types.h"
 #include "xe_gt.h"
+#include "xe_gt_clock.h"
 #include "xe_gt_printk.h"
 #include "xe_hw_fence.h"
 #include "xe_map.h"
@@ -2476,6 +2477,8 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
 	snapshot->replay_size = lrc->replay_size;
 	snapshot->lrc_snapshot = NULL;
 	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
+	snapshot->ctx_timestamp_ms =
+		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_ctx_timestamp(lrc));
 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
 	return snapshot;
 }
@@ -2529,6 +2532,7 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
 	drm_printf(p, "\tStart seqno: (memory) %d\n", snapshot->start_seqno);
 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
 	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
+	drm_printf(p, "\tTimestamp ms: %llu\n", snapshot->ctx_timestamp_ms);
 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
 
 	if (!snapshot->lrc_snapshot)
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index 62beaffba0af..97aef0327fc8 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -39,6 +39,7 @@ struct xe_lrc_snapshot {
 	u32 seqno;
 	u64 ctx_timestamp;
 	u32 ctx_job_timestamp;
+	u64 ctx_timestamp_ms;
 };
 
 #define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
  2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
  2026-05-02  0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  3:46   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

Given an LRC belonging to the secondary queue, in order to check if its
context group is active, we need to check the LRC of the primary queue.
In addition to that we want to compare the secondary queue position to
CSMQDEBUG register to check if the queue itself is active.

To do so, store primary LRC and position information in the LRC as well
as take a reference to the primary LRC from each LRC in the queue group.

A note on references involved:

- In general the Queue takes a ref on its LRC.
- In addition, for multi-queue,
a. Primary Queue takes a ref for each Secondary LRC.
b. Each Secondary Queue takes a ref to the Primary Queue

In the current patch, each LRC in the queue group is storing a pointer
to Primary LRC. There is a small window of time in the primary queue
free path where the primary LRC may be freed before the secondary LRC.

__xe_exec_queue_fini(q); // frees|puts primary q LRCs
...
window where secondary Q LRC is pointing to invalid primary LRC
...
__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case

In this window the reference in Secondary LRC is invalid. While there
may be nothing accessing the secondary LRCs reference, to be safe, this
patch is taking a reference to Primary LRC from the secondary LRC.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2:
- Store primary LRC instead of primary queue (Niranjana)
- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
- Document/Revisit references (Matt/Umesh)
---
 drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
 drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
 drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
 3 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index b287d0e0e60a..e34601d28520 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
 		return;
 
 	/* Primary queue cleanup */
-	xa_for_each(&group->xa, idx, lrc)
+	xa_for_each(&group->xa, idx, lrc) {
+		/* drop secondary lrc ref to primary lrc */
+		xe_lrc_put(lrc->multi_queue.primary_lrc);
+		/* drop primary queue ref to secondary lrc */
 		xe_lrc_put(lrc);
+	}
+	/* drop primary lrc ref to itself */
+	xe_lrc_put(q->lrc[0]);
 
 	xa_destroy(&group->xa);
 	mutex_destroy(&group->list_lock);
@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
 {
 	xe_assert(gt_to_xe(q->gt), idx < q->width);
 
-	scoped_guard(spinlock, &q->lrc_lookup_lock)
+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
 		q->lrc[idx] = lrc;
+		if (xe_exec_queue_is_multi_queue(q)) {
+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
+
+			lrc->multi_queue.pos = q->multi_queue.pos;
+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
+		}
+	}
 }
 
 /**
@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
 
 			xe_exec_queue_set_lrc(q, lrc, i);
 
-			if (__lrc)
+			if (__lrc) {
+				if (xe_exec_queue_is_multi_queue(q))
+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
+
 				xe_lrc_put(__lrc);
+			}
 			__lrc = lrc;
 
 		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index 97aef0327fc8..3d0bf4a7bfa0 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
 	return SZ_16K;
 }
 
+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
+{
+	return lrc->multi_queue.primary_lrc;
+}
+
 size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
 size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
 u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
index 5a718f759ed6..0a5c13ec2ad7 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -63,6 +63,14 @@ struct xe_lrc {
 
 	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
 	u64 ctx_timestamp;
+
+	/** @multi_queue: Multi queue LRC related information */
+	struct {
+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
+		struct xe_lrc *primary_lrc;
+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
+		u8 pos;
+	} multi_queue;
 };
 
 struct xe_lrc_snapshot;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (2 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:00   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

In secondary queue LRCs, the QUEUE TIMESTAMP register is saved and
restored allowing us to view the individual queue run times. Add helpers
to read this value from the LRC.

BSpec: 73988

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2: (Matt)
- Add BSpec reference
- Make queue_timestamp snapshot 64 bit
- Add a snapshot of queue timestamp in ms
---
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  3 ++
 drivers/gpu/drm/xe/xe_lrc.c             | 47 +++++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_lrc.h             |  2 ++
 drivers/gpu/drm/xe/xe_lrc_types.h       |  3 ++
 4 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
index b5eff383902c..4ab86fc369fd 100644
--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
@@ -34,6 +34,9 @@
 #define CTX_CS_INT_VEC_REG		0x5a
 #define CTX_CS_INT_VEC_DATA		(CTX_CS_INT_VEC_REG + 1)
 
+#define CTX_QUEUE_TIMESTAMP		(0xd0 + 1)
+#define CTX_QUEUE_TIMESTAMP_UDW		(0xd2 + 1)
+
 #define INDIRECT_CTX_RING_HEAD		(0x02 + 1)
 #define INDIRECT_CTX_RING_TAIL		(0x04 + 1)
 #define INDIRECT_CTX_RING_START		(0x06 + 1)
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index d85c712d106b..2ee52efb9219 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -789,6 +789,16 @@ static u32 __xe_lrc_ctx_timestamp_udw_offset(struct xe_lrc *lrc)
 	return __xe_lrc_regs_offset(lrc) + CTX_TIMESTAMP_UDW * sizeof(u32);
 }
 
+static u32 __xe_lrc_queue_timestamp_offset(struct xe_lrc *lrc)
+{
+	return __xe_lrc_regs_offset(lrc) + CTX_QUEUE_TIMESTAMP * sizeof(u32);
+}
+
+static u32 __xe_lrc_queue_timestamp_udw_offset(struct xe_lrc *lrc)
+{
+	return __xe_lrc_regs_offset(lrc) + CTX_QUEUE_TIMESTAMP_UDW * sizeof(u32);
+}
+
 static inline u32 __xe_lrc_indirect_ring_offset(struct xe_lrc *lrc)
 {
 	u32 offset = xe_bo_size(lrc->bo) - LRC_WA_BB_SIZE -
@@ -838,6 +848,8 @@ DECL_MAP_ADDR_HELPERS(ctx_timestamp_udw, lrc->bo)
 DECL_MAP_ADDR_HELPERS(parallel, lrc->bo)
 DECL_MAP_ADDR_HELPERS(indirect_ring, lrc->bo)
 DECL_MAP_ADDR_HELPERS(engine_id, lrc->bo)
+DECL_MAP_ADDR_HELPERS(queue_timestamp, lrc->bo)
+DECL_MAP_ADDR_HELPERS(queue_timestamp_udw, lrc->bo)
 
 #undef DECL_MAP_ADDR_HELPERS
 
@@ -886,6 +898,30 @@ static u64 xe_lrc_ctx_timestamp(struct xe_lrc *lrc)
 	return (u64)udw << 32 | ldw;
 }
 
+/**
+ * xe_lrc_queue_timestamp() - Read queue timestamp value
+ * @lrc: Pointer to the lrc.
+ *
+ * Returns: queue timestamp value
+ */
+static u64 xe_lrc_queue_timestamp(struct xe_lrc *lrc)
+{
+	struct xe_device *xe = lrc_to_xe(lrc);
+	struct iosys_map map;
+	u32 ldw, udw = 0;
+
+	if (!xe_lrc_is_multi_queue(lrc))
+		return 0;
+
+	map = __xe_lrc_queue_timestamp_map(lrc);
+	ldw = xe_map_read32(xe, &map);
+
+	map = __xe_lrc_queue_timestamp_udw_map(lrc);
+	udw = xe_map_read32(xe, &map);
+
+	return (u64)udw << 32 | ldw;
+}
+
 /**
  * xe_lrc_ctx_job_timestamp_ggtt_addr() - Get ctx job timestamp GGTT address
  * @lrc: Pointer to the lrc.
@@ -1551,6 +1587,12 @@ static int xe_lrc_ctx_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, struct
 	if (lrc_to_xe(lrc)->info.has_64bit_timestamp)
 		xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP_UDW, 0);
 
+	if (xe_lrc_is_multi_queue(lrc)) {
+		lrc->queue_timestamp = 0;
+		xe_lrc_write_ctx_reg(lrc, CTX_QUEUE_TIMESTAMP, 0);
+		xe_lrc_write_ctx_reg(lrc, CTX_QUEUE_TIMESTAMP_UDW, 0);
+	}
+
 	if (xe->info.has_asid && vm)
 		xe_lrc_write_ctx_reg(lrc, CTX_ASID, vm->usm.asid);
 
@@ -2479,6 +2521,9 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
 	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
 	snapshot->ctx_timestamp_ms =
 		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_ctx_timestamp(lrc));
+	snapshot->queue_timestamp = xe_lrc_queue_timestamp(lrc);
+	snapshot->queue_timestamp_ms =
+		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_queue_timestamp(lrc));
 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
 	return snapshot;
 }
@@ -2533,6 +2578,8 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
 	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
 	drm_printf(p, "\tTimestamp ms: %llu\n", snapshot->ctx_timestamp_ms);
+	drm_printf(p, "\tQueue Timestamp: 0x%016llx\n", snapshot->queue_timestamp);
+	drm_printf(p, "\tQueue Timestamp ms: %llu\n", snapshot->queue_timestamp_ms);
 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
 
 	if (!snapshot->lrc_snapshot)
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index 3d0bf4a7bfa0..12d08808ac75 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -38,8 +38,10 @@ struct xe_lrc_snapshot {
 	u32 start_seqno;
 	u32 seqno;
 	u64 ctx_timestamp;
+	u64 queue_timestamp;
 	u32 ctx_job_timestamp;
 	u64 ctx_timestamp_ms;
+	u64 queue_timestamp_ms;
 };
 
 #define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4)
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
index 0a5c13ec2ad7..53ef48feebfc 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -64,6 +64,9 @@ struct xe_lrc {
 	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
 	u64 ctx_timestamp;
 
+	/** @queue_timestamp: value of QUEUE_TIMESTAMP on last update */
+	u64 queue_timestamp;
+
 	/** @multi_queue: Multi queue LRC related information */
 	struct {
 		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (3 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:16   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

We need to define more helpers that read engine ID specific register, so
move that logic outside of get_ctx_timestamp().

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/xe/xe_lrc.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 2ee52efb9219..92419e5058fd 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -2620,17 +2620,27 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot)
 	kfree(snapshot);
 }
 
-static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
+static struct xe_hw_engine *engine_id_to_hwe(struct xe_gt *gt, u32 engine_id)
 {
 	u16 class = REG_FIELD_GET(ENGINE_CLASS_ID, engine_id);
 	u16 instance = REG_FIELD_GET(ENGINE_INSTANCE_ID, engine_id);
+	struct xe_hw_engine *hwe = xe_gt_hw_engine(gt, class, instance, false);
+
+	if (xe_gt_WARN_ONCE(gt, !hwe || xe_hw_engine_is_reserved(hwe),
+			    "Unexpected engine class:instance %d:%d for utilization\n",
+			    class, instance))
+		return NULL;
+
+	return hwe;
+}
+
+static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
+{
 	struct xe_hw_engine *hwe;
 	u64 val;
 
-	hwe = xe_gt_hw_engine(lrc->gt, class, instance, false);
-	if (xe_gt_WARN_ONCE(lrc->gt, !hwe || xe_hw_engine_is_reserved(hwe),
-			    "Unexpected engine class:instance %d:%d for context utilization\n",
-			    class, instance))
+	hwe = engine_id_to_hwe(lrc->gt, engine_id);
+	if (!hwe)
 		return -1;
 
 	if (lrc_to_xe(lrc)->info.has_64bit_timestamp)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (4 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:12   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

If a queue is currently active on the CS, query the QUEUE TIMESTAMP
register to get an up to date value of the runtime. To do so, ensure
that the primary queue is active and then check if the secondary queue
is executing on the CS.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2:
- Move trace to a separate patch (Stuart)
- Refactor multi queue timestamp logic (Matt/Niranjana)
---
 drivers/gpu/drm/xe/regs/xe_engine_regs.h |   4 +
 drivers/gpu/drm/xe/xe_lrc.c              | 115 +++++++++++++++++++----
 2 files changed, 99 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
index 1b4a7e9a703d..af6af6f3f5e8 100644
--- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
+++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
@@ -170,6 +170,10 @@
 #define   GFX_MSIX_INTERRUPT_ENABLE		REG_BIT(13)
 
 #define RING_CSMQDEBUG(base)			XE_REG((base) + 0x2b0)
+#define   CURRENT_ACTIVE_QUEUE_ID_MASK		REG_GENMASK(7, 0)
+
+#define RING_QUEUE_TIMESTAMP(base)		XE_REG((base) + 0x4c0)
+#define RING_QUEUE_TIMESTAMP_UDW(base)		XE_REG((base) + 0x4c0 + 4)
 
 #define RING_TIMESTAMP(base)			XE_REG((base) + 0x358)
 
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 92419e5058fd..023202be5d52 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -21,6 +21,7 @@
 #include "xe_configfs.h"
 #include "xe_device.h"
 #include "xe_drm_client.h"
+#include "xe_exec_queue.h"
 #include "xe_exec_queue_types.h"
 #include "xe_gt.h"
 #include "xe_gt_clock.h"
@@ -2655,17 +2656,65 @@ static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
 	return 0;
 }
 
-/**
- * xe_lrc_timestamp() - Current ctx timestamp
- * @lrc: Pointer to the lrc.
- *
- * Return latest ctx timestamp. With support for active contexts, the
- * calculation may be slightly racy, so follow a read-again logic to ensure that
- * the context is still active before returning the right timestamp.
- *
- * Returns: New ctx timestamp value
- */
-u64 xe_lrc_timestamp(struct xe_lrc *lrc)
+static u64 get_queue_timestamp(struct xe_hw_engine *hwe)
+{
+	return xe_mmio_read64_2x32(&hwe->gt->mmio,
+				   RING_QUEUE_TIMESTAMP(hwe->mmio_base));
+}
+
+static u32 get_queue_id(struct xe_hw_engine *hwe)
+{
+	u32 val = xe_mmio_read32(&hwe->gt->mmio,
+				 RING_CSMQDEBUG(hwe->mmio_base));
+
+	return REG_FIELD_GET(CURRENT_ACTIVE_QUEUE_ID_MASK, val);
+}
+
+static bool context_active(struct xe_lrc *lrc)
+{
+	return xe_lrc_ctx_timestamp(lrc) == CONTEXT_ACTIVE;
+}
+
+static u64 xe_lrc_multi_queue_timestamp(struct xe_lrc *lrc)
+{
+	struct xe_lrc *primary_lrc = lrc->multi_queue.primary_lrc;
+	struct xe_hw_engine *hwe;
+	u64 reg_queue_ts = lrc->queue_timestamp;
+
+	if (IS_SRIOV_VF(lrc_to_xe(lrc)))
+		return xe_lrc_queue_timestamp(lrc);
+
+	if (!primary_lrc || !context_active(primary_lrc))
+		return xe_lrc_queue_timestamp(lrc);
+
+	/* WA BB populates engine id in PPHWSP of primary context only */
+	hwe = engine_id_to_hwe(primary_lrc->gt, xe_lrc_engine_id(primary_lrc));
+	if (!hwe)
+		return xe_lrc_queue_timestamp(lrc);
+
+	if (get_queue_id(hwe) != lrc->multi_queue.pos)
+		return xe_lrc_queue_timestamp(lrc);
+
+	/* queue is active, so store the queue timestamp register */
+	reg_queue_ts = get_queue_timestamp(hwe);
+
+	/* double check queue and primary queue are both still active */
+	if (get_queue_id(hwe) != lrc->multi_queue.pos ||
+	    !context_active(primary_lrc))
+		return xe_lrc_queue_timestamp(lrc);
+
+	return reg_queue_ts;
+}
+
+static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
+{
+	*old_ts = lrc->queue_timestamp;
+	lrc->queue_timestamp = xe_lrc_multi_queue_timestamp(lrc);
+
+	return lrc->queue_timestamp;
+}
+
+static u64 xe_lrc_single_queue_timestamp(struct xe_lrc *lrc)
 {
 	u64 lrc_ts, reg_ts, new_ts = lrc->ctx_timestamp;
 	u32 engine_id;
@@ -2697,24 +2746,50 @@ u64 xe_lrc_timestamp(struct xe_lrc *lrc)
 	return new_ts;
 }
 
+static u64 xe_lrc_update_ctx_timestamp(struct xe_lrc *lrc, u64 *old_ts)
+{
+	*old_ts = lrc->ctx_timestamp;
+	lrc->ctx_timestamp = xe_lrc_single_queue_timestamp(lrc);
+
+	trace_xe_lrc_update_timestamp(lrc, *old_ts);
+
+	return lrc->ctx_timestamp;
+}
+
 /**
- * xe_lrc_update_timestamp() - Update ctx timestamp
+ * xe_lrc_timestamp() - Current lrc timestamp
+ * @lrc: Pointer to the lrc.
+ *
+ * Return latest lrc timestamp. With support for active contexts/queues, the
+ * calculation may be slightly racy, so follow a read-again logic to ensure that
+ * the context/queue is still active before returning the right timestamp.
+ *
+ * Returns: New lrc timestamp value
+ */
+u64 xe_lrc_timestamp(struct xe_lrc *lrc)
+{
+	if (xe_lrc_is_multi_queue(lrc))
+		return xe_lrc_multi_queue_timestamp(lrc);
+	else
+		return xe_lrc_single_queue_timestamp(lrc);
+}
+
+/**
+ * xe_lrc_update_timestamp() - Update lrc timestamp
  * @lrc: Pointer to the lrc.
  * @old_ts: Old timestamp value
  *
- * Populate @old_ts current saved ctx timestamp, read new ctx timestamp and
+ * Populate @old_ts with current saved lrc timestamp, read new lrc timestamp and
  * update saved value.
  *
- * Returns: New ctx timestamp value
+ * Returns: New lrc timestamp value
  */
 u64 xe_lrc_update_timestamp(struct xe_lrc *lrc, u64 *old_ts)
 {
-	*old_ts = lrc->ctx_timestamp;
-	lrc->ctx_timestamp = xe_lrc_timestamp(lrc);
-
-	trace_xe_lrc_update_timestamp(lrc, *old_ts);
-
-	return lrc->ctx_timestamp;
+	if (xe_lrc_is_multi_queue(lrc))
+		return xe_lrc_update_multi_queue_timestamp(lrc, old_ts);
+	else
+		return xe_lrc_update_ctx_timestamp(lrc, old_ts);
 }
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (5 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:19   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
  2026-05-02  0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

Add a trace event for multi queue timestamp capture.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2:
- Split traces from original patch (Stuart)
- Print primary lrc in the trace (Niranjana)
---
 drivers/gpu/drm/xe/xe_lrc.c       |  2 ++
 drivers/gpu/drm/xe/xe_trace_lrc.h | 27 +++++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 023202be5d52..6bd93803cb7f 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -2711,6 +2711,8 @@ static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
 	*old_ts = lrc->queue_timestamp;
 	lrc->queue_timestamp = xe_lrc_multi_queue_timestamp(lrc);
 
+	trace_xe_lrc_update_queue_timestamp(lrc, *old_ts);
+
 	return lrc->queue_timestamp;
 }
 
diff --git a/drivers/gpu/drm/xe/xe_trace_lrc.h b/drivers/gpu/drm/xe/xe_trace_lrc.h
index d525cbee1e34..fdc77102fa77 100644
--- a/drivers/gpu/drm/xe/xe_trace_lrc.h
+++ b/drivers/gpu/drm/xe/xe_trace_lrc.h
@@ -12,6 +12,7 @@
 #include <linux/tracepoint.h>
 #include <linux/types.h>
 
+#include "xe_exec_queue_types.h"
 #include "xe_gt_types.h"
 #include "xe_lrc.h"
 #include "xe_lrc_types.h"
@@ -42,6 +43,32 @@ TRACE_EVENT(xe_lrc_update_timestamp,
 		      __get_str(device_id))
 );
 
+TRACE_EVENT(xe_lrc_update_queue_timestamp,
+           TP_PROTO(struct xe_lrc *lrc, uint64_t old),
+           TP_ARGS(lrc, old),
+           TP_STRUCT__entry(
+                    __field(struct xe_lrc *, lrc)
+                    __field(u8, pos)
+                    __field(u64, old)
+                    __field(u64, new)
+                    __string(name, lrc->fence_ctx.name)
+                    __string(device_id, __dev_name_lrc(lrc))
+           ),
+
+           TP_fast_assign(
+                  __entry->lrc = lrc->multi_queue.primary_lrc;
+                  __entry->pos = lrc->multi_queue.pos;
+                  __entry->old = old;
+                  __entry->new = lrc->queue_timestamp;
+                  __assign_str(name);
+                  __assign_str(device_id);
+                  ),
+           TP_printk("lrc=:%p pos=%d lrc->name=%s old=%llu new=%llu device_id:%s",
+                     __entry->lrc, __entry->pos, __get_str(name),
+                     __entry->old, __entry->new,
+                     __get_str(device_id))
+);
+
 #endif
 
 /* This part must be outside protection */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (6 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:20   ` Niranjana Vishwanathapura
  2026-05-02  0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

Each queue in a multi queue group has a dedicated timestamp counter. Use
this QUEUE TIMESTAMP register to capture the start timestamp for the
job.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2: Use xe_lrc_is_multi_queue for check (Niranjana)
---
 drivers/gpu/drm/xe/xe_ring_ops.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
index cfeb4fc7d217..39a670e91ba7 100644
--- a/drivers/gpu/drm/xe/xe_ring_ops.c
+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
@@ -269,8 +269,12 @@ static u32 get_ppgtt_flag(struct xe_sched_job *job)
 static int emit_copy_timestamp(struct xe_device *xe, struct xe_lrc *lrc,
 			       u32 *dw, int i)
 {
+	const struct xe_reg reg = xe_lrc_is_multi_queue(lrc) ?
+				   RING_QUEUE_TIMESTAMP(0) :
+				   RING_CTX_TIMESTAMP(0);
+
 	dw[i++] = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT | MI_SRM_ADD_CS_OFFSET;
-	dw[i++] = RING_CTX_TIMESTAMP(0).addr;
+	dw[i++] = reg.addr;
 	dw[i++] = xe_lrc_ctx_job_timestamp_ggtt_addr(lrc);
 	dw[i++] = 0;
 
@@ -281,7 +285,7 @@ static int emit_copy_timestamp(struct xe_device *xe, struct xe_lrc *lrc,
 	if (IS_SRIOV_VF(xe)) {
 		dw[i++] = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT |
 			MI_SRM_ADD_CS_OFFSET;
-		dw[i++] = RING_CTX_TIMESTAMP(0).addr;
+		dw[i++] = reg.addr;
 		dw[i++] = xe_lrc_ctx_timestamp_ggtt_addr(lrc);
 		dw[i++] = 0;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register
  2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
                   ` (7 preceding siblings ...)
  2026-05-02  0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
@ 2026-05-02  0:53 ` Umesh Nerlige Ramappa
  2026-05-05  4:25   ` Niranjana Vishwanathapura
  8 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-02  0:53 UTC (permalink / raw)
  To: intel-xe, niranjana.vishwanathapura; +Cc: matthew.brost, stuart.summers

In a multi-queue use case, when a job is running on the secondary queue,
the CTX_TIMESTAMP does not reflect the queues run ticks. Instead, we use
the QUEUE TIMESTAMP to check how long the job ran. For user space to see
the run ticks for a secondary queue, whitelist the QUEUE_TIMESTAMP
register.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
v2: Whitelist QUEUE_TIMESTAMP only for copy and compute engines (Niranjana)
---
 drivers/gpu/drm/xe/xe_reg_whitelist.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
index 80577e4b7437..37d6ac720d5c 100644
--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
@@ -33,6 +33,14 @@ static bool match_has_mert(const struct xe_device *xe,
 	return xe_device_has_mert((struct xe_device *)xe);
 }
 
+static bool match_multiq_class(const struct xe_device *xe,
+			       const struct xe_gt *gt,
+			       const struct xe_hw_engine *hwe)
+{
+	return hwe->class == XE_ENGINE_CLASS_COMPUTE ||
+	       hwe->class == XE_ENGINE_CLASS_COPY;
+}
+
 static const struct xe_rtp_entry_sr register_whitelist[] = {
 	{ XE_RTP_NAME("WaAllowPMDepthAndInvocationCountAccessFromUMD, 1408556865"),
 	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, 1210), ENGINE_CLASS(RENDER)),
@@ -54,6 +62,12 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
 				RING_FORCE_TO_NONPRIV_ACCESS_RD,
 				XE_RTP_ACTION_FLAG(ENGINE_BASE)))
 	},
+	{ XE_RTP_NAME("allow_read_queue_timestamp"),
+	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3500, 3511), FUNC(match_multiq_class)),
+	  XE_RTP_ACTIONS(WHITELIST(RING_QUEUE_TIMESTAMP(0),
+				   RING_FORCE_TO_NONPRIV_ACCESS_RD,
+				   XE_RTP_ACTION_FLAG(ENGINE_BASE)))
+	},
 	{ XE_RTP_NAME("16014440446"),
 	  XE_RTP_RULES(PLATFORM(PVC)),
 	  XE_RTP_ACTIONS(WHITELIST(XE_REG(0x4400),
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot
  2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
@ 2026-05-04 23:51   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-04 23:51 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:34PM -0700, Umesh Nerlige Ramappa wrote:
>Use the 64 bit value when available for the context timestamp in the LRC
>snapshot.
>
>Suggested-by: Matthew Brost <matthew.brost@intel.com>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

LGTM
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

>---
> drivers/gpu/drm/xe/xe_lrc.c | 4 ++--
> drivers/gpu/drm/xe/xe_lrc.h | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index 9d12a0d2f0b5..98dc4d0eb61b 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -2475,7 +2475,7 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
> 	snapshot->replay_offset = 0;
> 	snapshot->replay_size = lrc->replay_size;
> 	snapshot->lrc_snapshot = NULL;
>-	snapshot->ctx_timestamp = lower_32_bits(xe_lrc_ctx_timestamp(lrc));
>+	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
> 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
> 	return snapshot;
> }
>@@ -2528,7 +2528,7 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
> 	drm_printf(p, "\tRing start: (memory) 0x%08x\n", snapshot->start);
> 	drm_printf(p, "\tStart seqno: (memory) %d\n", snapshot->start_seqno);
> 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
>-	drm_printf(p, "\tTimestamp: 0x%08x\n", snapshot->ctx_timestamp);
>+	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
> 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
>
> 	if (!snapshot->lrc_snapshot)
>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>index e7c975f9e2d9..62beaffba0af 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.h
>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>@@ -37,7 +37,7 @@ struct xe_lrc_snapshot {
> 	} tail;
> 	u32 start_seqno;
> 	u32 seqno;
>-	u32 ctx_timestamp;
>+	u64 ctx_timestamp;
> 	u32 ctx_job_timestamp;
> };
>
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/9] drm/xe: Add timestamp_ms to LRC snapshot
  2026-05-02  0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
@ 2026-05-04 23:59   ` Niranjana Vishwanathapura
  2026-05-05 18:03     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-04 23:59 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:35PM -0700, Umesh Nerlige Ramappa wrote:
>From: Matthew Brost <matthew.brost@intel.com>
>
>Add a timestamp in milliseconds to the LRC snapshot to make it easier to
>reason about how long the LRC has been running and the average duration
>of each job.
>
>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>---
> drivers/gpu/drm/xe/xe_lrc.c | 4 ++++
> drivers/gpu/drm/xe/xe_lrc.h | 1 +
> 2 files changed, 5 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index 98dc4d0eb61b..d85c712d106b 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -23,6 +23,7 @@
> #include "xe_drm_client.h"
> #include "xe_exec_queue_types.h"
> #include "xe_gt.h"
>+#include "xe_gt_clock.h"
> #include "xe_gt_printk.h"
> #include "xe_hw_fence.h"
> #include "xe_map.h"
>@@ -2476,6 +2477,8 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
> 	snapshot->replay_size = lrc->replay_size;
> 	snapshot->lrc_snapshot = NULL;
> 	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
>+	snapshot->ctx_timestamp_ms =
>+		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_ctx_timestamp(lrc));
> 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
> 	return snapshot;
> }
>@@ -2529,6 +2532,7 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
> 	drm_printf(p, "\tStart seqno: (memory) %d\n", snapshot->start_seqno);
> 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
> 	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
>+	drm_printf(p, "\tTimestamp ms: %llu\n", snapshot->ctx_timestamp_ms);

Do we need a separate field for this? Maybe add it in a single line?
drm_printf(p, "\tTimestamp: 0x%016llx (%llums)\n", snapshot->ctx_timestamp, snapshot->ctx_timestamp_ms);

I am hoping we don't have any script that is using these capture dumps which we might be breaking here.

> 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
>
> 	if (!snapshot->lrc_snapshot)
>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>index 62beaffba0af..97aef0327fc8 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.h
>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>@@ -39,6 +39,7 @@ struct xe_lrc_snapshot {
> 	u32 seqno;
> 	u64 ctx_timestamp;
> 	u32 ctx_job_timestamp;
>+	u64 ctx_timestamp_ms;

NIT...may be put ctx_timestamp_ms right after ctx_timestamp?
That way, we won't be adding a u32 in between two u64s.

Niranjana

> };
>
> #define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4)
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
  2026-05-02  0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
@ 2026-05-05  3:46   ` Niranjana Vishwanathapura
  2026-05-05 18:35     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  3:46 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>Given an LRC belonging to the secondary queue, in order to check if its
>context group is active, we need to check the LRC of the primary queue.
>In addition to that we want to compare the secondary queue position to
>CSMQDEBUG register to check if the queue itself is active.
>
>To do so, store primary LRC and position information in the LRC as well
>as take a reference to the primary LRC from each LRC in the queue group.
>
>A note on references involved:
>
>- In general the Queue takes a ref on its LRC.
>- In addition, for multi-queue,
>a. Primary Queue takes a ref for each Secondary LRC.
>b. Each Secondary Queue takes a ref to the Primary Queue
>
>In the current patch, each LRC in the queue group is storing a pointer
>to Primary LRC. There is a small window of time in the primary queue
>free path where the primary LRC may be freed before the secondary LRC.
>
>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>...
>window where secondary Q LRC is pointing to invalid primary LRC
>...
>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>
>In this window the reference in Secondary LRC is invalid. While there
>may be nothing accessing the secondary LRCs reference, to be safe, this
>patch is taking a reference to Primary LRC from the secondary LRC.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
>v2:
>- Store primary LRC instead of primary queue (Niranjana)
>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>- Document/Revisit references (Matt/Umesh)
>---
> drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
> drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
> drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
> 3 files changed, 33 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>index b287d0e0e60a..e34601d28520 100644
>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
> 		return;
>
> 	/* Primary queue cleanup */
>-	xa_for_each(&group->xa, idx, lrc)
>+	xa_for_each(&group->xa, idx, lrc) {
>+		/* drop secondary lrc ref to primary lrc */
>+		xe_lrc_put(lrc->multi_queue.primary_lrc);
>+		/* drop primary queue ref to secondary lrc */
> 		xe_lrc_put(lrc);
>+	}
>+	/* drop primary lrc ref to itself */
>+	xe_lrc_put(q->lrc[0]);
>
> 	xa_destroy(&group->xa);
> 	mutex_destroy(&group->list_lock);
>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
> {
> 	xe_assert(gt_to_xe(q->gt), idx < q->width);
>
>-	scoped_guard(spinlock, &q->lrc_lookup_lock)
>+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
> 		q->lrc[idx] = lrc;
>+		if (xe_exec_queue_is_multi_queue(q)) {
>+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>+
>+			lrc->multi_queue.pos = q->multi_queue.pos;

I think q->multi_queue.pos is not yet set for secondary queues at this point.
It is set later in the xe_exec_queue_group_add() call.

>+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);

I think we don't need to get/put the primary_lrc reference.
Each queue holds a reference to its LRC. The secondary queues holds a reference
to the primary queue. So, essentially, the secondary LRCs are holding a reference
to primary lrc. So, I think, we don't need to hold reference again.

Niranjana

>+		}
>+	}
> }
>
> /**
>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>
> 			xe_exec_queue_set_lrc(q, lrc, i);
>
>-			if (__lrc)
>+			if (__lrc) {
>+				if (xe_exec_queue_is_multi_queue(q))
>+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
>+
> 				xe_lrc_put(__lrc);
>+			}
> 			__lrc = lrc;
>
> 		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>index 97aef0327fc8..3d0bf4a7bfa0 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.h
>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
> 	return SZ_16K;
> }
>
>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>+{
>+	return lrc->multi_queue.primary_lrc;
>+}
>+
> size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
> size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
> u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>index 5a718f759ed6..0a5c13ec2ad7 100644
>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>@@ -63,6 +63,14 @@ struct xe_lrc {
>
> 	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
> 	u64 ctx_timestamp;
>+
>+	/** @multi_queue: Multi queue LRC related information */
>+	struct {
>+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>+		struct xe_lrc *primary_lrc;
>+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
>+		u8 pos;
>+	} multi_queue;
> };
>
> struct xe_lrc_snapshot;
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc
  2026-05-02  0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
@ 2026-05-05  4:00   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:00 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:37PM -0700, Umesh Nerlige Ramappa wrote:
>In secondary queue LRCs, the QUEUE TIMESTAMP register is saved and
>restored allowing us to view the individual queue run times. Add helpers
>to read this value from the LRC.
>
>BSpec: 73988
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
>v2: (Matt)
>- Add BSpec reference
>- Make queue_timestamp snapshot 64 bit
>- Add a snapshot of queue timestamp in ms
>---
> drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  3 ++
> drivers/gpu/drm/xe/xe_lrc.c             | 47 +++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_lrc.h             |  2 ++
> drivers/gpu/drm/xe/xe_lrc_types.h       |  3 ++
> 4 files changed, 55 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
>index b5eff383902c..4ab86fc369fd 100644
>--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
>+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
>@@ -34,6 +34,9 @@
> #define CTX_CS_INT_VEC_REG		0x5a
> #define CTX_CS_INT_VEC_DATA		(CTX_CS_INT_VEC_REG + 1)
>
>+#define CTX_QUEUE_TIMESTAMP		(0xd0 + 1)
>+#define CTX_QUEUE_TIMESTAMP_UDW		(0xd2 + 1)
>+
> #define INDIRECT_CTX_RING_HEAD		(0x02 + 1)
> #define INDIRECT_CTX_RING_TAIL		(0x04 + 1)
> #define INDIRECT_CTX_RING_START		(0x06 + 1)
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index d85c712d106b..2ee52efb9219 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -789,6 +789,16 @@ static u32 __xe_lrc_ctx_timestamp_udw_offset(struct xe_lrc *lrc)
> 	return __xe_lrc_regs_offset(lrc) + CTX_TIMESTAMP_UDW * sizeof(u32);
> }
>
>+static u32 __xe_lrc_queue_timestamp_offset(struct xe_lrc *lrc)
>+{
>+	return __xe_lrc_regs_offset(lrc) + CTX_QUEUE_TIMESTAMP * sizeof(u32);
>+}
>+
>+static u32 __xe_lrc_queue_timestamp_udw_offset(struct xe_lrc *lrc)
>+{
>+	return __xe_lrc_regs_offset(lrc) + CTX_QUEUE_TIMESTAMP_UDW * sizeof(u32);
>+}
>+
> static inline u32 __xe_lrc_indirect_ring_offset(struct xe_lrc *lrc)
> {
> 	u32 offset = xe_bo_size(lrc->bo) - LRC_WA_BB_SIZE -
>@@ -838,6 +848,8 @@ DECL_MAP_ADDR_HELPERS(ctx_timestamp_udw, lrc->bo)
> DECL_MAP_ADDR_HELPERS(parallel, lrc->bo)
> DECL_MAP_ADDR_HELPERS(indirect_ring, lrc->bo)
> DECL_MAP_ADDR_HELPERS(engine_id, lrc->bo)
>+DECL_MAP_ADDR_HELPERS(queue_timestamp, lrc->bo)
>+DECL_MAP_ADDR_HELPERS(queue_timestamp_udw, lrc->bo)
>
> #undef DECL_MAP_ADDR_HELPERS
>
>@@ -886,6 +898,30 @@ static u64 xe_lrc_ctx_timestamp(struct xe_lrc *lrc)
> 	return (u64)udw << 32 | ldw;
> }
>
>+/**
>+ * xe_lrc_queue_timestamp() - Read queue timestamp value
>+ * @lrc: Pointer to the lrc.
>+ *
>+ * Returns: queue timestamp value
>+ */
>+static u64 xe_lrc_queue_timestamp(struct xe_lrc *lrc)
>+{
>+	struct xe_device *xe = lrc_to_xe(lrc);
>+	struct iosys_map map;
>+	u32 ldw, udw = 0;
>+
>+	if (!xe_lrc_is_multi_queue(lrc))
>+		return 0;

May be print an error instead of silently returning 0?

>+
>+	map = __xe_lrc_queue_timestamp_map(lrc);
>+	ldw = xe_map_read32(xe, &map);
>+
>+	map = __xe_lrc_queue_timestamp_udw_map(lrc);
>+	udw = xe_map_read32(xe, &map);
>+
>+	return (u64)udw << 32 | ldw;
>+}
>+
> /**
>  * xe_lrc_ctx_job_timestamp_ggtt_addr() - Get ctx job timestamp GGTT address
>  * @lrc: Pointer to the lrc.
>@@ -1551,6 +1587,12 @@ static int xe_lrc_ctx_init(struct xe_lrc *lrc, struct xe_hw_engine *hwe, struct
> 	if (lrc_to_xe(lrc)->info.has_64bit_timestamp)
> 		xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP_UDW, 0);
>
>+	if (xe_lrc_is_multi_queue(lrc)) {
>+		lrc->queue_timestamp = 0;
>+		xe_lrc_write_ctx_reg(lrc, CTX_QUEUE_TIMESTAMP, 0);
>+		xe_lrc_write_ctx_reg(lrc, CTX_QUEUE_TIMESTAMP_UDW, 0);
>+	}

lrc->multi_queue.primary_lrc is not set yet at this point.
So, xe_lrc_is_multi_queue(lrc) will always return false here.

>+
> 	if (xe->info.has_asid && vm)
> 		xe_lrc_write_ctx_reg(lrc, CTX_ASID, vm->usm.asid);
>
>@@ -2479,6 +2521,9 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
> 	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
> 	snapshot->ctx_timestamp_ms =
> 		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_ctx_timestamp(lrc));
>+	snapshot->queue_timestamp = xe_lrc_queue_timestamp(lrc);
>+	snapshot->queue_timestamp_ms =
>+		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_queue_timestamp(lrc));
> 	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
> 	return snapshot;
> }
>@@ -2533,6 +2578,8 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
> 	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
> 	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
> 	drm_printf(p, "\tTimestamp ms: %llu\n", snapshot->ctx_timestamp_ms);
>+	drm_printf(p, "\tQueue Timestamp: 0x%016llx\n", snapshot->queue_timestamp);
>+	drm_printf(p, "\tQueue Timestamp ms: %llu\n", snapshot->queue_timestamp_ms);

Same comment as in patch #1 (Timestamp ms).

Niranjana

> 	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
>
> 	if (!snapshot->lrc_snapshot)
>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>index 3d0bf4a7bfa0..12d08808ac75 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.h
>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>@@ -38,8 +38,10 @@ struct xe_lrc_snapshot {
> 	u32 start_seqno;
> 	u32 seqno;
> 	u64 ctx_timestamp;
>+	u64 queue_timestamp;
> 	u32 ctx_job_timestamp;
> 	u64 ctx_timestamp_ms;
>+	u64 queue_timestamp_ms;
> };
>
> #define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4)
>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>index 0a5c13ec2ad7..53ef48feebfc 100644
>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>@@ -64,6 +64,9 @@ struct xe_lrc {
> 	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
> 	u64 ctx_timestamp;
>
>+	/** @queue_timestamp: value of QUEUE_TIMESTAMP on last update */
>+	u64 queue_timestamp;
>+
> 	/** @multi_queue: Multi queue LRC related information */
> 	struct {
> 		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues
  2026-05-02  0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
@ 2026-05-05  4:12   ` Niranjana Vishwanathapura
  2026-05-05 19:02     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:12 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:39PM -0700, Umesh Nerlige Ramappa wrote:
>If a queue is currently active on the CS, query the QUEUE TIMESTAMP
>register to get an up to date value of the runtime. To do so, ensure
>that the primary queue is active and then check if the secondary queue
>is executing on the CS.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
>v2:
>- Move trace to a separate patch (Stuart)
>- Refactor multi queue timestamp logic (Matt/Niranjana)
>---
> drivers/gpu/drm/xe/regs/xe_engine_regs.h |   4 +
> drivers/gpu/drm/xe/xe_lrc.c              | 115 +++++++++++++++++++----
> 2 files changed, 99 insertions(+), 20 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>index 1b4a7e9a703d..af6af6f3f5e8 100644
>--- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>+++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>@@ -170,6 +170,10 @@
> #define   GFX_MSIX_INTERRUPT_ENABLE		REG_BIT(13)
>
> #define RING_CSMQDEBUG(base)			XE_REG((base) + 0x2b0)
>+#define   CURRENT_ACTIVE_QUEUE_ID_MASK		REG_GENMASK(7, 0)
>+
>+#define RING_QUEUE_TIMESTAMP(base)		XE_REG((base) + 0x4c0)
>+#define RING_QUEUE_TIMESTAMP_UDW(base)		XE_REG((base) + 0x4c0 + 4)
>
> #define RING_TIMESTAMP(base)			XE_REG((base) + 0x358)
>
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index 92419e5058fd..023202be5d52 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -21,6 +21,7 @@
> #include "xe_configfs.h"
> #include "xe_device.h"
> #include "xe_drm_client.h"
>+#include "xe_exec_queue.h"
> #include "xe_exec_queue_types.h"
> #include "xe_gt.h"
> #include "xe_gt_clock.h"
>@@ -2655,17 +2656,65 @@ static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
> 	return 0;
> }
>
>-/**
>- * xe_lrc_timestamp() - Current ctx timestamp
>- * @lrc: Pointer to the lrc.
>- *
>- * Return latest ctx timestamp. With support for active contexts, the
>- * calculation may be slightly racy, so follow a read-again logic to ensure that
>- * the context is still active before returning the right timestamp.
>- *
>- * Returns: New ctx timestamp value
>- */
>-u64 xe_lrc_timestamp(struct xe_lrc *lrc)
>+static u64 get_queue_timestamp(struct xe_hw_engine *hwe)
>+{
>+	return xe_mmio_read64_2x32(&hwe->gt->mmio,
>+				   RING_QUEUE_TIMESTAMP(hwe->mmio_base));
>+}
>+
>+static u32 get_queue_id(struct xe_hw_engine *hwe)
>+{

NIT...may be rename it as get_multi_queue_active_queue()?

>+	u32 val = xe_mmio_read32(&hwe->gt->mmio,
>+				 RING_CSMQDEBUG(hwe->mmio_base));
>+
>+	return REG_FIELD_GET(CURRENT_ACTIVE_QUEUE_ID_MASK, val);
>+}
>+
>+static bool context_active(struct xe_lrc *lrc)
>+{
>+	return xe_lrc_ctx_timestamp(lrc) == CONTEXT_ACTIVE;
>+}
>+
>+static u64 xe_lrc_multi_queue_timestamp(struct xe_lrc *lrc)
>+{
>+	struct xe_lrc *primary_lrc = lrc->multi_queue.primary_lrc;
>+	struct xe_hw_engine *hwe;
>+	u64 reg_queue_ts = lrc->queue_timestamp;
>+
>+	if (IS_SRIOV_VF(lrc_to_xe(lrc)))
>+		return xe_lrc_queue_timestamp(lrc);
>+
>+	if (!primary_lrc || !context_active(primary_lrc))
>+		return xe_lrc_queue_timestamp(lrc);

May be print a warning if primary_lrc is not set?

>+
>+	/* WA BB populates engine id in PPHWSP of primary context only */
>+	hwe = engine_id_to_hwe(primary_lrc->gt, xe_lrc_engine_id(primary_lrc));
>+	if (!hwe)
>+		return xe_lrc_queue_timestamp(lrc);
>+
>+	if (get_queue_id(hwe) != lrc->multi_queue.pos)
>+		return xe_lrc_queue_timestamp(lrc);
>+
>+	/* queue is active, so store the queue timestamp register */
>+	reg_queue_ts = get_queue_timestamp(hwe);
>+
>+	/* double check queue and primary queue are both still active */
>+	if (get_queue_id(hwe) != lrc->multi_queue.pos ||
>+	    !context_active(primary_lrc))
>+		return xe_lrc_queue_timestamp(lrc);
>+
>+	return reg_queue_ts;
>+}
>+
>+static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
>+{
>+	*old_ts = lrc->queue_timestamp;
>+	lrc->queue_timestamp = xe_lrc_multi_queue_timestamp(lrc);

Same here, add a warning message if lrc is not multi-queue?

>+
>+	return lrc->queue_timestamp;
>+}
>+
>+static u64 xe_lrc_single_queue_timestamp(struct xe_lrc *lrc)
> {

Hmm...NIT...we never used the word 'single_queue' so far to refer to
non-multi-queue case. May be rename it to xe_lrc_ctx_timestamp()
or something?

Also, may be update this funcion to make use the newly added
context_active() function?

Niranjana

> 	u64 lrc_ts, reg_ts, new_ts = lrc->ctx_timestamp;
> 	u32 engine_id;
>@@ -2697,24 +2746,50 @@ u64 xe_lrc_timestamp(struct xe_lrc *lrc)
> 	return new_ts;
> }
>
>+static u64 xe_lrc_update_ctx_timestamp(struct xe_lrc *lrc, u64 *old_ts)
>+{
>+	*old_ts = lrc->ctx_timestamp;
>+	lrc->ctx_timestamp = xe_lrc_single_queue_timestamp(lrc);
>+
>+	trace_xe_lrc_update_timestamp(lrc, *old_ts);
>+
>+	return lrc->ctx_timestamp;
>+}
>+
> /**
>- * xe_lrc_update_timestamp() - Update ctx timestamp
>+ * xe_lrc_timestamp() - Current lrc timestamp
>+ * @lrc: Pointer to the lrc.
>+ *
>+ * Return latest lrc timestamp. With support for active contexts/queues, the
>+ * calculation may be slightly racy, so follow a read-again logic to ensure that
>+ * the context/queue is still active before returning the right timestamp.
>+ *
>+ * Returns: New lrc timestamp value
>+ */
>+u64 xe_lrc_timestamp(struct xe_lrc *lrc)
>+{
>+	if (xe_lrc_is_multi_queue(lrc))
>+		return xe_lrc_multi_queue_timestamp(lrc);
>+	else
>+		return xe_lrc_single_queue_timestamp(lrc);
>+}
>+
>+/**
>+ * xe_lrc_update_timestamp() - Update lrc timestamp
>  * @lrc: Pointer to the lrc.
>  * @old_ts: Old timestamp value
>  *
>- * Populate @old_ts current saved ctx timestamp, read new ctx timestamp and
>+ * Populate @old_ts with current saved lrc timestamp, read new lrc timestamp and
>  * update saved value.
>  *
>- * Returns: New ctx timestamp value
>+ * Returns: New lrc timestamp value
>  */
> u64 xe_lrc_update_timestamp(struct xe_lrc *lrc, u64 *old_ts)
> {
>-	*old_ts = lrc->ctx_timestamp;
>-	lrc->ctx_timestamp = xe_lrc_timestamp(lrc);
>-
>-	trace_xe_lrc_update_timestamp(lrc, *old_ts);
>-
>-	return lrc->ctx_timestamp;
>+	if (xe_lrc_is_multi_queue(lrc))
>+		return xe_lrc_update_multi_queue_timestamp(lrc, old_ts);
>+	else
>+		return xe_lrc_update_ctx_timestamp(lrc, old_ts);
> }
>
> /**
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion
  2026-05-02  0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
@ 2026-05-05  4:16   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:16 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:38PM -0700, Umesh Nerlige Ramappa wrote:
>We need to define more helpers that read engine ID specific register, so
>move that logic outside of get_ctx_timestamp().
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
> drivers/gpu/drm/xe/xe_lrc.c | 20 +++++++++++++++-----
> 1 file changed, 15 insertions(+), 5 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index 2ee52efb9219..92419e5058fd 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -2620,17 +2620,27 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot)
> 	kfree(snapshot);
> }
>
>-static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
>+static struct xe_hw_engine *engine_id_to_hwe(struct xe_gt *gt, u32 engine_id)
> {
> 	u16 class = REG_FIELD_GET(ENGINE_CLASS_ID, engine_id);
> 	u16 instance = REG_FIELD_GET(ENGINE_INSTANCE_ID, engine_id);
>+	struct xe_hw_engine *hwe = xe_gt_hw_engine(gt, class, instance, false);
>+
>+	if (xe_gt_WARN_ONCE(gt, !hwe || xe_hw_engine_is_reserved(hwe),
>+			    "Unexpected engine class:instance %d:%d for utilization\n",
>+			    class, instance))
>+		return NULL;
>+
>+	return hwe;
>+}

Looks good. But the error message doesn't say which function is causing this.
And also it will only print it once though there are multiple paths (scenarios)
to get to here. Probably that is fine as things will be broken anyway if this
happens.

Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

>+
>+static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
>+{
> 	struct xe_hw_engine *hwe;
> 	u64 val;
>
>-	hwe = xe_gt_hw_engine(lrc->gt, class, instance, false);
>-	if (xe_gt_WARN_ONCE(lrc->gt, !hwe || xe_hw_engine_is_reserved(hwe),
>-			    "Unexpected engine class:instance %d:%d for context utilization\n",
>-			    class, instance))
>+	hwe = engine_id_to_hwe(lrc->gt, engine_id);
>+	if (!hwe)
> 		return -1;
>
> 	if (lrc_to_xe(lrc)->info.has_64bit_timestamp)
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp
  2026-05-02  0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
@ 2026-05-05  4:19   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:19 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:40PM -0700, Umesh Nerlige Ramappa wrote:
>Add a trace event for multi queue timestamp capture.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
>v2:
>- Split traces from original patch (Stuart)
>- Print primary lrc in the trace (Niranjana)
>---
> drivers/gpu/drm/xe/xe_lrc.c       |  2 ++
> drivers/gpu/drm/xe/xe_trace_lrc.h | 27 +++++++++++++++++++++++++++
> 2 files changed, 29 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>index 023202be5d52..6bd93803cb7f 100644
>--- a/drivers/gpu/drm/xe/xe_lrc.c
>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>@@ -2711,6 +2711,8 @@ static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
> 	*old_ts = lrc->queue_timestamp;
> 	lrc->queue_timestamp = xe_lrc_multi_queue_timestamp(lrc);
>
>+	trace_xe_lrc_update_queue_timestamp(lrc, *old_ts);
>+
> 	return lrc->queue_timestamp;
> }
>
>diff --git a/drivers/gpu/drm/xe/xe_trace_lrc.h b/drivers/gpu/drm/xe/xe_trace_lrc.h
>index d525cbee1e34..fdc77102fa77 100644
>--- a/drivers/gpu/drm/xe/xe_trace_lrc.h
>+++ b/drivers/gpu/drm/xe/xe_trace_lrc.h
>@@ -12,6 +12,7 @@
> #include <linux/tracepoint.h>
> #include <linux/types.h>
>
>+#include "xe_exec_queue_types.h"
> #include "xe_gt_types.h"
> #include "xe_lrc.h"
> #include "xe_lrc_types.h"
>@@ -42,6 +43,32 @@ TRACE_EVENT(xe_lrc_update_timestamp,
> 		      __get_str(device_id))
> );
>
>+TRACE_EVENT(xe_lrc_update_queue_timestamp,
>+           TP_PROTO(struct xe_lrc *lrc, uint64_t old),
>+           TP_ARGS(lrc, old),
>+           TP_STRUCT__entry(
>+                    __field(struct xe_lrc *, lrc)
>+                    __field(u8, pos)
>+                    __field(u64, old)
>+                    __field(u64, new)
>+                    __string(name, lrc->fence_ctx.name)
>+                    __string(device_id, __dev_name_lrc(lrc))
>+           ),
>+
>+           TP_fast_assign(
>+                  __entry->lrc = lrc->multi_queue.primary_lrc;
>+                  __entry->pos = lrc->multi_queue.pos;
>+                  __entry->old = old;
>+                  __entry->new = lrc->queue_timestamp;
>+                  __assign_str(name);
>+                  __assign_str(device_id);
>+                  ),
>+           TP_printk("lrc=:%p pos=%d lrc->name=%s old=%llu new=%llu device_id:%s",
>+                     __entry->lrc, __entry->pos, __get_str(name),
>+                     __entry->old, __entry->new,
>+                     __get_str(device_id))

I meant printing both current LRC and primary LRC (otherwise, it won't help).
As I mentioned before, printing pos, may not be of much value here.

Niranjana

>+);
>+
> #endif
>
> /* This part must be outside protection */
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue
  2026-05-02  0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
@ 2026-05-05  4:20   ` Niranjana Vishwanathapura
  0 siblings, 0 replies; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:20 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:41PM -0700, Umesh Nerlige Ramappa wrote:
>Each queue in a multi queue group has a dedicated timestamp counter. Use
>this QUEUE TIMESTAMP register to capture the start timestamp for the
>job.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

LGTM
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>

>---
>v2: Use xe_lrc_is_multi_queue for check (Niranjana)
>---
> drivers/gpu/drm/xe/xe_ring_ops.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c b/drivers/gpu/drm/xe/xe_ring_ops.c
>index cfeb4fc7d217..39a670e91ba7 100644
>--- a/drivers/gpu/drm/xe/xe_ring_ops.c
>+++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>@@ -269,8 +269,12 @@ static u32 get_ppgtt_flag(struct xe_sched_job *job)
> static int emit_copy_timestamp(struct xe_device *xe, struct xe_lrc *lrc,
> 			       u32 *dw, int i)
> {
>+	const struct xe_reg reg = xe_lrc_is_multi_queue(lrc) ?
>+				   RING_QUEUE_TIMESTAMP(0) :
>+				   RING_CTX_TIMESTAMP(0);
>+
> 	dw[i++] = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT | MI_SRM_ADD_CS_OFFSET;
>-	dw[i++] = RING_CTX_TIMESTAMP(0).addr;
>+	dw[i++] = reg.addr;
> 	dw[i++] = xe_lrc_ctx_job_timestamp_ggtt_addr(lrc);
> 	dw[i++] = 0;
>
>@@ -281,7 +285,7 @@ static int emit_copy_timestamp(struct xe_device *xe, struct xe_lrc *lrc,
> 	if (IS_SRIOV_VF(xe)) {
> 		dw[i++] = MI_STORE_REGISTER_MEM | MI_SRM_USE_GGTT |
> 			MI_SRM_ADD_CS_OFFSET;
>-		dw[i++] = RING_CTX_TIMESTAMP(0).addr;
>+		dw[i++] = reg.addr;
> 		dw[i++] = xe_lrc_ctx_timestamp_ggtt_addr(lrc);
> 		dw[i++] = 0;
> 	}
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register
  2026-05-02  0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
@ 2026-05-05  4:25   ` Niranjana Vishwanathapura
  2026-05-05 17:58     ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05  4:25 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Fri, May 01, 2026 at 05:53:42PM -0700, Umesh Nerlige Ramappa wrote:
>In a multi-queue use case, when a job is running on the secondary queue,
>the CTX_TIMESTAMP does not reflect the queues run ticks. Instead, we use
>the QUEUE TIMESTAMP to check how long the job ran. For user space to see
>the run ticks for a secondary queue, whitelist the QUEUE_TIMESTAMP
>register.
>
>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>---
>v2: Whitelist QUEUE_TIMESTAMP only for copy and compute engines (Niranjana)
>---
> drivers/gpu/drm/xe/xe_reg_whitelist.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>index 80577e4b7437..37d6ac720d5c 100644
>--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
>+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>@@ -33,6 +33,14 @@ static bool match_has_mert(const struct xe_device *xe,
> 	return xe_device_has_mert((struct xe_device *)xe);
> }
>
>+static bool match_multiq_class(const struct xe_device *xe,
>+			       const struct xe_gt *gt,
>+			       const struct xe_hw_engine *hwe)

To be consistent, we have used 'multi_queue' as naming convention
and avoided 'multiq'. It would be good to keep that consistency
everywhere.

>+{
>+	return hwe->class == XE_ENGINE_CLASS_COMPUTE ||
>+	       hwe->class == XE_ENGINE_CLASS_COPY;

We already have xe_exec_queue_supports_multi_queue() function which 
does similar check. I think we need to abstract it out to a single
function and use that instead of multiple places in code that
determine whether a class supports multi-queue or not.

Niranjana

>+}
>+
> static const struct xe_rtp_entry_sr register_whitelist[] = {
> 	{ XE_RTP_NAME("WaAllowPMDepthAndInvocationCountAccessFromUMD, 1408556865"),
> 	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, 1210), ENGINE_CLASS(RENDER)),
>@@ -54,6 +62,12 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
> 				RING_FORCE_TO_NONPRIV_ACCESS_RD,
> 				XE_RTP_ACTION_FLAG(ENGINE_BASE)))
> 	},
>+	{ XE_RTP_NAME("allow_read_queue_timestamp"),
>+	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3500, 3511), FUNC(match_multiq_class)),
>+	  XE_RTP_ACTIONS(WHITELIST(RING_QUEUE_TIMESTAMP(0),
>+				   RING_FORCE_TO_NONPRIV_ACCESS_RD,
>+				   XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>+	},
> 	{ XE_RTP_NAME("16014440446"),
> 	  XE_RTP_RULES(PLATFORM(PVC)),
> 	  XE_RTP_ACTIONS(WHITELIST(XE_REG(0x4400),
>-- 
>2.43.0
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register
  2026-05-05  4:25   ` Niranjana Vishwanathapura
@ 2026-05-05 17:58     ` Umesh Nerlige Ramappa
  2026-05-05 18:34       ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 17:58 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Mon, May 04, 2026 at 09:25:29PM -0700, Niranjana Vishwanathapura wrote:
>On Fri, May 01, 2026 at 05:53:42PM -0700, Umesh Nerlige Ramappa wrote:
>>In a multi-queue use case, when a job is running on the secondary queue,
>>the CTX_TIMESTAMP does not reflect the queues run ticks. Instead, we use
>>the QUEUE TIMESTAMP to check how long the job ran. For user space to see
>>the run ticks for a secondary queue, whitelist the QUEUE_TIMESTAMP
>>register.
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>---
>>v2: Whitelist QUEUE_TIMESTAMP only for copy and compute engines (Niranjana)
>>---
>>drivers/gpu/drm/xe/xe_reg_whitelist.c | 14 ++++++++++++++
>>1 file changed, 14 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>index 80577e4b7437..37d6ac720d5c 100644
>>--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>@@ -33,6 +33,14 @@ static bool match_has_mert(const struct xe_device *xe,
>>	return xe_device_has_mert((struct xe_device *)xe);
>>}
>>
>>+static bool match_multiq_class(const struct xe_device *xe,
>>+			       const struct xe_gt *gt,
>>+			       const struct xe_hw_engine *hwe)
>
>To be consistent, we have used 'multi_queue' as naming convention
>and avoided 'multiq'. It would be good to keep that consistency
>everywhere.

sure, will change that
>
>>+{
>>+	return hwe->class == XE_ENGINE_CLASS_COMPUTE ||
>>+	       hwe->class == XE_ENGINE_CLASS_COPY;
>
>We already have xe_exec_queue_supports_multi_queue() function which 
>does similar check. I think we need to abstract it out to a single
>function and use that instead of multiple places in code that
>determine whether a class supports multi-queue or not.

The whitelist is applied during engine init at driver load, so we don't 
have any exec queues yet. Instead, should we derive supported engines 
from xe_graphics_desc.multi_queue_engine_class_mask and check?

Umesh

>
>Niranjana
>
>>+}
>>+
>>static const struct xe_rtp_entry_sr register_whitelist[] = {
>>	{ XE_RTP_NAME("WaAllowPMDepthAndInvocationCountAccessFromUMD, 1408556865"),
>>	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, 1210), ENGINE_CLASS(RENDER)),
>>@@ -54,6 +62,12 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
>>				RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>				XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>	},
>>+	{ XE_RTP_NAME("allow_read_queue_timestamp"),
>>+	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3500, 3511), FUNC(match_multiq_class)),
>>+	  XE_RTP_ACTIONS(WHITELIST(RING_QUEUE_TIMESTAMP(0),
>>+				   RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>+				   XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>+	},
>>	{ XE_RTP_NAME("16014440446"),
>>	  XE_RTP_RULES(PLATFORM(PVC)),
>>	  XE_RTP_ACTIONS(WHITELIST(XE_REG(0x4400),
>>-- 
>>2.43.0
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/9] drm/xe: Add timestamp_ms to LRC snapshot
  2026-05-04 23:59   ` Niranjana Vishwanathapura
@ 2026-05-05 18:03     ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 18:03 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Mon, May 04, 2026 at 04:59:26PM -0700, Niranjana Vishwanathapura wrote:
>On Fri, May 01, 2026 at 05:53:35PM -0700, Umesh Nerlige Ramappa wrote:
>>From: Matthew Brost <matthew.brost@intel.com>
>>
>>Add a timestamp in milliseconds to the LRC snapshot to make it easier to
>>reason about how long the LRC has been running and the average duration
>>of each job.
>>
>>Signed-off-by: Matthew Brost <matthew.brost@intel.com>
>>---
>>drivers/gpu/drm/xe/xe_lrc.c | 4 ++++
>>drivers/gpu/drm/xe/xe_lrc.h | 1 +
>>2 files changed, 5 insertions(+)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>>index 98dc4d0eb61b..d85c712d106b 100644
>>--- a/drivers/gpu/drm/xe/xe_lrc.c
>>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>>@@ -23,6 +23,7 @@
>>#include "xe_drm_client.h"
>>#include "xe_exec_queue_types.h"
>>#include "xe_gt.h"
>>+#include "xe_gt_clock.h"
>>#include "xe_gt_printk.h"
>>#include "xe_hw_fence.h"
>>#include "xe_map.h"
>>@@ -2476,6 +2477,8 @@ struct xe_lrc_snapshot *xe_lrc_snapshot_capture(struct xe_lrc *lrc)
>>	snapshot->replay_size = lrc->replay_size;
>>	snapshot->lrc_snapshot = NULL;
>>	snapshot->ctx_timestamp = xe_lrc_ctx_timestamp(lrc);
>>+	snapshot->ctx_timestamp_ms =
>>+		xe_gt_clock_interval_to_ms(lrc->gt, xe_lrc_ctx_timestamp(lrc));
>>	snapshot->ctx_job_timestamp = xe_lrc_ctx_job_timestamp(lrc);
>>	return snapshot;
>>}
>>@@ -2529,6 +2532,7 @@ void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct drm_printer
>>	drm_printf(p, "\tStart seqno: (memory) %d\n", snapshot->start_seqno);
>>	drm_printf(p, "\tSeqno: (memory) %d\n", snapshot->seqno);
>>	drm_printf(p, "\tTimestamp: 0x%016llx\n", snapshot->ctx_timestamp);
>>+	drm_printf(p, "\tTimestamp ms: %llu\n", snapshot->ctx_timestamp_ms);
>
>Do we need a separate field for this? Maybe add it in a single line?
>drm_printf(p, "\tTimestamp: 0x%016llx (%llums)\n", snapshot->ctx_timestamp, snapshot->ctx_timestamp_ms);
>
>I am hoping we don't have any script that is using these capture dumps which we might be breaking here.

I don't know, but I think it's easier for scripts if the new prints are 
on separate line. Also the intention was to keep Matt's patch separate 
here.

>
>>	drm_printf(p, "\tJob Timestamp: 0x%08x\n", snapshot->ctx_job_timestamp);
>>
>>	if (!snapshot->lrc_snapshot)
>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>index 62beaffba0af..97aef0327fc8 100644
>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>@@ -39,6 +39,7 @@ struct xe_lrc_snapshot {
>>	u32 seqno;
>>	u64 ctx_timestamp;
>>	u32 ctx_job_timestamp;
>>+	u64 ctx_timestamp_ms;
>
>NIT...may be put ctx_timestamp_ms right after ctx_timestamp?
>That way, we won't be adding a u32 in between two u64s.

will change,

Umesh
>
>Niranjana
>
>>};
>>
>>#define LRC_PPHWSP_FLUSH_INVAL_SCRATCH_ADDR (0x34 * 4)
>>-- 
>>2.43.0
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register
  2026-05-05 17:58     ` Umesh Nerlige Ramappa
@ 2026-05-05 18:34       ` Niranjana Vishwanathapura
  2026-05-05 19:06         ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05 18:34 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Tue, May 05, 2026 at 10:58:42AM -0700, Umesh Nerlige Ramappa wrote:
>On Mon, May 04, 2026 at 09:25:29PM -0700, Niranjana Vishwanathapura wrote:
>>On Fri, May 01, 2026 at 05:53:42PM -0700, Umesh Nerlige Ramappa wrote:
>>>In a multi-queue use case, when a job is running on the secondary queue,
>>>the CTX_TIMESTAMP does not reflect the queues run ticks. Instead, we use
>>>the QUEUE TIMESTAMP to check how long the job ran. For user space to see
>>>the run ticks for a secondary queue, whitelist the QUEUE_TIMESTAMP
>>>register.
>>>
>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>---
>>>v2: Whitelist QUEUE_TIMESTAMP only for copy and compute engines (Niranjana)
>>>---
>>>drivers/gpu/drm/xe/xe_reg_whitelist.c | 14 ++++++++++++++
>>>1 file changed, 14 insertions(+)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>index 80577e4b7437..37d6ac720d5c 100644
>>>--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>@@ -33,6 +33,14 @@ static bool match_has_mert(const struct xe_device *xe,
>>>	return xe_device_has_mert((struct xe_device *)xe);
>>>}
>>>
>>>+static bool match_multiq_class(const struct xe_device *xe,
>>>+			       const struct xe_gt *gt,
>>>+			       const struct xe_hw_engine *hwe)
>>
>>To be consistent, we have used 'multi_queue' as naming convention
>>and avoided 'multiq'. It would be good to keep that consistency
>>everywhere.
>
>sure, will change that
>>
>>>+{
>>>+	return hwe->class == XE_ENGINE_CLASS_COMPUTE ||
>>>+	       hwe->class == XE_ENGINE_CLASS_COPY;
>>
>>We already have xe_exec_queue_supports_multi_queue() function which 
>>does similar check. I think we need to abstract it out to a single
>>function and use that instead of multiple places in code that
>>determine whether a class supports multi-queue or not.
>
>The whitelist is applied during engine init at driver load, so we 
>don't have any exec queues yet. Instead, should we derive supported 
>engines from xe_graphics_desc.multi_queue_engine_class_mask and check?
>

Yah, xe_exec_queue_supports_multi_queue() also uses it. So, may be add a
xe_gt_supports_multi_queue(struct xe_gt *gt, enum xe_engine_class class)
in xe_gt.h and use that both here and in xe_exec_queue_supports_multi_queue().

Niranjana

>Umesh
>
>>
>>Niranjana
>>
>>>+}
>>>+
>>>static const struct xe_rtp_entry_sr register_whitelist[] = {
>>>	{ XE_RTP_NAME("WaAllowPMDepthAndInvocationCountAccessFromUMD, 1408556865"),
>>>	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, 1210), ENGINE_CLASS(RENDER)),
>>>@@ -54,6 +62,12 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
>>>				RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>>				XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>>	},
>>>+	{ XE_RTP_NAME("allow_read_queue_timestamp"),
>>>+	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3500, 3511), FUNC(match_multiq_class)),
>>>+	  XE_RTP_ACTIONS(WHITELIST(RING_QUEUE_TIMESTAMP(0),
>>>+				   RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>>+				   XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>>+	},
>>>	{ XE_RTP_NAME("16014440446"),
>>>	  XE_RTP_RULES(PLATFORM(PVC)),
>>>	  XE_RTP_ACTIONS(WHITELIST(XE_REG(0x4400),
>>>-- 
>>>2.43.0
>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
  2026-05-05  3:46   ` Niranjana Vishwanathapura
@ 2026-05-05 18:35     ` Umesh Nerlige Ramappa
  2026-05-05 18:45       ` Niranjana Vishwanathapura
  0 siblings, 1 reply; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 18:35 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Mon, May 04, 2026 at 08:46:10PM -0700, Niranjana Vishwanathapura wrote:
>On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>>Given an LRC belonging to the secondary queue, in order to check if its
>>context group is active, we need to check the LRC of the primary queue.
>>In addition to that we want to compare the secondary queue position to
>>CSMQDEBUG register to check if the queue itself is active.
>>
>>To do so, store primary LRC and position information in the LRC as well
>>as take a reference to the primary LRC from each LRC in the queue group.
>>
>>A note on references involved:
>>
>>- In general the Queue takes a ref on its LRC.
>>- In addition, for multi-queue,
>>a. Primary Queue takes a ref for each Secondary LRC.
>>b. Each Secondary Queue takes a ref to the Primary Queue
>>
>>In the current patch, each LRC in the queue group is storing a pointer
>>to Primary LRC. There is a small window of time in the primary queue
>>free path where the primary LRC may be freed before the secondary LRC.
>>
>>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>>...
>>window where secondary Q LRC is pointing to invalid primary LRC
>>...
>>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>>
>>In this window the reference in Secondary LRC is invalid. While there
>>may be nothing accessing the secondary LRCs reference, to be safe, this
>>patch is taking a reference to Primary LRC from the secondary LRC.
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>---
>>v2:
>>- Store primary LRC instead of primary queue (Niranjana)
>>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>>- Document/Revisit references (Matt/Umesh)
>>---
>>drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
>>drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
>>drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
>>3 files changed, 33 insertions(+), 3 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>index b287d0e0e60a..e34601d28520 100644
>>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
>>		return;
>>
>>	/* Primary queue cleanup */
>>-	xa_for_each(&group->xa, idx, lrc)
>>+	xa_for_each(&group->xa, idx, lrc) {
>>+		/* drop secondary lrc ref to primary lrc */
>>+		xe_lrc_put(lrc->multi_queue.primary_lrc);
>>+		/* drop primary queue ref to secondary lrc */
>>		xe_lrc_put(lrc);
>>+	}
>>+	/* drop primary lrc ref to itself */
>>+	xe_lrc_put(q->lrc[0]);
>>
>>	xa_destroy(&group->xa);
>>	mutex_destroy(&group->list_lock);
>>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
>>{
>>	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>
>>-	scoped_guard(spinlock, &q->lrc_lookup_lock)
>>+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
>>		q->lrc[idx] = lrc;
>>+		if (xe_exec_queue_is_multi_queue(q)) {
>>+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>>+
>>+			lrc->multi_queue.pos = q->multi_queue.pos;
>
>I think q->multi_queue.pos is not yet set for secondary queues at this point.
>It is set later in the xe_exec_queue_group_add() call.

Hmm, I recall seeing that a while ago and that's why I leaned toward 
having a lrc->q (back reference to the queue that the LRC belongs to, 
but that's prohibited). Anyways, I will try to move this logic outsize 
to a late point then.

>
>>+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
>
>I think we don't need to get/put the primary_lrc reference.
>Each queue holds a reference to its LRC. The secondary queues holds a reference
>to the primary queue. So, essentially, the secondary LRCs are holding a reference
>to primary lrc. So, I think, we don't need to hold reference again.

I would look at the queue and LRC as different objects in memory.

In taking a reference here, I am addressing only one specific corner 
case:

- Consider all secondary queues are freed
- Since primary queue holds a reference to all secondary LRCs, all 
   secondary LRCs are still in memory.
- When all references to primary queue are released, we end up in 
   xe_exec_queue_fini(primary_q).

In this function the LRC for primary_q is freed first, and then the 
secondary_q lrcs are freed, so there is a small window of time where the 
reference from secondary LRC to primary LRC is invalid. IMO, that small 
windows of time is enough for strange issues.

Umesh
>
>Niranjana
>
>>+		}
>>+	}
>>}
>>
>>/**
>>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>
>>			xe_exec_queue_set_lrc(q, lrc, i);
>>
>>-			if (__lrc)
>>+			if (__lrc) {
>>+				if (xe_exec_queue_is_multi_queue(q))
>>+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
>>+
>>				xe_lrc_put(__lrc);
>>+			}
>>			__lrc = lrc;
>>
>>		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>index 97aef0327fc8..3d0bf4a7bfa0 100644
>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
>>	return SZ_16K;
>>}
>>
>>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>>+{
>>+	return lrc->multi_queue.primary_lrc;
>>+}
>>+
>>size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
>>size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
>>u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>>index 5a718f759ed6..0a5c13ec2ad7 100644
>>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>>@@ -63,6 +63,14 @@ struct xe_lrc {
>>
>>	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
>>	u64 ctx_timestamp;
>>+
>>+	/** @multi_queue: Multi queue LRC related information */
>>+	struct {
>>+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>>+		struct xe_lrc *primary_lrc;
>>+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
>>+		u8 pos;
>>+	} multi_queue;
>>};
>>
>>struct xe_lrc_snapshot;
>>-- 
>>2.43.0
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
  2026-05-05 18:35     ` Umesh Nerlige Ramappa
@ 2026-05-05 18:45       ` Niranjana Vishwanathapura
  2026-05-05 18:51         ` Umesh Nerlige Ramappa
  0 siblings, 1 reply; 27+ messages in thread
From: Niranjana Vishwanathapura @ 2026-05-05 18:45 UTC (permalink / raw)
  To: Umesh Nerlige Ramappa; +Cc: intel-xe, matthew.brost, stuart.summers

On Tue, May 05, 2026 at 11:35:23AM -0700, Umesh Nerlige Ramappa wrote:
>On Mon, May 04, 2026 at 08:46:10PM -0700, Niranjana Vishwanathapura wrote:
>>On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>>>Given an LRC belonging to the secondary queue, in order to check if its
>>>context group is active, we need to check the LRC of the primary queue.
>>>In addition to that we want to compare the secondary queue position to
>>>CSMQDEBUG register to check if the queue itself is active.
>>>
>>>To do so, store primary LRC and position information in the LRC as well
>>>as take a reference to the primary LRC from each LRC in the queue group.
>>>
>>>A note on references involved:
>>>
>>>- In general the Queue takes a ref on its LRC.
>>>- In addition, for multi-queue,
>>>a. Primary Queue takes a ref for each Secondary LRC.
>>>b. Each Secondary Queue takes a ref to the Primary Queue
>>>
>>>In the current patch, each LRC in the queue group is storing a pointer
>>>to Primary LRC. There is a small window of time in the primary queue
>>>free path where the primary LRC may be freed before the secondary LRC.
>>>
>>>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>>>...
>>>window where secondary Q LRC is pointing to invalid primary LRC
>>>...
>>>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>>>
>>>In this window the reference in Secondary LRC is invalid. While there
>>>may be nothing accessing the secondary LRCs reference, to be safe, this
>>>patch is taking a reference to Primary LRC from the secondary LRC.
>>>
>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>---
>>>v2:
>>>- Store primary LRC instead of primary queue (Niranjana)
>>>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>>>- Document/Revisit references (Matt/Umesh)
>>>---
>>>drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
>>>drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
>>>drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
>>>3 files changed, 33 insertions(+), 3 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>index b287d0e0e60a..e34601d28520 100644
>>>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
>>>		return;
>>>
>>>	/* Primary queue cleanup */
>>>-	xa_for_each(&group->xa, idx, lrc)
>>>+	xa_for_each(&group->xa, idx, lrc) {
>>>+		/* drop secondary lrc ref to primary lrc */
>>>+		xe_lrc_put(lrc->multi_queue.primary_lrc);
>>>+		/* drop primary queue ref to secondary lrc */
>>>		xe_lrc_put(lrc);
>>>+	}
>>>+	/* drop primary lrc ref to itself */
>>>+	xe_lrc_put(q->lrc[0]);
>>>
>>>	xa_destroy(&group->xa);
>>>	mutex_destroy(&group->list_lock);
>>>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
>>>{
>>>	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>>
>>>-	scoped_guard(spinlock, &q->lrc_lookup_lock)
>>>+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
>>>		q->lrc[idx] = lrc;
>>>+		if (xe_exec_queue_is_multi_queue(q)) {
>>>+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>>>+
>>>+			lrc->multi_queue.pos = q->multi_queue.pos;
>>
>>I think q->multi_queue.pos is not yet set for secondary queues at this point.
>>It is set later in the xe_exec_queue_group_add() call.
>
>Hmm, I recall seeing that a while ago and that's why I leaned toward 
>having a lrc->q (back reference to the queue that the LRC belongs to, 
>but that's prohibited). Anyways, I will try to move this logic outsize 
>to a late point then.
>
>>
>>>+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
>>
>>I think we don't need to get/put the primary_lrc reference.
>>Each queue holds a reference to its LRC. The secondary queues holds a reference
>>to the primary queue. So, essentially, the secondary LRCs are holding a reference
>>to primary lrc. So, I think, we don't need to hold reference again.
>
>I would look at the queue and LRC as different objects in memory.
>
>In taking a reference here, I am addressing only one specific corner 
>case:
>
>- Consider all secondary queues are freed
>- Since primary queue holds a reference to all secondary LRCs, all   
>secondary LRCs are still in memory.
>- When all references to primary queue are released, we end up in   
>xe_exec_queue_fini(primary_q).
>
>In this function the LRC for primary_q is freed first, and then the 
>secondary_q lrcs are freed, so there is a small window of time where 
>the reference from secondary LRC to primary LRC is invalid. IMO, that 
>small windows of time is enough for strange issues.
>

By that time, all secondary queues have been deregistered and destroyed.
So, there shouldn't be any secondary LRC trying to access the primary
LRC. May be we can set primary_lrc to NULL while releaseing LRC reference
which should be sane enough and we really don't need primary_lrc reference
taking.

Niranjana

>Umesh
>>
>>Niranjana
>>
>>>+		}
>>>+	}
>>>}
>>>
>>>/**
>>>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>>
>>>			xe_exec_queue_set_lrc(q, lrc, i);
>>>
>>>-			if (__lrc)
>>>+			if (__lrc) {
>>>+				if (xe_exec_queue_is_multi_queue(q))
>>>+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
>>>+
>>>				xe_lrc_put(__lrc);
>>>+			}
>>>			__lrc = lrc;
>>>
>>>		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>>index 97aef0327fc8..3d0bf4a7bfa0 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
>>>	return SZ_16K;
>>>}
>>>
>>>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>>>+{
>>>+	return lrc->multi_queue.primary_lrc;
>>>+}
>>>+
>>>size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
>>>size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
>>>u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>index 5a718f759ed6..0a5c13ec2ad7 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>@@ -63,6 +63,14 @@ struct xe_lrc {
>>>
>>>	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
>>>	u64 ctx_timestamp;
>>>+
>>>+	/** @multi_queue: Multi queue LRC related information */
>>>+	struct {
>>>+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>>>+		struct xe_lrc *primary_lrc;
>>>+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
>>>+		u8 pos;
>>>+	} multi_queue;
>>>};
>>>
>>>struct xe_lrc_snapshot;
>>>-- 
>>>2.43.0
>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
  2026-05-05 18:45       ` Niranjana Vishwanathapura
@ 2026-05-05 18:51         ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 18:51 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Tue, May 05, 2026 at 11:45:02AM -0700, Niranjana Vishwanathapura wrote:
>On Tue, May 05, 2026 at 11:35:23AM -0700, Umesh Nerlige Ramappa wrote:
>>On Mon, May 04, 2026 at 08:46:10PM -0700, Niranjana Vishwanathapura wrote:
>>>On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>>>>Given an LRC belonging to the secondary queue, in order to check if its
>>>>context group is active, we need to check the LRC of the primary queue.
>>>>In addition to that we want to compare the secondary queue position to
>>>>CSMQDEBUG register to check if the queue itself is active.
>>>>
>>>>To do so, store primary LRC and position information in the LRC as well
>>>>as take a reference to the primary LRC from each LRC in the queue group.
>>>>
>>>>A note on references involved:
>>>>
>>>>- In general the Queue takes a ref on its LRC.
>>>>- In addition, for multi-queue,
>>>>a. Primary Queue takes a ref for each Secondary LRC.
>>>>b. Each Secondary Queue takes a ref to the Primary Queue
>>>>
>>>>In the current patch, each LRC in the queue group is storing a pointer
>>>>to Primary LRC. There is a small window of time in the primary queue
>>>>free path where the primary LRC may be freed before the secondary LRC.
>>>>
>>>>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>>>>...
>>>>window where secondary Q LRC is pointing to invalid primary LRC
>>>>...
>>>>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>>>>
>>>>In this window the reference in Secondary LRC is invalid. While there
>>>>may be nothing accessing the secondary LRCs reference, to be safe, this
>>>>patch is taking a reference to Primary LRC from the secondary LRC.
>>>>
>>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>>---
>>>>v2:
>>>>- Store primary LRC instead of primary queue (Niranjana)
>>>>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>>>>- Document/Revisit references (Matt/Umesh)
>>>>---
>>>>drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
>>>>drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
>>>>drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
>>>>3 files changed, 33 insertions(+), 3 deletions(-)
>>>>
>>>>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>>index b287d0e0e60a..e34601d28520 100644
>>>>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>>>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
>>>>		return;
>>>>
>>>>	/* Primary queue cleanup */
>>>>-	xa_for_each(&group->xa, idx, lrc)
>>>>+	xa_for_each(&group->xa, idx, lrc) {
>>>>+		/* drop secondary lrc ref to primary lrc */
>>>>+		xe_lrc_put(lrc->multi_queue.primary_lrc);
>>>>+		/* drop primary queue ref to secondary lrc */
>>>>		xe_lrc_put(lrc);
>>>>+	}
>>>>+	/* drop primary lrc ref to itself */
>>>>+	xe_lrc_put(q->lrc[0]);
>>>>
>>>>	xa_destroy(&group->xa);
>>>>	mutex_destroy(&group->list_lock);
>>>>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
>>>>{
>>>>	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>>>
>>>>-	scoped_guard(spinlock, &q->lrc_lookup_lock)
>>>>+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
>>>>		q->lrc[idx] = lrc;
>>>>+		if (xe_exec_queue_is_multi_queue(q)) {
>>>>+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>>>>+
>>>>+			lrc->multi_queue.pos = q->multi_queue.pos;
>>>
>>>I think q->multi_queue.pos is not yet set for secondary queues at this point.
>>>It is set later in the xe_exec_queue_group_add() call.
>>
>>Hmm, I recall seeing that a while ago and that's why I leaned toward 
>>having a lrc->q (back reference to the queue that the LRC belongs 
>>to, but that's prohibited). Anyways, I will try to move this logic 
>>outsize to a late point then.
>>
>>>
>>>>+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
>>>
>>>I think we don't need to get/put the primary_lrc reference.
>>>Each queue holds a reference to its LRC. The secondary queues holds a reference
>>>to the primary queue. So, essentially, the secondary LRCs are holding a reference
>>>to primary lrc. So, I think, we don't need to hold reference again.
>>
>>I would look at the queue and LRC as different objects in memory.
>>
>>In taking a reference here, I am addressing only one specific corner 
>>case:
>>
>>- Consider all secondary queues are freed
>>- Since primary queue holds a reference to all secondary LRCs, all   
>>secondary LRCs are still in memory.
>>- When all references to primary queue are released, we end up in   
>>xe_exec_queue_fini(primary_q).
>>
>>In this function the LRC for primary_q is freed first, and then the 
>>secondary_q lrcs are freed, so there is a small window of time where 
>>the reference from secondary LRC to primary LRC is invalid. IMO, 
>>that small windows of time is enough for strange issues.
>>
>
>By that time, all secondary queues have been deregistered and destroyed.
>So, there shouldn't be any secondary LRC trying to access the primary
>LRC. May be we can set primary_lrc to NULL while releaseing LRC reference
>which should be sane enough and we really don't need primary_lrc reference
>taking.

ok, that's right, I will drop this reference and update the commit 
message.

Thanks,
Umesh
>
>Niranjana
>
>>Umesh
>>>
>>>Niranjana
>>>
>>>>+		}
>>>>+	}
>>>>}
>>>>
>>>>/**
>>>>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>>>
>>>>			xe_exec_queue_set_lrc(q, lrc, i);
>>>>
>>>>-			if (__lrc)
>>>>+			if (__lrc) {
>>>>+				if (xe_exec_queue_is_multi_queue(q))
>>>>+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
>>>>+
>>>>				xe_lrc_put(__lrc);
>>>>+			}
>>>>			__lrc = lrc;
>>>>
>>>>		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>>>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>>>index 97aef0327fc8..3d0bf4a7bfa0 100644
>>>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>>>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
>>>>	return SZ_16K;
>>>>}
>>>>
>>>>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>>>>+{
>>>>+	return lrc->multi_queue.primary_lrc;
>>>>+}
>>>>+
>>>>size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
>>>>size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
>>>>u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>>>>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>>index 5a718f759ed6..0a5c13ec2ad7 100644
>>>>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>>>>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>>@@ -63,6 +63,14 @@ struct xe_lrc {
>>>>
>>>>	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
>>>>	u64 ctx_timestamp;
>>>>+
>>>>+	/** @multi_queue: Multi queue LRC related information */
>>>>+	struct {
>>>>+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>>>>+		struct xe_lrc *primary_lrc;
>>>>+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
>>>>+		u8 pos;
>>>>+	} multi_queue;
>>>>};
>>>>
>>>>struct xe_lrc_snapshot;
>>>>-- 
>>>>2.43.0
>>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues
  2026-05-05  4:12   ` Niranjana Vishwanathapura
@ 2026-05-05 19:02     ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 19:02 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Mon, May 04, 2026 at 09:12:12PM -0700, Niranjana Vishwanathapura wrote:
>On Fri, May 01, 2026 at 05:53:39PM -0700, Umesh Nerlige Ramappa wrote:
>>If a queue is currently active on the CS, query the QUEUE TIMESTAMP
>>register to get an up to date value of the runtime. To do so, ensure
>>that the primary queue is active and then check if the secondary queue
>>is executing on the CS.
>>
>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>---
>>v2:
>>- Move trace to a separate patch (Stuart)
>>- Refactor multi queue timestamp logic (Matt/Niranjana)
>>---
>>drivers/gpu/drm/xe/regs/xe_engine_regs.h |   4 +
>>drivers/gpu/drm/xe/xe_lrc.c              | 115 +++++++++++++++++++----
>>2 files changed, 99 insertions(+), 20 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/xe/regs/xe_engine_regs.h b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>>index 1b4a7e9a703d..af6af6f3f5e8 100644
>>--- a/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>>+++ b/drivers/gpu/drm/xe/regs/xe_engine_regs.h
>>@@ -170,6 +170,10 @@
>>#define   GFX_MSIX_INTERRUPT_ENABLE		REG_BIT(13)
>>
>>#define RING_CSMQDEBUG(base)			XE_REG((base) + 0x2b0)
>>+#define   CURRENT_ACTIVE_QUEUE_ID_MASK		REG_GENMASK(7, 0)
>>+
>>+#define RING_QUEUE_TIMESTAMP(base)		XE_REG((base) + 0x4c0)
>>+#define RING_QUEUE_TIMESTAMP_UDW(base)		XE_REG((base) + 0x4c0 + 4)
>>
>>#define RING_TIMESTAMP(base)			XE_REG((base) + 0x358)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
>>index 92419e5058fd..023202be5d52 100644
>>--- a/drivers/gpu/drm/xe/xe_lrc.c
>>+++ b/drivers/gpu/drm/xe/xe_lrc.c
>>@@ -21,6 +21,7 @@
>>#include "xe_configfs.h"
>>#include "xe_device.h"
>>#include "xe_drm_client.h"
>>+#include "xe_exec_queue.h"
>>#include "xe_exec_queue_types.h"
>>#include "xe_gt.h"
>>#include "xe_gt_clock.h"
>>@@ -2655,17 +2656,65 @@ static int get_ctx_timestamp(struct xe_lrc *lrc, u32 engine_id, u64 *reg_ctx_ts)
>>	return 0;
>>}
>>
>>-/**
>>- * xe_lrc_timestamp() - Current ctx timestamp
>>- * @lrc: Pointer to the lrc.
>>- *
>>- * Return latest ctx timestamp. With support for active contexts, the
>>- * calculation may be slightly racy, so follow a read-again logic to ensure that
>>- * the context is still active before returning the right timestamp.
>>- *
>>- * Returns: New ctx timestamp value
>>- */
>>-u64 xe_lrc_timestamp(struct xe_lrc *lrc)
>>+static u64 get_queue_timestamp(struct xe_hw_engine *hwe)
>>+{
>>+	return xe_mmio_read64_2x32(&hwe->gt->mmio,
>>+				   RING_QUEUE_TIMESTAMP(hwe->mmio_base));
>>+}
>>+
>>+static u32 get_queue_id(struct xe_hw_engine *hwe)
>>+{
>
>NIT...may be rename it as get_multi_queue_active_queue()?
>
>>+	u32 val = xe_mmio_read32(&hwe->gt->mmio,
>>+				 RING_CSMQDEBUG(hwe->mmio_base));
>>+
>>+	return REG_FIELD_GET(CURRENT_ACTIVE_QUEUE_ID_MASK, val);
>>+}
>>+
>>+static bool context_active(struct xe_lrc *lrc)
>>+{
>>+	return xe_lrc_ctx_timestamp(lrc) == CONTEXT_ACTIVE;
>>+}
>>+
>>+static u64 xe_lrc_multi_queue_timestamp(struct xe_lrc *lrc)
>>+{
>>+	struct xe_lrc *primary_lrc = lrc->multi_queue.primary_lrc;
>>+	struct xe_hw_engine *hwe;
>>+	u64 reg_queue_ts = lrc->queue_timestamp;
>>+
>>+	if (IS_SRIOV_VF(lrc_to_xe(lrc)))
>>+		return xe_lrc_queue_timestamp(lrc);
>>+
>>+	if (!primary_lrc || !context_active(primary_lrc))
>>+		return xe_lrc_queue_timestamp(lrc);
>
>May be print a warning if primary_lrc is not set?

I can replace it with an assert(primary_lrc), because we are here only 
after checking multi_queue. Otherwise the check is there only to make 
sure we don't do a NPD in context_active(primary_lrc).

>
>>+
>>+	/* WA BB populates engine id in PPHWSP of primary context only */
>>+	hwe = engine_id_to_hwe(primary_lrc->gt, xe_lrc_engine_id(primary_lrc));
>>+	if (!hwe)
>>+		return xe_lrc_queue_timestamp(lrc);
>>+
>>+	if (get_queue_id(hwe) != lrc->multi_queue.pos)
>>+		return xe_lrc_queue_timestamp(lrc);
>>+
>>+	/* queue is active, so store the queue timestamp register */
>>+	reg_queue_ts = get_queue_timestamp(hwe);
>>+
>>+	/* double check queue and primary queue are both still active */
>>+	if (get_queue_id(hwe) != lrc->multi_queue.pos ||
>>+	    !context_active(primary_lrc))
>>+		return xe_lrc_queue_timestamp(lrc);
>>+
>>+	return reg_queue_ts;
>>+}
>>+
>>+static u64 xe_lrc_update_multi_queue_timestamp(struct xe_lrc *lrc, u64 *old_ts)
>>+{
>>+	*old_ts = lrc->queue_timestamp;
>>+	lrc->queue_timestamp = xe_lrc_multi_queue_timestamp(lrc);
>
>Same here, add a warning message if lrc is not multi-queue?

The function is only called if we are multi_queue.

>
>>+
>>+	return lrc->queue_timestamp;
>>+}
>>+
>>+static u64 xe_lrc_single_queue_timestamp(struct xe_lrc *lrc)
>>{
>
>Hmm...NIT...we never used the word 'single_queue' so far to refer to
>non-multi-queue case. May be rename it to xe_lrc_ctx_timestamp()
>or something?

Will try to rename

>
>Also, may be update this funcion to make use the newly added
>context_active() function?

I had given it a shot earlier and that became slightly messy. Will try 
again

Umesh
>
>Niranjana
>
>>	u64 lrc_ts, reg_ts, new_ts = lrc->ctx_timestamp;
>>	u32 engine_id;
>>@@ -2697,24 +2746,50 @@ u64 xe_lrc_timestamp(struct xe_lrc *lrc)
>>	return new_ts;
>>}
>>
>>+static u64 xe_lrc_update_ctx_timestamp(struct xe_lrc *lrc, u64 *old_ts)
>>+{
>>+	*old_ts = lrc->ctx_timestamp;
>>+	lrc->ctx_timestamp = xe_lrc_single_queue_timestamp(lrc);
>>+
>>+	trace_xe_lrc_update_timestamp(lrc, *old_ts);
>>+
>>+	return lrc->ctx_timestamp;
>>+}
>>+
>>/**
>>- * xe_lrc_update_timestamp() - Update ctx timestamp
>>+ * xe_lrc_timestamp() - Current lrc timestamp
>>+ * @lrc: Pointer to the lrc.
>>+ *
>>+ * Return latest lrc timestamp. With support for active contexts/queues, the
>>+ * calculation may be slightly racy, so follow a read-again logic to ensure that
>>+ * the context/queue is still active before returning the right timestamp.
>>+ *
>>+ * Returns: New lrc timestamp value
>>+ */
>>+u64 xe_lrc_timestamp(struct xe_lrc *lrc)
>>+{
>>+	if (xe_lrc_is_multi_queue(lrc))
>>+		return xe_lrc_multi_queue_timestamp(lrc);
>>+	else
>>+		return xe_lrc_single_queue_timestamp(lrc);
>>+}
>>+
>>+/**
>>+ * xe_lrc_update_timestamp() - Update lrc timestamp
>> * @lrc: Pointer to the lrc.
>> * @old_ts: Old timestamp value
>> *
>>- * Populate @old_ts current saved ctx timestamp, read new ctx timestamp and
>>+ * Populate @old_ts with current saved lrc timestamp, read new lrc timestamp and
>> * update saved value.
>> *
>>- * Returns: New ctx timestamp value
>>+ * Returns: New lrc timestamp value
>> */
>>u64 xe_lrc_update_timestamp(struct xe_lrc *lrc, u64 *old_ts)
>>{
>>-	*old_ts = lrc->ctx_timestamp;
>>-	lrc->ctx_timestamp = xe_lrc_timestamp(lrc);
>>-
>>-	trace_xe_lrc_update_timestamp(lrc, *old_ts);
>>-
>>-	return lrc->ctx_timestamp;
>>+	if (xe_lrc_is_multi_queue(lrc))
>>+		return xe_lrc_update_multi_queue_timestamp(lrc, old_ts);
>>+	else
>>+		return xe_lrc_update_ctx_timestamp(lrc, old_ts);
>>}
>>
>>/**
>>-- 
>>2.43.0
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register
  2026-05-05 18:34       ` Niranjana Vishwanathapura
@ 2026-05-05 19:06         ` Umesh Nerlige Ramappa
  0 siblings, 0 replies; 27+ messages in thread
From: Umesh Nerlige Ramappa @ 2026-05-05 19:06 UTC (permalink / raw)
  To: Niranjana Vishwanathapura; +Cc: intel-xe, matthew.brost, stuart.summers

On Tue, May 05, 2026 at 11:34:38AM -0700, Niranjana Vishwanathapura wrote:
>On Tue, May 05, 2026 at 10:58:42AM -0700, Umesh Nerlige Ramappa wrote:
>>On Mon, May 04, 2026 at 09:25:29PM -0700, Niranjana Vishwanathapura wrote:
>>>On Fri, May 01, 2026 at 05:53:42PM -0700, Umesh Nerlige Ramappa wrote:
>>>>In a multi-queue use case, when a job is running on the secondary queue,
>>>>the CTX_TIMESTAMP does not reflect the queues run ticks. Instead, we use
>>>>the QUEUE TIMESTAMP to check how long the job ran. For user space to see
>>>>the run ticks for a secondary queue, whitelist the QUEUE_TIMESTAMP
>>>>register.
>>>>
>>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>>---
>>>>v2: Whitelist QUEUE_TIMESTAMP only for copy and compute engines (Niranjana)
>>>>---
>>>>drivers/gpu/drm/xe/xe_reg_whitelist.c | 14 ++++++++++++++
>>>>1 file changed, 14 insertions(+)
>>>>
>>>>diff --git a/drivers/gpu/drm/xe/xe_reg_whitelist.c b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>>index 80577e4b7437..37d6ac720d5c 100644
>>>>--- a/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>>+++ b/drivers/gpu/drm/xe/xe_reg_whitelist.c
>>>>@@ -33,6 +33,14 @@ static bool match_has_mert(const struct xe_device *xe,
>>>>	return xe_device_has_mert((struct xe_device *)xe);
>>>>}
>>>>
>>>>+static bool match_multiq_class(const struct xe_device *xe,
>>>>+			       const struct xe_gt *gt,
>>>>+			       const struct xe_hw_engine *hwe)
>>>
>>>To be consistent, we have used 'multi_queue' as naming convention
>>>and avoided 'multiq'. It would be good to keep that consistency
>>>everywhere.
>>
>>sure, will change that
>>>
>>>>+{
>>>>+	return hwe->class == XE_ENGINE_CLASS_COMPUTE ||
>>>>+	       hwe->class == XE_ENGINE_CLASS_COPY;
>>>
>>>We already have xe_exec_queue_supports_multi_queue() function 
>>>which does similar check. I think we need to abstract it out to a 
>>>single
>>>function and use that instead of multiple places in code that
>>>determine whether a class supports multi-queue or not.
>>
>>The whitelist is applied during engine init at driver load, so we 
>>don't have any exec queues yet. Instead, should we derive supported 
>>engines from xe_graphics_desc.multi_queue_engine_class_mask and 
>>check?
>>
>
>Yah, xe_exec_queue_supports_multi_queue() also uses it. So, may be add a
>xe_gt_supports_multi_queue(struct xe_gt *gt, enum xe_engine_class class)
>in xe_gt.h and use that both here and in xe_exec_queue_supports_multi_queue().

Oh, I didn't read your comment right earlier. That's what you were 
already suggesting. Will do.

Umesh

>
>Niranjana
>
>>Umesh
>>
>>>
>>>Niranjana
>>>
>>>>+}
>>>>+
>>>>static const struct xe_rtp_entry_sr register_whitelist[] = {
>>>>	{ XE_RTP_NAME("WaAllowPMDepthAndInvocationCountAccessFromUMD, 1408556865"),
>>>>	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(1200, 1210), ENGINE_CLASS(RENDER)),
>>>>@@ -54,6 +62,12 @@ static const struct xe_rtp_entry_sr register_whitelist[] = {
>>>>				RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>>>				XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>>>	},
>>>>+	{ XE_RTP_NAME("allow_read_queue_timestamp"),
>>>>+	  XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3500, 3511), FUNC(match_multiq_class)),
>>>>+	  XE_RTP_ACTIONS(WHITELIST(RING_QUEUE_TIMESTAMP(0),
>>>>+				   RING_FORCE_TO_NONPRIV_ACCESS_RD,
>>>>+				   XE_RTP_ACTION_FLAG(ENGINE_BASE)))
>>>>+	},
>>>>	{ XE_RTP_NAME("16014440446"),
>>>>	  XE_RTP_RULES(PLATFORM(PVC)),
>>>>	  XE_RTP_ACTIONS(WHITELIST(XE_REG(0x4400),
>>>>-- 
>>>>2.43.0
>>>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2026-05-05 19:06 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
2026-05-04 23:51   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
2026-05-04 23:59   ` Niranjana Vishwanathapura
2026-05-05 18:03     ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
2026-05-05  3:46   ` Niranjana Vishwanathapura
2026-05-05 18:35     ` Umesh Nerlige Ramappa
2026-05-05 18:45       ` Niranjana Vishwanathapura
2026-05-05 18:51         ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
2026-05-05  4:00   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
2026-05-05  4:16   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
2026-05-05  4:12   ` Niranjana Vishwanathapura
2026-05-05 19:02     ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
2026-05-05  4:19   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
2026-05-05  4:20   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
2026-05-05  4:25   ` Niranjana Vishwanathapura
2026-05-05 17:58     ` Umesh Nerlige Ramappa
2026-05-05 18:34       ` Niranjana Vishwanathapura
2026-05-05 19:06         ` Umesh Nerlige Ramappa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox