public inbox for intel-xe@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <matthew.brost@intel.com>,
	<stuart.summers@intel.com>
Subject: Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
Date: Tue, 5 May 2026 11:45:02 -0700	[thread overview]
Message-ID: <afo6ru4u83dDRJFC@nvishwa1-desk> (raw)
In-Reply-To: <afo4axPANAfvgzW7@soc-5CG1426VCC.clients.intel.com>

On Tue, May 05, 2026 at 11:35:23AM -0700, Umesh Nerlige Ramappa wrote:
>On Mon, May 04, 2026 at 08:46:10PM -0700, Niranjana Vishwanathapura wrote:
>>On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>>>Given an LRC belonging to the secondary queue, in order to check if its
>>>context group is active, we need to check the LRC of the primary queue.
>>>In addition to that we want to compare the secondary queue position to
>>>CSMQDEBUG register to check if the queue itself is active.
>>>
>>>To do so, store primary LRC and position information in the LRC as well
>>>as take a reference to the primary LRC from each LRC in the queue group.
>>>
>>>A note on references involved:
>>>
>>>- In general the Queue takes a ref on its LRC.
>>>- In addition, for multi-queue,
>>>a. Primary Queue takes a ref for each Secondary LRC.
>>>b. Each Secondary Queue takes a ref to the Primary Queue
>>>
>>>In the current patch, each LRC in the queue group is storing a pointer
>>>to Primary LRC. There is a small window of time in the primary queue
>>>free path where the primary LRC may be freed before the secondary LRC.
>>>
>>>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>>>...
>>>window where secondary Q LRC is pointing to invalid primary LRC
>>>...
>>>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>>>
>>>In this window the reference in Secondary LRC is invalid. While there
>>>may be nothing accessing the secondary LRCs reference, to be safe, this
>>>patch is taking a reference to Primary LRC from the secondary LRC.
>>>
>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>---
>>>v2:
>>>- Store primary LRC instead of primary queue (Niranjana)
>>>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>>>- Document/Revisit references (Matt/Umesh)
>>>---
>>>drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
>>>drivers/gpu/drm/xe/xe_lrc.h        |  5 +++++
>>>drivers/gpu/drm/xe/xe_lrc_types.h  |  8 ++++++++
>>>3 files changed, 33 insertions(+), 3 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>index b287d0e0e60a..e34601d28520 100644
>>>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
>>>		return;
>>>
>>>	/* Primary queue cleanup */
>>>-	xa_for_each(&group->xa, idx, lrc)
>>>+	xa_for_each(&group->xa, idx, lrc) {
>>>+		/* drop secondary lrc ref to primary lrc */
>>>+		xe_lrc_put(lrc->multi_queue.primary_lrc);
>>>+		/* drop primary queue ref to secondary lrc */
>>>		xe_lrc_put(lrc);
>>>+	}
>>>+	/* drop primary lrc ref to itself */
>>>+	xe_lrc_put(q->lrc[0]);
>>>
>>>	xa_destroy(&group->xa);
>>>	mutex_destroy(&group->list_lock);
>>>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
>>>{
>>>	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>>
>>>-	scoped_guard(spinlock, &q->lrc_lookup_lock)
>>>+	scoped_guard(spinlock, &q->lrc_lookup_lock) {
>>>		q->lrc[idx] = lrc;
>>>+		if (xe_exec_queue_is_multi_queue(q)) {
>>>+			struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>>>+
>>>+			lrc->multi_queue.pos = q->multi_queue.pos;
>>
>>I think q->multi_queue.pos is not yet set for secondary queues at this point.
>>It is set later in the xe_exec_queue_group_add() call.
>
>Hmm, I recall seeing that a while ago and that's why I leaned toward 
>having a lrc->q (back reference to the queue that the LRC belongs to, 
>but that's prohibited). Anyways, I will try to move this logic outsize 
>to a late point then.
>
>>
>>>+			lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
>>
>>I think we don't need to get/put the primary_lrc reference.
>>Each queue holds a reference to its LRC. The secondary queues holds a reference
>>to the primary queue. So, essentially, the secondary LRCs are holding a reference
>>to primary lrc. So, I think, we don't need to hold reference again.
>
>I would look at the queue and LRC as different objects in memory.
>
>In taking a reference here, I am addressing only one specific corner 
>case:
>
>- Consider all secondary queues are freed
>- Since primary queue holds a reference to all secondary LRCs, all   
>secondary LRCs are still in memory.
>- When all references to primary queue are released, we end up in   
>xe_exec_queue_fini(primary_q).
>
>In this function the LRC for primary_q is freed first, and then the 
>secondary_q lrcs are freed, so there is a small window of time where 
>the reference from secondary LRC to primary LRC is invalid. IMO, that 
>small windows of time is enough for strange issues.
>

By that time, all secondary queues have been deregistered and destroyed.
So, there shouldn't be any secondary LRC trying to access the primary
LRC. May be we can set primary_lrc to NULL while releaseing LRC reference
which should be sane enough and we really don't need primary_lrc reference
taking.

Niranjana

>Umesh
>>
>>Niranjana
>>
>>>+		}
>>>+	}
>>>}
>>>
>>>/**
>>>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>>
>>>			xe_exec_queue_set_lrc(q, lrc, i);
>>>
>>>-			if (__lrc)
>>>+			if (__lrc) {
>>>+				if (xe_exec_queue_is_multi_queue(q))
>>>+					xe_lrc_put(__lrc->multi_queue.primary_lrc);
>>>+
>>>				xe_lrc_put(__lrc);
>>>+			}
>>>			__lrc = lrc;
>>>
>>>		} while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>>index 97aef0327fc8..3d0bf4a7bfa0 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
>>>	return SZ_16K;
>>>}
>>>
>>>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>>>+{
>>>+	return lrc->multi_queue.primary_lrc;
>>>+}
>>>+
>>>size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
>>>size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
>>>u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>index 5a718f759ed6..0a5c13ec2ad7 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>@@ -63,6 +63,14 @@ struct xe_lrc {
>>>
>>>	/** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
>>>	u64 ctx_timestamp;
>>>+
>>>+	/** @multi_queue: Multi queue LRC related information */
>>>+	struct {
>>>+		/** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>>>+		struct xe_lrc *primary_lrc;
>>>+		/** @multi_queue.pos: Position of LRC within the multi-queue group */
>>>+		u8 pos;
>>>+	} multi_queue;
>>>};
>>>
>>>struct xe_lrc_snapshot;
>>>-- 
>>>2.43.0
>>>

  reply	other threads:[~2026-05-05 18:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-02  0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
2026-05-04 23:51   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
2026-05-04 23:59   ` Niranjana Vishwanathapura
2026-05-05 18:03     ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
2026-05-05  3:46   ` Niranjana Vishwanathapura
2026-05-05 18:35     ` Umesh Nerlige Ramappa
2026-05-05 18:45       ` Niranjana Vishwanathapura [this message]
2026-05-05 18:51         ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
2026-05-05  4:00   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
2026-05-05  4:16   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
2026-05-05  4:12   ` Niranjana Vishwanathapura
2026-05-05 19:02     ` Umesh Nerlige Ramappa
2026-05-02  0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
2026-05-05  4:19   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
2026-05-05  4:20   ` Niranjana Vishwanathapura
2026-05-02  0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
2026-05-05  4:25   ` Niranjana Vishwanathapura
2026-05-05 17:58     ` Umesh Nerlige Ramappa
2026-05-05 18:34       ` Niranjana Vishwanathapura
2026-05-05 19:06         ` Umesh Nerlige Ramappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afo6ru4u83dDRJFC@nvishwa1-desk \
    --to=niranjana.vishwanathapura@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=stuart.summers@intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox