From: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: <intel-xe@lists.freedesktop.org>, <matthew.brost@intel.com>,
<stuart.summers@intel.com>
Subject: Re: [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC
Date: Tue, 5 May 2026 11:45:02 -0700 [thread overview]
Message-ID: <afo6ru4u83dDRJFC@nvishwa1-desk> (raw)
In-Reply-To: <afo4axPANAfvgzW7@soc-5CG1426VCC.clients.intel.com>
On Tue, May 05, 2026 at 11:35:23AM -0700, Umesh Nerlige Ramappa wrote:
>On Mon, May 04, 2026 at 08:46:10PM -0700, Niranjana Vishwanathapura wrote:
>>On Fri, May 01, 2026 at 05:53:36PM -0700, Umesh Nerlige Ramappa wrote:
>>>Given an LRC belonging to the secondary queue, in order to check if its
>>>context group is active, we need to check the LRC of the primary queue.
>>>In addition to that we want to compare the secondary queue position to
>>>CSMQDEBUG register to check if the queue itself is active.
>>>
>>>To do so, store primary LRC and position information in the LRC as well
>>>as take a reference to the primary LRC from each LRC in the queue group.
>>>
>>>A note on references involved:
>>>
>>>- In general the Queue takes a ref on its LRC.
>>>- In addition, for multi-queue,
>>>a. Primary Queue takes a ref for each Secondary LRC.
>>>b. Each Secondary Queue takes a ref to the Primary Queue
>>>
>>>In the current patch, each LRC in the queue group is storing a pointer
>>>to Primary LRC. There is a small window of time in the primary queue
>>>free path where the primary LRC may be freed before the secondary LRC.
>>>
>>>__xe_exec_queue_fini(q); // frees|puts primary q LRCs
>>>...
>>>window where secondary Q LRC is pointing to invalid primary LRC
>>>...
>>>__xe_exec_queue_free(q); // frees|puts secondary q LRCs in multi-Q case
>>>
>>>In this window the reference in Secondary LRC is invalid. While there
>>>may be nothing accessing the secondary LRCs reference, to be safe, this
>>>patch is taking a reference to Primary LRC from the secondary LRC.
>>>
>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>>---
>>>v2:
>>>- Store primary LRC instead of primary queue (Niranjana)
>>>- Drop the valid flag and check if primary_lrc is NULL (Niranjana)
>>>- Document/Revisit references (Matt/Umesh)
>>>---
>>>drivers/gpu/drm/xe/xe_exec_queue.c | 23 ++++++++++++++++++++---
>>>drivers/gpu/drm/xe/xe_lrc.h | 5 +++++
>>>drivers/gpu/drm/xe/xe_lrc_types.h | 8 ++++++++
>>>3 files changed, 33 insertions(+), 3 deletions(-)
>>>
>>>diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>index b287d0e0e60a..e34601d28520 100644
>>>--- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>>+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>>@@ -129,8 +129,14 @@ static void xe_exec_queue_group_cleanup(struct xe_exec_queue *q)
>>> return;
>>>
>>> /* Primary queue cleanup */
>>>- xa_for_each(&group->xa, idx, lrc)
>>>+ xa_for_each(&group->xa, idx, lrc) {
>>>+ /* drop secondary lrc ref to primary lrc */
>>>+ xe_lrc_put(lrc->multi_queue.primary_lrc);
>>>+ /* drop primary queue ref to secondary lrc */
>>> xe_lrc_put(lrc);
>>>+ }
>>>+ /* drop primary lrc ref to itself */
>>>+ xe_lrc_put(q->lrc[0]);
>>>
>>> xa_destroy(&group->xa);
>>> mutex_destroy(&group->list_lock);
>>>@@ -275,8 +281,15 @@ static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u
>>>{
>>> xe_assert(gt_to_xe(q->gt), idx < q->width);
>>>
>>>- scoped_guard(spinlock, &q->lrc_lookup_lock)
>>>+ scoped_guard(spinlock, &q->lrc_lookup_lock) {
>>> q->lrc[idx] = lrc;
>>>+ if (xe_exec_queue_is_multi_queue(q)) {
>>>+ struct xe_lrc *primary_lrc = q->multi_queue.group->primary->lrc[0];
>>>+
>>>+ lrc->multi_queue.pos = q->multi_queue.pos;
>>
>>I think q->multi_queue.pos is not yet set for secondary queues at this point.
>>It is set later in the xe_exec_queue_group_add() call.
>
>Hmm, I recall seeing that a while ago and that's why I leaned toward
>having a lrc->q (back reference to the queue that the LRC belongs to,
>but that's prohibited). Anyways, I will try to move this logic outsize
>to a late point then.
>
>>
>>>+ lrc->multi_queue.primary_lrc = xe_lrc_get(primary_lrc);
>>
>>I think we don't need to get/put the primary_lrc reference.
>>Each queue holds a reference to its LRC. The secondary queues holds a reference
>>to the primary queue. So, essentially, the secondary LRCs are holding a reference
>>to primary lrc. So, I think, we don't need to hold reference again.
>
>I would look at the queue and LRC as different objects in memory.
>
>In taking a reference here, I am addressing only one specific corner
>case:
>
>- Consider all secondary queues are freed
>- Since primary queue holds a reference to all secondary LRCs, all
>secondary LRCs are still in memory.
>- When all references to primary queue are released, we end up in
>xe_exec_queue_fini(primary_q).
>
>In this function the LRC for primary_q is freed first, and then the
>secondary_q lrcs are freed, so there is a small window of time where
>the reference from secondary LRC to primary LRC is invalid. IMO, that
>small windows of time is enough for strange issues.
>
By that time, all secondary queues have been deregistered and destroyed.
So, there shouldn't be any secondary LRC trying to access the primary
LRC. May be we can set primary_lrc to NULL while releaseing LRC reference
which should be sane enough and we really don't need primary_lrc reference
taking.
Niranjana
>Umesh
>>
>>Niranjana
>>
>>>+ }
>>>+ }
>>>}
>>>
>>>/**
>>>@@ -388,8 +401,12 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>>
>>> xe_exec_queue_set_lrc(q, lrc, i);
>>>
>>>- if (__lrc)
>>>+ if (__lrc) {
>>>+ if (xe_exec_queue_is_multi_queue(q))
>>>+ xe_lrc_put(__lrc->multi_queue.primary_lrc);
>>>+
>>> xe_lrc_put(__lrc);
>>>+ }
>>> __lrc = lrc;
>>>
>>> } while (marker != xe_vf_migration_fixups_complete_count(q->gt));
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
>>>index 97aef0327fc8..3d0bf4a7bfa0 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc.h
>>>@@ -91,6 +91,11 @@ static inline size_t xe_lrc_ring_size(void)
>>> return SZ_16K;
>>>}
>>>
>>>+static inline bool xe_lrc_is_multi_queue(struct xe_lrc *lrc)
>>>+{
>>>+ return lrc->multi_queue.primary_lrc;
>>>+}
>>>+
>>>size_t xe_gt_lrc_hang_replay_size(struct xe_gt *gt, enum xe_engine_class class);
>>>size_t xe_gt_lrc_size(struct xe_gt *gt, enum xe_engine_class class);
>>>u32 xe_lrc_pphwsp_offset(struct xe_lrc *lrc);
>>>diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>index 5a718f759ed6..0a5c13ec2ad7 100644
>>>--- a/drivers/gpu/drm/xe/xe_lrc_types.h
>>>+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
>>>@@ -63,6 +63,14 @@ struct xe_lrc {
>>>
>>> /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
>>> u64 ctx_timestamp;
>>>+
>>>+ /** @multi_queue: Multi queue LRC related information */
>>>+ struct {
>>>+ /** @multi_queue.primary_lrc: Primary lrc of this multi-queue group*/
>>>+ struct xe_lrc *primary_lrc;
>>>+ /** @multi_queue.pos: Position of LRC within the multi-queue group */
>>>+ u8 pos;
>>>+ } multi_queue;
>>>};
>>>
>>>struct xe_lrc_snapshot;
>>>--
>>>2.43.0
>>>
next prev parent reply other threads:[~2026-05-05 18:45 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-02 0:53 [PATCH v2 0/9] Support run ticks for multi-queue use case Umesh Nerlige Ramappa
2026-05-02 0:53 ` [PATCH v2 1/9] drm/xe/lrc: Use 64 bit ctx timestamp in the LRC snapshot Umesh Nerlige Ramappa
2026-05-04 23:51 ` Niranjana Vishwanathapura
2026-05-02 0:53 ` [PATCH v2 2/9] drm/xe: Add timestamp_ms to " Umesh Nerlige Ramappa
2026-05-04 23:59 ` Niranjana Vishwanathapura
2026-05-05 18:03 ` Umesh Nerlige Ramappa
2026-05-02 0:53 ` [PATCH v2 3/9] drm/xe/multi_queue: Store primary LRC and position info in LRC Umesh Nerlige Ramappa
2026-05-05 3:46 ` Niranjana Vishwanathapura
2026-05-05 18:35 ` Umesh Nerlige Ramappa
2026-05-05 18:45 ` Niranjana Vishwanathapura [this message]
2026-05-05 18:51 ` Umesh Nerlige Ramappa
2026-05-02 0:53 ` [PATCH v2 4/9] drm/xe/multi_queue: Add helpers to access CS QUEUE TIMESTAMP from lrc Umesh Nerlige Ramappa
2026-05-05 4:00 ` Niranjana Vishwanathapura
2026-05-02 0:53 ` [PATCH v2 5/9] drm/xe/lrc: Refactor out engine id to hwe conversion Umesh Nerlige Ramappa
2026-05-05 4:16 ` Niranjana Vishwanathapura
2026-05-02 0:53 ` [PATCH v2 6/9] drm/xe/multi_queue: Capture queue run times for active queues Umesh Nerlige Ramappa
2026-05-05 4:12 ` Niranjana Vishwanathapura
2026-05-05 19:02 ` Umesh Nerlige Ramappa
2026-05-02 0:53 ` [PATCH v2 7/9] drm/xe/multi_queue: Add trace event for the multi queue timestamp Umesh Nerlige Ramappa
2026-05-05 4:19 ` Niranjana Vishwanathapura
2026-05-02 0:53 ` [PATCH v2 8/9] drm/xe/multi_queue: Use QUEUE_TIMESTAMP as job timestamp for multi-queue Umesh Nerlige Ramappa
2026-05-05 4:20 ` Niranjana Vishwanathapura
2026-05-02 0:53 ` [PATCH v2 9/9] drm/xe/multi_queue: Whitelist QUEUE_TIMESTAMP register Umesh Nerlige Ramappa
2026-05-05 4:25 ` Niranjana Vishwanathapura
2026-05-05 17:58 ` Umesh Nerlige Ramappa
2026-05-05 18:34 ` Niranjana Vishwanathapura
2026-05-05 19:06 ` Umesh Nerlige Ramappa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=afo6ru4u83dDRJFC@nvishwa1-desk \
--to=niranjana.vishwanathapura@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
--cc=stuart.summers@intel.com \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox