Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Lis, Tomasz" <tomasz.lis@intel.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: intel-xe@lists.freedesktop.org,
	"Michał Winiarski" <michal.winiarski@intel.com>,
	"Michał Wajdeczko" <michal.wajdeczko@intel.com>,
	"Piotr Piórkowski" <piotr.piorkowski@intel.com>
Subject: Re: [PATCH v3 2/4] drm/xe/queue: Wrappers for setting and getting LRC references
Date: Thu, 26 Feb 2026 18:26:29 +0100	[thread overview]
Message-ID: <906b9ee5-2bc8-4b38-a9ec-3e8993a6f015@intel.com> (raw)
In-Reply-To: <aZ+ocKctwftPBcbn@lstrano-desk.jf.intel.com>

ack to almost all requests, but comment to one remaining below.

On 2/26/2026 2:57 AM, Matthew Brost wrote:
> On Wed, Feb 25, 2026 at 05:42:20PM -0800, Matthew Brost wrote:
>
> Opps, one mistake in my review, corrected below.
>
>> On Thu, Feb 26, 2026 at 12:54:45AM +0100, Tomasz Lis wrote:
>>
>> s/'drm/xe/queue'/'drm/xe' in the patch prefix I think.
>>
>>> There is a small but non-zero chance that VF post migration fixups
>>> are running on an exec queue during teardown. The chances are
>>> decreased by starting the teardown by releasing guc_id, but remain
>>> non-zero. On the other hand the sync between fixups and EQ creation
>>> (wait_valid_ggtt) drastically increases the chance for such parallel
>>> teardown if queue creation error path is entered (err_lrc label).
>>>
>>> The exec queue itself is not going to cause an issue, but LRCs have
>>> a small chance of getting freed during the fixups.
>>>
>>> Creating a setter and a getter makes it easier to protect the fixup
>>> operations with a lock. For other driver activities, the original
>>> access method (without any protection) can still be used.
>>>
>>> Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_exec_queue.c | 71 +++++++++++++++++++++---------
>>>   drivers/gpu/drm/xe/xe_exec_queue.h |  1 +
>>>   2 files changed, 52 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>>> index b4ef725a682d..2cb37af42021 100644
>>> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>>> @@ -270,6 +270,54 @@ static struct xe_exec_queue *__xe_exec_queue_alloc(struct xe_device *xe,
>>>   	return q;
>>>   }
>>>   
>>> +static void xe_exec_queue_set_lrc(struct xe_exec_queue *q, struct xe_lrc *lrc, u16 idx)
>>> +{
>>> +	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>> +
>>> +	scoped_guard(spinlock, &q->multi_queue.lock)
>>> +		q->lrc[idx] = lrc;
>>> +}
>>> +
>>> +/**
>>> + * xe_exec_queue_get_lrc() - Get the LRC from exec queue.
>>> + * @q: The exec_queue.
>>> + * @idx: Index within multi-LRC array.
>>> + *
>>> + * Retrieves LRC of given index for the exec queue
>> under lock and takes reference.
>>
>>
>>> + *
>>> + * Return: Pointer to LRC on success, error on failure
>> NULL on lookup failure.
>>
>>> + */
>>> +struct xe_lrc *xe_exec_queue_get_lrc(struct xe_exec_queue *q, u16 idx)
>>> +{
>>> +	struct xe_lrc *lrc;
>> struct xe_lrc *lrc = NULL;
>>
> Actually not needed as lrc is always assigned and will either be present
> and valid or NULL.
>
> Matt
>
>>> +
>>> +	xe_assert(gt_to_xe(q->gt), idx < q->width);
>>> +
>>> +	scoped_guard(spinlock, &q->multi_queue.lock) {
>> Hmm, this isn't what 'q->multi_queue.lock' was designed for.
>>
>> Can we get a dedicated spinlock? Maybe 'q->lrc_lookup_lock'?
>>
>> Open to better naming.
>>
>>> +		lrc = q->lrc[idx];
>>> +		if (lrc)
>>> +			xe_lrc_get(lrc);
>>> +	}
>>> +
>>> +	return lrc;
>>> +}
>>> +
>>> +/**
>>> + * xe_exec_queue_lrc() - Get the LRC from exec queue.
>>> + * @q: The exec_queue.
>>> + *
>>> + * Retrieves the primary LRC for the exec queue. Note that this function
>>> + * returns only the first LRC instance, even when multiple parallel LRCs
>>> + * are configured. This function does not increment reference count,
>>> + * so the reference can be just forgotten after use.
>>> + *
>>> + * Return: Pointer to LRC on success, error on failure
>>> + */
>>> +struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q)
>>> +{
>>> +	return q->lrc[0];
>>> +}
>> Why move this code?

We've added another getter and setter. It would be very un-structured if 
we had one getter, then a lot of different code, and then the 2nd getter.

Where no other restrictions apply, we should keep functions of similar 
level in one place within source files.

When someone looks at a code which contains a setter and a getter, I 
believe they should find any other getters nearby. They definitely 
should not expect there may be another getter of the same resource much 
further within the file.

For why move up instead of inserting new ones down - it is to avoid 
unnecessary prototype (I believe checkpatch would otherwise complain).

-Tomasz

>>
>>> +
>>>   static void __xe_exec_queue_fini(struct xe_exec_queue *q)
>>>   {
>>>   	int i;
>>> @@ -327,8 +375,7 @@ static int __xe_exec_queue_init(struct xe_exec_queue *q, u32 exec_queue_flags)
>>>   			goto err_lrc;
>>>   		}
>>>   
>>> -		/* Pairs with READ_ONCE to xe_exec_queue_contexts_hwsp_rebase */
>>> -		WRITE_ONCE(q->lrc[i], lrc);
>>> +		xe_exec_queue_set_lrc(q, lrc, i);
>>>   	}
>>>   
>>>   	return 0;
>>> @@ -1288,21 +1335,6 @@ int xe_exec_queue_get_property_ioctl(struct drm_device *dev, void *data,
>>>   	return ret;
>>>   }
>>>   
>>> -/**
>>> - * xe_exec_queue_lrc() - Get the LRC from exec queue.
>>> - * @q: The exec_queue.
>>> - *
>>> - * Retrieves the primary LRC for the exec queue. Note that this function
>>> - * returns only the first LRC instance, even when multiple parallel LRCs
>>> - * are configured.
>>> - *
>>> - * Return: Pointer to LRC on success, error on failure
>>> - */
>>> -struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q)
>>> -{
>>> -	return q->lrc[0];
>>> -}
>>> -
>>>   /**
>>>    * xe_exec_queue_is_lr() - Whether an exec_queue is long-running
>>>    * @q: The exec_queue
>>> @@ -1662,14 +1694,13 @@ int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch)
>>>   	for (i = 0; i < q->width; ++i) {
>>>   		struct xe_lrc *lrc;
>>>   
>>> -		/* Pairs with WRITE_ONCE in __xe_exec_queue_init  */
>>> -		lrc = READ_ONCE(q->lrc[i]);
>>> +		lrc = xe_exec_queue_get_lrc(q, i);
>>>   		if (!lrc)
>>>   			continue;
>>> -
>> Unrelated.
>>
>> Matt
>>
>>>   		xe_lrc_update_memirq_regs_with_address(lrc, q->hwe, scratch);
>>>   		xe_lrc_update_hwctx_regs_with_address(lrc);
>>>   		err = xe_lrc_setup_wa_bb_with_scratch(lrc, q->hwe, scratch);
>>> +		xe_lrc_put(lrc);
>>>   		if (err)
>>>   			break;
>>>   	}
>>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h b/drivers/gpu/drm/xe/xe_exec_queue.h
>>> index c9e3a7c2d249..a82d99bd77bc 100644
>>> --- a/drivers/gpu/drm/xe/xe_exec_queue.h
>>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.h
>>> @@ -160,6 +160,7 @@ void xe_exec_queue_update_run_ticks(struct xe_exec_queue *q);
>>>   int xe_exec_queue_contexts_hwsp_rebase(struct xe_exec_queue *q, void *scratch);
>>>   
>>>   struct xe_lrc *xe_exec_queue_lrc(struct xe_exec_queue *q);
>>> +struct xe_lrc *xe_exec_queue_get_lrc(struct xe_exec_queue *q, u16 idx);
>>>   
>>>   /**
>>>    * xe_exec_queue_idle_skip_suspend() - Can exec queue skip suspend
>>> -- 
>>> 2.25.1
>>>

  reply	other threads:[~2026-02-26 17:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-25 23:54 [PATCH v3 0/4] drm/xe/vf: Fix exec queue creation during post-migration recovery Tomasz Lis
2026-02-25 23:54 ` [PATCH v3 1/4] drm/xe/queue: Call fini on exec queue creation fail Tomasz Lis
2026-02-25 23:54 ` [PATCH v3 2/4] drm/xe/queue: Wrappers for setting and getting LRC references Tomasz Lis
2026-02-26  1:42   ` Matthew Brost
2026-02-26  1:57     ` Matthew Brost
2026-02-26 17:26       ` Lis, Tomasz [this message]
2026-02-26 17:30         ` Matthew Brost
2026-02-25 23:54 ` [PATCH v3 3/4] drm/xe/vf: Wait for all fixups before using default LRCs Tomasz Lis
2026-02-25 23:54 ` [PATCH v3 4/4] drm/xe/vf: Redo LRC creation while in VF fixups Tomasz Lis
2026-02-26  1:49   ` Matthew Brost
2026-02-26  1:24 ` ✓ CI.KUnit: success for drm/xe/vf: Fix exec queue creation during post-migration recovery (rev3) Patchwork
2026-02-26  2:02 ` ✓ Xe.CI.BAT: " Patchwork
2026-02-26  6:08 ` ✗ Xe.CI.FULL: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=906b9ee5-2bc8-4b38-a9ec-3e8993a6f015@intel.com \
    --to=tomasz.lis@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=michal.wajdeczko@intel.com \
    --cc=michal.winiarski@intel.com \
    --cc=piotr.piorkowski@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox