Intel-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 06/16] drm/i915/gt: Decouple completed requests on unwind
Date: Wed, 25 Nov 2020 16:21:27 +0000	[thread overview]
Message-ID: <7157b2c0-0fde-8adb-95bd-84ae4573d5b9@linux.intel.com> (raw)
In-Reply-To: <160629970681.25068.9984672839751167059@build.alporthouse.com>


On 25/11/2020 10:21, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-25 09:15:25)
>>
>> On 24/11/2020 17:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
>>>>
>>>> On 24/11/2020 11:42, Chris Wilson wrote:
>>>>> Since the introduction of preempt-to-busy, requests can complete in the
>>>>> background, even while they are not on the engine->active.requests list.
>>>>> As such, the engine->active.request list itself is not in strict
>>>>> retirement order, and we have to scan the entire list while unwinding to
>>>>> not miss any. However, if the request is completed we currently leave it
>>>>> on the list [until retirement], but we could just as simply remove it
>>>>> and stop treating it as active. We would only have to then traverse it
>>>>> once while unwinding in quick succession.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
>>>>>     drivers/gpu/drm/i915/i915_request.c | 3 ++-
>>>>>     2 files changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index 30aa59fb7271..cf11cbac241b 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>>>>>         list_for_each_entry_safe_reverse(rq, rn,
>>>>>                                          &engine->active.requests,
>>>>>                                          sched.link) {
>>>>> -             if (i915_request_completed(rq))
>>>>> -                     continue; /* XXX */
>>>>> +             if (i915_request_completed(rq)) {
>>>>> +                     list_del_init(&rq->sched.link);
>>>>> +                     continue;
>>>>> +             }
>>>>>     
>>>>>                 __i915_request_unsubmit(rq);
>>>>>     
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 8d7d29c9e375..a9db1376b996 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
>>>>>          * after removing the breadcrumb and signaling it, so that we do not
>>>>>          * inadvertently attach the breadcrumb to a completed request.
>>>>>          */
>>>>> -     remove_from_engine(rq);
>>>>> +     if (!list_empty(&rq->sched.link))
>>>>> +             remove_from_engine(rq);
>>>>
>>>> The list_empty check is unlocked so is list_del_init in
>>>> remove_from_engine safe on potentially already unlinked request or it
>>>> needs to re-check under the lock?
>>>
>>> It's safe. The unwind is under the lock, and remove_from_engine takes
>>> the lock, and both do list_del_init() which is a no-op if already
>>> removed. And the end state of the flag bits is the same on each path. We
>>> can skip the __notify_execute_cb_imm() since we know in unwind it is
>>> executing and there should be no cb.
>>>
>>> The test before we take the lock is only allowed to skip the active.lock
>>> if it sees the list is already decoupled, in which case we can leave it
>>> to the unwind to remove it from the engine (and we know that the request
>>> can only have been inflight prior to completion). Since the test is not
>>> locked, we don't serialise with the removal, but the list_del_init is
>>> the last action on the request so there is no window where the unwind is
>>> accessing the request after it may have been retired.
>>>
>>> list_move() will not confuse list_empty(), as although it does a
>>> list_del_entry, it is not temporarily re-initialised to an empty list.
>>
>> List_del_init is indeed safe. List_move.. which one you think can race
>> with retire? Preempt-to-busy unwinding an almost completed request yet
>> again? Or even preempt timeout racing with completion?
> 
> Here in unwind. We pass the completion check, but the request may still
> be running and complete at any time (until we submit & ack the new ELSP).
> So an unlocked list_empty check during retire can race with any of the
> list_move during unwind and resubmit. (On resubmit, we check completed
> under the lock and drop the request in __i915_request_submit which
> should also leave it in a consistent state as if we had called
> remove_from_engine.)

Right, yes, that seems safe as well. Only new problem could have been a 
false negative, meaning remote_from_engine _not_ scheduled by mistake if 
a transient false list_empty condition.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-11-25 16:21 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24 11:42 [Intel-gfx] [PATCH 01/16] drm/i915/gem: Drop free_work for GEM contexts Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 02/16] drm/i915/gt: Track the overall awake/busy time Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 03/16] drm/i915/gt: Protect context lifetime with RCU Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 04/16] drm/i915/gt: Split the breadcrumb spinlock between global and contexts Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 05/16] drm/i915/gt: Move the breadcrumb to the signaler if completed upon cancel Chris Wilson
2020-11-24 16:19   ` Tvrtko Ursulin
2020-11-24 16:30     ` Chris Wilson
2020-11-24 17:02       ` Tvrtko Ursulin
2020-11-25 19:56   ` [Intel-gfx] [PATCH v2] " Chris Wilson
2020-11-26 11:32     ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 06/16] drm/i915/gt: Decouple completed requests on unwind Chris Wilson
2020-11-24 17:13   ` Tvrtko Ursulin
2020-11-24 17:31     ` Chris Wilson
2020-11-25  9:15       ` Tvrtko Ursulin
2020-11-25 10:21         ` Chris Wilson
2020-11-25 16:21           ` Tvrtko Ursulin [this message]
2020-11-24 11:42 ` [Intel-gfx] [PATCH 07/16] drm/i915/gt: Check for a completed last request once Chris Wilson
2020-11-24 17:19   ` Tvrtko Ursulin
2020-11-24 17:38     ` Chris Wilson
2020-11-25  8:59       ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 08/16] drm/i915/gt: Replace direct submit with direct call to tasklet Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 09/16] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
2020-11-25  9:34   ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 10/16] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 11/16] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 12/16] drm/i915/gt: Defer schedule_out until after the next dequeue Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 13/16] drm/i915/gt: Remove virtual breadcrumb before transfer Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 14/16] drm/i915/gt: Shrink the critical section for irq signaling Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 15/16] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 16/16] drm/i915/gt: Simplify virtual engine handling for execlists_hold() Chris Wilson
2020-11-24 14:15 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/16] drm/i915/gem: Drop free_work for GEM contexts Patchwork
2020-11-24 14:16 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-11-24 14:45 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-24 18:04 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-11-25 21:02 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/16] drm/i915/gem: Drop free_work for GEM contexts (rev2) Patchwork
2020-11-25 21:04 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-11-25 21:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-25 23:34 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7157b2c0-0fde-8adb-95bd-84ae4573d5b9@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox