Re: [PATCH 4/5] drm/i915/gt: Adopt engine_park synchronisation rules for engine_retire

From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 4/5] drm/i915/gt: Adopt engine_park synchronisation rules for engine_retire
Date: Thu, 21 Nov 2019 16:31:43 +0000	[thread overview]
Message-ID: <e6849483-a86a-c8da-7806-fca05f8fa01f@linux.intel.com> (raw)
In-Reply-To: <157435346163.2524.17000806966099512047@skylake-alporthouse-com>

On 21/11/2019 16:24, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-11-21 16:17:59)
>>
>> On 21/11/2019 14:53, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-11-21 14:42:56)
>>>>
>>>> On 21/11/2019 13:51, Chris Wilson wrote:
>>>>> In the next patch, we will introduce a new asynchronous retirement
>>>>> worker, fed by execlists CS events. Here we may queue a retirement as
>>>>> soon as a request is submitted to HW (and completes instantly), and we
>>>>> also want to process that retirement as early as possible and cannot
>>>>> afford to postpone (as there may not be another opportunity to retire it
>>>>> for a few seconds). To allow the new async retirer to run in parallel
>>>>> with our submission, pull the __i915_request_queue (that passes the
>>>>> request to HW) inside the timelines spinlock so that the retirement
>>>>> cannot release the timeline before we have completed the submission.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_engine_pm.c | 29 ++++++++++++++++-------
>>>>>     1 file changed, 21 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> index 373a4b9f159c..bd0af02bea16 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
>>>>> @@ -74,18 +74,33 @@ static inline void __timeline_mark_unlock(struct intel_context *ce,
>>>>>     #endif /* !IS_ENABLED(CONFIG_LOCKDEP) */
>>>>>     
>>>>>     static void
>>>>> -__intel_timeline_enter_and_release_pm(struct intel_timeline *tl,
>>>>> -                                   struct intel_engine_cs *engine)
>>>>> +__queue_and_release_pm(struct i915_request *rq,
>>>>> +                    struct intel_timeline *tl,
>>>>> +                    struct intel_engine_cs *engine)
>>>>>     {
>>>>>         struct intel_gt_timelines *timelines = &engine->gt->timelines;
>>>>>     
>>>>> +     /*
>>>>> +      * We have to serialise all potential retirement paths with our
>>>>> +      * submission, as we don't want to underflow either the
>>>>> +      * engine->wakeref.counter or our timeline->active_count.
>>>>> +      *
>>>>> +      * Equally, we cannot allow a new submission to start until
>>>>> +      * after we finish queueing, nor could we allow that submitter
>>>>> +      * to retire us before we are ready!
>>>>> +      */
>>>>>         spin_lock(&timelines->lock);
>>>>>     
>>>>> -     if (!atomic_fetch_inc(&tl->active_count))
>>>>> -             list_add_tail(&tl->link, &timelines->active_list);
>>>>> +     /* Hand the request over to HW and so engine_retire() */
>>>>> +     __i915_request_queue(rq, NULL);
>>>>>     
>>>>> +     /* Let new submissions commence (and maybe retire this timeline) */
>>>>>         __intel_wakeref_defer_park(&engine->wakeref);
>>>>>     
>>>>> +     /* Let intel_gt_retire_requests() retire us */
>>>>> +     if (!atomic_fetch_inc(&tl->active_count))
>>>>> +             list_add_tail(&tl->link, &timelines->active_list);
>>>>> +
>>>>>         spin_unlock(&timelines->lock);
>>>>
>>>> Now that everything is under the lock the order of operation is not
>>>> important, or it still is?
>>>
>>> queue before unpark that is required.
>>>
>>> unpark and add_to_timeline, the order is flexible as the lock governors
>>> the interactions between those and retirers. So I chose to allow the
>>> next newcomer start a few instructions earlier.
>>
>> Yes, because of different locks. So the comment above
>> __intel_wakeref_defer_park is not correct since timeline cannot be
>> retired until the lock is dropped.
> 
> The goal was to indicate that the wakeref.count will allow new
> submissions to bypass the engine-pm; while also tying back to the
> retirement theme and reminding the reader that request submission also
> implies some retiring of old requests on the timeline.
> 
> So I was trying to point out the connection between all steps and the
> act of retiring, since that was most pressing on my mind.
> 
>> It's only preservation of timeline ordering which mandates defer_park
>> after request_queue. As far as I am able to summon my own understanding
>> from yesterday.
> 
> Correct. That's the important bit from yesterday.

Phew.. thanks for re-confirming.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx