From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link
Date: Tue, 21 Jan 2020 11:11:27 +0000 [thread overview]
Message-ID: <7c1c1334-7033-71f5-c437-a6313ea32d6f@linux.intel.com> (raw)
In-Reply-To: <157955236639.2218.7304647621155277813@skylake-alporthouse-com>
On 20/01/2020 20:32, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-01-20 19:47:08)
>>
>> On 20/01/2020 17:57, Chris Wilson wrote:
>>> Keep the rq->fence.flags consistent with the status of the
>>> rq->sched.link, and clear the associated bits when decoupling the link
>>> on retirement (as we may wish to inspect those flags independent of
>>> other state).
>>>
>>> Fixes: 32ff621fd744 ("drm/i915/gt: Allow temporary suspension of inflight requests")
>>> References: https://gitlab.freedesktop.org/drm/intel/issues/997
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_request.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 9ed0d3bc7249..78a5f5d3c070 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
>>> locked = engine;
>>> }
>>> list_del_init(&rq->sched.link);
>>> + clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
>>
>> This one I think can not be set in retirement. Or there is a path?
>>
>> [comes back after writing the comment below]
>>
>> Race between completion to hold puts the request on hold, then request
>> completes just as it is un-held? It needs retire to happen at the right
>> time, driven by ...? Is this it?
>>
>>> + clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
>>
>> This one I think indeed can race with completion.
>
> Fwiw, this appears to be the active part of the trace
>
> <0>[ 56.132334] <idle>-0 1.Ns1 52042407us : execlists_reset_finish: 0000:00:02.0 vcs1: depth->1
> <0>[ 56.132415] rs:main -428 0..s. 52042429us : process_csb: 0000:00:02.0 vcs1: cs-irq head=5, tail=1
> <0>[ 56.132501] rs:main -428 0..s. 52042432us : process_csb: 0000:00:02.0 vcs1: csb[0]: status=0x00000001:0x00000000
> <0>[ 56.132591] rs:main -428 0..s. 52042437us : trace_ports: 0000:00:02.0 vcs1: promote { b:6!, 0:0 }
> <0>[ 56.132676] rs:main -428 0..s. 52042442us : process_csb: 0000:00:02.0 vcs1: csb[1]: status=0x00000818:0x00000020
> <0>[ 56.132766] rs:main -428 0..s. 52042445us : trace_ports: 0000:00:02.0 vcs1: completed { b:6!, 0:0 }
> <0>[ 56.132860] kworker/-12 0.... 52042771us : i915_request_retire: 0000:00:02.0 vcs1: fence b:6, current 6
> <0>[ 56.132949] kworker/-65 1d..1 52044886us : execlists_unhold: 0000:00:02.0 vcs0: fence 27:2, current 2 hold release
> <0>[ 56.133041] ksoftirq-16 1..s. 52044912us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=2
> <0>[ 56.133118] ksoftirq-16 1d.s1 52044916us : __i915_request_submit: 0000:00:02.0 vcs0: fence 27:2, current 2
> <0>[ 56.133216] kworker/-65 1.... 52044946us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:14, current 14
> <0>[ 56.133314] kworker/-65 1d... 52044986us : __i915_request_commit: 0000:00:02.0 vcs0: fence 9:16, current 14
> <0>[ 56.133402] kworker/-65 1d... 52044993us : __engine_park: 0000:00:02.0 vcs0:
> <0>[ 56.133490] kworker/-65 1d..2 52045032us : __i915_request_submit: 0000:00:02.0 vcs0: fence 9:16, current 14
> <0>[ 56.133581] kworker/-65 1d..2 52045892us : trace_ports: 0000:00:02.0 vcs0: submit { 9:16, 0:0 }
> <0>[ 56.133664] <idle>-0 0dNH2 52045932us : __intel_context_retire: 0000:00:02.0 vcs0: context:27 retire
> <0>[ 56.133751] <idle>-0 0.Ns1 52045949us : process_csb: 0000:00:02.0 vcs0: cs-irq head=2, tail=4
> <0>[ 56.133838] <idle>-0 0.Ns1 52045950us : process_csb: 0000:00:02.0 vcs0: csb[3]: status=0x00000001:0x00000000
> <0>[ 56.133927] <idle>-0 0.Ns1 52045951us : trace_ports: 0000:00:02.0 vcs0: promote { 9:16!, 0:0 }
> <0>[ 56.134013] <idle>-0 0.Ns1 52045951us : process_csb: 0000:00:02.0 vcs0: csb[4]: status=0x00000818:0x00000060
> <0>[ 56.134102] <idle>-0 0.Ns1 52045952us : trace_ports: 0000:00:02.0 vcs0: completed { 9:16!, 0:0 }
> <0>[ 56.134198] kworker/-12 0.... 52045960us : i915_request_retire: 0000:00:02.0 vcs0: fence 9:16, current 16
> <0>[ 56.134285] kworker/-12 0.... 52045962us : __engine_park: 0000:00:02.0 vcs0:
> <0>[ 56.134361] kworker/-65 1d..1 52046503us : execlists_unhold: 0000:00:02.0 vcs1: fence 2a:2, current 1 hold release
> <0>[ 56.134427] ksoftirq-16 1..s. 52046510us : process_csb: 0000:00:02.0 vcs1: cs-irq head=1, tail=1
>
> It doesn't look like there's overlap between the hold and retire. :|
Which bit is the race? I coudln't spot the same request being mentioned
on two separate paths.
I mean I have no issues with the patch. Just trying to understand the
possible races and whether or not there should be an assert for PQUEUE
instead of clearing it on retire.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-01-21 11:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-20 17:57 [Intel-gfx] [PATCH] drm/i915: Mark the removal of the i915_request from the sched.link Chris Wilson
2020-01-20 19:47 ` Tvrtko Ursulin
2020-01-20 20:27 ` Chris Wilson
2020-01-21 17:33 ` Tvrtko Ursulin
2020-01-20 20:32 ` Chris Wilson
2020-01-21 11:11 ` Tvrtko Ursulin [this message]
2020-01-21 11:15 ` Chris Wilson
2020-01-21 2:12 ` [Intel-gfx] ✓ Fi.CI.BAT: success for " Patchwork
2020-01-21 14:53 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7c1c1334-7033-71f5-c437-a6313ea32d6f@linux.intel.com \
--to=tvrtko.ursulin@linux.intel.com \
--cc=chris@chris-wilson.co.uk \
--cc=intel-gfx@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox