* [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down
@ 2018-07-19 7:50 Chris Wilson
2018-07-19 8:33 ` ✗ Fi.CI.BAT: failure for " Patchwork
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Chris Wilson @ 2018-07-19 7:50 UTC (permalink / raw)
To: intel-gfx
There's a race between idling the engine and finishing off the last
tasklet (as we may kick the tasklets after declaring an individual
engine idle). However, since we do not need to access the device until
we try to submit to the ELSP register (processing the CSB just requires
normal CPU access to the HWSP, and when idle we should not need to
submit!) we can defer the assertion unto that point. The assertion is
still useful as it does verify that we do hold the longterm GT wakeref
taken from request allocation until request completion.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274
Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index db5351e6a3a5..9d693e61536c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
struct execlist_port *port = execlists->port;
unsigned int n;
+ /*
+ * We can skip acquiring intel_runtime_pm_get() here as it was taken
+ * on our behalf by the request (see i915_gem_mark_busy()) and it will
+ * not be relinquished until the device is idle (see
+ * i915_gem_idle_work_handler()). As a precaution, we make sure
+ * that all ELSP are drained i.e. we have processed the CSB,
+ * before allowing ourselves to idle and calling intel_runtime_pm_put().
+ */
+ GEM_BUG_ON(!engine->i915->gt.awake);
+
/*
* ELSQ note: the submit queue is not cleared after being submitted
* to the HW so we need to make sure we always clean it up. This is
@@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
{
lockdep_assert_held(&engine->timeline.lock);
- /*
- * We can skip acquiring intel_runtime_pm_get() here as it was taken
- * on our behalf by the request (see i915_gem_mark_busy()) and it will
- * not be relinquished until the device is idle (see
- * i915_gem_idle_work_handler()). As a precaution, we make sure
- * that all ELSP are drained i.e. we have processed the CSB,
- * before allowing ourselves to idle and calling intel_runtime_pm_put().
- */
- GEM_BUG_ON(!engine->i915->gt.awake);
-
process_csb(engine);
if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT))
execlists_dequeue(engine);
@@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data)
engine->execlists.active);
spin_lock_irqsave(&engine->timeline.lock, flags);
-
- if (engine->i915->gt.awake) /* we may be delayed until after we idle! */
- __execlists_submission_tasklet(engine);
-
+ __execlists_submission_tasklet(engine);
spin_unlock_irqrestore(&engine->timeline.lock, flags);
}
--
2.18.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 10+ messages in thread* ✗ Fi.CI.BAT: failure for drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 7:50 [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down Chris Wilson @ 2018-07-19 8:33 ` Patchwork 2018-07-19 11:42 ` Tvrtko Ursulin 2018-07-19 11:39 ` ✓ Fi.CI.BAT: success " Patchwork ` (2 subsequent siblings) 3 siblings, 1 reply; 10+ messages in thread From: Patchwork @ 2018-07-19 8:33 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/execlists: Move the assertion we have the rpm wakeref down URL : https://patchwork.freedesktop.org/series/46837/ State : failure == Summary == = CI Bug Log - changes from CI_DRM_4507 -> Patchwork_9713 = == Summary - FAILURE == Serious unknown changes coming with Patchwork_9713 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_9713, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://patchwork.freedesktop.org/api/1.0/series/46837/revisions/1/mbox/ == Possible new issues == Here are the unknown changes that may have been introduced in Patchwork_9713: === IGT changes === ==== Possible regressions ==== igt@drv_selftest@live_evict: fi-cnl-psr: NOTRUN -> DMESG-WARN +12 igt@drv_selftest@live_workarounds: {fi-cfl-8109u}: NOTRUN -> DMESG-FAIL fi-cnl-psr: NOTRUN -> DMESG-FAIL == Known issues == Here are the changes found in Patchwork_9713 that come from known issues: === IGT changes === ==== Issues hit ==== igt@drv_module_reload@basic-no-display: fi-cnl-psr: NOTRUN -> DMESG-WARN (fdo#105395) +2 igt@gem_exec_suspend@basic-s4-devices: fi-kbl-7500u: PASS -> DMESG-WARN (fdo#105128, fdo#107139) igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c: fi-bxt-dsi: PASS -> INCOMPLETE (fdo#103927) ==== Possible fixes ==== igt@drv_selftest@live_hangcheck: fi-skl-6700k2: DMESG-FAIL (fdo#106560, fdo#107174) -> PASS igt@gem_exec_suspend@basic-s3: {fi-cfl-8109u}: INCOMPLETE (fdo#107187) -> PASS {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927 fdo#105128 https://bugs.freedesktop.org/show_bug.cgi?id=105128 fdo#105395 https://bugs.freedesktop.org/show_bug.cgi?id=105395 fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560 fdo#107139 https://bugs.freedesktop.org/show_bug.cgi?id=107139 fdo#107174 https://bugs.freedesktop.org/show_bug.cgi?id=107174 fdo#107187 https://bugs.freedesktop.org/show_bug.cgi?id=107187 == Participating hosts (46 -> 43) == Additional (2): fi-bdw-gvtdvm fi-cnl-psr Missing (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u == Build changes == * Linux: CI_DRM_4507 -> Patchwork_9713 CI_DRM_4507: 3bbfaebaf3ba21d866c7823d9e4febf47b4b7b39 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4566: 7270e39a0e6238804827ea7f8db433ad743ed745 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_9713: d24e890e5fc98f0c91baea1732320544f615bc2d @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == d24e890e5fc9 drm/i915/execlists: Move the assertion we have the rpm wakeref down == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9713/issues.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ✗ Fi.CI.BAT: failure for drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 8:33 ` ✗ Fi.CI.BAT: failure for " Patchwork @ 2018-07-19 11:42 ` Tvrtko Ursulin 2018-07-19 11:46 ` Chris Wilson 0 siblings, 1 reply; 10+ messages in thread From: Tvrtko Ursulin @ 2018-07-19 11:42 UTC (permalink / raw) To: intel-gfx, Patchwork, Chris Wilson On 19/07/2018 09:33, Patchwork wrote: > == Series Details == > > Series: drm/i915/execlists: Move the assertion we have the rpm wakeref down > URL : https://patchwork.freedesktop.org/series/46837/ > State : failure > > == Summary == > > = CI Bug Log - changes from CI_DRM_4507 -> Patchwork_9713 = > > == Summary - FAILURE == > > Serious unknown changes coming with Patchwork_9713 absolutely need to be > verified manually. > > If you think the reported changes have nothing to do with the changes > introduced in Patchwork_9713, please notify your bug team to allow them > to document this new failure mode, which will reduce false positives in CI. > > External URL: https://patchwork.freedesktop.org/api/1.0/series/46837/revisions/1/mbox/ > > == Possible new issues == > > Here are the unknown changes that may have been introduced in Patchwork_9713: > > === IGT changes === > > ==== Possible regressions ==== > > igt@drv_selftest@live_evict: > fi-cnl-psr: NOTRUN -> DMESG-WARN +12 > > igt@drv_selftest@live_workarounds: > {fi-cfl-8109u}: NOTRUN -> DMESG-FAIL > fi-cnl-psr: NOTRUN -> DMESG-FAIL How come these failure are not in, copy & paste from end of email: For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9713/issues.html ? Regards, Tvrtko > > == Known issues == > > Here are the changes found in Patchwork_9713 that come from known issues: > > === IGT changes === > > ==== Issues hit ==== > > igt@drv_module_reload@basic-no-display: > fi-cnl-psr: NOTRUN -> DMESG-WARN (fdo#105395) +2 > > igt@gem_exec_suspend@basic-s4-devices: > fi-kbl-7500u: PASS -> DMESG-WARN (fdo#105128, fdo#107139) > > igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c: > fi-bxt-dsi: PASS -> INCOMPLETE (fdo#103927) > > > ==== Possible fixes ==== > > igt@drv_selftest@live_hangcheck: > fi-skl-6700k2: DMESG-FAIL (fdo#106560, fdo#107174) -> PASS > > igt@gem_exec_suspend@basic-s3: > {fi-cfl-8109u}: INCOMPLETE (fdo#107187) -> PASS > > > {name}: This element is suppressed. This means it is ignored when computing > the status of the difference (SUCCESS, WARNING, or FAILURE). > > fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927 > fdo#105128 https://bugs.freedesktop.org/show_bug.cgi?id=105128 > fdo#105395 https://bugs.freedesktop.org/show_bug.cgi?id=105395 > fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560 > fdo#107139 https://bugs.freedesktop.org/show_bug.cgi?id=107139 > fdo#107174 https://bugs.freedesktop.org/show_bug.cgi?id=107174 > fdo#107187 https://bugs.freedesktop.org/show_bug.cgi?id=107187 > > > == Participating hosts (46 -> 43) == > > Additional (2): fi-bdw-gvtdvm fi-cnl-psr > Missing (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u > > > == Build changes == > > * Linux: CI_DRM_4507 -> Patchwork_9713 > > CI_DRM_4507: 3bbfaebaf3ba21d866c7823d9e4febf47b4b7b39 @ git://anongit.freedesktop.org/gfx-ci/linux > IGT_4566: 7270e39a0e6238804827ea7f8db433ad743ed745 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools > Patchwork_9713: d24e890e5fc98f0c91baea1732320544f615bc2d @ git://anongit.freedesktop.org/gfx-ci/linux > > > == Linux commits == > > d24e890e5fc9 drm/i915/execlists: Move the assertion we have the rpm wakeref down > > == Logs == > > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9713/issues.html > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ✗ Fi.CI.BAT: failure for drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 11:42 ` Tvrtko Ursulin @ 2018-07-19 11:46 ` Chris Wilson 0 siblings, 0 replies; 10+ messages in thread From: Chris Wilson @ 2018-07-19 11:46 UTC (permalink / raw) To: Patchwork, Tvrtko Ursulin, intel-gfx Quoting Tvrtko Ursulin (2018-07-19 12:42:06) > > On 19/07/2018 09:33, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915/execlists: Move the assertion we have the rpm wakeref down > > URL : https://patchwork.freedesktop.org/series/46837/ > > State : failure > > > > == Summary == > > > > = CI Bug Log - changes from CI_DRM_4507 -> Patchwork_9713 = > > > > == Summary - FAILURE == > > > > Serious unknown changes coming with Patchwork_9713 absolutely need to be > > verified manually. > > > > If you think the reported changes have nothing to do with the changes > > introduced in Patchwork_9713, please notify your bug team to allow them > > to document this new failure mode, which will reduce false positives in CI. > > > > External URL: https://patchwork.freedesktop.org/api/1.0/series/46837/revisions/1/mbox/ > > > > == Possible new issues == > > > > Here are the unknown changes that may have been introduced in Patchwork_9713: > > > > === IGT changes === > > > > ==== Possible regressions ==== > > > > igt@drv_selftest@live_evict: > > fi-cnl-psr: NOTRUN -> DMESG-WARN +12 > > > > igt@drv_selftest@live_workarounds: > > {fi-cfl-8109u}: NOTRUN -> DMESG-FAIL > > fi-cnl-psr: NOTRUN -> DMESG-FAIL > > How come these failure are not in, copy & paste from end of email: Postulating it's because fi-cnl-psr spontaneously reappeared and wasn't in the baseline. Fwiw, it's all [drm:intel_print_wm_latency [i915]] ERROR Gen9 Plane WM1 latency not provided above and beyond the expected (reset recovery fails). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 7:50 [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down Chris Wilson 2018-07-19 8:33 ` ✗ Fi.CI.BAT: failure for " Patchwork @ 2018-07-19 11:39 ` Patchwork 2018-07-19 11:49 ` [PATCH] " Tvrtko Ursulin 2018-07-19 14:48 ` ✓ Fi.CI.IGT: success for " Patchwork 3 siblings, 0 replies; 10+ messages in thread From: Patchwork @ 2018-07-19 11:39 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/execlists: Move the assertion we have the rpm wakeref down URL : https://patchwork.freedesktop.org/series/46837/ State : success == Summary == = CI Bug Log - changes from CI_DRM_4509 -> Patchwork_9716 = == Summary - WARNING == Minor unknown changes coming with Patchwork_9716 need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_9716, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://patchwork.freedesktop.org/api/1.0/series/46837/revisions/1/mbox/ == Possible new issues == Here are the unknown changes that may have been introduced in Patchwork_9716: === IGT changes === ==== Warnings ==== igt@gem_exec_gttfill@basic: fi-pnv-d510: PASS -> SKIP == Known issues == Here are the changes found in Patchwork_9716 that come from known issues: === IGT changes === ==== Issues hit ==== igt@gem_exec_suspend@basic-s3: {fi-cfl-8109u}: PASS -> INCOMPLETE (fdo#107187) {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). fdo#107187 https://bugs.freedesktop.org/show_bug.cgi?id=107187 == Participating hosts (46 -> 42) == Additional (1): fi-kbl-7560u Missing (5): fi-ctg-p8600 fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u == Build changes == * Linux: CI_DRM_4509 -> Patchwork_9716 CI_DRM_4509: e84aa0b47beed78a5a12db93e76fb00eab5db160 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4567: 7f85adc4050182f490c7a5c48db3d57cdb00af4e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_9716: 6347bf8a6dc3e4205b7cf39cb93f3c5590270c11 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 6347bf8a6dc3 drm/i915/execlists: Move the assertion we have the rpm wakeref down == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9716/issues.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 7:50 [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down Chris Wilson 2018-07-19 8:33 ` ✗ Fi.CI.BAT: failure for " Patchwork 2018-07-19 11:39 ` ✓ Fi.CI.BAT: success " Patchwork @ 2018-07-19 11:49 ` Tvrtko Ursulin 2018-07-19 11:59 ` Chris Wilson 2018-07-19 14:48 ` ✓ Fi.CI.IGT: success for " Patchwork 3 siblings, 1 reply; 10+ messages in thread From: Tvrtko Ursulin @ 2018-07-19 11:49 UTC (permalink / raw) To: Chris Wilson, intel-gfx On 19/07/2018 08:50, Chris Wilson wrote: > There's a race between idling the engine and finishing off the last > tasklet (as we may kick the tasklets after declaring an individual > engine idle). However, since we do not need to access the device until > we try to submit to the ELSP register (processing the CSB just requires > normal CPU access to the HWSP, and when idle we should not need to > submit!) we can defer the assertion unto that point. The assertion is > still useful as it does verify that we do hold the longterm GT wakeref > taken from request allocation until request completion. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274 > Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)") > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > --- > drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index db5351e6a3a5..9d693e61536c 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > struct execlist_port *port = execlists->port; > unsigned int n; > > + /* > + * We can skip acquiring intel_runtime_pm_get() here as it was taken > + * on our behalf by the request (see i915_gem_mark_busy()) and it will > + * not be relinquished until the device is idle (see > + * i915_gem_idle_work_handler()). As a precaution, we make sure > + * that all ELSP are drained i.e. we have processed the CSB, > + * before allowing ourselves to idle and calling intel_runtime_pm_put(). > + */ > + GEM_BUG_ON(!engine->i915->gt.awake); > + > /* > * ELSQ note: the submit queue is not cleared after being submitted > * to the HW so we need to make sure we always clean it up. This is > @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) > { > lockdep_assert_held(&engine->timeline.lock); > > - /* > - * We can skip acquiring intel_runtime_pm_get() here as it was taken > - * on our behalf by the request (see i915_gem_mark_busy()) and it will > - * not be relinquished until the device is idle (see > - * i915_gem_idle_work_handler()). As a precaution, we make sure > - * that all ELSP are drained i.e. we have processed the CSB, > - * before allowing ourselves to idle and calling intel_runtime_pm_put(). > - */ > - GEM_BUG_ON(!engine->i915->gt.awake); > - > process_csb(engine); > if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) > execlists_dequeue(engine); > @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data) > engine->execlists.active); > > spin_lock_irqsave(&engine->timeline.lock, flags); > - > - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */ > - __execlists_submission_tasklet(engine); > - > + __execlists_submission_tasklet(engine); > spin_unlock_irqrestore(&engine->timeline.lock, flags); > } > > Why we won't hit the assert on the elsp submit side now? AFAIR the discussion about this particular line concluded that direct tasklet call can race with busy->idle transition. So even is process_csb doens't need the assert, that part I get, the part about the race now again bothers me. Perhaps I just forgot what I thought I understood back then.. :( Should this call process_csb only if !gt.awake? But sounds terribly dodgy.. Why would execlists.active be set if we are not awake.. Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 11:49 ` [PATCH] " Tvrtko Ursulin @ 2018-07-19 11:59 ` Chris Wilson 2018-07-19 12:14 ` Tvrtko Ursulin 0 siblings, 1 reply; 10+ messages in thread From: Chris Wilson @ 2018-07-19 11:59 UTC (permalink / raw) To: Tvrtko Ursulin, intel-gfx Quoting Tvrtko Ursulin (2018-07-19 12:49:13) > > On 19/07/2018 08:50, Chris Wilson wrote: > > There's a race between idling the engine and finishing off the last > > tasklet (as we may kick the tasklets after declaring an individual > > engine idle). However, since we do not need to access the device until > > we try to submit to the ELSP register (processing the CSB just requires > > normal CPU access to the HWSP, and when idle we should not need to > > submit!) we can defer the assertion unto that point. The assertion is > > still useful as it does verify that we do hold the longterm GT wakeref > > taken from request allocation until request completion. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274 > > Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)") > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > --- > > drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++-------------- > > 1 file changed, 11 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > > index db5351e6a3a5..9d693e61536c 100644 > > --- a/drivers/gpu/drm/i915/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/intel_lrc.c > > @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > > struct execlist_port *port = execlists->port; > > unsigned int n; > > > > + /* > > + * We can skip acquiring intel_runtime_pm_get() here as it was taken > > + * on our behalf by the request (see i915_gem_mark_busy()) and it will > > + * not be relinquished until the device is idle (see > > + * i915_gem_idle_work_handler()). As a precaution, we make sure > > + * that all ELSP are drained i.e. we have processed the CSB, > > + * before allowing ourselves to idle and calling intel_runtime_pm_put(). > > + */ > > + GEM_BUG_ON(!engine->i915->gt.awake); > > + > > /* > > * ELSQ note: the submit queue is not cleared after being submitted > > * to the HW so we need to make sure we always clean it up. This is > > @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) > > { > > lockdep_assert_held(&engine->timeline.lock); > > > > - /* > > - * We can skip acquiring intel_runtime_pm_get() here as it was taken > > - * on our behalf by the request (see i915_gem_mark_busy()) and it will > > - * not be relinquished until the device is idle (see > > - * i915_gem_idle_work_handler()). As a precaution, we make sure > > - * that all ELSP are drained i.e. we have processed the CSB, > > - * before allowing ourselves to idle and calling intel_runtime_pm_put(). > > - */ > > - GEM_BUG_ON(!engine->i915->gt.awake); > > - > > process_csb(engine); > > if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) > > execlists_dequeue(engine); > > @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data) > > engine->execlists.active); > > > > spin_lock_irqsave(&engine->timeline.lock, flags); > > - > > - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */ > > - __execlists_submission_tasklet(engine); > > - > > + __execlists_submission_tasklet(engine); > > spin_unlock_irqrestore(&engine->timeline.lock, flags); > > } > > > > > > Why we won't hit the assert on the elsp submit side now? AFAIR the > discussion about this particular line concluded that direct tasklet call > can race with busy->idle transition. So even is process_csb doens't need > the assert, that part I get, the part about the race now again bothers > me. Perhaps I just forgot what I thought I understood back then.. :( Same race, I just didn't think it through that it could change within the space of a couple of lines. :| > Should this call process_csb only if !gt.awake? But sounds terribly > dodgy.. Why would execlists.active be set if we are not awake.. Have to remember it's i915->gt.awake no execlists->active (that's what I briefly hoped for...) I looked at ways we might decouple the tasklet (we can't just use tasklet_disable) but that looks overkill, and I can't see any way we can guarantee that we won't randomly kick it later. I did consider reordering the final wait_for(engine_is_idle()) and tasklet_flush, but I couldn't convince myself that was enough to guarantee we wouldn't get an even later interrupt to kick the tasklet. So left it at just using the CSB to filter out the spurious call, and relying on that we are idle to avoid the submit. The assertion still plays the same role as it always has done, making sure the actual register access is covered by our wakeref. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 11:59 ` Chris Wilson @ 2018-07-19 12:14 ` Tvrtko Ursulin 2018-07-19 12:23 ` Chris Wilson 0 siblings, 1 reply; 10+ messages in thread From: Tvrtko Ursulin @ 2018-07-19 12:14 UTC (permalink / raw) To: Chris Wilson, intel-gfx On 19/07/2018 12:59, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-07-19 12:49:13) >> >> On 19/07/2018 08:50, Chris Wilson wrote: >>> There's a race between idling the engine and finishing off the last >>> tasklet (as we may kick the tasklets after declaring an individual >>> engine idle). However, since we do not need to access the device until >>> we try to submit to the ELSP register (processing the CSB just requires >>> normal CPU access to the HWSP, and when idle we should not need to >>> submit!) we can defer the assertion unto that point. The assertion is >>> still useful as it does verify that we do hold the longterm GT wakeref >>> taken from request allocation until request completion. >>> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274 >>> Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)") >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> --- >>> drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++-------------- >>> 1 file changed, 11 insertions(+), 14 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c >>> index db5351e6a3a5..9d693e61536c 100644 >>> --- a/drivers/gpu/drm/i915/intel_lrc.c >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c >>> @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) >>> struct execlist_port *port = execlists->port; >>> unsigned int n; >>> >>> + /* >>> + * We can skip acquiring intel_runtime_pm_get() here as it was taken >>> + * on our behalf by the request (see i915_gem_mark_busy()) and it will >>> + * not be relinquished until the device is idle (see >>> + * i915_gem_idle_work_handler()). As a precaution, we make sure >>> + * that all ELSP are drained i.e. we have processed the CSB, >>> + * before allowing ourselves to idle and calling intel_runtime_pm_put(). >>> + */ >>> + GEM_BUG_ON(!engine->i915->gt.awake); >>> + >>> /* >>> * ELSQ note: the submit queue is not cleared after being submitted >>> * to the HW so we need to make sure we always clean it up. This is >>> @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) >>> { >>> lockdep_assert_held(&engine->timeline.lock); >>> >>> - /* >>> - * We can skip acquiring intel_runtime_pm_get() here as it was taken >>> - * on our behalf by the request (see i915_gem_mark_busy()) and it will >>> - * not be relinquished until the device is idle (see >>> - * i915_gem_idle_work_handler()). As a precaution, we make sure >>> - * that all ELSP are drained i.e. we have processed the CSB, >>> - * before allowing ourselves to idle and calling intel_runtime_pm_put(). >>> - */ >>> - GEM_BUG_ON(!engine->i915->gt.awake); >>> - >>> process_csb(engine); >>> if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) >>> execlists_dequeue(engine); >>> @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data) >>> engine->execlists.active); >>> >>> spin_lock_irqsave(&engine->timeline.lock, flags); >>> - >>> - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */ >>> - __execlists_submission_tasklet(engine); >>> - >>> + __execlists_submission_tasklet(engine); >>> spin_unlock_irqrestore(&engine->timeline.lock, flags); >>> } >>> >>> >> >> Why we won't hit the assert on the elsp submit side now? AFAIR the >> discussion about this particular line concluded that direct tasklet call >> can race with busy->idle transition. So even is process_csb doens't need >> the assert, that part I get, the part about the race now again bothers >> me. Perhaps I just forgot what I thought I understood back then.. :( > > Same race, I just didn't think it through that it could change within > the space of a couple of lines. :| > >> Should this call process_csb only if !gt.awake? But sounds terribly >> dodgy.. Why would execlists.active be set if we are not awake.. > > Have to remember it's i915->gt.awake no execlists->active (that's what I > briefly hoped for...) I looked at ways we might decouple the tasklet (we > can't just use tasklet_disable) but that looks overkill, and I can't see > any way we can guarantee that we won't randomly kick it later. > > I did consider reordering the final wait_for(engine_is_idle()) and > tasklet_flush, but I couldn't convince myself that was enough to > guarantee we wouldn't get an even later interrupt to kick the tasklet. > > So left it at just using the CSB to filter out the spurious call, and > relying on that we are idle to avoid the submit. The assertion still > plays the same role as it always has done, making sure the actual > register access is covered by our wakeref. So the thing I am thinking of, and you tell me if it is not the one: 1. Idle work handler runs 2. Goes to idle the engines 3. New request comes in from a separate thread 4. intel_engine_is_idle sees execlist.active is set 5. Calls the tasklet But gt.awake is true at this point, so assert does not fire. If the status of execlists.active changes from true to false between the check in 4 and tasklet kick in 5. Okay, but in this case how did we pass the the gt.awake and gt.active_requests checks at the top of the idle work handler? The whole state change has to happen in the middle of idle work handler I guess. Transition from !awake to awake before the intel_engines_are_idle_loop, and actually process the csb in the middle of the intel_engine_is_idle checks. Okay, think I answered my own question. My mindset was to serialized in this instance. :) Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 12:14 ` Tvrtko Ursulin @ 2018-07-19 12:23 ` Chris Wilson 0 siblings, 0 replies; 10+ messages in thread From: Chris Wilson @ 2018-07-19 12:23 UTC (permalink / raw) To: Tvrtko Ursulin, intel-gfx Quoting Tvrtko Ursulin (2018-07-19 13:14:38) > > On 19/07/2018 12:59, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-07-19 12:49:13) > >> > >> On 19/07/2018 08:50, Chris Wilson wrote: > >>> There's a race between idling the engine and finishing off the last > >>> tasklet (as we may kick the tasklets after declaring an individual > >>> engine idle). However, since we do not need to access the device until > >>> we try to submit to the ELSP register (processing the CSB just requires > >>> normal CPU access to the HWSP, and when idle we should not need to > >>> submit!) we can defer the assertion unto that point. The assertion is > >>> still useful as it does verify that we do hold the longterm GT wakeref > >>> taken from request allocation until request completion. > >>> > >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274 > >>> Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)") > >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>> --- > >>> drivers/gpu/drm/i915/intel_lrc.c | 25 +++++++++++-------------- > >>> 1 file changed, 11 insertions(+), 14 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >>> index db5351e6a3a5..9d693e61536c 100644 > >>> --- a/drivers/gpu/drm/i915/intel_lrc.c > >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c > >>> @@ -452,6 +452,16 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) > >>> struct execlist_port *port = execlists->port; > >>> unsigned int n; > >>> > >>> + /* > >>> + * We can skip acquiring intel_runtime_pm_get() here as it was taken > >>> + * on our behalf by the request (see i915_gem_mark_busy()) and it will > >>> + * not be relinquished until the device is idle (see > >>> + * i915_gem_idle_work_handler()). As a precaution, we make sure > >>> + * that all ELSP are drained i.e. we have processed the CSB, > >>> + * before allowing ourselves to idle and calling intel_runtime_pm_put(). > >>> + */ > >>> + GEM_BUG_ON(!engine->i915->gt.awake); > >>> + > >>> /* > >>> * ELSQ note: the submit queue is not cleared after being submitted > >>> * to the HW so we need to make sure we always clean it up. This is > >>> @@ -1043,16 +1053,6 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine) > >>> { > >>> lockdep_assert_held(&engine->timeline.lock); > >>> > >>> - /* > >>> - * We can skip acquiring intel_runtime_pm_get() here as it was taken > >>> - * on our behalf by the request (see i915_gem_mark_busy()) and it will > >>> - * not be relinquished until the device is idle (see > >>> - * i915_gem_idle_work_handler()). As a precaution, we make sure > >>> - * that all ELSP are drained i.e. we have processed the CSB, > >>> - * before allowing ourselves to idle and calling intel_runtime_pm_put(). > >>> - */ > >>> - GEM_BUG_ON(!engine->i915->gt.awake); > >>> - > >>> process_csb(engine); > >>> if (!execlists_is_active(&engine->execlists, EXECLISTS_ACTIVE_PREEMPT)) > >>> execlists_dequeue(engine); > >>> @@ -1073,10 +1073,7 @@ static void execlists_submission_tasklet(unsigned long data) > >>> engine->execlists.active); > >>> > >>> spin_lock_irqsave(&engine->timeline.lock, flags); > >>> - > >>> - if (engine->i915->gt.awake) /* we may be delayed until after we idle! */ > >>> - __execlists_submission_tasklet(engine); > >>> - > >>> + __execlists_submission_tasklet(engine); > >>> spin_unlock_irqrestore(&engine->timeline.lock, flags); > >>> } > >>> > >>> > >> > >> Why we won't hit the assert on the elsp submit side now? AFAIR the > >> discussion about this particular line concluded that direct tasklet call > >> can race with busy->idle transition. So even is process_csb doens't need > >> the assert, that part I get, the part about the race now again bothers > >> me. Perhaps I just forgot what I thought I understood back then.. :( > > > > Same race, I just didn't think it through that it could change within > > the space of a couple of lines. :| > > > >> Should this call process_csb only if !gt.awake? But sounds terribly > >> dodgy.. Why would execlists.active be set if we are not awake.. > > > > Have to remember it's i915->gt.awake no execlists->active (that's what I > > briefly hoped for...) I looked at ways we might decouple the tasklet (we > > can't just use tasklet_disable) but that looks overkill, and I can't see > > any way we can guarantee that we won't randomly kick it later. > > > > I did consider reordering the final wait_for(engine_is_idle()) and > > tasklet_flush, but I couldn't convince myself that was enough to > > guarantee we wouldn't get an even later interrupt to kick the tasklet. > > > > So left it at just using the CSB to filter out the spurious call, and > > relying on that we are idle to avoid the submit. The assertion still > > plays the same role as it always has done, making sure the actual > > register access is covered by our wakeref. > > So the thing I am thinking of, and you tell me if it is not the one: > > 1. Idle work handler runs > 2. Goes to idle the engines > 3. New request comes in from a separate thread > 4. intel_engine_is_idle sees execlist.active is set > 5. Calls the tasklet > > But gt.awake is true at this point, so assert does not fire. > > If the status of execlists.active changes from true to false between the > check in 4 and tasklet kick in 5. Okay, but in this case how did we pass > the the gt.awake and gt.active_requests checks at the top of the idle > work handler? active_requests is protected by struct_mutex, so stable for idle_worker deciding to set gt.awake=false. The assert we have is all about gt.awake and the theory was that after parking the engine we wouldn't get another tasklet. That's the theory full of holes. > The whole state change has to happen in the middle of idle work handler > I guess. Transition from !awake to awake before the > intel_engines_are_idle_loop, and actually process the csb in the middle > of the intel_engine_is_idle checks. > > Okay, think I answered my own question. My mindset was to serialized in > this instance. :) Same answer, so yup. The challenge being getting that serialisation with the interrupt handler solid. Or just be prepared that we get the occasional late tasklet_schedule and let it nop away. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
* ✓ Fi.CI.IGT: success for drm/i915/execlists: Move the assertion we have the rpm wakeref down 2018-07-19 7:50 [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down Chris Wilson ` (2 preceding siblings ...) 2018-07-19 11:49 ` [PATCH] " Tvrtko Ursulin @ 2018-07-19 14:48 ` Patchwork 3 siblings, 0 replies; 10+ messages in thread From: Patchwork @ 2018-07-19 14:48 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: drm/i915/execlists: Move the assertion we have the rpm wakeref down URL : https://patchwork.freedesktop.org/series/46837/ State : success == Summary == = CI Bug Log - changes from CI_DRM_4509_full -> Patchwork_9716_full = == Summary - WARNING == Minor unknown changes coming with Patchwork_9716_full need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_9716_full, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. == Possible new issues == Here are the unknown changes that may have been introduced in Patchwork_9716_full: === IGT changes === ==== Warnings ==== igt@gem_exec_schedule@deep-bsd2: shard-kbl: PASS -> SKIP +2 igt@gem_exec_schedule@deep-vebox: shard-kbl: SKIP -> PASS == Known issues == Here are the changes found in Patchwork_9716_full that come from known issues: === IGT changes === ==== Issues hit ==== igt@drv_selftest@live_hangcheck: shard-kbl: PASS -> DMESG-FAIL (fdo#106947, fdo#106560) igt@kms_flip@2x-flip-vs-expired-vblank-interruptible: shard-glk: PASS -> FAIL (fdo#105363) ==== Possible fixes ==== igt@kms_setmode@basic: shard-kbl: FAIL (fdo#99912) -> PASS fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363 fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560 fdo#106947 https://bugs.freedesktop.org/show_bug.cgi?id=106947 fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912 == Participating hosts (5 -> 5) == No changes in participating hosts == Build changes == * Linux: CI_DRM_4509 -> Patchwork_9716 CI_DRM_4509: e84aa0b47beed78a5a12db93e76fb00eab5db160 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4567: 7f85adc4050182f490c7a5c48db3d57cdb00af4e @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_9716: 6347bf8a6dc3e4205b7cf39cb93f3c5590270c11 @ git://anongit.freedesktop.org/gfx-ci/linux piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9716/shards.html _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-07-19 14:48 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-07-19 7:50 [PATCH] drm/i915/execlists: Move the assertion we have the rpm wakeref down Chris Wilson 2018-07-19 8:33 ` ✗ Fi.CI.BAT: failure for " Patchwork 2018-07-19 11:42 ` Tvrtko Ursulin 2018-07-19 11:46 ` Chris Wilson 2018-07-19 11:39 ` ✓ Fi.CI.BAT: success " Patchwork 2018-07-19 11:49 ` [PATCH] " Tvrtko Ursulin 2018-07-19 11:59 ` Chris Wilson 2018-07-19 12:14 ` Tvrtko Ursulin 2018-07-19 12:23 ` Chris Wilson 2018-07-19 14:48 ` ✓ Fi.CI.IGT: success for " Patchwork
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.