* [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
@ 2016-07-05 11:29 Chris Wilson
2016-07-05 11:55 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 11:29 UTC (permalink / raw)
To: intel-gfx
After assigning ourselves as the new bottom-half, we must perform a
cursory check to prevent a missed interrupt. Either we miss the interrupt
whilst programming the hardware, or if there was a previous waiter (for
a later seqno) they may be woken instead of us (due to the inherent race
in the unlocked read of b->tasklet in the irq handler) and so we miss the
wake up.
Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 009d6e1..6fcbb52 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -65,7 +65,7 @@ static void irq_disable(struct intel_engine_cs *engine)
engine->irq_posted = false;
}
-static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
+static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
{
struct intel_engine_cs *engine =
container_of(b, struct intel_engine_cs, breadcrumbs);
@@ -73,7 +73,7 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
assert_spin_locked(&b->lock);
if (b->rpm_wakelock)
- return false;
+ return;
/* Since we are waiting on a request, the GPU should be busy
* and should have its own rpm reference. For completeness,
@@ -93,8 +93,6 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
if (!b->irq_enabled ||
test_bit(engine->id, &i915->gpu_error.missed_irq_rings))
mod_timer(&b->fake_irq, jiffies + 1);
-
- return engine->irq_posted;
}
static void __intel_breadcrumbs_disable_irq(struct intel_breadcrumbs *b)
@@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
b->first_wait = wait;
smp_store_mb(b->tasklet, wait->tsk);
- first = __intel_breadcrumbs_enable_irq(b);
+ /* After assigning ourselves as the new bottom-half, we must
+ * perform a cursory check to prevent a missed interrupt.
+ * Either we miss the interrupt whilst programming the hardware,
+ * or if there was a previous waiter (for a later seqno) they
+ * may be woken instead of us (due to the inherent race
+ * in the unlocked read of b->tasklet in the irq handler) and
+ * so we miss the wake up.
+ */
+ __intel_breadcrumbs_enable_irq(b);
}
GEM_BUG_ON(!b->tasklet);
GEM_BUG_ON(!b->first_wait);
--
2.8.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
@ 2016-07-05 11:55 ` Chris Wilson
2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
2 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 11:55 UTC (permalink / raw)
To: intel-gfx
On Tue, Jul 05, 2016 at 12:29:05PM +0100, Chris Wilson wrote:
> After assigning ourselves as the new bottom-half, we must perform a
> cursory check to prevent a missed interrupt. Either we miss the interrupt
> whilst programming the hardware, or if there was a previous waiter (for
> a later seqno) they may be woken instead of us (due to the inherent race
> in the unlocked read of b->tasklet in the irq handler) and so we miss the
> wake up.
>
> Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 688e6c725816 ("drm/i915: Slaughter the thundering... herd")
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread* ✗ Ro.CI.BAT: warning for drm/i915: Always double check for a missed interrupt for new bottom halves
2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
2016-07-05 11:55 ` Chris Wilson
@ 2016-07-05 12:02 ` Patchwork
2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2016-07-05 12:02 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Always double check for a missed interrupt for new bottom halves
URL : https://patchwork.freedesktop.org/series/9514/
State : warning
== Summary ==
Series 9514v1 drm/i915: Always double check for a missed interrupt for new bottom halves
http://patchwork.freedesktop.org/api/1.0/series/9514/revisions/1/mbox
Test drv_module_reload_basic:
dmesg-warn -> PASS (ro-bdw-i7-5600u)
Test gem_exec_flush:
Subgroup basic-batch-kernel-default-uc:
dmesg-fail -> PASS (fi-skl-i5-6260u)
dmesg-fail -> PASS (fi-skl-i7-6700k)
Subgroup basic-batch-kernel-default-wb:
dmesg-fail -> PASS (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
Subgroup nonblocking-crc-pipe-b-frame-sequence:
skip -> PASS (fi-skl-i5-6260u)
Subgroup read-crc-pipe-b:
pass -> SKIP (fi-skl-i5-6260u)
fi-kbl-qkkr total:234 pass:162 dwarn:27 dfail:2 fail:2 skip:41
fi-skl-i5-6260u total:234 pass:205 dwarn:0 dfail:0 fail:2 skip:27
fi-skl-i7-6700k total:234 pass:192 dwarn:0 dfail:0 fail:2 skip:40
fi-snb-i7-2600 total:234 pass:178 dwarn:0 dfail:0 fail:2 skip:54
ro-bdw-i5-5250u total:229 pass:204 dwarn:1 dfail:1 fail:0 skip:23
ro-bdw-i7-5557U total:229 pass:204 dwarn:1 dfail:1 fail:0 skip:23
ro-bdw-i7-5600u total:229 pass:190 dwarn:0 dfail:1 fail:0 skip:38
ro-bsw-n3050 total:229 pass:177 dwarn:0 dfail:1 fail:2 skip:49
ro-byt-n2820 total:229 pass:180 dwarn:0 dfail:1 fail:3 skip:45
ro-hsw-i3-4010u total:229 pass:197 dwarn:0 dfail:1 fail:0 skip:31
ro-hsw-i7-4770r total:229 pass:197 dwarn:0 dfail:1 fail:0 skip:31
ro-ilk-i7-620lm total:229 pass:157 dwarn:0 dfail:1 fail:1 skip:70
ro-ilk1-i5-650 total:224 pass:157 dwarn:0 dfail:1 fail:1 skip:65
ro-ivb-i7-3770 total:229 pass:188 dwarn:0 dfail:1 fail:0 skip:40
ro-skl3-i5-6260u total:229 pass:208 dwarn:1 dfail:1 fail:0 skip:19
ro-snb-i7-2620M total:229 pass:179 dwarn:0 dfail:1 fail:1 skip:48
Results at /archive/results/CI_IGT_test/RO_Patchwork_1415/
4b6e8fb drm-intel-nightly: 2016y-07m-05d-10h-28m-45s UTC integration manifest
a919aeb drm/i915: Always double check for a missed interrupt for new bottom halves
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
2016-07-05 11:55 ` Chris Wilson
2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
@ 2016-07-05 13:38 ` Tvrtko Ursulin
2016-07-05 15:04 ` Chris Wilson
2 siblings, 1 reply; 5+ messages in thread
From: Tvrtko Ursulin @ 2016-07-05 13:38 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 05/07/16 12:29, Chris Wilson wrote:
> After assigning ourselves as the new bottom-half, we must perform a
> cursory check to prevent a missed interrupt. Either we miss the interrupt
> whilst programming the hardware, or if there was a previous waiter (for
> a later seqno) they may be woken instead of us (due to the inherent race
> in the unlocked read of b->tasklet in the irq handler) and so we miss the
> wake up.
>
> Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
> drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 009d6e1..6fcbb52 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -65,7 +65,7 @@ static void irq_disable(struct intel_engine_cs *engine)
> engine->irq_posted = false;
> }
>
> -static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
> +static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
> {
> struct intel_engine_cs *engine =
> container_of(b, struct intel_engine_cs, breadcrumbs);
> @@ -73,7 +73,7 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
>
> assert_spin_locked(&b->lock);
> if (b->rpm_wakelock)
> - return false;
> + return;
>
> /* Since we are waiting on a request, the GPU should be busy
> * and should have its own rpm reference. For completeness,
> @@ -93,8 +93,6 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
> if (!b->irq_enabled ||
> test_bit(engine->id, &i915->gpu_error.missed_irq_rings))
> mod_timer(&b->fake_irq, jiffies + 1);
> -
> - return engine->irq_posted;
> }
>
> static void __intel_breadcrumbs_disable_irq(struct intel_breadcrumbs *b)
> @@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
> GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
> b->first_wait = wait;
> smp_store_mb(b->tasklet, wait->tsk);
> - first = __intel_breadcrumbs_enable_irq(b);
> + /* After assigning ourselves as the new bottom-half, we must
> + * perform a cursory check to prevent a missed interrupt.
> + * Either we miss the interrupt whilst programming the hardware,
> + * or if there was a previous waiter (for a later seqno) they
> + * may be woken instead of us (due to the inherent race
> + * in the unlocked read of b->tasklet in the irq handler) and
> + * so we miss the wake up.
> + */
> + __intel_breadcrumbs_enable_irq(b);
> }
> GEM_BUG_ON(!b->tasklet);
> GEM_BUG_ON(!b->first_wait);
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
@ 2016-07-05 15:04 ` Chris Wilson
0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 15:04 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
On Tue, Jul 05, 2016 at 02:38:55PM +0100, Tvrtko Ursulin wrote:
> On 05/07/16 12:29, Chris Wilson wrote:
> >@@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
> > GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
> > b->first_wait = wait;
> > smp_store_mb(b->tasklet, wait->tsk);
> >- first = __intel_breadcrumbs_enable_irq(b);
> >+ /* After assigning ourselves as the new bottom-half, we must
> >+ * perform a cursory check to prevent a missed interrupt.
> >+ * Either we miss the interrupt whilst programming the hardware,
> >+ * or if there was a previous waiter (for a later seqno) they
> >+ * may be woken instead of us (due to the inherent race
> >+ * in the unlocked read of b->tasklet in the irq handler) and
> >+ * so we miss the wake up.
> >+ */
> >+ __intel_breadcrumbs_enable_irq(b);
> > }
> > GEM_BUG_ON(!b->tasklet);
> > GEM_BUG_ON(!b->first_wait);
>
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Even knowing the nature of the bug, I've found it very hard to hit. I've
written a test case that at the very least should exercise the multiple
waiters case, but making it hit the window where we swap the bottom
halves is a nigh-on impossible task (yet CI managed to hit it
almost consistently!).
I'm just thankful we do have some GEM tests in CI that did manage to hit
it.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-05 15:04 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
2016-07-05 11:55 ` Chris Wilson
2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
2016-07-05 15:04 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.