All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
@ 2016-07-05 11:29 Chris Wilson
  2016-07-05 11:55 ` Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 11:29 UTC (permalink / raw)
  To: intel-gfx

After assigning ourselves as the new bottom-half, we must perform a
cursory check to prevent a missed interrupt.  Either we miss the interrupt
whilst programming the hardware, or if there was a previous waiter (for
a later seqno) they may be woken instead of us (due to the inherent race
in the unlocked read of b->tasklet in the irq handler) and so we miss the
wake up.

Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 009d6e1..6fcbb52 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -65,7 +65,7 @@ static void irq_disable(struct intel_engine_cs *engine)
 	engine->irq_posted = false;
 }
 
-static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
+static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
 {
 	struct intel_engine_cs *engine =
 		container_of(b, struct intel_engine_cs, breadcrumbs);
@@ -73,7 +73,7 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
 
 	assert_spin_locked(&b->lock);
 	if (b->rpm_wakelock)
-		return false;
+		return;
 
 	/* Since we are waiting on a request, the GPU should be busy
 	 * and should have its own rpm reference. For completeness,
@@ -93,8 +93,6 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
 	if (!b->irq_enabled ||
 	    test_bit(engine->id, &i915->gpu_error.missed_irq_rings))
 		mod_timer(&b->fake_irq, jiffies + 1);
-
-	return engine->irq_posted;
 }
 
 static void __intel_breadcrumbs_disable_irq(struct intel_breadcrumbs *b)
@@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
 		GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
 		b->first_wait = wait;
 		smp_store_mb(b->tasklet, wait->tsk);
-		first = __intel_breadcrumbs_enable_irq(b);
+		/* After assigning ourselves as the new bottom-half, we must
+		 * perform a cursory check to prevent a missed interrupt.
+		 * Either we miss the interrupt whilst programming the hardware,
+		 * or if there was a previous waiter (for a later seqno) they
+		 * may be woken instead of us (due to the inherent race
+		 * in the unlocked read of b->tasklet in the irq handler) and
+		 * so we miss the wake up.
+		 */
+		__intel_breadcrumbs_enable_irq(b);
 	}
 	GEM_BUG_ON(!b->tasklet);
 	GEM_BUG_ON(!b->first_wait);
-- 
2.8.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
  2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
@ 2016-07-05 11:55 ` Chris Wilson
  2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
  2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
  2 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 11:55 UTC (permalink / raw)
  To: intel-gfx

On Tue, Jul 05, 2016 at 12:29:05PM +0100, Chris Wilson wrote:
> After assigning ourselves as the new bottom-half, we must perform a
> cursory check to prevent a missed interrupt.  Either we miss the interrupt
> whilst programming the hardware, or if there was a previous waiter (for
> a later seqno) they may be woken instead of us (due to the inherent race
> in the unlocked read of b->tasklet in the irq handler) and so we miss the
> wake up.
> 
> Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 688e6c725816 ("drm/i915: Slaughter the thundering... herd")
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* ✗ Ro.CI.BAT: warning for drm/i915: Always double check for a missed interrupt for new bottom halves
  2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
  2016-07-05 11:55 ` Chris Wilson
@ 2016-07-05 12:02 ` Patchwork
  2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
  2 siblings, 0 replies; 5+ messages in thread
From: Patchwork @ 2016-07-05 12:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Always double check for a missed interrupt for new bottom halves
URL   : https://patchwork.freedesktop.org/series/9514/
State : warning

== Summary ==

Series 9514v1 drm/i915: Always double check for a missed interrupt for new bottom halves
http://patchwork.freedesktop.org/api/1.0/series/9514/revisions/1/mbox

Test drv_module_reload_basic:
                dmesg-warn -> PASS       (ro-bdw-i7-5600u)
Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                dmesg-fail -> PASS       (fi-skl-i5-6260u)
                dmesg-fail -> PASS       (fi-skl-i7-6700k)
        Subgroup basic-batch-kernel-default-wb:
                dmesg-fail -> PASS       (ro-skl3-i5-6260u)
Test kms_pipe_crc_basic:
        Subgroup nonblocking-crc-pipe-b-frame-sequence:
                skip       -> PASS       (fi-skl-i5-6260u)
        Subgroup read-crc-pipe-b:
                pass       -> SKIP       (fi-skl-i5-6260u)

fi-kbl-qkkr      total:234  pass:162  dwarn:27  dfail:2   fail:2   skip:41 
fi-skl-i5-6260u  total:234  pass:205  dwarn:0   dfail:0   fail:2   skip:27 
fi-skl-i7-6700k  total:234  pass:192  dwarn:0   dfail:0   fail:2   skip:40 
fi-snb-i7-2600   total:234  pass:178  dwarn:0   dfail:0   fail:2   skip:54 
ro-bdw-i5-5250u  total:229  pass:204  dwarn:1   dfail:1   fail:0   skip:23 
ro-bdw-i7-5557U  total:229  pass:204  dwarn:1   dfail:1   fail:0   skip:23 
ro-bdw-i7-5600u  total:229  pass:190  dwarn:0   dfail:1   fail:0   skip:38 
ro-bsw-n3050     total:229  pass:177  dwarn:0   dfail:1   fail:2   skip:49 
ro-byt-n2820     total:229  pass:180  dwarn:0   dfail:1   fail:3   skip:45 
ro-hsw-i3-4010u  total:229  pass:197  dwarn:0   dfail:1   fail:0   skip:31 
ro-hsw-i7-4770r  total:229  pass:197  dwarn:0   dfail:1   fail:0   skip:31 
ro-ilk-i7-620lm  total:229  pass:157  dwarn:0   dfail:1   fail:1   skip:70 
ro-ilk1-i5-650   total:224  pass:157  dwarn:0   dfail:1   fail:1   skip:65 
ro-ivb-i7-3770   total:229  pass:188  dwarn:0   dfail:1   fail:0   skip:40 
ro-skl3-i5-6260u total:229  pass:208  dwarn:1   dfail:1   fail:0   skip:19 
ro-snb-i7-2620M  total:229  pass:179  dwarn:0   dfail:1   fail:1   skip:48 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1415/

4b6e8fb drm-intel-nightly: 2016y-07m-05d-10h-28m-45s UTC integration manifest
a919aeb drm/i915: Always double check for a missed interrupt for new bottom halves

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
  2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
  2016-07-05 11:55 ` Chris Wilson
  2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
@ 2016-07-05 13:38 ` Tvrtko Ursulin
  2016-07-05 15:04   ` Chris Wilson
  2 siblings, 1 reply; 5+ messages in thread
From: Tvrtko Ursulin @ 2016-07-05 13:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 05/07/16 12:29, Chris Wilson wrote:
> After assigning ourselves as the new bottom-half, we must perform a
> cursory check to prevent a missed interrupt.  Either we miss the interrupt
> whilst programming the hardware, or if there was a previous waiter (for
> a later seqno) they may be woken instead of us (due to the inherent race
> in the unlocked read of b->tasklet in the irq handler) and so we miss the
> wake up.
>
> Spotted-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 009d6e1..6fcbb52 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -65,7 +65,7 @@ static void irq_disable(struct intel_engine_cs *engine)
>   	engine->irq_posted = false;
>   }
>
> -static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
> +static void __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
>   {
>   	struct intel_engine_cs *engine =
>   		container_of(b, struct intel_engine_cs, breadcrumbs);
> @@ -73,7 +73,7 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
>
>   	assert_spin_locked(&b->lock);
>   	if (b->rpm_wakelock)
> -		return false;
> +		return;
>
>   	/* Since we are waiting on a request, the GPU should be busy
>   	 * and should have its own rpm reference. For completeness,
> @@ -93,8 +93,6 @@ static bool __intel_breadcrumbs_enable_irq(struct intel_breadcrumbs *b)
>   	if (!b->irq_enabled ||
>   	    test_bit(engine->id, &i915->gpu_error.missed_irq_rings))
>   		mod_timer(&b->fake_irq, jiffies + 1);
> -
> -	return engine->irq_posted;
>   }
>
>   static void __intel_breadcrumbs_disable_irq(struct intel_breadcrumbs *b)
> @@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>   		GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
>   		b->first_wait = wait;
>   		smp_store_mb(b->tasklet, wait->tsk);
> -		first = __intel_breadcrumbs_enable_irq(b);
> +		/* After assigning ourselves as the new bottom-half, we must
> +		 * perform a cursory check to prevent a missed interrupt.
> +		 * Either we miss the interrupt whilst programming the hardware,
> +		 * or if there was a previous waiter (for a later seqno) they
> +		 * may be woken instead of us (due to the inherent race
> +		 * in the unlocked read of b->tasklet in the irq handler) and
> +		 * so we miss the wake up.
> +		 */
> +		__intel_breadcrumbs_enable_irq(b);
>   	}
>   	GEM_BUG_ON(!b->tasklet);
>   	GEM_BUG_ON(!b->first_wait);

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves
  2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
@ 2016-07-05 15:04   ` Chris Wilson
  0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-05 15:04 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Jul 05, 2016 at 02:38:55PM +0100, Tvrtko Ursulin wrote:
> On 05/07/16 12:29, Chris Wilson wrote:
> >@@ -233,7 +231,15 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
> >  		GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
> >  		b->first_wait = wait;
> >  		smp_store_mb(b->tasklet, wait->tsk);
> >-		first = __intel_breadcrumbs_enable_irq(b);
> >+		/* After assigning ourselves as the new bottom-half, we must
> >+		 * perform a cursory check to prevent a missed interrupt.
> >+		 * Either we miss the interrupt whilst programming the hardware,
> >+		 * or if there was a previous waiter (for a later seqno) they
> >+		 * may be woken instead of us (due to the inherent race
> >+		 * in the unlocked read of b->tasklet in the irq handler) and
> >+		 * so we miss the wake up.
> >+		 */
> >+		__intel_breadcrumbs_enable_irq(b);
> >  	}
> >  	GEM_BUG_ON(!b->tasklet);
> >  	GEM_BUG_ON(!b->first_wait);
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Even knowing the nature of the bug, I've found it very hard to hit. I've
written a test case that at the very least should exercise the multiple
waiters case, but making it hit the window where we swap the bottom
halves is a nigh-on impossible task (yet CI managed to hit it
almost consistently!).

I'm just thankful we do have some GEM tests in CI that did manage to hit
it.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-07-05 15:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-05 11:29 [PATCH] drm/i915: Always double check for a missed interrupt for new bottom halves Chris Wilson
2016-07-05 11:55 ` Chris Wilson
2016-07-05 12:02 ` ✗ Ro.CI.BAT: warning for " Patchwork
2016-07-05 13:38 ` [PATCH] " Tvrtko Ursulin
2016-07-05 15:04   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.