All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: "Chris Wilson" <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org,
	"Antti Koskipää" <antti.koskipaa@linux.intel.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] drm/i915: Exit cherryview_irq_handler() after one pass
Date: Thu, 10 Mar 2016 14:24:39 +0200	[thread overview]
Message-ID: <20160310122439.GF10446@intel.com> (raw)
In-Reply-To: <20160310121046.GN1405@nuc-i3427.alporthouse.com>

On Thu, Mar 10, 2016 at 12:10:46PM +0000, Chris Wilson wrote:
> On Thu, Mar 10, 2016 at 02:01:27PM +0200, Ville Syrjälä wrote:
> > On Thu, Mar 10, 2016 at 11:44:28AM +0000, Chris Wilson wrote:
> > > This effectively reverts
> > > 
> > > commit 8e5fd599eb219f1054e39b40d18b217af669eea9
> > > Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Date:   Wed Apr 9 13:28:50 2014 +0300
> > > 
> > >     drm/i915/chv: Make CHV irq handler loop until all interrupts are consumed
> > > 
> > > as under continuous execlists load we can saturate the IRQ handler,
> > > destablising the tsc clock and triggering the NMI watchdog to declare a hung
> > > CPU.
> > > 
> > > [  552.756051] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > [  552.756080] clocksource:                       'refined-jiffies' wd_now: 10003b480 wd_last: 10003b28c mask: ffffffff
> > > [  552.756091] clocksource:                       'tsc' cs_now: d55d31aa50 cs_last: d17446166c mask: ffffffffffffffff
> > > [  552.756210] clocksource: Switched to clocksource refined-jiffies
> > > [  575.217870] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
> > > [  575.217893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rc7+ #18
> > > [  575.217905] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> > > [  575.217915]  0000000000000000 ffff88027fd05bc0 ffffffff81288c6d 0000000000000000
> > > [  575.217935]  0000000000000001 ffff88027fd05be0 ffffffff810e72d1 0000000000000000
> > > [  575.217951]  ffff88027fd05c80 ffff88027fd05c20 ffffffff81114b60 0000000181015f1e
> > > [  575.217967] Call Trace:
> > > [  575.217973]  <NMI>  [<ffffffff81288c6d>] dump_stack+0x4f/0x72
> > > [  575.217994]  [<ffffffff810e72d1>] watchdog_overflow_callback+0x151/0x160
> > > [  575.218003]  [<ffffffff81114b60>] __perf_event_overflow+0xa0/0x1e0
> > > [  575.218016]  [<ffffffff811154c4>] perf_event_overflow+0x14/0x20
> > > [  575.218028]  [<ffffffff8101d2ca>] intel_pmu_handle_irq+0x1da/0x460
> > > [  575.218042]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218052]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218064]  [<ffffffff81014ae8>] perf_event_nmi_handler+0x28/0x50
> > > [  575.218075]  [<ffffffff81007540>] nmi_handle+0x60/0x130
> > > [  575.218086]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218096]  [<ffffffff810079c0>] do_nmi+0x140/0x470
> > > [  575.218108]  [<ffffffff81559ec7>] end_repeat_nmi+0x1a/0x1e
> > > [  575.218119]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218129]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218139]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218148]  <<EOE>>  [<ffffffff814a8353>] cpuidle_enter_state+0xf3/0x2f0
> > > [  575.218164]  [<ffffffff814a8587>] cpuidle_enter+0x17/0x20
> > > [  575.218175]  [<ffffffff810aaa3a>] call_cpuidle+0x2a/0x40
> > > [  575.218185]  [<ffffffff810aade3>] cpu_startup_entry+0x273/0x330
> > > [  575.218196]  [<ffffffff81033a1e>] start_secondary+0x10e/0x130
> > > 
> > > However, not servicing all available IIR within the handler does hurt the
> > > throughput of pathological nop execbuf by about 20%, with a similar effect
> > > upon the dispatch latency of a series of execbuf.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
> > > Testcase: igt/gem_exec_nop/basic # requires NMI watchdog
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Cc: Antti Koskipää <antti.koskipaa@linux.intel.com
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/gpu/drm/i915/i915_irq.c | 40 +++++++++++++++++++---------------------
> > >  1 file changed, 19 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 53e5104964b3..8a3230427884 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1829,35 +1829,33 @@ static irqreturn_t cherryview_irq_handler(int irq, void *arg)
> > >  	/* IRQs are synced during runtime_suspend, we don't require a wakeref */
> > >  	disable_rpm_wakeref_asserts(dev_priv);
> > >  
> > > -	for (;;) {
> > > -		master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL;
> > > -		iir = I915_READ(VLV_IIR);
> > > +	master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL;
> > > +	iir = I915_READ(VLV_IIR);
> > >  
> > > -		if (master_ctl == 0 && iir == 0)
> > > -			break;
> > > +	if (master_ctl == 0 && iir == 0)
> > > +		break;
> > 
> > goto something?
> 
> Sigh. The problem of rewriting the "obvious" patch against -nightly. I
> just changed the for(;;) into do {} while(0) for testing. Perhaps I
> should stick with that in case we need to flip flop agin.
> 
> > Apart from that I have no objections if it doesn't cause problems
> > with interrupts getting lost and whatnot. That was the original reason
> > for it I think, but at least I myself never really looked into it. IIRC
> > Rafael just told me they needed to do it to get the thing working, so
> > I just put the patch in. And that was before I had even seen any silicon.
> 
> My testing only looks at the GT side, and we do stress that pretty hard
> because of execlists and have reasonable methods of detection if we stop
> processing execbuf. I'm more worried about the display and pipe interrupts.

IIRC GT was where the problem was originally.

And just as a side note, I do have a branch somewhere that rewrites all
the gmch irq handlers to not loop. Just never actually found the time to
really run it on anything :) So I like moving towards that direction in
any case.

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

WARNING: multiple messages have this Message-ID (diff)
From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: "Chris Wilson" <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org,
	"Antti Koskipää" <antti.koskipaa@linux.intel.com>,
	"Tvrtko Ursulin" <tvrtko.ursulin@intel.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] drm/i915: Exit cherryview_irq_handler() after one pass
Date: Thu, 10 Mar 2016 14:24:39 +0200	[thread overview]
Message-ID: <20160310122439.GF10446@intel.com> (raw)
In-Reply-To: <20160310121046.GN1405@nuc-i3427.alporthouse.com>

On Thu, Mar 10, 2016 at 12:10:46PM +0000, Chris Wilson wrote:
> On Thu, Mar 10, 2016 at 02:01:27PM +0200, Ville Syrj�l� wrote:
> > On Thu, Mar 10, 2016 at 11:44:28AM +0000, Chris Wilson wrote:
> > > This effectively reverts
> > > 
> > > commit 8e5fd599eb219f1054e39b40d18b217af669eea9
> > > Author: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> > > Date:   Wed Apr 9 13:28:50 2014 +0300
> > > 
> > >     drm/i915/chv: Make CHV irq handler loop until all interrupts are consumed
> > > 
> > > as under continuous execlists load we can saturate the IRQ handler,
> > > destablising the tsc clock and triggering the NMI watchdog to declare a hung
> > > CPU.
> > > 
> > > [  552.756051] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
> > > [  552.756080] clocksource:                       'refined-jiffies' wd_now: 10003b480 wd_last: 10003b28c mask: ffffffff
> > > [  552.756091] clocksource:                       'tsc' cs_now: d55d31aa50 cs_last: d17446166c mask: ffffffffffffffff
> > > [  552.756210] clocksource: Switched to clocksource refined-jiffies
> > > [  575.217870] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
> > > [  575.217893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rc7+ #18
> > > [  575.217905] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
> > > [  575.217915]  0000000000000000 ffff88027fd05bc0 ffffffff81288c6d 0000000000000000
> > > [  575.217935]  0000000000000001 ffff88027fd05be0 ffffffff810e72d1 0000000000000000
> > > [  575.217951]  ffff88027fd05c80 ffff88027fd05c20 ffffffff81114b60 0000000181015f1e
> > > [  575.217967] Call Trace:
> > > [  575.217973]  <NMI>  [<ffffffff81288c6d>] dump_stack+0x4f/0x72
> > > [  575.217994]  [<ffffffff810e72d1>] watchdog_overflow_callback+0x151/0x160
> > > [  575.218003]  [<ffffffff81114b60>] __perf_event_overflow+0xa0/0x1e0
> > > [  575.218016]  [<ffffffff811154c4>] perf_event_overflow+0x14/0x20
> > > [  575.218028]  [<ffffffff8101d2ca>] intel_pmu_handle_irq+0x1da/0x460
> > > [  575.218042]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218052]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218064]  [<ffffffff81014ae8>] perf_event_nmi_handler+0x28/0x50
> > > [  575.218075]  [<ffffffff81007540>] nmi_handle+0x60/0x130
> > > [  575.218086]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218096]  [<ffffffff810079c0>] do_nmi+0x140/0x470
> > > [  575.218108]  [<ffffffff81559ec7>] end_repeat_nmi+0x1a/0x1e
> > > [  575.218119]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218129]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218139]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
> > > [  575.218148]  <<EOE>>  [<ffffffff814a8353>] cpuidle_enter_state+0xf3/0x2f0
> > > [  575.218164]  [<ffffffff814a8587>] cpuidle_enter+0x17/0x20
> > > [  575.218175]  [<ffffffff810aaa3a>] call_cpuidle+0x2a/0x40
> > > [  575.218185]  [<ffffffff810aade3>] cpu_startup_entry+0x273/0x330
> > > [  575.218196]  [<ffffffff81033a1e>] start_secondary+0x10e/0x130
> > > 
> > > However, not servicing all available IIR within the handler does hurt the
> > > throughput of pathological nop execbuf by about 20%, with a similar effect
> > > upon the dispatch latency of a series of execbuf.
> > > 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
> > > Testcase: igt/gem_exec_nop/basic # requires NMI watchdog
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> > > Cc: Antti Koskip�� <antti.koskipaa@linux.intel.com
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/gpu/drm/i915/i915_irq.c | 40 +++++++++++++++++++---------------------
> > >  1 file changed, 19 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 53e5104964b3..8a3230427884 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1829,35 +1829,33 @@ static irqreturn_t cherryview_irq_handler(int irq, void *arg)
> > >  	/* IRQs are synced during runtime_suspend, we don't require a wakeref */
> > >  	disable_rpm_wakeref_asserts(dev_priv);
> > >  
> > > -	for (;;) {
> > > -		master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL;
> > > -		iir = I915_READ(VLV_IIR);
> > > +	master_ctl = I915_READ(GEN8_MASTER_IRQ) & ~GEN8_MASTER_IRQ_CONTROL;
> > > +	iir = I915_READ(VLV_IIR);
> > >  
> > > -		if (master_ctl == 0 && iir == 0)
> > > -			break;
> > > +	if (master_ctl == 0 && iir == 0)
> > > +		break;
> > 
> > goto something?
> 
> Sigh. The problem of rewriting the "obvious" patch against -nightly. I
> just changed the for(;;) into do {} while(0) for testing. Perhaps I
> should stick with that in case we need to flip flop agin.
> 
> > Apart from that I have no objections if it doesn't cause problems
> > with interrupts getting lost and whatnot. That was the original reason
> > for it I think, but at least I myself never really looked into it. IIRC
> > Rafael just told me they needed to do it to get the thing working, so
> > I just put the patch in. And that was before I had even seen any silicon.
> 
> My testing only looks at the GT side, and we do stress that pretty hard
> because of execlists and have reasonable methods of detection if we stop
> processing execbuf. I'm more worried about the display and pipe interrupts.

IIRC GT was where the problem was originally.

And just as a side note, I do have a branch somewhere that rewrites all
the gmch irq handlers to not loop. Just never actually found the time to
really run it on anything :) So I like moving towards that direction in
any case.

-- 
Ville Syrj�l�
Intel OTC

  reply	other threads:[~2016-03-10 12:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-10 11:44 [PATCH] drm/i915: Exit cherryview_irq_handler() after one pass Chris Wilson
2016-03-10 12:01 ` Ville Syrjälä
2016-03-10 12:01   ` Ville Syrjälä
2016-03-10 12:10   ` Chris Wilson
2016-03-10 12:10     ` Chris Wilson
2016-03-10 12:24     ` Ville Syrjälä [this message]
2016-03-10 12:24       ` Ville Syrjälä
2016-03-10 12:42       ` Chris Wilson
2016-03-10 12:42         ` Chris Wilson
2016-03-10 13:01         ` Ville Syrjälä
2016-03-10 13:01           ` Ville Syrjälä
2016-03-10 12:01 ` [Intel-gfx] " Tvrtko Ursulin
2016-03-10 12:12 ` kbuild test robot
2016-03-10 12:12   ` kbuild test robot
2016-03-10 12:18 ` [PATCH v2] " Chris Wilson
2016-03-10 12:25   ` Ville Syrjälä
2016-03-10 12:25     ` Ville Syrjälä
2016-03-10 12:38   ` [Intel-gfx] " Tvrtko Ursulin
2016-03-10 15:06 ` ✗ Fi.CI.BAT: failure for drm/i915: Exit cherryview_irq_handler() after one pass (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160310122439.GF10446@intel.com \
    --to=ville.syrjala@linux.intel.com \
    --cc=antti.koskipaa@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=stable@vger.kernel.org \
    --cc=tvrtko.ursulin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.