public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
@ 2015-03-23 17:11 Imre Deak
  2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
  2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
  0 siblings, 2 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw)
  To: intel-gfx

When disabling RPS interrupts there is a race where we disable RPS
inerrupts while the interrupt handler is running and the handler has
already latched the pending RPS interrupt from the master IIR register.
Afterwards the disabling path clears the PM IIR bits, making the state
of pending interrupts inconsistent from the interrupt handler's point of
view. This triggers the following warning: "The master control interrupt
lied (PM)!".

To fix this make sure that any running interrupt handler (which may
have already latched the master IIR) finishes before clearing the IIR
bits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87347
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 6d8340d..7afbde4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -330,6 +330,13 @@ void gen6_disable_rps_interrupts(struct drm_device *dev)
 	__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
 	I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
 				~dev_priv->pm_rps_events);
+
+	spin_unlock_irq(&dev_priv->irq_lock);
+
+	synchronize_irq(dev->irq);
+
+	spin_lock_irq(&dev_priv->irq_lock);
+
 	I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
 	I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
 
-- 
2.1.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
  2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
@ 2015-03-23 17:11 ` Imre Deak
  2015-03-24  9:14   ` Daniel Vetter
  2015-03-24 20:39   ` shuang.he
  2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
  1 sibling, 2 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw)
  To: intel-gfx

The logical place for clearing the RPS latched interrupt bits is when
resetting the RPS interrupts, so move the corresponding part from the RPS
disable function to the reset function. During resetting we already
cleared the IIR bits, so the only thing missing there was clearing pm_iir.

Note that we call gen6_disable_rps_interrupts() also during driver load
and resume time via intel_uncore_sanitize() when i915 interrupts are
still not installed. If there are any pending RPS bits at this point
(which after this patch wouldn't be cleared) they will be cleared by the
reset code via the interrupt preinstall hooks.

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7afbde4..14ecb4d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -277,6 +277,7 @@ void gen6_reset_rps_interrupts(struct drm_device *dev)
 	I915_WRITE(reg, dev_priv->pm_rps_events);
 	I915_WRITE(reg, dev_priv->pm_rps_events);
 	POSTING_READ(reg);
+	dev_priv->rps.pm_iir = 0;
 	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
@@ -334,15 +335,6 @@ void gen6_disable_rps_interrupts(struct drm_device *dev)
 	spin_unlock_irq(&dev_priv->irq_lock);
 
 	synchronize_irq(dev->irq);
-
-	spin_lock_irq(&dev_priv->irq_lock);
-
-	I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
-	I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
-
-	dev_priv->rps.pm_iir = 0;
-
-	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 /**
-- 
2.1.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
  2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
  2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
@ 2015-03-23 21:10 ` Chris Wilson
  2015-03-24  9:14   ` Daniel Vetter
  1 sibling, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-03-23 21:10 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> When disabling RPS interrupts there is a race where we disable RPS
> inerrupts while the interrupt handler is running and the handler has
> already latched the pending RPS interrupt from the master IIR register.
> Afterwards the disabling path clears the PM IIR bits, making the state
> of pending interrupts inconsistent from the interrupt handler's point of
> view. This triggers the following warning: "The master control interrupt
> lied (PM)!".
> 
> To fix this make sure that any running interrupt handler (which may
> have already latched the master IIR) finishes before clearing the IIR
> bits.
> 

Isn't this overkill for what is just a bogus WARN? If the WARN is a
logical consequence of the code, let's just remove the WARN.

Or iow can you not find a cheaper way to fix this?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
  2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
@ 2015-03-24  9:14   ` Daniel Vetter
  2015-03-24  9:24     ` Chris Wilson
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Vetter @ 2015-03-24  9:14 UTC (permalink / raw)
  To: Chris Wilson, Imre Deak, intel-gfx

On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > When disabling RPS interrupts there is a race where we disable RPS
> > inerrupts while the interrupt handler is running and the handler has
> > already latched the pending RPS interrupt from the master IIR register.
> > Afterwards the disabling path clears the PM IIR bits, making the state
> > of pending interrupts inconsistent from the interrupt handler's point of
> > view. This triggers the following warning: "The master control interrupt
> > lied (PM)!".
> > 
> > To fix this make sure that any running interrupt handler (which may
> > have already latched the master IIR) finishes before clearing the IIR
> > bits.
> > 
> 
> Isn't this overkill for what is just a bogus WARN? If the WARN is a
> logical consequence of the code, let's just remove the WARN.
> 
> Or iow can you not find a cheaper way to fix this?

We only run this on suspend/resume afaik, overhead should be acceptable.
And we've had that much overhead before we've done all the runtime pm
unification, it's still less synchronization than disabling interrupts
completely.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
  2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
@ 2015-03-24  9:14   ` Daniel Vetter
  2015-03-24 20:39   ` shuang.he
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Vetter @ 2015-03-24  9:14 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Mar 23, 2015 at 07:11:35PM +0200, Imre Deak wrote:
> The logical place for clearing the RPS latched interrupt bits is when
> resetting the RPS interrupts, so move the corresponding part from the RPS
> disable function to the reset function. During resetting we already
> cleared the IIR bits, so the only thing missing there was clearing pm_iir.
> 
> Note that we call gen6_disable_rps_interrupts() also during driver load
> and resume time via intel_uncore_sanitize() when i915 interrupts are
> still not installed. If there are any pending RPS bits at this point
> (which after this patch wouldn't be cleared) they will be cleared by the
> reset code via the interrupt preinstall hooks.
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Both merged to dinq, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
  2015-03-24  9:14   ` Daniel Vetter
@ 2015-03-24  9:24     ` Chris Wilson
  2015-03-24 20:52       ` Imre Deak
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-03-24  9:24 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote:
> On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > > When disabling RPS interrupts there is a race where we disable RPS
> > > inerrupts while the interrupt handler is running and the handler has
> > > already latched the pending RPS interrupt from the master IIR register.
> > > Afterwards the disabling path clears the PM IIR bits, making the state
> > > of pending interrupts inconsistent from the interrupt handler's point of
> > > view. This triggers the following warning: "The master control interrupt
> > > lied (PM)!".
> > > 
> > > To fix this make sure that any running interrupt handler (which may
> > > have already latched the master IIR) finishes before clearing the IIR
> > > bits.
> > > 
> > 
> > Isn't this overkill for what is just a bogus WARN? If the WARN is a
> > logical consequence of the code, let's just remove the WARN.
> > 
> > Or iow can you not find a cheaper way to fix this?
> 
> We only run this on suspend/resume afaik, overhead should be acceptable.
> And we've had that much overhead before we've done all the runtime pm
> unification, it's still less synchronization than disabling interrupts
> completely.

Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired
everytime we no longer expect to receive RPS interrupts). If it is only
the infrequent, then yeah I can't complain too much - I still think it
is slightly fishy, but I can accept that it is just a quirk of the
buffering the interrupt does.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
  2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
  2015-03-24  9:14   ` Daniel Vetter
@ 2015-03-24 20:39   ` shuang.he
  1 sibling, 0 replies; 8+ messages in thread
From: shuang.he @ 2015-03-24 20:39 UTC (permalink / raw)
  To: shuang.he, ethan.gao, intel-gfx, imre.deak

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6030
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -1              275/275              274/275
ILK                                  303/303              303/303
SNB                                  304/304              304/304
IVB                                  339/339              339/339
BYT                                  287/287              287/287
HSW                                  361/361              361/361
BDW                                  310/310              310/310
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*PNV  igt@gem_userptr_blits@minor-sync-interruptible      PASS(2)      DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
  2015-03-24  9:24     ` Chris Wilson
@ 2015-03-24 20:52       ` Imre Deak
  0 siblings, 0 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-24 20:52 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, 2015-03-24 at 09:24 +0000, Chris Wilson wrote:
> On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote:
> > On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> > > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > > > When disabling RPS interrupts there is a race where we disable RPS
> > > > inerrupts while the interrupt handler is running and the handler has
> > > > already latched the pending RPS interrupt from the master IIR register.
> > > > Afterwards the disabling path clears the PM IIR bits, making the state
> > > > of pending interrupts inconsistent from the interrupt handler's point of
> > > > view. This triggers the following warning: "The master control interrupt
> > > > lied (PM)!".
> > > > 
> > > > To fix this make sure that any running interrupt handler (which may
> > > > have already latched the master IIR) finishes before clearing the IIR
> > > > bits.
> > > > 
> > > 
> > > Isn't this overkill for what is just a bogus WARN? If the WARN is a
> > > logical consequence of the code, let's just remove the WARN.
> > > 
> > > Or iow can you not find a cheaper way to fix this?
> > 
> > We only run this on suspend/resume afaik, overhead should be acceptable.
> > And we've had that much overhead before we've done all the runtime pm
> > unification, it's still less synchronization than disabling interrupts
> > completely.
> 
> Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired
> everytime we no longer expect to receive RPS interrupts).

Yes, it affects only the driver loading (where it's a nop) and the
runtime/system suspend path.

> If it is only the infrequent, then yeah I can't complain too much - I
> still think it is slightly fishy, but I can accept that it is just a
> quirk of the buffering the interrupt does.

The reason for sync_irq is to prevent any subsequent
gen6_enable_rps_interrupts() call to trigger spurious interrupts.
Without it we could have:

CPU 0                               CPU 1
gen8_gt_irq_handler()
tmp = I915_READ(GEN8_GT_IIR(2))
<tmp has pending RPS bits>
                                    gen6_disable_rps_interrupts()
                                    gen6_reset_rps_interrupts()
                                    <clear GEN8_GT_IIR(2)>
                                    gen6_enable_rps_interrupts()
                                 
gen6_rps_irq_handler(tmp)
<handle the now stale RPS events>

Now I admit the above is quite unlikely, or even impossible due to what
happens between gen6_disable_rps_interrupts() and
gen6_enable_rps_interrupts() in our code atm. I still thought it's safer
not to rely on those.

One alternative would have been to extend the irq_lock in the interrupt
handler, so that we read GEN8_GT_IIR(2) and handle any events in it in
one critical section, but that would add the overhead where it actually
matters.

--Imre


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-24 20:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
2015-03-24  9:14   ` Daniel Vetter
2015-03-24 20:39   ` shuang.he
2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
2015-03-24  9:14   ` Daniel Vetter
2015-03-24  9:24     ` Chris Wilson
2015-03-24 20:52       ` Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox