* [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
@ 2015-03-23 17:11 Imre Deak
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
0 siblings, 2 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw)
To: intel-gfx
When disabling RPS interrupts there is a race where we disable RPS
inerrupts while the interrupt handler is running and the handler has
already latched the pending RPS interrupt from the master IIR register.
Afterwards the disabling path clears the PM IIR bits, making the state
of pending interrupts inconsistent from the interrupt handler's point of
view. This triggers the following warning: "The master control interrupt
lied (PM)!".
To fix this make sure that any running interrupt handler (which may
have already latched the master IIR) finishes before clearing the IIR
bits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87347
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
drivers/gpu/drm/i915/i915_irq.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 6d8340d..7afbde4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -330,6 +330,13 @@ void gen6_disable_rps_interrupts(struct drm_device *dev)
__gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events);
I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) &
~dev_priv->pm_rps_events);
+
+ spin_unlock_irq(&dev_priv->irq_lock);
+
+ synchronize_irq(dev->irq);
+
+ spin_lock_irq(&dev_priv->irq_lock);
+
I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
--
2.1.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
@ 2015-03-23 17:11 ` Imre Deak
2015-03-24 9:14 ` Daniel Vetter
2015-03-24 20:39 ` shuang.he
2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
1 sibling, 2 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw)
To: intel-gfx
The logical place for clearing the RPS latched interrupt bits is when
resetting the RPS interrupts, so move the corresponding part from the RPS
disable function to the reset function. During resetting we already
cleared the IIR bits, so the only thing missing there was clearing pm_iir.
Note that we call gen6_disable_rps_interrupts() also during driver load
and resume time via intel_uncore_sanitize() when i915 interrupts are
still not installed. If there are any pending RPS bits at this point
(which after this patch wouldn't be cleared) they will be cleared by the
reset code via the interrupt preinstall hooks.
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
drivers/gpu/drm/i915/i915_irq.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7afbde4..14ecb4d 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -277,6 +277,7 @@ void gen6_reset_rps_interrupts(struct drm_device *dev)
I915_WRITE(reg, dev_priv->pm_rps_events);
I915_WRITE(reg, dev_priv->pm_rps_events);
POSTING_READ(reg);
+ dev_priv->rps.pm_iir = 0;
spin_unlock_irq(&dev_priv->irq_lock);
}
@@ -334,15 +335,6 @@ void gen6_disable_rps_interrupts(struct drm_device *dev)
spin_unlock_irq(&dev_priv->irq_lock);
synchronize_irq(dev->irq);
-
- spin_lock_irq(&dev_priv->irq_lock);
-
- I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
- I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events);
-
- dev_priv->rps.pm_iir = 0;
-
- spin_unlock_irq(&dev_priv->irq_lock);
}
/**
--
2.1.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
@ 2015-03-23 21:10 ` Chris Wilson
2015-03-24 9:14 ` Daniel Vetter
1 sibling, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-03-23 21:10 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> When disabling RPS interrupts there is a race where we disable RPS
> inerrupts while the interrupt handler is running and the handler has
> already latched the pending RPS interrupt from the master IIR register.
> Afterwards the disabling path clears the PM IIR bits, making the state
> of pending interrupts inconsistent from the interrupt handler's point of
> view. This triggers the following warning: "The master control interrupt
> lied (PM)!".
>
> To fix this make sure that any running interrupt handler (which may
> have already latched the master IIR) finishes before clearing the IIR
> bits.
>
Isn't this overkill for what is just a bogus WARN? If the WARN is a
logical consequence of the code, let's just remove the WARN.
Or iow can you not find a cheaper way to fix this?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
@ 2015-03-24 9:14 ` Daniel Vetter
2015-03-24 9:24 ` Chris Wilson
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Vetter @ 2015-03-24 9:14 UTC (permalink / raw)
To: Chris Wilson, Imre Deak, intel-gfx
On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > When disabling RPS interrupts there is a race where we disable RPS
> > inerrupts while the interrupt handler is running and the handler has
> > already latched the pending RPS interrupt from the master IIR register.
> > Afterwards the disabling path clears the PM IIR bits, making the state
> > of pending interrupts inconsistent from the interrupt handler's point of
> > view. This triggers the following warning: "The master control interrupt
> > lied (PM)!".
> >
> > To fix this make sure that any running interrupt handler (which may
> > have already latched the master IIR) finishes before clearing the IIR
> > bits.
> >
>
> Isn't this overkill for what is just a bogus WARN? If the WARN is a
> logical consequence of the code, let's just remove the WARN.
>
> Or iow can you not find a cheaper way to fix this?
We only run this on suspend/resume afaik, overhead should be acceptable.
And we've had that much overhead before we've done all the runtime pm
unification, it's still less synchronization than disabling interrupts
completely.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
@ 2015-03-24 9:14 ` Daniel Vetter
2015-03-24 20:39 ` shuang.he
1 sibling, 0 replies; 8+ messages in thread
From: Daniel Vetter @ 2015-03-24 9:14 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx
On Mon, Mar 23, 2015 at 07:11:35PM +0200, Imre Deak wrote:
> The logical place for clearing the RPS latched interrupt bits is when
> resetting the RPS interrupts, so move the corresponding part from the RPS
> disable function to the reset function. During resetting we already
> cleared the IIR bits, so the only thing missing there was clearing pm_iir.
>
> Note that we call gen6_disable_rps_interrupts() also during driver load
> and resume time via intel_uncore_sanitize() when i915 interrupts are
> still not installed. If there are any pending RPS bits at this point
> (which after this patch wouldn't be cleared) they will be cleared by the
> reset code via the interrupt preinstall hooks.
>
> Signed-off-by: Imre Deak <imre.deak@intel.com>
Both merged to dinq, thanks.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
2015-03-24 9:14 ` Daniel Vetter
@ 2015-03-24 9:24 ` Chris Wilson
2015-03-24 20:52 ` Imre Deak
0 siblings, 1 reply; 8+ messages in thread
From: Chris Wilson @ 2015-03-24 9:24 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote:
> On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > > When disabling RPS interrupts there is a race where we disable RPS
> > > inerrupts while the interrupt handler is running and the handler has
> > > already latched the pending RPS interrupt from the master IIR register.
> > > Afterwards the disabling path clears the PM IIR bits, making the state
> > > of pending interrupts inconsistent from the interrupt handler's point of
> > > view. This triggers the following warning: "The master control interrupt
> > > lied (PM)!".
> > >
> > > To fix this make sure that any running interrupt handler (which may
> > > have already latched the master IIR) finishes before clearing the IIR
> > > bits.
> > >
> >
> > Isn't this overkill for what is just a bogus WARN? If the WARN is a
> > logical consequence of the code, let's just remove the WARN.
> >
> > Or iow can you not find a cheaper way to fix this?
>
> We only run this on suspend/resume afaik, overhead should be acceptable.
> And we've had that much overhead before we've done all the runtime pm
> unification, it's still less synchronization than disabling interrupts
> completely.
Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired
everytime we no longer expect to receive RPS interrupts). If it is only
the infrequent, then yeah I can't complain too much - I still think it
is slightly fishy, but I can accept that it is just a quirk of the
buffering the interrupt does.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
2015-03-24 9:14 ` Daniel Vetter
@ 2015-03-24 20:39 ` shuang.he
1 sibling, 0 replies; 8+ messages in thread
From: shuang.he @ 2015-03-24 20:39 UTC (permalink / raw)
To: shuang.he, ethan.gao, intel-gfx, imre.deak
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 6030
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV -1 275/275 274/275
ILK 303/303 303/303
SNB 304/304 304/304
IVB 339/339 339/339
BYT 287/287 287/287
HSW 361/361 361/361
BDW 310/310 310/310
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*PNV igt@gem_userptr_blits@minor-sync-interruptible PASS(2) DMESG_WARN(1)PASS(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits
2015-03-24 9:24 ` Chris Wilson
@ 2015-03-24 20:52 ` Imre Deak
0 siblings, 0 replies; 8+ messages in thread
From: Imre Deak @ 2015-03-24 20:52 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Tue, 2015-03-24 at 09:24 +0000, Chris Wilson wrote:
> On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote:
> > On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote:
> > > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote:
> > > > When disabling RPS interrupts there is a race where we disable RPS
> > > > inerrupts while the interrupt handler is running and the handler has
> > > > already latched the pending RPS interrupt from the master IIR register.
> > > > Afterwards the disabling path clears the PM IIR bits, making the state
> > > > of pending interrupts inconsistent from the interrupt handler's point of
> > > > view. This triggers the following warning: "The master control interrupt
> > > > lied (PM)!".
> > > >
> > > > To fix this make sure that any running interrupt handler (which may
> > > > have already latched the master IIR) finishes before clearing the IIR
> > > > bits.
> > > >
> > >
> > > Isn't this overkill for what is just a bogus WARN? If the WARN is a
> > > logical consequence of the code, let's just remove the WARN.
> > >
> > > Or iow can you not find a cheaper way to fix this?
> >
> > We only run this on suspend/resume afaik, overhead should be acceptable.
> > And we've had that much overhead before we've done all the runtime pm
> > unification, it's still less synchronization than disabling interrupts
> > completely.
>
> Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired
> everytime we no longer expect to receive RPS interrupts).
Yes, it affects only the driver loading (where it's a nop) and the
runtime/system suspend path.
> If it is only the infrequent, then yeah I can't complain too much - I
> still think it is slightly fishy, but I can accept that it is just a
> quirk of the buffering the interrupt does.
The reason for sync_irq is to prevent any subsequent
gen6_enable_rps_interrupts() call to trigger spurious interrupts.
Without it we could have:
CPU 0 CPU 1
gen8_gt_irq_handler()
tmp = I915_READ(GEN8_GT_IIR(2))
<tmp has pending RPS bits>
gen6_disable_rps_interrupts()
gen6_reset_rps_interrupts()
<clear GEN8_GT_IIR(2)>
gen6_enable_rps_interrupts()
gen6_rps_irq_handler(tmp)
<handle the now stale RPS events>
Now I admit the above is quite unlikely, or even impossible due to what
happens between gen6_disable_rps_interrupts() and
gen6_enable_rps_interrupts() in our code atm. I still thought it's safer
not to rely on those.
One alternative would have been to extend the irq_lock in the interrupt
handler, so that we read GEN8_GT_IIR(2) and handle any events in it in
one critical section, but that would add the overhead where it actually
matters.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-03-24 20:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak
2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak
2015-03-24 9:14 ` Daniel Vetter
2015-03-24 20:39 ` shuang.he
2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson
2015-03-24 9:14 ` Daniel Vetter
2015-03-24 9:24 ` Chris Wilson
2015-03-24 20:52 ` Imre Deak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox