* [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits @ 2015-03-23 17:11 Imre Deak 2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak 2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson 0 siblings, 2 replies; 8+ messages in thread From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw) To: intel-gfx When disabling RPS interrupts there is a race where we disable RPS inerrupts while the interrupt handler is running and the handler has already latched the pending RPS interrupt from the master IIR register. Afterwards the disabling path clears the PM IIR bits, making the state of pending interrupts inconsistent from the interrupt handler's point of view. This triggers the following warning: "The master control interrupt lied (PM)!". To fix this make sure that any running interrupt handler (which may have already latched the master IIR) finishes before clearing the IIR bits. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87347 Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 6d8340d..7afbde4 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -330,6 +330,13 @@ void gen6_disable_rps_interrupts(struct drm_device *dev) __gen6_disable_pm_irq(dev_priv, dev_priv->pm_rps_events); I915_WRITE(gen6_pm_ier(dev_priv), I915_READ(gen6_pm_ier(dev_priv)) & ~dev_priv->pm_rps_events); + + spin_unlock_irq(&dev_priv->irq_lock); + + synchronize_irq(dev->irq); + + spin_lock_irq(&dev_priv->irq_lock); + I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events); I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events); -- 2.1.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time 2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak @ 2015-03-23 17:11 ` Imre Deak 2015-03-24 9:14 ` Daniel Vetter 2015-03-24 20:39 ` shuang.he 2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson 1 sibling, 2 replies; 8+ messages in thread From: Imre Deak @ 2015-03-23 17:11 UTC (permalink / raw) To: intel-gfx The logical place for clearing the RPS latched interrupt bits is when resetting the RPS interrupts, so move the corresponding part from the RPS disable function to the reset function. During resetting we already cleared the IIR bits, so the only thing missing there was clearing pm_iir. Note that we call gen6_disable_rps_interrupts() also during driver load and resume time via intel_uncore_sanitize() when i915 interrupts are still not installed. If there are any pending RPS bits at this point (which after this patch wouldn't be cleared) they will be cleared by the reset code via the interrupt preinstall hooks. Signed-off-by: Imre Deak <imre.deak@intel.com> --- drivers/gpu/drm/i915/i915_irq.c | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7afbde4..14ecb4d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -277,6 +277,7 @@ void gen6_reset_rps_interrupts(struct drm_device *dev) I915_WRITE(reg, dev_priv->pm_rps_events); I915_WRITE(reg, dev_priv->pm_rps_events); POSTING_READ(reg); + dev_priv->rps.pm_iir = 0; spin_unlock_irq(&dev_priv->irq_lock); } @@ -334,15 +335,6 @@ void gen6_disable_rps_interrupts(struct drm_device *dev) spin_unlock_irq(&dev_priv->irq_lock); synchronize_irq(dev->irq); - - spin_lock_irq(&dev_priv->irq_lock); - - I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events); - I915_WRITE(gen6_pm_iir(dev_priv), dev_priv->pm_rps_events); - - dev_priv->rps.pm_iir = 0; - - spin_unlock_irq(&dev_priv->irq_lock); } /** -- 2.1.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time 2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak @ 2015-03-24 9:14 ` Daniel Vetter 2015-03-24 20:39 ` shuang.he 1 sibling, 0 replies; 8+ messages in thread From: Daniel Vetter @ 2015-03-24 9:14 UTC (permalink / raw) To: Imre Deak; +Cc: intel-gfx On Mon, Mar 23, 2015 at 07:11:35PM +0200, Imre Deak wrote: > The logical place for clearing the RPS latched interrupt bits is when > resetting the RPS interrupts, so move the corresponding part from the RPS > disable function to the reset function. During resetting we already > cleared the IIR bits, so the only thing missing there was clearing pm_iir. > > Note that we call gen6_disable_rps_interrupts() also during driver load > and resume time via intel_uncore_sanitize() when i915 interrupts are > still not installed. If there are any pending RPS bits at this point > (which after this patch wouldn't be cleared) they will be cleared by the > reset code via the interrupt preinstall hooks. > > Signed-off-by: Imre Deak <imre.deak@intel.com> Both merged to dinq, thanks. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time 2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak 2015-03-24 9:14 ` Daniel Vetter @ 2015-03-24 20:39 ` shuang.he 1 sibling, 0 replies; 8+ messages in thread From: shuang.he @ 2015-03-24 20:39 UTC (permalink / raw) To: shuang.he, ethan.gao, intel-gfx, imre.deak Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) Task id: 6030 -------------------------------------Summary------------------------------------- Platform Delta drm-intel-nightly Series Applied PNV -1 275/275 274/275 ILK 303/303 303/303 SNB 304/304 304/304 IVB 339/339 339/339 BYT 287/287 287/287 HSW 361/361 361/361 BDW 310/310 310/310 -------------------------------------Detailed------------------------------------- Platform Test drm-intel-nightly Series Applied *PNV igt@gem_userptr_blits@minor-sync-interruptible PASS(2) DMESG_WARN(1)PASS(1) Note: You need to pay more attention to line start with '*' _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits 2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak 2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak @ 2015-03-23 21:10 ` Chris Wilson 2015-03-24 9:14 ` Daniel Vetter 1 sibling, 1 reply; 8+ messages in thread From: Chris Wilson @ 2015-03-23 21:10 UTC (permalink / raw) To: Imre Deak; +Cc: intel-gfx On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote: > When disabling RPS interrupts there is a race where we disable RPS > inerrupts while the interrupt handler is running and the handler has > already latched the pending RPS interrupt from the master IIR register. > Afterwards the disabling path clears the PM IIR bits, making the state > of pending interrupts inconsistent from the interrupt handler's point of > view. This triggers the following warning: "The master control interrupt > lied (PM)!". > > To fix this make sure that any running interrupt handler (which may > have already latched the master IIR) finishes before clearing the IIR > bits. > Isn't this overkill for what is just a bogus WARN? If the WARN is a logical consequence of the code, let's just remove the WARN. Or iow can you not find a cheaper way to fix this? -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits 2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson @ 2015-03-24 9:14 ` Daniel Vetter 2015-03-24 9:24 ` Chris Wilson 0 siblings, 1 reply; 8+ messages in thread From: Daniel Vetter @ 2015-03-24 9:14 UTC (permalink / raw) To: Chris Wilson, Imre Deak, intel-gfx On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote: > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote: > > When disabling RPS interrupts there is a race where we disable RPS > > inerrupts while the interrupt handler is running and the handler has > > already latched the pending RPS interrupt from the master IIR register. > > Afterwards the disabling path clears the PM IIR bits, making the state > > of pending interrupts inconsistent from the interrupt handler's point of > > view. This triggers the following warning: "The master control interrupt > > lied (PM)!". > > > > To fix this make sure that any running interrupt handler (which may > > have already latched the master IIR) finishes before clearing the IIR > > bits. > > > > Isn't this overkill for what is just a bogus WARN? If the WARN is a > logical consequence of the code, let's just remove the WARN. > > Or iow can you not find a cheaper way to fix this? We only run this on suspend/resume afaik, overhead should be acceptable. And we've had that much overhead before we've done all the runtime pm unification, it's still less synchronization than disabling interrupts completely. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits 2015-03-24 9:14 ` Daniel Vetter @ 2015-03-24 9:24 ` Chris Wilson 2015-03-24 20:52 ` Imre Deak 0 siblings, 1 reply; 8+ messages in thread From: Chris Wilson @ 2015-03-24 9:24 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote: > On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote: > > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote: > > > When disabling RPS interrupts there is a race where we disable RPS > > > inerrupts while the interrupt handler is running and the handler has > > > already latched the pending RPS interrupt from the master IIR register. > > > Afterwards the disabling path clears the PM IIR bits, making the state > > > of pending interrupts inconsistent from the interrupt handler's point of > > > view. This triggers the following warning: "The master control interrupt > > > lied (PM)!". > > > > > > To fix this make sure that any running interrupt handler (which may > > > have already latched the master IIR) finishes before clearing the IIR > > > bits. > > > > > > > Isn't this overkill for what is just a bogus WARN? If the WARN is a > > logical consequence of the code, let's just remove the WARN. > > > > Or iow can you not find a cheaper way to fix this? > > We only run this on suspend/resume afaik, overhead should be acceptable. > And we've had that much overhead before we've done all the runtime pm > unification, it's still less synchronization than disabling interrupts > completely. Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired everytime we no longer expect to receive RPS interrupts). If it is only the infrequent, then yeah I can't complain too much - I still think it is slightly fishy, but I can accept that it is just a quirk of the buffering the interrupt does. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits 2015-03-24 9:24 ` Chris Wilson @ 2015-03-24 20:52 ` Imre Deak 0 siblings, 0 replies; 8+ messages in thread From: Imre Deak @ 2015-03-24 20:52 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Tue, 2015-03-24 at 09:24 +0000, Chris Wilson wrote: > On Tue, Mar 24, 2015 at 10:14:03AM +0100, Daniel Vetter wrote: > > On Mon, Mar 23, 2015 at 09:10:15PM +0000, Chris Wilson wrote: > > > On Mon, Mar 23, 2015 at 07:11:34PM +0200, Imre Deak wrote: > > > > When disabling RPS interrupts there is a race where we disable RPS > > > > inerrupts while the interrupt handler is running and the handler has > > > > already latched the pending RPS interrupt from the master IIR register. > > > > Afterwards the disabling path clears the PM IIR bits, making the state > > > > of pending interrupts inconsistent from the interrupt handler's point of > > > > view. This triggers the following warning: "The master control interrupt > > > > lied (PM)!". > > > > > > > > To fix this make sure that any running interrupt handler (which may > > > > have already latched the master IIR) finishes before clearing the IIR > > > > bits. > > > > > > > > > > Isn't this overkill for what is just a bogus WARN? If the WARN is a > > > logical consequence of the code, let's just remove the WARN. > > > > > > Or iow can you not find a cheaper way to fix this? > > > > We only run this on suspend/resume afaik, overhead should be acceptable. > > And we've had that much overhead before we've done all the runtime pm > > unification, it's still less synchronization than disabling interrupts > > completely. > > Hmm, I thought this was in conjunction with RPS pm masking (i.e. fired > everytime we no longer expect to receive RPS interrupts). Yes, it affects only the driver loading (where it's a nop) and the runtime/system suspend path. > If it is only the infrequent, then yeah I can't complain too much - I > still think it is slightly fishy, but I can accept that it is just a > quirk of the buffering the interrupt does. The reason for sync_irq is to prevent any subsequent gen6_enable_rps_interrupts() call to trigger spurious interrupts. Without it we could have: CPU 0 CPU 1 gen8_gt_irq_handler() tmp = I915_READ(GEN8_GT_IIR(2)) <tmp has pending RPS bits> gen6_disable_rps_interrupts() gen6_reset_rps_interrupts() <clear GEN8_GT_IIR(2)> gen6_enable_rps_interrupts() gen6_rps_irq_handler(tmp) <handle the now stale RPS events> Now I admit the above is quite unlikely, or even impossible due to what happens between gen6_disable_rps_interrupts() and gen6_enable_rps_interrupts() in our code atm. I still thought it's safer not to rely on those. One alternative would have been to extend the irq_lock in the interrupt handler, so that we read GEN8_GT_IIR(2) and handle any events in it in one critical section, but that would add the overhead where it actually matters. --Imre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-03-24 20:52 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-23 17:11 [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Imre Deak 2015-03-23 17:11 ` [PATCH 2/2] drm/i915: move clearing of RPS interrupt bits from disable to reset time Imre Deak 2015-03-24 9:14 ` Daniel Vetter 2015-03-24 20:39 ` shuang.he 2015-03-23 21:10 ` [PATCH 1/2] drm/i915: fix race when clearing RPS IIR bits Chris Wilson 2015-03-24 9:14 ` Daniel Vetter 2015-03-24 9:24 ` Chris Wilson 2015-03-24 20:52 ` Imre Deak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox