* [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
@ 2016-05-23 14:09 ville.syrjala
2016-05-23 14:09 ` [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset ville.syrjala
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: ville.syrjala @ 2016-05-23 14:09 UTC (permalink / raw)
To: intel-gfx
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
SNB (and IVB too I suppose) starts to misbehave if the GPU gets stuck
in an infinite batch buffer loop. The GPU apparently hogs something
critical and CPUs start to lose interrupts and whatnot. We can keep
the system limping along by unmasking some interrupts in
GEN6_PMINTRMSK. The EI up interrupt has been previously chosen for
that task, so let's never mask it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/intel_pm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 29bdd79d9039..576e98744a2d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4719,7 +4719,7 @@ void gen6_rps_idle(struct drm_i915_private *dev_priv)
else
gen6_set_rps(dev_priv, dev_priv->rps.idle_freq);
dev_priv->rps.last_adj = 0;
- I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
+ I915_WRITE(GEN6_PMINTRMSK, gen6_rps_pm_mask(dev_priv, ~0));
}
mutex_unlock(&dev_priv->rps.hw_lock);
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
@ 2016-05-23 14:09 ` ville.syrjala
2016-05-23 14:32 ` Chris Wilson
2016-05-23 14:09 ` [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate ville.syrjala
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: ville.syrjala @ 2016-05-23 14:09 UTC (permalink / raw)
To: intel-gfx
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Based on my observations GPU reset doesn't clobber the RPS state, so
frobbing with RPS around GPU reset seems rather pointless. Just get
rid of it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 11 -----------
drivers/gpu/drm/i915/intel_drv.h | 1 -
drivers/gpu/drm/i915/intel_pm.c | 9 ---------
3 files changed, 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index dba03c026151..3794dc67caa4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -930,8 +930,6 @@ int i915_reset(struct drm_i915_private *dev_priv)
unsigned reset_counter;
int ret;
- intel_reset_gt_powersave(dev_priv);
-
mutex_lock(&dev->struct_mutex);
/* Clear any previous failed attempts at recovery. Time to try again. */
@@ -994,15 +992,6 @@ int i915_reset(struct drm_i915_private *dev_priv)
mutex_unlock(&dev->struct_mutex);
- /*
- * rps/rc6 re-init is necessary to restore state lost after the
- * reset and the re-install of gt irqs. Skip for ironlake per
- * previous concerns that it doesn't respond well to some forms
- * of re-init after reset.
- */
- if (INTEL_INFO(dev)->gen > 5)
- intel_enable_gt_powersave(dev_priv);
-
return 0;
error:
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 0741b2d3aa65..ca3b69b61f06 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1652,7 +1652,6 @@ void intel_cleanup_gt_powersave(struct drm_i915_private *dev_priv);
void intel_enable_gt_powersave(struct drm_i915_private *dev_priv);
void intel_disable_gt_powersave(struct drm_i915_private *dev_priv);
void intel_suspend_gt_powersave(struct drm_i915_private *dev_priv);
-void intel_reset_gt_powersave(struct drm_i915_private *dev_priv);
void gen6_update_ring_freq(struct drm_i915_private *dev_priv);
void gen6_rps_busy(struct drm_i915_private *dev_priv);
void gen6_rps_reset_ei(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 576e98744a2d..817c84c22782 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6508,15 +6508,6 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
}
}
-void intel_reset_gt_powersave(struct drm_i915_private *dev_priv)
-{
- if (INTEL_INFO(dev_priv)->gen < 6)
- return;
-
- gen6_suspend_rps(dev_priv);
- dev_priv->rps.enabled = false;
-}
-
static void ibx_init_clock_gating(struct drm_device *dev)
{
struct drm_i915_private *dev_priv = dev->dev_private;
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
2016-05-23 14:09 ` [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset ville.syrjala
@ 2016-05-23 14:09 ` ville.syrjala
2016-05-23 14:33 ` Chris Wilson
2016-05-24 8:27 ` Daniel Vetter
2016-05-23 14:30 ` [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB Chris Wilson
` (2 subsequent siblings)
4 siblings, 2 replies; 14+ messages in thread
From: ville.syrjala @ 2016-05-23 14:09 UTC (permalink / raw)
To: intel-gfx
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
On SNB (at least) it's dangeruopus to hang the GPU with an infinite
batch buffer loop while RPS is disabled. The only thing that keeps
the system going in such circumstances are the internal RPS timers,
so we should never feed the GPU with RPS disabled unless we want to
risk a total system hang.
Previously we didn't fully disable RPS, but that changes in
commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
so we probably didn't see the problem so often previously. But
even before that we were at the mercy of the BIOS for the initial
RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
upon boot could have been fatal.
To avoid the problems let's just make the RPS enable immediate.
This renders the delayed_resume_work useless actually. We could perhaps
just move the ring freq table initialization to the work and do the
other stull synchronously?
Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
This is more and RFC at this point. Perhaps we might want to keep the delayed
work but just for the ring freq table update (which is the main reson this whole
thing exists in the first place). Another factor is that wait_for() is not
exactly optiomal currently, so it makes the operation slower than it needs to
be. Would need some hard numbers to see if it's worth keeping the delayed work
or not I suppose.
drivers/gpu/drm/i915/intel_pm.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 817c84c22782..e65c5c2b9b4e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6505,6 +6505,7 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
if (schedule_delayed_work(&dev_priv->rps.delayed_resume_work,
round_jiffies_up_relative(HZ)))
intel_runtime_pm_get_noresume(dev_priv);
+ flush_delayed_work(&dev_priv->rps.delayed_resume_work);
}
}
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
2016-05-23 14:09 ` [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset ville.syrjala
2016-05-23 14:09 ` [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate ville.syrjala
@ 2016-05-23 14:30 ` Chris Wilson
2016-05-23 14:34 ` Ville Syrjälä
2016-05-23 14:42 ` [PATCH v2 " ville.syrjala
2016-05-23 16:29 ` ✗ Ro.CI.BAT: warning for series starting with [v2,1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB (rev2) Patchwork
4 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2016-05-23 14:30 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Mon, May 23, 2016 at 05:09:41PM +0300, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> SNB (and IVB too I suppose) starts to misbehave if the GPU gets stuck
> in an infinite batch buffer loop. The GPU apparently hogs something
> critical and CPUs start to lose interrupts and whatnot. We can keep
> the system limping along by unmasking some interrupts in
> GEN6_PMINTRMSK. The EI up interrupt has been previously chosen for
> that task, so let's never mask it.
>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/intel_pm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 29bdd79d9039..576e98744a2d 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4719,7 +4719,7 @@ void gen6_rps_idle(struct drm_i915_private *dev_priv)
> else
> gen6_set_rps(dev_priv, dev_priv->rps.idle_freq);
> dev_priv->rps.last_adj = 0;
> - I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
> + I915_WRITE(GEN6_PMINTRMSK, gen6_rps_pm_mask(dev_priv, ~0));
gen6_sanitize_rps_pm_mask()
With that,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset
2016-05-23 14:09 ` [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset ville.syrjala
@ 2016-05-23 14:32 ` Chris Wilson
2016-05-24 8:24 ` Daniel Vetter
0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2016-05-23 14:32 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Mon, May 23, 2016 at 05:09:42PM +0300, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Based on my observations GPU reset doesn't clobber the RPS state, so
> frobbing with RPS around GPU reset seems rather pointless. Just get
> rid of it.
>
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Testcase: igt/pm_rpm/reset ?
Looks like that is the desired test.
Maybe make it basic and see what CI says :)
Actually it probably should be a basic test considering how often we
hang.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate
2016-05-23 14:09 ` [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate ville.syrjala
@ 2016-05-23 14:33 ` Chris Wilson
2016-05-24 8:27 ` Daniel Vetter
1 sibling, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2016-05-23 14:33 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Mon, May 23, 2016 at 05:09:43PM +0300, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> On SNB (at least) it's dangeruopus to hang the GPU with an infinite
> batch buffer loop while RPS is disabled. The only thing that keeps
> the system going in such circumstances are the internal RPS timers,
> so we should never feed the GPU with RPS disabled unless we want to
> risk a total system hang.
>
> Previously we didn't fully disable RPS, but that changes in
> commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> so we probably didn't see the problem so often previously. But
> even before that we were at the mercy of the BIOS for the initial
> RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
> upon boot could have been fatal.
>
> To avoid the problems let's just make the RPS enable immediate.
> This renders the delayed_resume_work useless actually. We could perhaps
> just move the ring freq table initialization to the work and do the
> other stull synchronously?
>
> Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> This is more and RFC at this point. Perhaps we might want to keep the delayed
> work but just for the ring freq table update (which is the main reson this whole
> thing exists in the first place). Another factor is that wait_for() is not
> exactly optiomal currently, so it makes the operation slower than it needs to
> be. Would need some hard numbers to see if it's worth keeping the delayed work
> or not I suppose.
I am in favour - the entire chain should be async, not just this little
step to setup the ring freq.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
2016-05-23 14:30 ` [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB Chris Wilson
@ 2016-05-23 14:34 ` Ville Syrjälä
0 siblings, 0 replies; 14+ messages in thread
From: Ville Syrjälä @ 2016-05-23 14:34 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On Mon, May 23, 2016 at 03:30:04PM +0100, Chris Wilson wrote:
> On Mon, May 23, 2016 at 05:09:41PM +0300, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > SNB (and IVB too I suppose) starts to misbehave if the GPU gets stuck
> > in an infinite batch buffer loop. The GPU apparently hogs something
> > critical and CPUs start to lose interrupts and whatnot. We can keep
> > the system limping along by unmasking some interrupts in
> > GEN6_PMINTRMSK. The EI up interrupt has been previously chosen for
> > that task, so let's never mask it.
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_pm.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 29bdd79d9039..576e98744a2d 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -4719,7 +4719,7 @@ void gen6_rps_idle(struct drm_i915_private *dev_priv)
> > else
> > gen6_set_rps(dev_priv, dev_priv->rps.idle_freq);
> > dev_priv->rps.last_adj = 0;
> > - I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
> > + I915_WRITE(GEN6_PMINTRMSK, gen6_rps_pm_mask(dev_priv, ~0));
>
> gen6_sanitize_rps_pm_mask()
Doh. That's what I had on the other machine. Shouldn't copy patches
by retyping them by hand.
>
> With that,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
` (2 preceding siblings ...)
2016-05-23 14:30 ` [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB Chris Wilson
@ 2016-05-23 14:42 ` ville.syrjala
2016-05-23 16:29 ` ✗ Ro.CI.BAT: warning for series starting with [v2,1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB (rev2) Patchwork
4 siblings, 0 replies; 14+ messages in thread
From: ville.syrjala @ 2016-05-23 14:42 UTC (permalink / raw)
To: intel-gfx
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
SNB (and IVB too I suppose) starts to misbehave if the GPU gets stuck
in an infinite batch buffer loop. The GPU apparently hogs something
critical and CPUs start to lose interrupts and whatnot. We can keep
the system limping along by unmasking some interrupts in
GEN6_PMINTRMSK. The EI up interrupt has been previously chosen for
that task, so let's never mask it.
v2: s/gen6_rps_pm_mask/gen6_sanitize_rps_pm_mask/ (Chris)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/intel_pm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 29bdd79d9039..96be0b6f5712 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4719,7 +4719,7 @@ void gen6_rps_idle(struct drm_i915_private *dev_priv)
else
gen6_set_rps(dev_priv, dev_priv->rps.idle_freq);
dev_priv->rps.last_adj = 0;
- I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
+ I915_WRITE(GEN6_PMINTRMSK, gen6_sanitize_rps_pm_mask(dev_priv, ~0));
}
mutex_unlock(&dev_priv->rps.hw_lock);
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 14+ messages in thread
* ✗ Ro.CI.BAT: warning for series starting with [v2,1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB (rev2)
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
` (3 preceding siblings ...)
2016-05-23 14:42 ` [PATCH v2 " ville.syrjala
@ 2016-05-23 16:29 ` Patchwork
4 siblings, 0 replies; 14+ messages in thread
From: Patchwork @ 2016-05-23 16:29 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
== Series Details ==
Series: series starting with [v2,1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB (rev2)
URL : https://patchwork.freedesktop.org/series/7572/
State : warning
== Summary ==
Series 7572v2 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/7572/revisions/2/mbox
Test gem_exec_flush:
Subgroup basic-batch-kernel-default-cmd:
fail -> PASS (ro-byt-n2820)
Test kms_pipe_crc_basic:
Subgroup bad-nb-words-1:
pass -> DMESG-WARN (ro-skl-i7-6700hq)
fi-bdw-i7-5557u total:209 pass:197 dwarn:0 dfail:0 fail:0 skip:12
fi-bsw-n3050 total:209 pass:167 dwarn:0 dfail:0 fail:2 skip:40
fi-hsw-i7-4770r total:209 pass:186 dwarn:0 dfail:0 fail:0 skip:23
fi-skl-i5-6260u total:209 pass:196 dwarn:2 dfail:0 fail:0 skip:11
fi-skl-i7-6700k total:209 pass:182 dwarn:2 dfail:0 fail:0 skip:25
fi-snb-i7-2600 total:209 pass:170 dwarn:0 dfail:0 fail:0 skip:39
ro-bdw-i5-5250u total:209 pass:172 dwarn:0 dfail:0 fail:0 skip:37
ro-bdw-i7-5557U total:209 pass:197 dwarn:0 dfail:0 fail:0 skip:12
ro-bdw-i7-5600u total:209 pass:180 dwarn:0 dfail:0 fail:1 skip:28
ro-bsw-n3050 total:209 pass:168 dwarn:0 dfail:0 fail:2 skip:39
ro-byt-n2820 total:209 pass:170 dwarn:0 dfail:0 fail:2 skip:37
ro-hsw-i3-4010u total:209 pass:186 dwarn:0 dfail:0 fail:0 skip:23
ro-hsw-i7-4770r total:209 pass:186 dwarn:0 dfail:0 fail:0 skip:23
ro-ilk-i7-620lm total:209 pass:146 dwarn:0 dfail:0 fail:1 skip:62
ro-ilk1-i5-650 total:204 pass:146 dwarn:0 dfail:0 fail:1 skip:57
ro-ivb2-i7-3770 total:209 pass:181 dwarn:0 dfail:0 fail:0 skip:28
ro-skl-i7-6700hq total:204 pass:180 dwarn:3 dfail:0 fail:0 skip:21
ro-snb-i7-2620M total:209 pass:170 dwarn:0 dfail:0 fail:1 skip:38
fi-byt-n2820 failed to connect after reboot
fi-hsw-i7-4770k failed to connect after reboot
ro-ivb-i7-3770 failed to connect after reboot
Results at /archive/results/CI_IGT_test/RO_Patchwork_978/
facf329 drm-intel-nightly: 2016y-05m-23d-15h-31m-43s UTC integration manifest
c2b85aa drm/i915: Make RPS enable immediate
ec7c1f6 drm/i915: Don't frob with RPS around GPU reset
7311f7d drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset
2016-05-23 14:32 ` Chris Wilson
@ 2016-05-24 8:24 ` Daniel Vetter
2016-05-24 8:36 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2016-05-24 8:24 UTC (permalink / raw)
To: Chris Wilson, ville.syrjala, intel-gfx
On Mon, May 23, 2016 at 03:32:40PM +0100, Chris Wilson wrote:
> On Mon, May 23, 2016 at 05:09:42PM +0300, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Based on my observations GPU reset doesn't clobber the RPS state, so
> > frobbing with RPS around GPU reset seems rather pointless. Just get
> > rid of it.
> >
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Testcase: igt/pm_rpm/reset ?
pm_rps&rc6_residency seem more appropriate. pm_rpm just outright yanks the
power cable for the GT. Unfortunately those 2 tests aren't particularly
nasty ...
-Daniel
>
> Looks like that is the desired test.
>
> Maybe make it basic and see what CI says :)
>
> Actually it probably should be a basic test considering how often we
> hang.
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate
2016-05-23 14:09 ` [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate ville.syrjala
2016-05-23 14:33 ` Chris Wilson
@ 2016-05-24 8:27 ` Daniel Vetter
2016-05-24 10:14 ` Ville Syrjälä
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2016-05-24 8:27 UTC (permalink / raw)
To: ville.syrjala; +Cc: intel-gfx
On Mon, May 23, 2016 at 05:09:43PM +0300, ville.syrjala@linux.intel.com wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> On SNB (at least) it's dangeruopus to hang the GPU with an infinite
> batch buffer loop while RPS is disabled. The only thing that keeps
> the system going in such circumstances are the internal RPS timers,
> so we should never feed the GPU with RPS disabled unless we want to
> risk a total system hang.
>
> Previously we didn't fully disable RPS, but that changes in
> commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> so we probably didn't see the problem so often previously. But
> even before that we were at the mercy of the BIOS for the initial
> RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
> upon boot could have been fatal.
>
> To avoid the problems let's just make the RPS enable immediate.
> This renders the delayed_resume_work useless actually. We could perhaps
> just move the ring freq table initialization to the work and do the
> other stull synchronously?
>
> Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> This is more and RFC at this point. Perhaps we might want to keep the delayed
> work but just for the ring freq table update (which is the main reson this whole
> thing exists in the first place). Another factor is that wait_for() is not
> exactly optiomal currently, so it makes the operation slower than it needs to
> be. Would need some hard numbers to see if it's worth keeping the delayed work
> or not I suppose.
Loading the ring freq tables takes forever, so definitely want to keep
that. I'd just split rps setup in two parts and push the schedule_work
down a bit. Long term we can go overboard with async (and maybey only
stall for all the GT setup in the very first batchbuffer).
-Daniel
>
> drivers/gpu/drm/i915/intel_pm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 817c84c22782..e65c5c2b9b4e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -6505,6 +6505,7 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
> if (schedule_delayed_work(&dev_priv->rps.delayed_resume_work,
> round_jiffies_up_relative(HZ)))
> intel_runtime_pm_get_noresume(dev_priv);
> + flush_delayed_work(&dev_priv->rps.delayed_resume_work);
> }
> }
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset
2016-05-24 8:24 ` Daniel Vetter
@ 2016-05-24 8:36 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2016-05-24 8:36 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Tue, May 24, 2016 at 10:24:58AM +0200, Daniel Vetter wrote:
> On Mon, May 23, 2016 at 03:32:40PM +0100, Chris Wilson wrote:
> > On Mon, May 23, 2016 at 05:09:42PM +0300, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > Based on my observations GPU reset doesn't clobber the RPS state, so
> > > frobbing with RPS around GPU reset seems rather pointless. Just get
> > > rid of it.
> > >
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Testcase: igt/pm_rpm/reset ?
>
> pm_rps&rc6_residency seem more appropriate. pm_rpm just outright yanks the
> power cable for the GT. Unfortunately those 2 tests aren't particularly
> nasty ...
Typo, I was looking at pm_rps since it has the reset test that should be
checking whether RPS works after hang/reset.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate
2016-05-24 8:27 ` Daniel Vetter
@ 2016-05-24 10:14 ` Ville Syrjälä
2016-05-24 10:43 ` Chris Wilson
0 siblings, 1 reply; 14+ messages in thread
From: Ville Syrjälä @ 2016-05-24 10:14 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Tue, May 24, 2016 at 10:27:19AM +0200, Daniel Vetter wrote:
> On Mon, May 23, 2016 at 05:09:43PM +0300, ville.syrjala@linux.intel.com wrote:
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > On SNB (at least) it's dangeruopus to hang the GPU with an infinite
> > batch buffer loop while RPS is disabled. The only thing that keeps
> > the system going in such circumstances are the internal RPS timers,
> > so we should never feed the GPU with RPS disabled unless we want to
> > risk a total system hang.
> >
> > Previously we didn't fully disable RPS, but that changes in
> > commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > so we probably didn't see the problem so often previously. But
> > even before that we were at the mercy of the BIOS for the initial
> > RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
> > upon boot could have been fatal.
> >
> > To avoid the problems let's just make the RPS enable immediate.
> > This renders the delayed_resume_work useless actually. We could perhaps
> > just move the ring freq table initialization to the work and do the
> > other stull synchronously?
> >
> > Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > This is more and RFC at this point. Perhaps we might want to keep the delayed
> > work but just for the ring freq table update (which is the main reson this whole
> > thing exists in the first place). Another factor is that wait_for() is not
> > exactly optiomal currently, so it makes the operation slower than it needs to
> > be. Would need some hard numbers to see if it's worth keeping the delayed work
> > or not I suppose.
>
> Loading the ring freq tables takes forever, so definitely want to keep
> that.
It only takes forever becasue wait_for() sucks.
with current sleep duration of 1000-2000 us:
[ 308.231364] rps init took 5533 us
[ 308.266322] ring freq init took 34952 us
sleep duration reduced to 100-200 us:
[ 155.367588] rps init took 679 us
[ 155.371100] ring freq init took 3509 us
So looks like someone just failed to root cause the slowness, and then
went on to optimize the wrong thing.
> I'd just split rps setup in two parts and push the schedule_work
> down a bit. Long term we can go overboard with async (and maybey only
> stall for all the GT setup in the very first batchbuffer).
> -Daniel
>
> >
> > drivers/gpu/drm/i915/intel_pm.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 817c84c22782..e65c5c2b9b4e 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -6505,6 +6505,7 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
> > if (schedule_delayed_work(&dev_priv->rps.delayed_resume_work,
> > round_jiffies_up_relative(HZ)))
> > intel_runtime_pm_get_noresume(dev_priv);
> > + flush_delayed_work(&dev_priv->rps.delayed_resume_work);
> > }
> > }
> >
> > --
> > 2.7.4
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate
2016-05-24 10:14 ` Ville Syrjälä
@ 2016-05-24 10:43 ` Chris Wilson
0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2016-05-24 10:43 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx
On Tue, May 24, 2016 at 01:14:53PM +0300, Ville Syrjälä wrote:
> On Tue, May 24, 2016 at 10:27:19AM +0200, Daniel Vetter wrote:
> > On Mon, May 23, 2016 at 05:09:43PM +0300, ville.syrjala@linux.intel.com wrote:
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > >
> > > On SNB (at least) it's dangeruopus to hang the GPU with an infinite
> > > batch buffer loop while RPS is disabled. The only thing that keeps
> > > the system going in such circumstances are the internal RPS timers,
> > > so we should never feed the GPU with RPS disabled unless we want to
> > > risk a total system hang.
> > >
> > > Previously we didn't fully disable RPS, but that changes in
> > > commit 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > > so we probably didn't see the problem so often previously. But
> > > even before that we were at the mercy of the BIOS for the initial
> > > RPS state, so if the BIOS didn't enable RPS a GPU hang immediately
> > > upon boot could have been fatal.
> > >
> > > To avoid the problems let's just make the RPS enable immediate.
> > > This renders the delayed_resume_work useless actually. We could perhaps
> > > just move the ring freq table initialization to the work and do the
> > > other stull synchronously?
> > >
> > > Fixes: 2030d684f7b3 ("drm/i915/bxt: Explicitly clear the Turbo control register")
> > > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > ---
> > > This is more and RFC at this point. Perhaps we might want to keep the delayed
> > > work but just for the ring freq table update (which is the main reson this whole
> > > thing exists in the first place). Another factor is that wait_for() is not
> > > exactly optiomal currently, so it makes the operation slower than it needs to
> > > be. Would need some hard numbers to see if it's worth keeping the delayed work
> > > or not I suppose.
> >
> > Loading the ring freq tables takes forever, so definitely want to keep
> > that.
>
> It only takes forever becasue wait_for() sucks.
>
> with current sleep duration of 1000-2000 us:
> [ 308.231364] rps init took 5533 us
> [ 308.266322] ring freq init took 34952 us
>
> sleep duration reduced to 100-200 us:
> [ 155.367588] rps init took 679 us
> [ 155.371100] ring freq init took 3509 us
>
> So looks like someone just failed to root cause the slowness, and then
> went on to optimize the wrong thing.
It's still 4ms that can be done in parallel to userspace starting :)
(And looks can be reduced further with an smarter wait_for).
So we encourage Mika to send his updated patches...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-05-24 10:43 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-23 14:09 [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB ville.syrjala
2016-05-23 14:09 ` [PATCH 2/3] drm/i915: Don't frob with RPS around GPU reset ville.syrjala
2016-05-23 14:32 ` Chris Wilson
2016-05-24 8:24 ` Daniel Vetter
2016-05-24 8:36 ` Chris Wilson
2016-05-23 14:09 ` [RFC][PATCH 3/3] drm/i915: Make RPS enable immediate ville.syrjala
2016-05-23 14:33 ` Chris Wilson
2016-05-24 8:27 ` Daniel Vetter
2016-05-24 10:14 ` Ville Syrjälä
2016-05-24 10:43 ` Chris Wilson
2016-05-23 14:30 ` [PATCH 1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB Chris Wilson
2016-05-23 14:34 ` Ville Syrjälä
2016-05-23 14:42 ` [PATCH v2 " ville.syrjala
2016-05-23 16:29 ` ✗ Ro.CI.BAT: warning for series starting with [v2,1/3] drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB (rev2) Patchwork
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox