* [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged
@ 2017-10-15 14:30 Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 14:30 UTC (permalink / raw)
To: intel-gfx
If we fail to recover the HW state upon resume (i.e. our attempt to
clear the wedged bit and reset during i915_gem_sanitize() fails), then
skip the HW restart inside i915_gem_init_hw(). We will ultimate do the
the HW restart when sucessfully unwedgeding and reseting the HW later,
but attempting to restore a wedged device upon resume is risky as the HW
is in an unknown state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9d39b309ce8..5993222c81ae 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
init_unused_rings(dev_priv);
BUG_ON(!dev_priv->kernel_context);
+ if (i915_terminally_wedged(&dev_priv->gpu_error)) {
+ ret = -EIO;
+ goto out;
+ }
ret = i915_ppgtt_init_hw(dev_priv);
if (ret) {
--
2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
@ 2017-10-15 14:37 ` Chris Wilson
2017-10-15 20:31 ` Chris Wilson
` (2 more replies)
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
` (3 subsequent siblings)
4 siblings, 3 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 14:37 UTC (permalink / raw)
To: intel-gfx
If we fail to recover the HW state upon resume (i.e. our attempt to
clear the wedged bit and reset during i915_gem_sanitize() fails), then
skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
HW restart when successfully unwedging and resetting the HW later,
but attempting to restore a wedged device upon resume is risky as the HW
is in an unknown state.
v2: Suppress the error message when detecting the already wedged HW.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9d39b309ce8..449f8c3788b1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
init_unused_rings(dev_priv);
BUG_ON(!dev_priv->kernel_context);
+ if (i915_terminally_wedged(&dev_priv->gpu_error)) {
+ ret = -EIO;
+ goto out;
+ }
ret = i915_ppgtt_init_hw(dev_priv);
if (ret) {
@@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
* wedged. But we only want to do this where the GPU is angry,
* for all other failure, such as an allocation failure, bail.
*/
- DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
- i915_gem_set_wedged(dev_priv);
+ if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
+ DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
+ i915_gem_set_wedged(dev_priv);
+ }
ret = 0;
}
--
2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
@ 2017-10-15 14:53 ` Patchwork
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 14:53 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Series 31987v1 drm/i915: Skip HW reinitialisation on resume if still wedged
https://patchwork.freedesktop.org/api/1.0/series/31987/revisions/1/mbox/
Test chamelium:
Subgroup dp-edid-read:
pass -> FAIL (fi-kbl-7500u) fdo#102672
fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:460s
fi-bdw-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:472s
fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:389s
fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:582s
fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s
fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:530s
fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:516s
fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:542s
fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:525s
fi-cfl-s total:289 pass:253 dwarn:4 dfail:0 fail:0 skip:32 time:567s
fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:440s
fi-gdg-551 total:289 pass:178 dwarn:1 dfail:0 fail:1 skip:109 time:275s
fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:607s
fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:445s
fi-ilk-650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 time:465s
fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:512s
fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:481s
fi-kbl-7500u total:289 pass:263 dwarn:1 dfail:0 fail:1 skip:24 time:506s
fi-kbl-7567u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:490s
fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:600s
fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:653s
fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:471s
fi-skl-6700hq total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:663s
fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:536s
fi-skl-6770hq total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:511s
fi-skl-gvtdvm total:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:477s
fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:591s
fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:430s
3d7ee91be487380ef6cad329fafbe424f6885372 drm-tip: 2017y-10m-14d-00h-14m-47s UTC integration manifest
bac6521dc7aa drm/i915: Skip HW reinitialisation on resume if still wedged
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6041/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-10-15 15:12 ` Patchwork
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 15:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Series 31987v2 drm/i915: Skip HW reinitialisation on resume if still wedged
https://patchwork.freedesktop.org/api/1.0/series/31987/revisions/2/mbox/
Test gem_exec_suspend:
Subgroup basic-s3:
dmesg-warn -> PASS (fi-cfl-s) fdo#103186
Test drv_module_reload:
Subgroup basic-reload-inject:
dmesg-warn -> INCOMPLETE (fi-cfl-s) fdo#103206
fdo#103186 https://bugs.freedesktop.org/show_bug.cgi?id=103186
fdo#103206 https://bugs.freedesktop.org/show_bug.cgi?id=103206
fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:463s
fi-bdw-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:480s
fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:387s
fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:570s
fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:287s
fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:520s
fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:521s
fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:531s
fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:528s
fi-cfl-s total:288 pass:254 dwarn:2 dfail:0 fail:0 skip:31
fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:442s
fi-gdg-551 total:289 pass:178 dwarn:1 dfail:0 fail:1 skip:109 time:273s
fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:602s
fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:438s
fi-ilk-650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 time:456s
fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:501s
fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:471s
fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:502s
fi-kbl-7567u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:485s
fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:596s
fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:651s
fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:469s
fi-skl-6700hq total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:656s
fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:531s
fi-skl-6770hq total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:572s
fi-skl-gvtdvm total:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:471s
fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:582s
fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:429s
3d7ee91be487380ef6cad329fafbe424f6885372 drm-tip: 2017y-10m-14d-00h-14m-47s UTC integration manifest
fd6af56a8f26 drm/i915: Skip HW reinitialisation on resume if still wedged
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6042/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.IGT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
` (2 preceding siblings ...)
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
@ 2017-10-15 16:10 ` Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 16:10 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Test kms_plane:
Subgroup plane-panning-bottom-right-suspend-pipe-C-planes:
skip -> PASS (shard-hsw)
Test kms_frontbuffer_tracking:
Subgroup fbc-rgb101010-draw-mmap-gtt:
skip -> PASS (shard-hsw)
shard-hsw total:2553 pass:1441 dwarn:0 dfail:0 fail:9 skip:1103 time:9681s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6042/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
@ 2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 15:30 ` Mika Kuoppala
2 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 20:31 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2017-10-15 15:37:25)
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103240
Testcase: igt/gem_eio/in-flight-suspend
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
@ 2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 14:28 ` Chris Wilson
2017-10-16 15:30 ` Mika Kuoppala
2 siblings, 1 reply; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 14:24 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..449f8c3788b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
You have done some hw initialization already before this point.
Is there a reason for not moving this right before acquiring
forcewake?
-Mika
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> @@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
> * wedged. But we only want to do this where the GPU is angry,
> * for all other failure, such as an allocation failure, bail.
> */
> - DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> - i915_gem_set_wedged(dev_priv);
> + if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> + i915_gem_set_wedged(dev_priv);
> + }
> ret = 0;
> }
>
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-16 14:24 ` Mika Kuoppala
@ 2017-10-16 14:28 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-16 14:28 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
Quoting Mika Kuoppala (2017-10-16 15:24:33)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If we fail to recover the HW state upon resume (i.e. our attempt to
> > clear the wedged bit and reset during i915_gem_sanitize() fails), then
> > skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> > HW restart when successfully unwedging and resetting the HW later,
> > but attempting to restore a wedged device upon resume is risky as the HW
> > is in an unknown state.
> >
> > v2: Suppress the error message when detecting the already wedged HW.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index d9d39b309ce8..449f8c3788b1 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> > init_unused_rings(dev_priv);
> >
> > BUG_ON(!dev_priv->kernel_context);
> > + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> > + ret = -EIO;
> > + goto out;
> > + }
> >
>
> You have done some hw initialization already before this point.
> Is there a reason for not moving this right before acquiring
> forcewake?
init_unused_rings() is part of the sanitisation I wanted to keep. The
other mmio writes we need to sort out in the right w/a category; if they
are display related we need to keep them. Hence, being chicken and
sticking the escape clause here, right before we commit to restarting
the engines.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
` (3 preceding siblings ...)
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-10-16 15:29 ` Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 15:29 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimate do the
> the HW restart when sucessfully unwedgeding and reseting the HW later,
successfully unwedging
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..5993222c81ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
@ 2017-10-16 15:30 ` Mika Kuoppala
2017-10-16 20:15 ` Chris Wilson
2 siblings, 1 reply; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 15:30 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Stamping the right version is also a helpful.
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..449f8c3788b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> @@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
> * wedged. But we only want to do this where the GPU is angry,
> * for all other failure, such as an allocation failure, bail.
> */
> - DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> - i915_gem_set_wedged(dev_priv);
> + if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> + i915_gem_set_wedged(dev_priv);
> + }
> ret = 0;
> }
>
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-16 15:30 ` Mika Kuoppala
@ 2017-10-16 20:15 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-16 20:15 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
Quoting Mika Kuoppala (2017-10-16 16:30:33)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If we fail to recover the HW state upon resume (i.e. our attempt to
> > clear the wedged bit and reset during i915_gem_sanitize() fails), then
> > skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> > HW restart when successfully unwedging and resetting the HW later,
> > but attempting to restore a wedged device upon resume is risky as the HW
> > is in an unknown state.
> >
> > v2: Suppress the error message when detecting the already wedged HW.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>
> Stamping the right version is also a helpful.
>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Thanks for taking the time to question the code carefully. This clears
up CI, but I am still able to kill snb at the moment with
gem_exec_whisper/hang-normal (though since we have reset enabled, it
looks to be a different problem).
One step forward, and pushed,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-10-16 20:15 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 14:28 ` Chris Wilson
2017-10-16 15:30 ` Mika Kuoppala
2017-10-16 20:15 ` Chris Wilson
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox