* [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged
@ 2017-10-15 14:30 Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 14:30 UTC (permalink / raw)
To: intel-gfx
If we fail to recover the HW state upon resume (i.e. our attempt to
clear the wedged bit and reset during i915_gem_sanitize() fails), then
skip the HW restart inside i915_gem_init_hw(). We will ultimate do the
the HW restart when sucessfully unwedgeding and reseting the HW later,
but attempting to restore a wedged device upon resume is risky as the HW
is in an unknown state.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9d39b309ce8..5993222c81ae 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
init_unused_rings(dev_priv);
BUG_ON(!dev_priv->kernel_context);
+ if (i915_terminally_wedged(&dev_priv->gpu_error)) {
+ ret = -EIO;
+ goto out;
+ }
ret = i915_ppgtt_init_hw(dev_priv);
if (ret) {
--
2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
@ 2017-10-15 14:37 ` Chris Wilson
2017-10-15 20:31 ` Chris Wilson
` (2 more replies)
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
` (3 subsequent siblings)
4 siblings, 3 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 14:37 UTC (permalink / raw)
To: intel-gfx
If we fail to recover the HW state upon resume (i.e. our attempt to
clear the wedged bit and reset during i915_gem_sanitize() fails), then
skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
HW restart when successfully unwedging and resetting the HW later,
but attempting to restore a wedged device upon resume is risky as the HW
is in an unknown state.
v2: Suppress the error message when detecting the already wedged HW.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9d39b309ce8..449f8c3788b1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
init_unused_rings(dev_priv);
BUG_ON(!dev_priv->kernel_context);
+ if (i915_terminally_wedged(&dev_priv->gpu_error)) {
+ ret = -EIO;
+ goto out;
+ }
ret = i915_ppgtt_init_hw(dev_priv);
if (ret) {
@@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
* wedged. But we only want to do this where the GPU is angry,
* for all other failure, such as an allocation failure, bail.
*/
- DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
- i915_gem_set_wedged(dev_priv);
+ if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
+ DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
+ i915_gem_set_wedged(dev_priv);
+ }
ret = 0;
}
--
2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
@ 2017-10-15 14:53 ` Patchwork
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 14:53 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Series 31987v1 drm/i915: Skip HW reinitialisation on resume if still wedged
https://patchwork.freedesktop.org/api/1.0/series/31987/revisions/1/mbox/
Test chamelium:
Subgroup dp-edid-read:
pass -> FAIL (fi-kbl-7500u) fdo#102672
fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:460s
fi-bdw-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:472s
fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:389s
fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:582s
fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:288s
fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:530s
fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:516s
fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:542s
fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:525s
fi-cfl-s total:289 pass:253 dwarn:4 dfail:0 fail:0 skip:32 time:567s
fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:440s
fi-gdg-551 total:289 pass:178 dwarn:1 dfail:0 fail:1 skip:109 time:275s
fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:607s
fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:445s
fi-ilk-650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 time:465s
fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:512s
fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:481s
fi-kbl-7500u total:289 pass:263 dwarn:1 dfail:0 fail:1 skip:24 time:506s
fi-kbl-7567u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:490s
fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:600s
fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:653s
fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:471s
fi-skl-6700hq total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:663s
fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:536s
fi-skl-6770hq total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:511s
fi-skl-gvtdvm total:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:477s
fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:591s
fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:430s
3d7ee91be487380ef6cad329fafbe424f6885372 drm-tip: 2017y-10m-14d-00h-14m-47s UTC integration manifest
bac6521dc7aa drm/i915: Skip HW reinitialisation on resume if still wedged
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6041/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-10-15 15:12 ` Patchwork
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 15:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Series 31987v2 drm/i915: Skip HW reinitialisation on resume if still wedged
https://patchwork.freedesktop.org/api/1.0/series/31987/revisions/2/mbox/
Test gem_exec_suspend:
Subgroup basic-s3:
dmesg-warn -> PASS (fi-cfl-s) fdo#103186
Test drv_module_reload:
Subgroup basic-reload-inject:
dmesg-warn -> INCOMPLETE (fi-cfl-s) fdo#103206
fdo#103186 https://bugs.freedesktop.org/show_bug.cgi?id=103186
fdo#103206 https://bugs.freedesktop.org/show_bug.cgi?id=103206
fi-bdw-5557u total:289 pass:268 dwarn:0 dfail:0 fail:0 skip:21 time:463s
fi-bdw-gvtdvm total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:480s
fi-blb-e6850 total:289 pass:223 dwarn:1 dfail:0 fail:0 skip:65 time:387s
fi-bsw-n3050 total:289 pass:243 dwarn:0 dfail:0 fail:0 skip:46 time:570s
fi-bwr-2160 total:289 pass:183 dwarn:0 dfail:0 fail:0 skip:106 time:287s
fi-bxt-dsi total:289 pass:259 dwarn:0 dfail:0 fail:0 skip:30 time:520s
fi-bxt-j4205 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:521s
fi-byt-j1900 total:289 pass:253 dwarn:1 dfail:0 fail:0 skip:35 time:531s
fi-byt-n2820 total:289 pass:249 dwarn:1 dfail:0 fail:0 skip:39 time:528s
fi-cfl-s total:288 pass:254 dwarn:2 dfail:0 fail:0 skip:31
fi-elk-e7500 total:289 pass:229 dwarn:0 dfail:0 fail:0 skip:60 time:442s
fi-gdg-551 total:289 pass:178 dwarn:1 dfail:0 fail:1 skip:109 time:273s
fi-glk-1 total:289 pass:261 dwarn:0 dfail:0 fail:0 skip:28 time:602s
fi-hsw-4770r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:438s
fi-ilk-650 total:289 pass:228 dwarn:0 dfail:0 fail:0 skip:61 time:456s
fi-ivb-3520m total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:501s
fi-ivb-3770 total:289 pass:260 dwarn:0 dfail:0 fail:0 skip:29 time:471s
fi-kbl-7500u total:289 pass:264 dwarn:1 dfail:0 fail:0 skip:24 time:502s
fi-kbl-7567u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:485s
fi-kbl-r total:289 pass:262 dwarn:0 dfail:0 fail:0 skip:27 time:596s
fi-pnv-d510 total:289 pass:222 dwarn:1 dfail:0 fail:0 skip:66 time:651s
fi-skl-6260u total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:469s
fi-skl-6700hq total:289 pass:263 dwarn:0 dfail:0 fail:0 skip:26 time:656s
fi-skl-6700k total:289 pass:265 dwarn:0 dfail:0 fail:0 skip:24 time:531s
fi-skl-6770hq total:289 pass:269 dwarn:0 dfail:0 fail:0 skip:20 time:572s
fi-skl-gvtdvm total:289 pass:266 dwarn:0 dfail:0 fail:0 skip:23 time:471s
fi-snb-2520m total:289 pass:250 dwarn:0 dfail:0 fail:0 skip:39 time:582s
fi-snb-2600 total:289 pass:249 dwarn:0 dfail:0 fail:0 skip:40 time:429s
3d7ee91be487380ef6cad329fafbe424f6885372 drm-tip: 2017y-10m-14d-00h-14m-47s UTC integration manifest
fd6af56a8f26 drm/i915: Skip HW reinitialisation on resume if still wedged
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6042/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* ✓ Fi.CI.IGT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
` (2 preceding siblings ...)
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
@ 2017-10-15 16:10 ` Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2017-10-15 16:10 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Skip HW reinitialisation on resume if still wedged (rev2)
URL : https://patchwork.freedesktop.org/series/31987/
State : success
== Summary ==
Test kms_plane:
Subgroup plane-panning-bottom-right-suspend-pipe-C-planes:
skip -> PASS (shard-hsw)
Test kms_frontbuffer_tracking:
Subgroup fbc-rgb101010-draw-mmap-gtt:
skip -> PASS (shard-hsw)
shard-hsw total:2553 pass:1441 dwarn:0 dfail:0 fail:9 skip:1103 time:9681s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_6042/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
@ 2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 15:30 ` Mika Kuoppala
2 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-15 20:31 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2017-10-15 15:37:25)
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103240
Testcase: igt/gem_eio/in-flight-suspend
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
@ 2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 14:28 ` Chris Wilson
2017-10-16 15:30 ` Mika Kuoppala
2 siblings, 1 reply; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 14:24 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..449f8c3788b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
You have done some hw initialization already before this point.
Is there a reason for not moving this right before acquiring
forcewake?
-Mika
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> @@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
> * wedged. But we only want to do this where the GPU is angry,
> * for all other failure, such as an allocation failure, bail.
> */
> - DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> - i915_gem_set_wedged(dev_priv);
> + if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> + i915_gem_set_wedged(dev_priv);
> + }
> ret = 0;
> }
>
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-16 14:24 ` Mika Kuoppala
@ 2017-10-16 14:28 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-16 14:28 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
Quoting Mika Kuoppala (2017-10-16 15:24:33)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If we fail to recover the HW state upon resume (i.e. our attempt to
> > clear the wedged bit and reset during i915_gem_sanitize() fails), then
> > skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> > HW restart when successfully unwedging and resetting the HW later,
> > but attempting to restore a wedged device upon resume is risky as the HW
> > is in an unknown state.
> >
> > v2: Suppress the error message when detecting the already wedged HW.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> > 1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index d9d39b309ce8..449f8c3788b1 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> > init_unused_rings(dev_priv);
> >
> > BUG_ON(!dev_priv->kernel_context);
> > + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> > + ret = -EIO;
> > + goto out;
> > + }
> >
>
> You have done some hw initialization already before this point.
> Is there a reason for not moving this right before acquiring
> forcewake?
init_unused_rings() is part of the sanitisation I wanted to keep. The
other mmio writes we need to sort out in the right w/a category; if they
are display related we need to keep them. Hence, being chicken and
sticking the escape clause here, right before we commit to restarting
the engines.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
` (3 preceding siblings ...)
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-10-16 15:29 ` Mika Kuoppala
4 siblings, 0 replies; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 15:29 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimate do the
> the HW restart when sucessfully unwedgeding and reseting the HW later,
successfully unwedging
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..5993222c81ae 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
@ 2017-10-16 15:30 ` Mika Kuoppala
2017-10-16 20:15 ` Chris Wilson
2 siblings, 1 reply; 11+ messages in thread
From: Mika Kuoppala @ 2017-10-16 15:30 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
Chris Wilson <chris@chris-wilson.co.uk> writes:
> If we fail to recover the HW state upon resume (i.e. our attempt to
> clear the wedged bit and reset during i915_gem_sanitize() fails), then
> skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> HW restart when successfully unwedging and resetting the HW later,
> but attempting to restore a wedged device upon resume is risky as the HW
> is in an unknown state.
>
> v2: Suppress the error message when detecting the already wedged HW.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Stamping the right version is also a helpful.
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d9d39b309ce8..449f8c3788b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4835,6 +4835,10 @@ int i915_gem_init_hw(struct drm_i915_private *dev_priv)
> init_unused_rings(dev_priv);
>
> BUG_ON(!dev_priv->kernel_context);
> + if (i915_terminally_wedged(&dev_priv->gpu_error)) {
> + ret = -EIO;
> + goto out;
> + }
>
> ret = i915_ppgtt_init_hw(dev_priv);
> if (ret) {
> @@ -4933,8 +4937,10 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
> * wedged. But we only want to do this where the GPU is angry,
> * for all other failure, such as an allocation failure, bail.
> */
> - DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> - i915_gem_set_wedged(dev_priv);
> + if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> + DRM_ERROR("Failed to initialize GPU, declaring it wedged\n");
> + i915_gem_set_wedged(dev_priv);
> + }
> ret = 0;
> }
>
> --
> 2.15.0.rc0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Skip HW reinitialisation on resume if still wedged
2017-10-16 15:30 ` Mika Kuoppala
@ 2017-10-16 20:15 ` Chris Wilson
0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2017-10-16 20:15 UTC (permalink / raw)
To: Mika Kuoppala, intel-gfx
Quoting Mika Kuoppala (2017-10-16 16:30:33)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
> > If we fail to recover the HW state upon resume (i.e. our attempt to
> > clear the wedged bit and reset during i915_gem_sanitize() fails), then
> > skip the HW restart inside i915_gem_init_hw(). We will ultimately do the
> > HW restart when successfully unwedging and resetting the HW later,
> > but attempting to restore a wedged device upon resume is risky as the HW
> > is in an unknown state.
> >
> > v2: Suppress the error message when detecting the already wedged HW.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>
> Stamping the right version is also a helpful.
>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Thanks for taking the time to question the code carefully. This clears
up CI, but I am still able to kill snb at the moment with
gem_exec_whisper/hang-normal (though since we have reset enabled, it
looks to be a different problem).
One step forward, and pushed,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-10-16 20:15 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-15 14:30 [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Chris Wilson
2017-10-15 14:37 ` [PATCH v2] " Chris Wilson
2017-10-15 20:31 ` Chris Wilson
2017-10-16 14:24 ` Mika Kuoppala
2017-10-16 14:28 ` Chris Wilson
2017-10-16 15:30 ` Mika Kuoppala
2017-10-16 20:15 ` Chris Wilson
2017-10-15 14:53 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-10-15 15:12 ` ✓ Fi.CI.BAT: success for drm/i915: Skip HW reinitialisation on resume if still wedged (rev2) Patchwork
2017-10-15 16:10 ` ✓ Fi.CI.IGT: " Patchwork
2017-10-16 15:29 ` [PATCH] drm/i915: Skip HW reinitialisation on resume if still wedged Mika Kuoppala
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.