public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/hsw: Fix GPU hang during resume from S3-devices state
@ 2016-10-12 14:46 Imre Deak
  2016-10-12 15:44 ` Chris Wilson
  2016-10-12 17:19 ` ✗ Fi.CI.BAT: warning for " Patchwork
  0 siblings, 2 replies; 5+ messages in thread
From: Imre Deak @ 2016-10-12 14:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Currently resuming on HSW from S3 pm_test/devices state leads to an
unrecoverable GPU hang. Resetting the GPU during suspend fixes this. For
a full S3 cycle this change only means the reset happens earlier (before
reaching S3). For S4 the reset will happen now both during the freeze
and quiesce phases, which is a benefit since it will guarantee that the
GPU is idle before creating and loading the hibernation image.

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 28 ++--------------------------
 drivers/gpu/drm/i915/i915_gem.c | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 89d3222..e9b3bfc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -532,32 +532,6 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
 
 static void i915_gem_fini(struct drm_device *dev)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
-
-	/*
-	 * Neither the BIOS, ourselves or any other kernel
-	 * expects the system to be in execlists mode on startup,
-	 * so we need to reset the GPU back to legacy mode. And the only
-	 * known way to disable logical contexts is through a GPU reset.
-	 *
-	 * So in order to leave the system in a known default configuration,
-	 * always reset the GPU upon unload. Afterwards we then clean up the
-	 * GEM state tracking, flushing off the requests and leaving the
-	 * system in a known idle state.
-	 *
-	 * Note that is of the upmost importance that the GPU is idle and
-	 * all stray writes are flushed *before* we dismantle the backing
-	 * storage for the pinned objects.
-	 *
-	 * However, since we are uncertain that reseting the GPU on older
-	 * machines is a good idea, we don't - just in case it leaves the
-	 * machine in an unusable condition.
-	 */
-	if (HAS_HW_CONTEXTS(dev)) {
-		int reset = intel_gpu_reset(dev_priv, ALL_ENGINES);
-		WARN_ON(reset && reset != -ENODEV);
-	}
-
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_cleanup_engines(dev);
 	i915_gem_context_fini(dev);
@@ -636,6 +610,8 @@ static int i915_load_modeset_init(struct drm_device *dev)
 	return 0;
 
 cleanup_gem:
+	if (i915_gem_suspend(dev))
+		DRM_ERROR("failed to idle hardware; continuing to unload!\n");
 	i915_gem_fini(dev);
 cleanup_irq:
 	intel_guc_fini(dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fdd496e..a86bc8f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4273,6 +4273,30 @@ int i915_gem_suspend(struct drm_device *dev)
 	 */
 	WARN_ON(dev_priv->gt.awake);
 
+	/*
+	 * Neither the BIOS, ourselves or any other kernel
+	 * expects the system to be in execlists mode on startup,
+	 * so we need to reset the GPU back to legacy mode. And the only
+	 * known way to disable logical contexts is through a GPU reset.
+	 *
+	 * So in order to leave the system in a known default configuration,
+	 * always reset the GPU upon unload and suspend. Afterwards we then
+	 * clean up the GEM state tracking, flushing off the requests and
+	 * leaving the system in a known idle state.
+	 *
+	 * Note that is of the upmost importance that the GPU is idle and
+	 * all stray writes are flushed *before* we dismantle the backing
+	 * storage for the pinned objects.
+	 *
+	 * However, since we are uncertain that reseting the GPU on older
+	 * machines is a good idea, we don't - just in case it leaves the
+	 * machine in an unusable condition.
+	 */
+	if (HAS_HW_CONTEXTS(dev)) {
+		int reset = intel_gpu_reset(dev_priv, ALL_ENGINES);
+		WARN_ON(reset && reset != -ENODEV);
+	}
+
 	return 0;
 
 err:
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/i915/hsw: Fix GPU hang during resume from S3-devices state
  2016-10-12 14:46 [PATCH] drm/i915/hsw: Fix GPU hang during resume from S3-devices state Imre Deak
@ 2016-10-12 15:44 ` Chris Wilson
  2016-10-12 17:19 ` ✗ Fi.CI.BAT: warning for " Patchwork
  1 sibling, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-10-12 15:44 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx, Mika Kuoppala

On Wed, Oct 12, 2016 at 05:46:37PM +0300, Imre Deak wrote:
> Currently resuming on HSW from S3 pm_test/devices state leads to an
> unrecoverable GPU hang. Resetting the GPU during suspend fixes this. For
> a full S3 cycle this change only means the reset happens earlier (before
> reaching S3). For S4 the reset will happen now both during the freeze
> and quiesce phases, which is a benefit since it will guarantee that the
> GPU is idle before creating and loading the hibernation image.
> 
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Imre Deak <imre.deak@intel.com>

Makes sense, we should treat the transition to suspend just like unload.
We should do the symmetric reset on load/resume as well.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* ✗ Fi.CI.BAT: warning for drm/i915/hsw: Fix GPU hang during resume from S3-devices state
  2016-10-12 14:46 [PATCH] drm/i915/hsw: Fix GPU hang during resume from S3-devices state Imre Deak
  2016-10-12 15:44 ` Chris Wilson
@ 2016-10-12 17:19 ` Patchwork
  2016-10-12 18:32   ` Saarinen, Jani
  1 sibling, 1 reply; 5+ messages in thread
From: Patchwork @ 2016-10-12 17:19 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

== Series Details ==

Series: drm/i915/hsw: Fix GPU hang during resume from S3-devices state
URL   : https://patchwork.freedesktop.org/series/13654/
State : warning

== Summary ==

Series 13654v1 drm/i915/hsw: Fix GPU hang during resume from S3-devices state
https://patchwork.freedesktop.org/api/1.0/series/13654/revisions/1/mbox/

Test drv_module_reload_basic:
                skip       -> PASS       (fi-skl-6770hq)
Test kms_cursor_legacy:
        Subgroup basic-flip-after-cursor-legacy:
                pass       -> DMESG-WARN (fi-ivb-3770)
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                dmesg-warn -> PASS       (fi-skl-6770hq)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                dmesg-warn -> PASS       (fi-byt-j1900)
Test kms_psr_sink_crc:
        Subgroup psr_basic:
                dmesg-warn -> PASS       (fi-skl-6700hq)
Test vgem_basic:
        Subgroup unload:
                pass       -> SKIP       (fi-bdw-5557u)
                skip       -> PASS       (fi-hsw-4770)

fi-bdw-5557u     total:248  pass:231  dwarn:0   dfail:0   fail:0   skip:17 
fi-bsw-n3050     total:248  pass:205  dwarn:0   dfail:0   fail:0   skip:43 
fi-bxt-t5700     total:248  pass:217  dwarn:0   dfail:0   fail:0   skip:31 
fi-byt-j1900     total:248  pass:214  dwarn:1   dfail:0   fail:1   skip:32 
fi-byt-n2820     total:248  pass:211  dwarn:0   dfail:0   fail:1   skip:36 
fi-hsw-4770      total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23 
fi-hsw-4770r     total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23 
fi-ilk-650       total:248  pass:185  dwarn:0   dfail:0   fail:2   skip:61 
fi-ivb-3520m     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26 
fi-ivb-3770      total:248  pass:221  dwarn:1   dfail:0   fail:0   skip:26 
fi-kbl-7200u     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26 
fi-skl-6260u     total:248  pass:233  dwarn:0   dfail:0   fail:0   skip:15 
fi-skl-6700hq    total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23 
fi-skl-6700k     total:248  pass:222  dwarn:1   dfail:0   fail:0   skip:25 
fi-skl-6770hq    total:248  pass:231  dwarn:1   dfail:0   fail:1   skip:15 
fi-snb-2520m     total:248  pass:211  dwarn:0   dfail:0   fail:0   skip:37 
fi-snb-2600      total:248  pass:210  dwarn:0   dfail:0   fail:0   skip:38 

Results at /archive/results/CI_IGT_test/Patchwork_2690/

14740bb25ec36fe4ce8042af3eb48aeb45e5bc13 drm-intel-nightly: 2016y-10m-12d-16h-18m-24s UTC integration manifest
78a5d29 drm/i915/hsw: Fix GPU hang during resume from S3-devices state

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for drm/i915/hsw: Fix GPU hang during resume from S3-devices state
  2016-10-12 17:19 ` ✗ Fi.CI.BAT: warning for " Patchwork
@ 2016-10-12 18:32   ` Saarinen, Jani
  2016-10-13  9:18     ` Imre Deak
  0 siblings, 1 reply; 5+ messages in thread
From: Saarinen, Jani @ 2016-10-12 18:32 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org, Deak, Imre

> == Series Details ==
> 
> Series: drm/i915/hsw: Fix GPU hang during resume from S3-devices state
> URL   : https://patchwork.freedesktop.org/series/13654/
> State : warning
> 
> == Summary ==
> 
> Series 13654v1 drm/i915/hsw: Fix GPU hang during resume from S3-devices
> state
> https://patchwork.freedesktop.org/api/1.0/series/13654/revisions/1/mbox/
> 
> Test drv_module_reload_basic:
>                 skip       -> PASS       (fi-skl-6770hq)
> Test kms_cursor_legacy:
>         Subgroup basic-flip-after-cursor-legacy:
>                 pass       -> DMESG-WARN (fi-ivb-3770)

 [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 144
 [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 144

> Test kms_flip:
>         Subgroup basic-flip-vs-modeset:
>                 dmesg-warn -> PASS       (fi-skl-6770hq)
> Test kms_pipe_crc_basic:
>         Subgroup suspend-read-crc-pipe-b:
>                 dmesg-warn -> PASS       (fi-byt-j1900)
> Test kms_psr_sink_crc:
>         Subgroup psr_basic:
>                 dmesg-warn -> PASS       (fi-skl-6700hq)
> Test vgem_basic:
>         Subgroup unload:
>                 pass       -> SKIP       (fi-bdw-5557u)
>                 skip       -> PASS       (fi-hsw-4770)
> 
> fi-bdw-5557u     total:248  pass:231  dwarn:0   dfail:0   fail:0   skip:17
> fi-bsw-n3050     total:248  pass:205  dwarn:0   dfail:0   fail:0   skip:43
> fi-bxt-t5700     total:248  pass:217  dwarn:0   dfail:0   fail:0   skip:31
> fi-byt-j1900     total:248  pass:214  dwarn:1   dfail:0   fail:1   skip:32
> fi-byt-n2820     total:248  pass:211  dwarn:0   dfail:0   fail:1   skip:36
> fi-hsw-4770      total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> fi-hsw-4770r     total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> fi-ilk-650       total:248  pass:185  dwarn:0   dfail:0   fail:2   skip:61
> fi-ivb-3520m     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26
> fi-ivb-3770      total:248  pass:221  dwarn:1   dfail:0   fail:0   skip:26
> fi-kbl-7200u     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26
> fi-skl-6260u     total:248  pass:233  dwarn:0   dfail:0   fail:0   skip:15
> fi-skl-6700hq    total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> fi-skl-6700k     total:248  pass:222  dwarn:1   dfail:0   fail:0   skip:25
> fi-skl-6770hq    total:248  pass:231  dwarn:1   dfail:0   fail:1   skip:15
> fi-snb-2520m     total:248  pass:211  dwarn:0   dfail:0   fail:0   skip:37
> fi-snb-2600      total:248  pass:210  dwarn:0   dfail:0   fail:0   skip:38
> 
> Results at /archive/results/CI_IGT_test/Patchwork_2690/
> 
> 14740bb25ec36fe4ce8042af3eb48aeb45e5bc13 drm-intel-nightly: 2016y-10m-
> 12d-16h-18m-24s UTC integration manifest
> 78a5d29 drm/i915/hsw: Fix GPU hang during resume from S3-devices state
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Jani Saarinen
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ✗ Fi.CI.BAT: warning for drm/i915/hsw: Fix GPU hang during resume from S3-devices state
  2016-10-12 18:32   ` Saarinen, Jani
@ 2016-10-13  9:18     ` Imre Deak
  0 siblings, 0 replies; 5+ messages in thread
From: Imre Deak @ 2016-10-13  9:18 UTC (permalink / raw)
  To: Saarinen, Jani, intel-gfx@lists.freedesktop.org, Chris Wilson

On ke, 2016-10-12 at 21:32 +0300, Saarinen, Jani wrote:
> > == Series Details ==
> > 
> > Series: drm/i915/hsw: Fix GPU hang during resume from S3-devices state
> > URL   : https://patchwork.freedesktop.org/series/13654/
> > State : warning
> > 
> > == Summary ==
> > 
> > Series 13654v1 drm/i915/hsw: Fix GPU hang during resume from S3-devices
> > state
> > https://patchwork.freedesktop.org/api/1.0/series/13654/revisions/1/mbox/
> > 
> > Test drv_module_reload_basic:
> >                 skip       -> PASS       (fi-skl-6770hq)
> > Test kms_cursor_legacy:
> >         Subgroup basic-flip-after-cursor-legacy:
> >                 pass       -> DMESG-WARN (fi-ivb-3770)
> 
>  [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 144
>  [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 144

A pre-existing bug, it seems it started in CI_DRM_1701, not long after attaching
the HDMI monitor the EDID belongs to. I opened a new bug:
https://bugs.freedesktop.org/show_bug.cgi?id=98228

Thanks for the review, I pushed the patch to -dinq.

> 
> > Test kms_flip:
> >         Subgroup basic-flip-vs-modeset:
> >                 dmesg-warn -> PASS       (fi-skl-6770hq)
> > Test kms_pipe_crc_basic:
> >         Subgroup suspend-read-crc-pipe-b:
> >                 dmesg-warn -> PASS       (fi-byt-j1900)
> > Test kms_psr_sink_crc:
> >         Subgroup psr_basic:
> >                 dmesg-warn -> PASS       (fi-skl-6700hq)
> > Test vgem_basic:
> >         Subgroup unload:
> >                 pass       -> SKIP       (fi-bdw-5557u)
> >                 skip       -> PASS       (fi-hsw-4770)
> > 
> > fi-bdw-5557u     total:248  pass:231  dwarn:0   dfail:0   fail:0   skip:17
> > fi-bsw-n3050     total:248  pass:205  dwarn:0   dfail:0   fail:0   skip:43
> > fi-bxt-t5700     total:248  pass:217  dwarn:0   dfail:0   fail:0   skip:31
> > fi-byt-j1900     total:248  pass:214  dwarn:1   dfail:0   fail:1   skip:32
> > fi-byt-n2820     total:248  pass:211  dwarn:0   dfail:0   fail:1   skip:36
> > fi-hsw-4770      total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> > fi-hsw-4770r     total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> > fi-ilk-650       total:248  pass:185  dwarn:0   dfail:0   fail:2   skip:61
> > fi-ivb-3520m     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26
> > fi-ivb-3770      total:248  pass:221  dwarn:1   dfail:0   fail:0   skip:26
> > fi-kbl-7200u     total:248  pass:222  dwarn:0   dfail:0   fail:0   skip:26
> > fi-skl-6260u     total:248  pass:233  dwarn:0   dfail:0   fail:0   skip:15
> > fi-skl-6700hq    total:248  pass:225  dwarn:0   dfail:0   fail:0   skip:23
> > fi-skl-6700k     total:248  pass:222  dwarn:1   dfail:0   fail:0   skip:25
> > fi-skl-6770hq    total:248  pass:231  dwarn:1   dfail:0   fail:1   skip:15
> > fi-snb-2520m     total:248  pass:211  dwarn:0   dfail:0   fail:0   skip:37
> > fi-snb-2600      total:248  pass:210  dwarn:0   dfail:0   fail:0   skip:38
> > 
> > Results at /archive/results/CI_IGT_test/Patchwork_2690/
> > 
> > 14740bb25ec36fe4ce8042af3eb48aeb45e5bc13 drm-intel-nightly: 2016y-10m-
> > 12d-16h-18m-24s UTC integration manifest
> > 78a5d29 drm/i915/hsw: Fix GPU hang during resume from S3-devices state
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> Jani Saarinen
> Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-13  9:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-12 14:46 [PATCH] drm/i915/hsw: Fix GPU hang during resume from S3-devices state Imre Deak
2016-10-12 15:44 ` Chris Wilson
2016-10-12 17:19 ` ✗ Fi.CI.BAT: warning for " Patchwork
2016-10-12 18:32   ` Saarinen, Jani
2016-10-13  9:18     ` Imre Deak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox