public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Flush delayed fence releases after reset
@ 2016-08-19 13:17 Chris Wilson
  2016-08-19 13:39 ` ✗ Ro.CI.BAT: failure for " Patchwork
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Chris Wilson @ 2016-08-19 13:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

What I never hit in testing, but Mika immediately did, was a GPU hang
with a pending fence release (where a tiled object has been changed by
the user to be untiled, and the update has not yet been committed to the
fence register). As the stride/tiling is 0, this causes a divide-by-zero
error when trying to write the new fence parameters:

   28.784518] drm/i915: Resetting chip after gpu hang
[   28.784551] divide error: 0000 [#1] PREEMPT SMP
[   28.784565] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mxm_wmi x86_pkg_temp_thermal snd_hda_codec_hdmi kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore mac_hid wmi efivarfs autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid0 multipath linear psmouse e1000e ptp pps_core nvme nvme_core i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video
[   28.784738] CPU: 0 PID: 1692 Comm: kworker/0:2 Not tainted 4.8.0-rc2+ #895
[   28.784752] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1803 05/09/2016
[   28.784786] Workqueue: events_long i915_hangcheck_elapsed [i915]
[   28.784814] task: ffff923c18f59d40 task.stack: ffff923c1b7e4000
[   28.784827] RIP: 0010:[<ffffffffc0475b5f>]  [<ffffffffc0475b5f>] fence_write+0x9f/0x3b0 [i915]
[   28.784854] RSP: 0018:ffff923c1b7e7b30  EFLAGS: 00010246
[   28.784866] RAX: 00000000008ca000 RBX: ffff923c18540000 RCX: 0000000000000020
[   28.784880] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000596d000
[   28.784894] RBP: ffff923c1b7e7b68 R08: 0000000000000000 R09: 0000000000000000
[   28.784908] R10: 0000000000000000 R11: 00000000008ca000 R12: ffff923c1ef9d600
[   28.784921] R13: 0000000000100040 R14: 0000000000100044 R15: ffff923c18549908
[   28.784935] FS:  0000000000000000(0000) GS:ffff923c36c00000(0000) knlGS:0000000000000000
[   28.784951] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.784962] CR2: 00007f193373c893 CR3: 0000000419c78000 CR4: 00000000003406f0
[   28.784976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   28.784990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   28.785004] Stack:
[   28.785009]  000000000596c03b ffff923c1b7e7b68 ffff923c18549938 0000000000000009
[   28.785026]  ffff923c18540000 ffff923c18549280 ffff923c18547ce8 ffff923c1b7e7b90
[   28.785044]  ffffffffc04761f9 ffff923c18540000 ffff923c18547d00 ffff923c18548ff8
[   28.785062] Call Trace:
[   28.785078]  [<ffffffffc04761f9>] i915_gem_restore_fences+0x39/0x50 [i915]
[   28.785102]  [<ffffffffc047fe89>] i915_gem_reset+0x179/0x300 [i915]

Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_fence.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c
index 3f8f328f5c76..b79bcd6288e5 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence.c
@@ -373,12 +373,16 @@ void i915_gem_restore_fences(struct drm_device *dev)
 
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
+		struct i915_vma *vma = reg->vma;
 
 		/*
 		 * Commit delayed tiling changes if we have an object still
 		 * attached to the fence, otherwise just clear the fence.
 		 */
-		fence_write(reg, reg->vma);
+		if (vma && !i915_gem_object_is_tiled(vma->obj))
+			vma = NULL;
+
+		fence_update(reg, vma);
 	}
 }
 
-- 
2.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* ✗ Ro.CI.BAT: failure for drm/i915: Flush delayed fence releases after reset
  2016-08-19 13:17 [PATCH] drm/i915: Flush delayed fence releases after reset Chris Wilson
@ 2016-08-19 13:39 ` Patchwork
  2016-08-19 13:50 ` [PATCH] " Mika Kuoppala
  2016-08-22  9:16 ` Joonas Lahtinen
  2 siblings, 0 replies; 4+ messages in thread
From: Patchwork @ 2016-08-19 13:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Flush delayed fence releases after reset
URL   : https://patchwork.freedesktop.org/series/11323/
State : failure

== Summary ==

Series 11323v1 drm/i915: Flush delayed fence releases after reset
http://patchwork.freedesktop.org/api/1.0/series/11323/revisions/1/mbox

Test kms_cursor_legacy:
        Subgroup basic-flip-vs-cursor-legacy:
                pass       -> FAIL       (ro-byt-n2820)
        Subgroup basic-flip-vs-cursor-varying-size:
                pass       -> FAIL       (ro-byt-n2820)
                fail       -> PASS       (fi-hsw-i7-4770k)
                pass       -> FAIL       (ro-skl3-i5-6260u)
                fail       -> PASS       (ro-bdw-i5-5250u)
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                pass       -> INCOMPLETE (fi-hsw-i7-4770k)
        Subgroup suspend-read-crc-pipe-b:
                dmesg-warn -> PASS       (ro-bdw-i7-5600u)
                skip       -> DMESG-WARN (ro-bdw-i5-5250u)

fi-hsw-i7-4770k  total:201  pass:180  dwarn:0   dfail:0   fail:0   skip:20 
fi-kbl-qkkr      total:244  pass:186  dwarn:28  dfail:0   fail:3   skip:27 
fi-skl-i7-6700k  total:244  pass:208  dwarn:4   dfail:2   fail:2   skip:28 
fi-snb-i7-2600   total:244  pass:202  dwarn:0   dfail:0   fail:0   skip:42 
ro-bdw-i5-5250u  total:240  pass:219  dwarn:3   dfail:0   fail:1   skip:17 
ro-bdw-i7-5557U  total:240  pass:220  dwarn:2   dfail:0   fail:0   skip:18 
ro-bdw-i7-5600u  total:240  pass:206  dwarn:0   dfail:0   fail:2   skip:32 
ro-bsw-n3050     total:240  pass:194  dwarn:0   dfail:0   fail:4   skip:42 
ro-byt-n2820     total:240  pass:196  dwarn:0   dfail:0   fail:4   skip:40 
ro-hsw-i3-4010u  total:240  pass:213  dwarn:0   dfail:0   fail:1   skip:26 
ro-hsw-i7-4770r  total:240  pass:185  dwarn:0   dfail:0   fail:0   skip:55 
ro-ilk1-i5-650   total:235  pass:174  dwarn:0   dfail:0   fail:2   skip:59 
ro-ivb-i7-3770   total:240  pass:204  dwarn:0   dfail:0   fail:1   skip:35 
ro-ivb2-i7-3770  total:240  pass:208  dwarn:0   dfail:0   fail:1   skip:31 
ro-skl3-i5-6260u total:240  pass:222  dwarn:0   dfail:0   fail:4   skip:14 

Results at /archive/results/CI_IGT_test/RO_Patchwork_1941/

21defee drm-intel-nightly: 2016y-08m-19d-09h-06m-36s UTC integration manifest
2d6dab0 drm/i915: Flush delayed fence releases after reset

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Flush delayed fence releases after reset
  2016-08-19 13:17 [PATCH] drm/i915: Flush delayed fence releases after reset Chris Wilson
  2016-08-19 13:39 ` ✗ Ro.CI.BAT: failure for " Patchwork
@ 2016-08-19 13:50 ` Mika Kuoppala
  2016-08-22  9:16 ` Joonas Lahtinen
  2 siblings, 0 replies; 4+ messages in thread
From: Mika Kuoppala @ 2016-08-19 13:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> What I never hit in testing, but Mika immediately did, was a GPU hang
> with a pending fence release (where a tiled object has been changed by
> the user to be untiled, and the update has not yet been committed to the
> fence register). As the stride/tiling is 0, this causes a divide-by-zero
> error when trying to write the new fence parameters:
>
>    28.784518] drm/i915: Resetting chip after gpu hang
> [   28.784551] divide error: 0000 [#1] PREEMPT SMP
> [   28.784565] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mxm_wmi x86_pkg_temp_thermal snd_hda_codec_hdmi kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore mac_hid wmi efivarfs autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid0 multipath linear psmouse e1000e ptp pps_core nvme nvme_core i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video
> [   28.784738] CPU: 0 PID: 1692 Comm: kworker/0:2 Not tainted 4.8.0-rc2+ #895
> [   28.784752] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1803 05/09/2016
> [   28.784786] Workqueue: events_long i915_hangcheck_elapsed [i915]
> [   28.784814] task: ffff923c18f59d40 task.stack: ffff923c1b7e4000
> [   28.784827] RIP: 0010:[<ffffffffc0475b5f>]  [<ffffffffc0475b5f>] fence_write+0x9f/0x3b0 [i915]
> [   28.784854] RSP: 0018:ffff923c1b7e7b30  EFLAGS: 00010246
> [   28.784866] RAX: 00000000008ca000 RBX: ffff923c18540000 RCX: 0000000000000020
> [   28.784880] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000596d000
> [   28.784894] RBP: ffff923c1b7e7b68 R08: 0000000000000000 R09: 0000000000000000
> [   28.784908] R10: 0000000000000000 R11: 00000000008ca000 R12: ffff923c1ef9d600
> [   28.784921] R13: 0000000000100040 R14: 0000000000100044 R15: ffff923c18549908
> [   28.784935] FS:  0000000000000000(0000) GS:ffff923c36c00000(0000) knlGS:0000000000000000
> [   28.784951] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   28.784962] CR2: 00007f193373c893 CR3: 0000000419c78000 CR4: 00000000003406f0
> [   28.784976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   28.784990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   28.785004] Stack:
> [   28.785009]  000000000596c03b ffff923c1b7e7b68 ffff923c18549938 0000000000000009
> [   28.785026]  ffff923c18540000 ffff923c18549280 ffff923c18547ce8 ffff923c1b7e7b90
> [   28.785044]  ffffffffc04761f9 ffff923c18540000 ffff923c18547d00 ffff923c18548ff8
> [   28.785062] Call Trace:
> [   28.785078]  [<ffffffffc04761f9>] i915_gem_restore_fences+0x39/0x50 [i915]
> [   28.785102]  [<ffffffffc047fe89>] i915_gem_reset+0x179/0x300 [i915]
>
> Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Can't test as I lost the machine.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_fence.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_fence.c b/drivers/gpu/drm/i915/i915_gem_fence.c
> index 3f8f328f5c76..b79bcd6288e5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_fence.c
> +++ b/drivers/gpu/drm/i915/i915_gem_fence.c
> @@ -373,12 +373,16 @@ void i915_gem_restore_fences(struct drm_device *dev)
>  
>  	for (i = 0; i < dev_priv->num_fence_regs; i++) {
>  		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
> +		struct i915_vma *vma = reg->vma;
>  
>  		/*
>  		 * Commit delayed tiling changes if we have an object still
>  		 * attached to the fence, otherwise just clear the fence.
>  		 */
> -		fence_write(reg, reg->vma);
> +		if (vma && !i915_gem_object_is_tiled(vma->obj))
> +			vma = NULL;
> +
> +		fence_update(reg, vma);
>  	}
>  }
>  
> -- 
> 2.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] drm/i915: Flush delayed fence releases after reset
  2016-08-19 13:17 [PATCH] drm/i915: Flush delayed fence releases after reset Chris Wilson
  2016-08-19 13:39 ` ✗ Ro.CI.BAT: failure for " Patchwork
  2016-08-19 13:50 ` [PATCH] " Mika Kuoppala
@ 2016-08-22  9:16 ` Joonas Lahtinen
  2 siblings, 0 replies; 4+ messages in thread
From: Joonas Lahtinen @ 2016-08-22  9:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala

On pe, 2016-08-19 at 14:17 +0100, Chris Wilson wrote:
> What I never hit in testing, but Mika immediately did, was a GPU hang
> with a pending fence release (where a tiled object has been changed by
> the user to be untiled, and the update has not yet been committed to the
> fence register). As the stride/tiling is 0, this causes a divide-by-zero
> error when trying to write the new fence parameters:
> 
>    28.784518] drm/i915: Resetting chip after gpu hang
> [   28.784551] divide error: 0000 [#1] PREEMPT SMP
> [   28.784565] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mxm_wmi x86_pkg_temp_thermal snd_hda_codec_hdmi kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore mac_hid wmi efivarfs autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid0 multipath linear psmouse e1000e ptp pps_core nvme nvme_core i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video
> [   28.784738] CPU: 0 PID: 1692 Comm: kworker/0:2 Not tainted 4.8.0-rc2+ #895
> [   28.784752] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1803 05/09/2016
> [   28.784786] Workqueue: events_long i915_hangcheck_elapsed [i915]
> [   28.784814] task: ffff923c18f59d40 task.stack: ffff923c1b7e4000
> [   28.784827] RIP: 0010:[]  [] fence_write+0x9f/0x3b0 [i915]
> [   28.784854] RSP: 0018:ffff923c1b7e7b30  EFLAGS: 00010246
> [   28.784866] RAX: 00000000008ca000 RBX: ffff923c18540000 RCX: 0000000000000020
> [   28.784880] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000596d000
> [   28.784894] RBP: ffff923c1b7e7b68 R08: 0000000000000000 R09: 0000000000000000
> [   28.784908] R10: 0000000000000000 R11: 00000000008ca000 R12: ffff923c1ef9d600
> [   28.784921] R13: 0000000000100040 R14: 0000000000100044 R15: ffff923c18549908
> [   28.784935] FS:  0000000000000000(0000) GS:ffff923c36c00000(0000) knlGS:0000000000000000
> [   28.784951] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   28.784962] CR2: 00007f193373c893 CR3: 0000000419c78000 CR4: 00000000003406f0
> [   28.784976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   28.784990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   28.785004] Stack:
> [   28.785009]  000000000596c03b ffff923c1b7e7b68 ffff923c18549938 0000000000000009
> [   28.785026]  ffff923c18540000 ffff923c18549280 ffff923c18547ce8 ffff923c1b7e7b90
> [   28.785044]  ffffffffc04761f9 ffff923c18540000 ffff923c18547d00 ffff923c18548ff8
> [   28.785062] Call Trace:
> [   28.785078]  [] i915_gem_restore_fences+0x39/0x50 [i915]
> [   28.785102]  [] i915_gem_reset+0x179/0x300 [i915]
> 
> Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Fixes: 49ef5294cda2 ("drm/i915: Move fence tracking from object to vma")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-08-22  9:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-19 13:17 [PATCH] drm/i915: Flush delayed fence releases after reset Chris Wilson
2016-08-19 13:39 ` ✗ Ro.CI.BAT: failure for " Patchwork
2016-08-19 13:50 ` [PATCH] " Mika Kuoppala
2016-08-22  9:16 ` Joonas Lahtinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox