* [PATCH] drm/i915: Prevent TLB error on first execution on SNB
@ 2015-02-13 12:59 Chris Wilson
2015-02-13 13:43 ` Daniel Vetter
2015-02-13 19:42 ` [PATCH] " shuang.he
0 siblings, 2 replies; 11+ messages in thread
From: Chris Wilson @ 2015-02-13 12:59 UTC (permalink / raw)
To: intel-gfx
Long ago I found that I was getting sporadic errors when booting SNB,
with the symptom being that the first batch died with IPEHR != *ACTHD,
typically caused by the TLB being invalid. These magically disappeared
if I held the forcewake during the entire ring initialisation sequence.
(It can probably be shortened to a short critical section, but the whole
initialisation is full of register writes and so we would be taking and
releasing forcewake almost continually, and so holding it over the
entire sequence will probably be a net win!)
Note some of the kernels I encounted the issue already had the deferred
forcewake release, so it is still relevant.
I know that there have been a few other reports with similar failure
conditions on SNB, I think such as
References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8d15c8110962..2426f6d9b5a5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4868,6 +4868,7 @@ int i915_gem_init(struct drm_device *dev)
dev_priv->gt.stop_ring = intel_logical_ring_stop;
}
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
ret = i915_gem_init_userptr(dev);
if (ret)
goto out_unlock;
@@ -4894,6 +4895,7 @@ int i915_gem_init(struct drm_device *dev)
}
out_unlock:
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
mutex_unlock(&dev->struct_mutex);
return ret;
--
2.1.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 12:59 [PATCH] drm/i915: Prevent TLB error on first execution on SNB Chris Wilson
@ 2015-02-13 13:43 ` Daniel Vetter
2015-02-13 14:12 ` Chris Wilson
2015-02-13 14:35 ` [PATCH v2] " Chris Wilson
2015-02-13 19:42 ` [PATCH] " shuang.he
1 sibling, 2 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-02-13 13:43 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, Feb 13, 2015 at 12:59:45PM +0000, Chris Wilson wrote:
> Long ago I found that I was getting sporadic errors when booting SNB,
> with the symptom being that the first batch died with IPEHR != *ACTHD,
> typically caused by the TLB being invalid. These magically disappeared
> if I held the forcewake during the entire ring initialisation sequence.
> (It can probably be shortened to a short critical section, but the whole
> initialisation is full of register writes and so we would be taking and
> releasing forcewake almost continually, and so holding it over the
> entire sequence will probably be a net win!)
>
> Note some of the kernels I encounted the issue already had the deferred
> forcewake release, so it is still relevant.
>
> I know that there have been a few other reports with similar failure
> conditions on SNB, I think such as
> References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Given that we've already added a forcewake critical section around
individual ring inits this makes maybe a bit too much sense. But I do
wonder whether we don't need the same for resume and gpu resets?
With the split into hw/sw setup we could get that by pusing the
forcewake_get/put inti i915_gem_init_hw. Does the magic still work with
that? And if we put it there there fw_get/put in init_ring_common is fully
redundant and could be remove.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8d15c8110962..2426f6d9b5a5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4868,6 +4868,7 @@ int i915_gem_init(struct drm_device *dev)
> dev_priv->gt.stop_ring = intel_logical_ring_stop;
> }
>
> + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> ret = i915_gem_init_userptr(dev);
> if (ret)
> goto out_unlock;
> @@ -4894,6 +4895,7 @@ int i915_gem_init(struct drm_device *dev)
> }
>
> out_unlock:
> + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> mutex_unlock(&dev->struct_mutex);
>
> return ret;
> --
> 2.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 13:43 ` Daniel Vetter
@ 2015-02-13 14:12 ` Chris Wilson
2015-02-23 22:54 ` Daniel Vetter
2015-02-13 14:35 ` [PATCH v2] " Chris Wilson
1 sibling, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2015-02-13 14:12 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Fri, Feb 13, 2015 at 02:43:40PM +0100, Daniel Vetter wrote:
> On Fri, Feb 13, 2015 at 12:59:45PM +0000, Chris Wilson wrote:
> > Long ago I found that I was getting sporadic errors when booting SNB,
> > with the symptom being that the first batch died with IPEHR != *ACTHD,
> > typically caused by the TLB being invalid. These magically disappeared
> > if I held the forcewake during the entire ring initialisation sequence.
> > (It can probably be shortened to a short critical section, but the whole
> > initialisation is full of register writes and so we would be taking and
> > releasing forcewake almost continually, and so holding it over the
> > entire sequence will probably be a net win!)
> >
> > Note some of the kernels I encounted the issue already had the deferred
> > forcewake release, so it is still relevant.
> >
> > I know that there have been a few other reports with similar failure
> > conditions on SNB, I think such as
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Given that we've already added a forcewake critical section around
> individual ring inits this makes maybe a bit too much sense. But I do
> wonder whether we don't need the same for resume and gpu resets?
>
> With the split into hw/sw setup we could get that by pusing the
> forcewake_get/put inti i915_gem_init_hw. Does the magic still work with
> that? And if we put it there there fw_get/put in init_ring_common is fully
> redundant and could be remove.
Hmm, my original thought was to keep the engine alive from the first
programming of CTL up until we fed in the first request (which is the
ppgtt/context init). We can add a second forcewake layer into init_hw to
give the same security blanket for resume/reset. Sound reasonable?
And I should add a comment saying that this is a security blanket.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 13:43 ` Daniel Vetter
2015-02-13 14:12 ` Chris Wilson
@ 2015-02-13 14:35 ` Chris Wilson
2015-02-14 0:19 ` shuang.he
2015-03-10 10:31 ` Daniel Vetter
1 sibling, 2 replies; 11+ messages in thread
From: Chris Wilson @ 2015-02-13 14:35 UTC (permalink / raw)
To: intel-gfx
Long ago I found that I was getting sporadic errors when booting SNB,
with the symptom being that the first batch died with IPEHR != *ACTHD,
typically caused by the TLB being invalid. These magically disappeared
if I held the forcewake during the entire ring initialisation sequence.
(It can probably be shortened to a short critical section, but the whole
initialisation is full of register writes and so we would be taking and
releasing forcewake almost continually, and so holding it over the
entire sequence will probably be a net win!)
Note some of the kernels I encounted the issue already had the deferred
forcewake release, so it is still relevant.
I know that there have been a few other reports with similar failure
conditions on SNB, I think such as
References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
v2: Wrap i915_gem_init_hw() with its own security blanket as we take
that path following resume and reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8d15c8110962..08450922f373 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
return -EIO;
+ /* Double layer security blanket, see i915_gem_init() */
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
if (dev_priv->ellc_size)
I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
@@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
for_each_ring(ring, dev_priv, i) {
ret = ring->init_hw(ring);
if (ret)
- return ret;
+ goto out;
}
for (i = 0; i < NUM_L3_SLICES(dev); i++)
@@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
DRM_ERROR("Context enable failed %d\n", ret);
i915_gem_cleanup_ringbuffer(dev);
- return ret;
+ goto out;
}
+out:
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
return ret;
}
@@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
dev_priv->gt.stop_ring = intel_logical_ring_stop;
}
+ /* This is just a security blanket to placate dragons.
+ * On some systems, we very sporadically observe that the first TLBs
+ * used by the CS may be stale, despite us poking the TLB reset. If
+ * we hold the forcewake during initialisation these problems
+ * just magically go away.
+ */
+ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
ret = i915_gem_init_userptr(dev);
if (ret)
goto out_unlock;
@@ -4894,6 +4907,7 @@ int i915_gem_init(struct drm_device *dev)
}
out_unlock:
+ intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
mutex_unlock(&dev->struct_mutex);
return ret;
--
2.1.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 12:59 [PATCH] drm/i915: Prevent TLB error on first execution on SNB Chris Wilson
2015-02-13 13:43 ` Daniel Vetter
@ 2015-02-13 19:42 ` shuang.he
1 sibling, 0 replies; 11+ messages in thread
From: shuang.he @ 2015-02-13 19:42 UTC (permalink / raw)
To: shuang.he, ethan.gao, intel-gfx, chris
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5770
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV 281/281 281/281
ILK 313/313 313/313
SNB 309/309 309/309
IVB 382/382 382/382
BYT 296/296 296/296
HSW -2 426/426 424/426
BDW -1 318/318 317/318
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*HSW igt_kms_flip_plain-flip-fb-recreate PASS(1) TIMEOUT(1)
*HSW igt_kms_flip_plain-flip-fb-recreate-interruptible PASS(1) TIMEOUT(1)
*BDW igt_gem_gtt_hog PASS(1) DMESG_WARN(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 14:35 ` [PATCH v2] " Chris Wilson
@ 2015-02-14 0:19 ` shuang.he
2015-03-10 10:31 ` Daniel Vetter
1 sibling, 0 replies; 11+ messages in thread
From: shuang.he @ 2015-02-14 0:19 UTC (permalink / raw)
To: shuang.he, ethan.gao, intel-gfx, chris
Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5773
-------------------------------------Summary-------------------------------------
Platform Delta drm-intel-nightly Series Applied
PNV -1 281/281 280/281
ILK 313/313 313/313
SNB 309/309 309/309
IVB 382/382 382/382
BYT 296/296 296/296
HSW 426/426 426/426
BDW -1 318/318 317/318
-------------------------------------Detailed-------------------------------------
Platform Test drm-intel-nightly Series Applied
*PNV igt_gen3_render_mixed_blits PASS(1) CRASH(1)
*BDW igt_gem_gtt_hog PASS(1) DMESG_WARN(1)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 14:12 ` Chris Wilson
@ 2015-02-23 22:54 ` Daniel Vetter
0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2015-02-23 22:54 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, intel-gfx
On Fri, Feb 13, 2015 at 02:12:48PM +0000, Chris Wilson wrote:
> On Fri, Feb 13, 2015 at 02:43:40PM +0100, Daniel Vetter wrote:
> > On Fri, Feb 13, 2015 at 12:59:45PM +0000, Chris Wilson wrote:
> > > Long ago I found that I was getting sporadic errors when booting SNB,
> > > with the symptom being that the first batch died with IPEHR != *ACTHD,
> > > typically caused by the TLB being invalid. These magically disappeared
> > > if I held the forcewake during the entire ring initialisation sequence.
> > > (It can probably be shortened to a short critical section, but the whole
> > > initialisation is full of register writes and so we would be taking and
> > > releasing forcewake almost continually, and so holding it over the
> > > entire sequence will probably be a net win!)
> > >
> > > Note some of the kernels I encounted the issue already had the deferred
> > > forcewake release, so it is still relevant.
> > >
> > > I know that there have been a few other reports with similar failure
> > > conditions on SNB, I think such as
> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > Given that we've already added a forcewake critical section around
> > individual ring inits this makes maybe a bit too much sense. But I do
> > wonder whether we don't need the same for resume and gpu resets?
> >
> > With the split into hw/sw setup we could get that by pusing the
> > forcewake_get/put inti i915_gem_init_hw. Does the magic still work with
> > that? And if we put it there there fw_get/put in init_ring_common is fully
> > redundant and could be remove.
>
> Hmm, my original thought was to keep the engine alive from the first
> programming of CTL up until we fed in the first request (which is the
> ppgtt/context init). We can add a second forcewake layer into init_hw to
> give the same security blanket for resume/reset. Sound reasonable?
With the split into sw/hw setup init_hw should be all that's needed for
coverage, nothing touches the hw outside of it. Hence I think the original
outer layer is redundant. Does the magic still work if we drop that part,
or have I missed some hw access (there really shouldn't be any)?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-02-13 14:35 ` [PATCH v2] " Chris Wilson
2015-02-14 0:19 ` shuang.he
@ 2015-03-10 10:31 ` Daniel Vetter
2015-03-10 10:35 ` Chris Wilson
1 sibling, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2015-03-10 10:31 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Fri, Feb 13, 2015 at 02:35:59PM +0000, Chris Wilson wrote:
> Long ago I found that I was getting sporadic errors when booting SNB,
> with the symptom being that the first batch died with IPEHR != *ACTHD,
> typically caused by the TLB being invalid. These magically disappeared
> if I held the forcewake during the entire ring initialisation sequence.
> (It can probably be shortened to a short critical section, but the whole
> initialisation is full of register writes and so we would be taking and
> releasing forcewake almost continually, and so holding it over the
> entire sequence will probably be a net win!)
>
> Note some of the kernels I encounted the issue already had the deferred
> forcewake release, so it is still relevant.
>
> I know that there have been a few other reports with similar failure
> conditions on SNB, I think such as
> References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
>
> v2: Wrap i915_gem_init_hw() with its own security blanket as we take
> that path following resume and reset.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8d15c8110962..08450922f373 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
> if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> return -EIO;
>
> + /* Double layer security blanket, see i915_gem_init() */
> + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +
> if (dev_priv->ellc_size)
> I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
>
> @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
> for_each_ring(ring, dev_priv, i) {
> ret = ring->init_hw(ring);
> if (ret)
> - return ret;
> + goto out;
> }
>
> for (i = 0; i < NUM_L3_SLICES(dev); i++)
> @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
> DRM_ERROR("Context enable failed %d\n", ret);
> i915_gem_cleanup_ringbuffer(dev);
>
> - return ret;
> + goto out;
> }
>
> +out:
> + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> return ret;
> }
>
> @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
> dev_priv->gt.stop_ring = intel_logical_ring_stop;
> }
>
> + /* This is just a security blanket to placate dragons.
> + * On some systems, we very sporadically observe that the first TLBs
> + * used by the CS may be stale, despite us poking the TLB reset. If
> + * we hold the forcewake during initialisation these problems
> + * just magically go away.
> + */
> + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
gem_init shouldn't ever touch the hw except through gem_init_hw. Do we
really need the double-layer here? Also the forcewake hack in the ring
init code should now be redundant, too.
-Daniel
> +
> ret = i915_gem_init_userptr(dev);
> if (ret)
> goto out_unlock;
> @@ -4894,6 +4907,7 @@ int i915_gem_init(struct drm_device *dev)
> }
>
> out_unlock:
> + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> mutex_unlock(&dev->struct_mutex);
>
> return ret;
> --
> 2.1.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-03-10 10:31 ` Daniel Vetter
@ 2015-03-10 10:35 ` Chris Wilson
2015-03-10 12:55 ` Daniel Vetter
0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2015-03-10 10:35 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
On Tue, Mar 10, 2015 at 11:31:04AM +0100, Daniel Vetter wrote:
> On Fri, Feb 13, 2015 at 02:35:59PM +0000, Chris Wilson wrote:
> > Long ago I found that I was getting sporadic errors when booting SNB,
> > with the symptom being that the first batch died with IPEHR != *ACTHD,
> > typically caused by the TLB being invalid. These magically disappeared
> > if I held the forcewake during the entire ring initialisation sequence.
> > (It can probably be shortened to a short critical section, but the whole
> > initialisation is full of register writes and so we would be taking and
> > releasing forcewake almost continually, and so holding it over the
> > entire sequence will probably be a net win!)
> >
> > Note some of the kernels I encounted the issue already had the deferred
> > forcewake release, so it is still relevant.
> >
> > I know that there have been a few other reports with similar failure
> > conditions on SNB, I think such as
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
> >
> > v2: Wrap i915_gem_init_hw() with its own security blanket as we take
> > that path following resume and reset.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 8d15c8110962..08450922f373 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
> > if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> > return -EIO;
> >
> > + /* Double layer security blanket, see i915_gem_init() */
> > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> > +
> > if (dev_priv->ellc_size)
> > I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> >
> > @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
> > for_each_ring(ring, dev_priv, i) {
> > ret = ring->init_hw(ring);
> > if (ret)
> > - return ret;
> > + goto out;
> > }
> >
> > for (i = 0; i < NUM_L3_SLICES(dev); i++)
> > @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
> > DRM_ERROR("Context enable failed %d\n", ret);
> > i915_gem_cleanup_ringbuffer(dev);
> >
> > - return ret;
> > + goto out;
> > }
> >
> > +out:
> > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> > return ret;
> > }
> >
> > @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
> > dev_priv->gt.stop_ring = intel_logical_ring_stop;
> > }
> >
> > + /* This is just a security blanket to placate dragons.
> > + * On some systems, we very sporadically observe that the first TLBs
> > + * used by the CS may be stale, despite us poking the TLB reset. If
> > + * we hold the forcewake during initialisation these problems
> > + * just magically go away.
> > + */
> > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>
> gem_init shouldn't ever touch the hw except through gem_init_hw. Do we
> really need the double-layer here?
There are register accesses before, so yes since that's how I tested
it...
> Also the forcewake hack in the ring
> init code should now be redundant, too.
I am of the opinion that they still serve documentary value. Unless you
have an assert_force_wake() handy.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-03-10 10:35 ` Chris Wilson
@ 2015-03-10 12:55 ` Daniel Vetter
2015-03-10 13:31 ` Jani Nikula
0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2015-03-10 12:55 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, intel-gfx
On Tue, Mar 10, 2015 at 10:35:41AM +0000, Chris Wilson wrote:
> On Tue, Mar 10, 2015 at 11:31:04AM +0100, Daniel Vetter wrote:
> > On Fri, Feb 13, 2015 at 02:35:59PM +0000, Chris Wilson wrote:
> > > Long ago I found that I was getting sporadic errors when booting SNB,
> > > with the symptom being that the first batch died with IPEHR != *ACTHD,
> > > typically caused by the TLB being invalid. These magically disappeared
> > > if I held the forcewake during the entire ring initialisation sequence.
> > > (It can probably be shortened to a short critical section, but the whole
> > > initialisation is full of register writes and so we would be taking and
> > > releasing forcewake almost continually, and so holding it over the
> > > entire sequence will probably be a net win!)
> > >
> > > Note some of the kernels I encounted the issue already had the deferred
> > > forcewake release, so it is still relevant.
> > >
> > > I know that there have been a few other reports with similar failure
> > > conditions on SNB, I think such as
> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
> > >
> > > v2: Wrap i915_gem_init_hw() with its own security blanket as we take
> > > that path following resume and reset.
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > > drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
> > > 1 file changed, 16 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 8d15c8110962..08450922f373 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
> > > if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
> > > return -EIO;
> > >
> > > + /* Double layer security blanket, see i915_gem_init() */
> > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> > > +
> > > if (dev_priv->ellc_size)
> > > I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
> > >
> > > @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
> > > for_each_ring(ring, dev_priv, i) {
> > > ret = ring->init_hw(ring);
> > > if (ret)
> > > - return ret;
> > > + goto out;
> > > }
> > >
> > > for (i = 0; i < NUM_L3_SLICES(dev); i++)
> > > @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
> > > DRM_ERROR("Context enable failed %d\n", ret);
> > > i915_gem_cleanup_ringbuffer(dev);
> > >
> > > - return ret;
> > > + goto out;
> > > }
> > >
> > > +out:
> > > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> > > return ret;
> > > }
> > >
> > > @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
> > > dev_priv->gt.stop_ring = intel_logical_ring_stop;
> > > }
> > >
> > > + /* This is just a security blanket to placate dragons.
> > > + * On some systems, we very sporadically observe that the first TLBs
> > > + * used by the CS may be stale, despite us poking the TLB reset. If
> > > + * we hold the forcewake during initialisation these problems
> > > + * just magically go away.
> > > + */
> > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> >
> > gem_init shouldn't ever touch the hw except through gem_init_hw. Do we
> > really need the double-layer here?
>
> There are register accesses before, so yes since that's how I tested
> it...
>
> > Also the forcewake hack in the ring
> > init code should now be redundant, too.
>
> I am of the opinion that they still serve documentary value. Unless you
> have an assert_force_wake() handy.
Ok, count me convinced.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
And I guess this is for Jani + cc: stable.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent TLB error on first execution on SNB
2015-03-10 12:55 ` Daniel Vetter
@ 2015-03-10 13:31 ` Jani Nikula
0 siblings, 0 replies; 11+ messages in thread
From: Jani Nikula @ 2015-03-10 13:31 UTC (permalink / raw)
To: Daniel Vetter, Chris Wilson
On Tue, 10 Mar 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Mar 10, 2015 at 10:35:41AM +0000, Chris Wilson wrote:
>> On Tue, Mar 10, 2015 at 11:31:04AM +0100, Daniel Vetter wrote:
>> > On Fri, Feb 13, 2015 at 02:35:59PM +0000, Chris Wilson wrote:
>> > > Long ago I found that I was getting sporadic errors when booting SNB,
>> > > with the symptom being that the first batch died with IPEHR != *ACTHD,
>> > > typically caused by the TLB being invalid. These magically disappeared
>> > > if I held the forcewake during the entire ring initialisation sequence.
>> > > (It can probably be shortened to a short critical section, but the whole
>> > > initialisation is full of register writes and so we would be taking and
>> > > releasing forcewake almost continually, and so holding it over the
>> > > entire sequence will probably be a net win!)
>> > >
>> > > Note some of the kernels I encounted the issue already had the deferred
>> > > forcewake release, so it is still relevant.
>> > >
>> > > I know that there have been a few other reports with similar failure
>> > > conditions on SNB, I think such as
>> > > References: https://bugs.freedesktop.org/show_bug.cgi?id=80913
>> > >
>> > > v2: Wrap i915_gem_init_hw() with its own security blanket as we take
>> > > that path following resume and reset.
>> > >
>> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > > ---
>> > > drivers/gpu/drm/i915/i915_gem.c | 18 ++++++++++++++++--
>> > > 1 file changed, 16 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> > > index 8d15c8110962..08450922f373 100644
>> > > --- a/drivers/gpu/drm/i915/i915_gem.c
>> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
>> > > @@ -4783,6 +4783,9 @@ i915_gem_init_hw(struct drm_device *dev)
>> > > if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
>> > > return -EIO;
>> > >
>> > > + /* Double layer security blanket, see i915_gem_init() */
>> > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>> > > +
>> > > if (dev_priv->ellc_size)
>> > > I915_WRITE(HSW_IDICR, I915_READ(HSW_IDICR) | IDIHASHMSK(0xf));
>> > >
>> > > @@ -4815,7 +4818,7 @@ i915_gem_init_hw(struct drm_device *dev)
>> > > for_each_ring(ring, dev_priv, i) {
>> > > ret = ring->init_hw(ring);
>> > > if (ret)
>> > > - return ret;
>> > > + goto out;
>> > > }
>> > >
>> > > for (i = 0; i < NUM_L3_SLICES(dev); i++)
>> > > @@ -4832,9 +4835,11 @@ i915_gem_init_hw(struct drm_device *dev)
>> > > DRM_ERROR("Context enable failed %d\n", ret);
>> > > i915_gem_cleanup_ringbuffer(dev);
>> > >
>> > > - return ret;
>> > > + goto out;
>> > > }
>> > >
>> > > +out:
>> > > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>> > > return ret;
>> > > }
>> > >
>> > > @@ -4868,6 +4873,14 @@ int i915_gem_init(struct drm_device *dev)
>> > > dev_priv->gt.stop_ring = intel_logical_ring_stop;
>> > > }
>> > >
>> > > + /* This is just a security blanket to placate dragons.
>> > > + * On some systems, we very sporadically observe that the first TLBs
>> > > + * used by the CS may be stale, despite us poking the TLB reset. If
>> > > + * we hold the forcewake during initialisation these problems
>> > > + * just magically go away.
>> > > + */
>> > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>> >
>> > gem_init shouldn't ever touch the hw except through gem_init_hw. Do we
>> > really need the double-layer here?
>>
>> There are register accesses before, so yes since that's how I tested
>> it...
>>
>> > Also the forcewake hack in the ring
>> > init code should now be redundant, too.
>>
>> I am of the opinion that they still serve documentary value. Unless you
>> have an assert_force_wake() handy.
>
> Ok, count me convinced.
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> And I guess this is for Jani + cc: stable.
Pushed to drm-intel-fixes, thanks for the patch and review.
BR,
Jani.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-03-10 13:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-13 12:59 [PATCH] drm/i915: Prevent TLB error on first execution on SNB Chris Wilson
2015-02-13 13:43 ` Daniel Vetter
2015-02-13 14:12 ` Chris Wilson
2015-02-23 22:54 ` Daniel Vetter
2015-02-13 14:35 ` [PATCH v2] " Chris Wilson
2015-02-14 0:19 ` shuang.he
2015-03-10 10:31 ` Daniel Vetter
2015-03-10 10:35 ` Chris Wilson
2015-03-10 12:55 ` Daniel Vetter
2015-03-10 13:31 ` Jani Nikula
2015-02-13 19:42 ` [PATCH] " shuang.he
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox