stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
@ 2017-08-08 13:19 Chris Wilson
  2017-08-08 13:36 ` Mika Kuoppala
  0 siblings, 1 reply; 6+ messages in thread
From: Chris Wilson @ 2017-08-08 13:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Mika Kuoppala, stable

As we may have just bound the renderstate into the GGTT for execution, we
need to ensure that the GTT TLB are also flushed.

On snb-gt2, this would cause a random GPU hang at the start of a new
context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
to take ~10s. It was the GPU hang that revealed the truth, as the CS
gleefully executed beyond the end of the golden renderstate batch, a good
indicator for a GTT TLB miss.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem_render_state.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 241d827b85fb..3703dc91eeda 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -242,6 +242,10 @@ int i915_gem_render_state_emit(struct drm_i915_gem_request *req)
 			goto err_unpin;
 	}
 
+	ret = req->engine->emit_flush(req, EMIT_INVALIDATE);
+	if (ret)
+		goto err_unpin;
+
 	ret = req->engine->emit_bb_start(req,
 					 so->batch_offset, so->batch_size,
 					 I915_DISPATCH_SECURE);
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
  2017-08-08 13:19 [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate Chris Wilson
@ 2017-08-08 13:36 ` Mika Kuoppala
  2017-08-08 13:37   ` Mika Kuoppala
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Mika Kuoppala @ 2017-08-08 13:36 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson, stable

Chris Wilson <chris@chris-wilson.co.uk> writes:

> As we may have just bound the renderstate into the GGTT for execution, we
> need to ensure that the GTT TLB are also flushed.
>
> On snb-gt2, this would cause a random GPU hang at the start of a new
> context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
> to take ~10s. It was the GPU hang that revealed the truth, as the CS
> gleefully executed beyond the end of the golden renderstate batch, a good
> indicator for a GTT TLB miss.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: stable@vger.kernel.org

The flush has been there but got stomped by:

Fixes: dc4be6071a24 ("drm/i915: Add explicit request management to i915_gem_init_hw()")

Now we can fix the gen6 renderstate too ;)

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem_render_state.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
> index 241d827b85fb..3703dc91eeda 100644
> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
> @@ -242,6 +242,10 @@ int i915_gem_render_state_emit(struct drm_i915_gem_request *req)
>  			goto err_unpin;
>  	}
>  
> +	ret = req->engine->emit_flush(req, EMIT_INVALIDATE);
> +	if (ret)
> +		goto err_unpin;
> +
>  	ret = req->engine->emit_bb_start(req,
>  					 so->batch_offset, so->batch_size,
>  					 I915_DISPATCH_SECURE);
> -- 
> 2.13.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
  2017-08-08 13:36 ` Mika Kuoppala
@ 2017-08-08 13:37   ` Mika Kuoppala
  2017-08-08 13:43     ` Mika Kuoppala
  2017-08-08 13:52   ` Chris Wilson
  2017-08-08 14:00   ` Chris Wilson
  2 siblings, 1 reply; 6+ messages in thread
From: Mika Kuoppala @ 2017-08-08 13:37 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson, stable

Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:

> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
>> As we may have just bound the renderstate into the GGTT for execution, we
>> need to ensure that the GTT TLB are also flushed.
>>
>> On snb-gt2, this would cause a random GPU hang at the start of a new
>> context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
>> to take ~10s. It was the GPU hang that revealed the truth, as the CS
>> gleefully executed beyond the end of the golden renderstate batch, a good
>> indicator for a GTT TLB miss.
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> The flush has been there but got stomped by:
>
> Fixes: dc4be6071a24 ("drm/i915: Add explicit request management to i915_gem_init_hw()")
>
> Now we can fix the gen6 renderstate too ;)
>
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

On hindsight, should we actually do the flush through add request?
-Mika

>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_render_state.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> index 241d827b85fb..3703dc91eeda 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>> @@ -242,6 +242,10 @@ int i915_gem_render_state_emit(struct drm_i915_gem_request *req)
>>  			goto err_unpin;
>>  	}
>>  
>> +	ret = req->engine->emit_flush(req, EMIT_INVALIDATE);
>> +	if (ret)
>> +		goto err_unpin;
>> +
>>  	ret = req->engine->emit_bb_start(req,
>>  					 so->batch_offset, so->batch_size,
>>  					 I915_DISPATCH_SECURE);
>> -- 
>> 2.13.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
  2017-08-08 13:37   ` Mika Kuoppala
@ 2017-08-08 13:43     ` Mika Kuoppala
  0 siblings, 0 replies; 6+ messages in thread
From: Mika Kuoppala @ 2017-08-08 13:43 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: stable

Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:

> Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:
>
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>>
>>> As we may have just bound the renderstate into the GGTT for execution, we
>>> need to ensure that the GTT TLB are also flushed.
>>>
>>> On snb-gt2, this would cause a random GPU hang at the start of a new
>>> context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
>>> to take ~10s. It was the GPU hang that revealed the truth, as the CS
>>> gleefully executed beyond the end of the golden renderstate batch, a good
>>> indicator for a GTT TLB miss.
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>>> Cc: stable@vger.kernel.org
>>
>> The flush has been there but got stomped by:
>>
>> Fixes: dc4be6071a24 ("drm/i915: Add explicit request management to i915_gem_init_hw()")
>>
>> Now we can fix the gen6 renderstate too ;)
>>
>> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
>
> On hindsight, should we actually do the flush through add request?

No, as it is not there anymore in gem_init_hw. -ETOOMUCHCOFFEE.

> -Mika
>
>>
>>> ---
>>>  drivers/gpu/drm/i915/i915_gem_render_state.c | 4 ++++
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> index 241d827b85fb..3703dc91eeda 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
>>> @@ -242,6 +242,10 @@ int i915_gem_render_state_emit(struct drm_i915_gem_request *req)
>>>  			goto err_unpin;
>>>  	}
>>>  
>>> +	ret = req->engine->emit_flush(req, EMIT_INVALIDATE);
>>> +	if (ret)
>>> +		goto err_unpin;
>>> +
>>>  	ret = req->engine->emit_bb_start(req,
>>>  					 so->batch_offset, so->batch_size,
>>>  					 I915_DISPATCH_SECURE);
>>> -- 
>>> 2.13.3
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
  2017-08-08 13:36 ` Mika Kuoppala
  2017-08-08 13:37   ` Mika Kuoppala
@ 2017-08-08 13:52   ` Chris Wilson
  2017-08-08 14:00   ` Chris Wilson
  2 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2017-08-08 13:52 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: stable

Quoting Mika Kuoppala (2017-08-08 14:36:39)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > As we may have just bound the renderstate into the GGTT for execution, we
> > need to ensure that the GTT TLB are also flushed.
> >
> > On snb-gt2, this would cause a random GPU hang at the start of a new
> > context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
> > to take ~10s. It was the GPU hang that revealed the truth, as the CS
> > gleefully executed beyond the end of the golden renderstate batch, a good
> > indicator for a GTT TLB miss.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: stable@vger.kernel.org
> 
> The flush has been there but got stomped by:
> 
> Fixes: dc4be6071a24 ("drm/i915: Add explicit request management to i915_gem_init_hw()")

Hmm, I think it is actually 20fe17aa52dc ("drm/i915: Remove
redundant TLB invalidate on switching contexts"). The importance is in
having the invalidate after the GTT bind and before the MI_BB_START.
Which just happens to be implied by the ordering of the context switch
emission, but given the indirect link, not guaranteed.

Commit dc4be6071a24 only modifies the sequence after the MI_BB_START, so
doesn't look like the culprit.
-Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate
  2017-08-08 13:36 ` Mika Kuoppala
  2017-08-08 13:37   ` Mika Kuoppala
  2017-08-08 13:52   ` Chris Wilson
@ 2017-08-08 14:00   ` Chris Wilson
  2 siblings, 0 replies; 6+ messages in thread
From: Chris Wilson @ 2017-08-08 14:00 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx; +Cc: stable

Quoting Mika Kuoppala (2017-08-08 14:36:39)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > As we may have just bound the renderstate into the GGTT for execution, we
> > need to ensure that the GTT TLB are also flushed.
> >
> > On snb-gt2, this would cause a random GPU hang at the start of a new
> > context (e.g. boot) and on snb-gt1, it was causing the renderstate batch
> > to take ~10s. It was the GPU hang that revealed the truth, as the CS
> > gleefully executed beyond the end of the golden renderstate batch, a good
> > indicator for a GTT TLB miss.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: stable@vger.kernel.org
> 
> The flush has been there but got stomped by:
> 
> Fixes: dc4be6071a24 ("drm/i915: Add explicit request management to i915_gem_init_hw()")
> 
> Now we can fix the gen6 renderstate too ;)
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

Added 
Fixes: 20fe17aa52dc ("drm/i915: Remove redundant TLB invalidate on switching contexts")
and pushed. Still weird that I can't see anything resembling this on the
farm, despite the two snb machines I have having weird problems. Also
the effect is so random, a TLB miss!, it is hard to imagine devising a
better test (every igt allocates at least one new context executing the
renderstate, and so having an opportunity to hit a bug.)

Thanks,
-Chris

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-08 14:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-08 13:19 [PATCH] drm/i915: Perform an invalidate prior to executing golden renderstate Chris Wilson
2017-08-08 13:36 ` Mika Kuoppala
2017-08-08 13:37   ` Mika Kuoppala
2017-08-08 13:43     ` Mika Kuoppala
2017-08-08 13:52   ` Chris Wilson
2017-08-08 14:00   ` Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).