* [PATCH] drm/i915: set ctx->initialized only after RCS
@ 2013-12-28 21:31 Ben Widawsky
2013-12-29 9:59 ` Chris Wilson
2014-01-07 7:10 ` Daniel Vetter
0 siblings, 2 replies; 9+ messages in thread
From: Ben Widawsky @ 2013-12-28 21:31 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The initialized flag is used to specify a context has been initialized
and it's context is safe to load, ie. the 3d state is setup properly.
With full PPGTT, we emit the address space loads during context switch
and this currently marks a context as initialized. With full PPGTT
patches, if a client first emits a batch to !RCS, then later, RCS, the
code will mistake the context as initialized and try to reload an
uninitialized context.
1. context 1 blit // context initialized
2. context 2 <X operation> // saves context 1 random state
3. context 1 render // loads random state from step 2
It is really easy to hit this with a planned upcoming patch which makes
default context reuse possible.
NOTE: This should only effect full PPGTT branches, ie. current
drm-intel-nightly.
Thanks to Chris for helping me track this down.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_context.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index ebe0f67..28b9f30 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -686,10 +686,11 @@ static int do_switch(struct intel_ring_buffer *ring,
i915_gem_context_unreference(from);
}
+ to->is_initialized = true;
+
done:
i915_gem_context_reference(to);
ring->last_context = to;
- to->is_initialized = true;
to->last_ring = ring;
return 0;
--
1.8.5.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2013-12-28 21:31 [PATCH] drm/i915: set ctx->initialized only after RCS Ben Widawsky
@ 2013-12-29 9:59 ` Chris Wilson
2013-12-30 21:34 ` Ben Widawsky
2014-01-07 7:10 ` Daniel Vetter
1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2013-12-29 9:59 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> The initialized flag is used to specify a context has been initialized
> and it's context is safe to load, ie. the 3d state is setup properly.
> With full PPGTT, we emit the address space loads during context switch
> and this currently marks a context as initialized. With full PPGTT
> patches, if a client first emits a batch to !RCS, then later, RCS, the
> code will mistake the context as initialized and try to reload an
> uninitialized context.
>
> 1. context 1 blit // context initialized
+context marked as initialised
> 2. context 2 <X operation> // saves context 1 random state
> 3. context 1 render // loads random state from step 2
Note that step 2 is not required since the tracking is per-ring.
> It is really easy to hit this with a planned upcoming patch which makes
> default context reuse possible.
>
> NOTE: This should only effect full PPGTT branches, ie. current
> drm-intel-nightly.
Though we did postulate that mesa could hit this in a pathological
scenario with a freshly created context.
> Thanks to Chris for helping me track this down.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2013-12-29 9:59 ` Chris Wilson
@ 2013-12-30 21:34 ` Ben Widawsky
2013-12-31 11:26 ` Chris Wilson
0 siblings, 1 reply; 9+ messages in thread
From: Ben Widawsky @ 2013-12-30 21:34 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Sun, Dec 29, 2013 at 09:59:26AM +0000, Chris Wilson wrote:
> On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > The initialized flag is used to specify a context has been initialized
> > and it's context is safe to load, ie. the 3d state is setup properly.
> > With full PPGTT, we emit the address space loads during context switch
> > and this currently marks a context as initialized. With full PPGTT
> > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > code will mistake the context as initialized and try to reload an
> > uninitialized context.
> >
> > 1. context 1 blit // context initialized
> +context marked as initialised
>
> > 2. context 2 <X operation> // saves context 1 random state
> > 3. context 1 render // loads random state from step 2
>
> Note that step 2 is not required since the tracking is per-ring.
>
This is missing an extremely important caveat which I was
incorrectly correlating earlier in our discussion. Yes, step 2 is not
required. The step which is required is the page allocated must be
non-zero when allocated, and the contents of the page must be capable of
hanging the GPU when used as a context object.
Otherwise, the uninitialized context would always be all 0, which if I
understand the HW correctly, is "safe"
As long as you don't disagree, I'll fix the commit message with that
info.
> > It is really easy to hit this with a planned upcoming patch which makes
> > default context reuse possible.
> >
> > NOTE: This should only effect full PPGTT branches, ie. current
> > drm-intel-nightly.
>
> Though we did postulate that mesa could hit this in a pathological
> scenario with a freshly created context.
>
> > Thanks to Chris for helping me track this down.
> >
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2013-12-30 21:34 ` Ben Widawsky
@ 2013-12-31 11:26 ` Chris Wilson
2014-01-01 18:10 ` Ben Widawsky
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2013-12-31 11:26 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Mon, Dec 30, 2013 at 01:34:46PM -0800, Ben Widawsky wrote:
> On Sun, Dec 29, 2013 at 09:59:26AM +0000, Chris Wilson wrote:
> > On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > > The initialized flag is used to specify a context has been initialized
> > > and it's context is safe to load, ie. the 3d state is setup properly.
> > > With full PPGTT, we emit the address space loads during context switch
> > > and this currently marks a context as initialized. With full PPGTT
> > > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > > code will mistake the context as initialized and try to reload an
> > > uninitialized context.
> > >
> > > 1. context 1 blit // context initialized
> > +context marked as initialised
> >
> > > 2. context 2 <X operation> // saves context 1 random state
> > > 3. context 1 render // loads random state from step 2
> >
> > Note that step 2 is not required since the tracking is per-ring.
> >
>
> This is missing an extremely important caveat which I was
> incorrectly correlating earlier in our discussion. Yes, step 2 is not
> required. The step which is required is the page allocated must be
> non-zero when allocated, and the contents of the page must be capable of
> hanging the GPU when used as a context object.
Really? I'm pretty sure the last error state we looked at, the context
was all zeroes.
> Otherwise, the uninitialized context would always be all 0, which if I
> understand the HW correctly, is "safe"
>
> As long as you don't disagree, I'll fix the commit message with that
> info.
Whatever you feel matches our best understanding of the problem. Just
double check that last error state first ;-)
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2013-12-31 11:26 ` Chris Wilson
@ 2014-01-01 18:10 ` Ben Widawsky
0 siblings, 0 replies; 9+ messages in thread
From: Ben Widawsky @ 2014-01-01 18:10 UTC (permalink / raw)
To: Chris Wilson, Ben Widawsky, Intel GFX
On Tue, Dec 31, 2013 at 11:26:17AM +0000, Chris Wilson wrote:
> On Mon, Dec 30, 2013 at 01:34:46PM -0800, Ben Widawsky wrote:
> > On Sun, Dec 29, 2013 at 09:59:26AM +0000, Chris Wilson wrote:
> > > On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > > > The initialized flag is used to specify a context has been initialized
> > > > and it's context is safe to load, ie. the 3d state is setup properly.
> > > > With full PPGTT, we emit the address space loads during context switch
> > > > and this currently marks a context as initialized. With full PPGTT
> > > > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > > > code will mistake the context as initialized and try to reload an
> > > > uninitialized context.
> > > >
> > > > 1. context 1 blit // context initialized
> > > +context marked as initialised
> > >
> > > > 2. context 2 <X operation> // saves context 1 random state
> > > > 3. context 1 render // loads random state from step 2
> > >
> > > Note that step 2 is not required since the tracking is per-ring.
> > >
> >
> > This is missing an extremely important caveat which I was
> > incorrectly correlating earlier in our discussion. Yes, step 2 is not
> > required. The step which is required is the page allocated must be
> > non-zero when allocated, and the contents of the page must be capable of
> > hanging the GPU when used as a context object.
>
> Really? I'm pretty sure the last error state we looked at, the context
> was all zeroes.
>
Well... the first page of the context is all 0.
> > Otherwise, the uninitialized context would always be all 0, which if I
> > understand the HW correctly, is "safe"
> >
> > As long as you don't disagree, I'll fix the commit message with that
> > info.
>
> Whatever you feel matches our best understanding of the problem. Just
> double check that last error state first ;-)
> -Chris
>
The real problem is that I've gone and convinced myself that no other
means exists to actually make the GPU hang. This incidentally also gives
some merit to my assumption that IPEHR is accurate through context loads
- in other words, we try to do MI_SET_CONTEXT during a MI_SET_CONTEXT.
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2013-12-28 21:31 [PATCH] drm/i915: set ctx->initialized only after RCS Ben Widawsky
2013-12-29 9:59 ` Chris Wilson
@ 2014-01-07 7:10 ` Daniel Vetter
2014-01-07 10:47 ` Chris Wilson
1 sibling, 1 reply; 9+ messages in thread
From: Daniel Vetter @ 2014-01-07 7:10 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> The initialized flag is used to specify a context has been initialized
> and it's context is safe to load, ie. the 3d state is setup properly.
> With full PPGTT, we emit the address space loads during context switch
> and this currently marks a context as initialized. With full PPGTT
> patches, if a client first emits a batch to !RCS, then later, RCS, the
> code will mistake the context as initialized and try to reload an
> uninitialized context.
>
> 1. context 1 blit // context initialized
> 2. context 2 <X operation> // saves context 1 random state
> 3. context 1 render // loads random state from step 2
>
> It is really easy to hit this with a planned upcoming patch which makes
> default context reuse possible.
>
> NOTE: This should only effect full PPGTT branches, ie. current
> drm-intel-nightly.
>
> Thanks to Chris for helping me track this down.
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Do we have a testcase for this or a bug report?
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem_context.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index ebe0f67..28b9f30 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -686,10 +686,11 @@ static int do_switch(struct intel_ring_buffer *ring,
> i915_gem_context_unreference(from);
> }
>
> + to->is_initialized = true;
> +
> done:
> i915_gem_context_reference(to);
> ring->last_context = to;
> - to->is_initialized = true;
> to->last_ring = ring;
>
> return 0;
> --
> 1.8.5.2
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2014-01-07 7:10 ` Daniel Vetter
@ 2014-01-07 10:47 ` Chris Wilson
2014-01-09 23:17 ` Ben Widawsky
0 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2014-01-07 10:47 UTC (permalink / raw)
To: Daniel Vetter; +Cc: Intel GFX, Ben Widawsky, Ben Widawsky
On Tue, Jan 07, 2014 at 08:10:33AM +0100, Daniel Vetter wrote:
> On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > The initialized flag is used to specify a context has been initialized
> > and it's context is safe to load, ie. the 3d state is setup properly.
> > With full PPGTT, we emit the address space loads during context switch
> > and this currently marks a context as initialized. With full PPGTT
> > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > code will mistake the context as initialized and try to reload an
> > uninitialized context.
> >
> > 1. context 1 blit // context initialized
> > 2. context 2 <X operation> // saves context 1 random state
> > 3. context 1 render // loads random state from step 2
> >
> > It is really easy to hit this with a planned upcoming patch which makes
> > default context reuse possible.
> >
> > NOTE: This should only effect full PPGTT branches, ie. current
> > drm-intel-nightly.
> >
> > Thanks to Chris for helping me track this down.
> >
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> Do we have a testcase for this or a bug report?
Odd, since QA should have hit this...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2014-01-07 10:47 ` Chris Wilson
@ 2014-01-09 23:17 ` Ben Widawsky
2014-01-10 7:18 ` Daniel Vetter
0 siblings, 1 reply; 9+ messages in thread
From: Ben Widawsky @ 2014-01-09 23:17 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, Ben Widawsky, Intel GFX
On Tue, Jan 07, 2014 at 10:47:18AM +0000, Chris Wilson wrote:
> On Tue, Jan 07, 2014 at 08:10:33AM +0100, Daniel Vetter wrote:
> > On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > > The initialized flag is used to specify a context has been initialized
> > > and it's context is safe to load, ie. the 3d state is setup properly.
> > > With full PPGTT, we emit the address space loads during context switch
> > > and this currently marks a context as initialized. With full PPGTT
> > > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > > code will mistake the context as initialized and try to reload an
> > > uninitialized context.
> > >
> > > 1. context 1 blit // context initialized
> > > 2. context 2 <X operation> // saves context 1 random state
> > > 3. context 1 render // loads random state from step 2
> > >
> > > It is really easy to hit this with a planned upcoming patch which makes
> > > default context reuse possible.
> > >
> > > NOTE: This should only effect full PPGTT branches, ie. current
> > > drm-intel-nightly.
> > >
> > > Thanks to Chris for helping me track this down.
> > >
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> >
> > Do we have a testcase for this or a bug report?
>
> Odd, since QA should have hit this...
> -Chris
I was only able to hit it on one machine. The test case was 'startx.' I
think getting it to fail is somewhat dependent upon how the backing
pages for the context are initialized (at least, that is my theory).
--
Ben Widawsky, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: set ctx->initialized only after RCS
2014-01-09 23:17 ` Ben Widawsky
@ 2014-01-10 7:18 ` Daniel Vetter
0 siblings, 0 replies; 9+ messages in thread
From: Daniel Vetter @ 2014-01-10 7:18 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Thu, Jan 09, 2014 at 03:17:25PM -0800, Ben Widawsky wrote:
> On Tue, Jan 07, 2014 at 10:47:18AM +0000, Chris Wilson wrote:
> > On Tue, Jan 07, 2014 at 08:10:33AM +0100, Daniel Vetter wrote:
> > > On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > > > The initialized flag is used to specify a context has been initialized
> > > > and it's context is safe to load, ie. the 3d state is setup properly.
> > > > With full PPGTT, we emit the address space loads during context switch
> > > > and this currently marks a context as initialized. With full PPGTT
> > > > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > > > code will mistake the context as initialized and try to reload an
> > > > uninitialized context.
> > > >
> > > > 1. context 1 blit // context initialized
> > > > 2. context 2 <X operation> // saves context 1 random state
> > > > 3. context 1 render // loads random state from step 2
> > > >
> > > > It is really easy to hit this with a planned upcoming patch which makes
> > > > default context reuse possible.
> > > >
> > > > NOTE: This should only effect full PPGTT branches, ie. current
> > > > drm-intel-nightly.
> > > >
> > > > Thanks to Chris for helping me track this down.
> > > >
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > >
> > > Do we have a testcase for this or a bug report?
> >
> > Odd, since QA should have hit this...
> > -Chris
>
> I was only able to hit it on one machine. The test case was 'startx.' I
> think getting it to fail is somewhat dependent upon how the backing
> pages for the context are initialized (at least, that is my theory).
Well I've hoped we'd understand the fallout a bit better, but meh. Patch
merged with the failure scenario in the commit message simplified as noted
by Chris.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-01-10 7:17 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-28 21:31 [PATCH] drm/i915: set ctx->initialized only after RCS Ben Widawsky
2013-12-29 9:59 ` Chris Wilson
2013-12-30 21:34 ` Ben Widawsky
2013-12-31 11:26 ` Chris Wilson
2014-01-01 18:10 ` Ben Widawsky
2014-01-07 7:10 ` Daniel Vetter
2014-01-07 10:47 ` Chris Wilson
2014-01-09 23:17 ` Ben Widawsky
2014-01-10 7:18 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox