* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-04-02 18:21 [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display oscar.mateo
@ 2014-04-02 17:59 ` Chris Wilson
2014-04-03 9:34 ` Daniel Vetter
2014-05-16 11:08 ` [PATCH v2] " oscar.mateo
2 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2014-04-02 17:59 UTC (permalink / raw)
To: oscar.mateo; +Cc: intel-gfx
On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> Otherwise, we do a NULL pointer dereference.
Or we could just do
was_pin_display = obj->pin_display;
obj->pin_display = true;
err_unpin_display:
obj->pin_display = was_pin_display;
And then we wouldn't even need the essay... However, it is almost a
theory of operation and quite useful...
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
@ 2014-04-02 18:21 oscar.mateo
2014-04-02 17:59 ` Chris Wilson
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: oscar.mateo @ 2014-04-02 18:21 UTC (permalink / raw)
To: intel-gfx
From: Oscar Mateo <oscar.mateo@intel.com>
Otherwise, we do a NULL pointer dereference.
I've seen this happen while handling an error in
i915_gem_object_pin_to_display_plane():
If i915_gem_object_set_cache_level() fails, we call is_pin_display() to
handle the error. At this point, the object is still not pinned to GGTT
and maybe not even bound, so we have to check before we dereference its
GGTT vma.
Issue: VIZ-3772
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c70121d..1d161c7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3619,6 +3619,10 @@ unlock:
static bool is_pin_display(struct drm_i915_gem_object *obj)
{
+ struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
+ if (!vma)
+ return false;
+
/* There are 3 sources that pin objects:
* 1. The display engine (scanouts, sprites, cursors);
* 2. Reservations for execbuffer;
@@ -3630,7 +3634,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
* subtracting the potential reference by the user, any pin_count
* remains, it must be due to another use by the display engine.
*/
- return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
+ return vma->pin_count - !!obj->user_pin_count;
}
/*
--
1.9.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-04-02 18:21 [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display oscar.mateo
2014-04-02 17:59 ` Chris Wilson
@ 2014-04-03 9:34 ` Daniel Vetter
2014-05-12 9:05 ` Mateo Lozano, Oscar
2014-05-16 11:08 ` [PATCH v2] " oscar.mateo
2 siblings, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2014-04-03 9:34 UTC (permalink / raw)
To: oscar.mateo; +Cc: intel-gfx
On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> Otherwise, we do a NULL pointer dereference.
>
> I've seen this happen while handling an error in
> i915_gem_object_pin_to_display_plane():
>
> If i915_gem_object_set_cache_level() fails, we call is_pin_display() to
> handle the error. At this point, the object is still not pinned to GGTT
> and maybe not even bound, so we have to check before we dereference its
> GGTT vma.
>
> Issue: VIZ-3772
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Have you looked into provoking this with an igt testcase? On a hunch a
busy load (to extend the race window) plus the usual interruptor trick to
jump out of wait_seqno calls should be able to make this go kaboom on
command. But I haven't analyzed the bug in detail.
Thanks, Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c70121d..1d161c7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3619,6 +3619,10 @@ unlock:
>
> static bool is_pin_display(struct drm_i915_gem_object *obj)
> {
> + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
> + if (!vma)
> + return false;
> +
> /* There are 3 sources that pin objects:
> * 1. The display engine (scanouts, sprites, cursors);
> * 2. Reservations for execbuffer;
> @@ -3630,7 +3634,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
> * subtracting the potential reference by the user, any pin_count
> * remains, it must be due to another use by the display engine.
> */
> - return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
> + return vma->pin_count - !!obj->user_pin_count;
> }
>
> /*
> --
> 1.9.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-04-03 9:34 ` Daniel Vetter
@ 2014-05-12 9:05 ` Mateo Lozano, Oscar
2014-05-12 10:09 ` Chris Wilson
2014-05-12 16:11 ` Daniel Vetter
0 siblings, 2 replies; 18+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-12 9:05 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx@lists.freedesktop.org
Hi Daniel,
Sorry, this fell through the cracks:
> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> GGTT in is_pin_display
>
> On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> > From: Oscar Mateo <oscar.mateo@intel.com>
> >
> > Otherwise, we do a NULL pointer dereference.
> >
> > I've seen this happen while handling an error in
> > i915_gem_object_pin_to_display_plane():
> >
> > If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> > to handle the error. At this point, the object is still not pinned to
> > GGTT and maybe not even bound, so we have to check before we
> > dereference its GGTT vma.
> >
> > Issue: VIZ-3772
> > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
>
> Have you looked into provoking this with an igt testcase? On a hunch a busy
> load (to extend the race window) plus the usual interruptor trick to jump out of
> wait_seqno calls should be able to make this go kaboom on command. But I
> haven't analyzed the bug in detail.
AFAICT, the only sequence where this likely to happen (because we are handling a recently created object) is:
intelfb_alloc -> intel_pin_and_fence_fb_obj -> i915_gem_object_pin_to_display_plane -> i915_gem_object_set_cache_level -> is_pin_display
Now, I think intelfb_alloc only happens during bootup, so reproducing it on IGT would be difficult.
I still feel the fix is valid though :)
-- Oscar
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 9:05 ` Mateo Lozano, Oscar
@ 2014-05-12 10:09 ` Chris Wilson
2014-05-12 10:30 ` Mateo Lozano, Oscar
2014-05-12 16:11 ` Daniel Vetter
1 sibling, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2014-05-12 10:09 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Mon, May 12, 2014 at 09:05:45AM +0000, Mateo Lozano, Oscar wrote:
> Hi Daniel,
>
> Sorry, this fell through the cracks:
>
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> > GGTT in is_pin_display
> >
> > On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > Otherwise, we do a NULL pointer dereference.
> > >
> > > I've seen this happen while handling an error in
> > > i915_gem_object_pin_to_display_plane():
> > >
> > > If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> > > to handle the error. At this point, the object is still not pinned to
> > > GGTT and maybe not even bound, so we have to check before we
> > > dereference its GGTT vma.
> > >
> > > Issue: VIZ-3772
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> >
> > Have you looked into provoking this with an igt testcase? On a hunch a busy
> > load (to extend the race window) plus the usual interruptor trick to jump out of
> > wait_seqno calls should be able to make this go kaboom on command. But I
> > haven't analyzed the bug in detail.
>
> AFAICT, the only sequence where this likely to happen (because we are handling a recently created object) is:
>
> intelfb_alloc -> intel_pin_and_fence_fb_obj -> i915_gem_object_pin_to_display_plane -> i915_gem_object_set_cache_level -> is_pin_display
>
> Now, I think intelfb_alloc only happens during bootup, so reproducing it on IGT would be difficult.
> I still feel the fix is valid though :)
Did you consider my alternative fix of restoring the old value in the
error path?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 10:09 ` Chris Wilson
@ 2014-05-12 10:30 ` Mateo Lozano, Oscar
2014-05-12 10:37 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-12 10:30 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx@lists.freedesktop.org
---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Monday, May 12, 2014 11:09 AM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> GGTT in is_pin_display
>
> On Mon, May 12, 2014 at 09:05:45AM +0000, Mateo Lozano, Oscar wrote:
> > Hi Daniel,
> >
> > Sorry, this fell through the cracks:
> >
> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not
> > > bound to GGTT in is_pin_display
> > >
> > > On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com
> wrote:
> > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > >
> > > > Otherwise, we do a NULL pointer dereference.
> > > >
> > > > I've seen this happen while handling an error in
> > > > i915_gem_object_pin_to_display_plane():
> > > >
> > > > If i915_gem_object_set_cache_level() fails, we call
> > > > is_pin_display() to handle the error. At this point, the object is
> > > > still not pinned to GGTT and maybe not even bound, so we have to
> > > > check before we dereference its GGTT vma.
> > > >
> > > > Issue: VIZ-3772
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > Have you looked into provoking this with an igt testcase? On a hunch
> > > a busy load (to extend the race window) plus the usual interruptor
> > > trick to jump out of wait_seqno calls should be able to make this go
> > > kaboom on command. But I haven't analyzed the bug in detail.
> >
> > AFAICT, the only sequence where this likely to happen (because we are
> handling a recently created object) is:
> >
> > intelfb_alloc -> intel_pin_and_fence_fb_obj ->
> > i915_gem_object_pin_to_display_plane ->
> > i915_gem_object_set_cache_level -> is_pin_display
> >
> > Now, I think intelfb_alloc only happens during bootup, so reproducing it on
> IGT would be difficult.
> > I still feel the fix is valid though :)
>
> Did you consider my alternative fix of restoring the old value in the error path?
> -Chris
Is that directed to Daniel or me? Restoring the old value is way easier, but I thought you wanted to keep is_pin_display as a theory of operation? And Daniel might still want an IGT test, I suppose...
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 10:30 ` Mateo Lozano, Oscar
@ 2014-05-12 10:37 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2014-05-12 10:37 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Mon, May 12, 2014 at 10:30:11AM +0000, Mateo Lozano, Oscar wrote:
> > Did you consider my alternative fix of restoring the old value in the error path?
>
> Is that directed to Daniel or me? Restoring the old value is way easier, but I thought you wanted to keep is_pin_display as a theory of operation? And Daniel might still want an IGT test, I suppose...
>
I'd like to be able to kill is_pin_display(), it's very fragile but I
don't have a better alternative yet.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 9:05 ` Mateo Lozano, Oscar
2014-05-12 10:09 ` Chris Wilson
@ 2014-05-12 16:11 ` Daniel Vetter
2014-05-12 16:14 ` Daniel Vetter
1 sibling, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2014-05-12 16:11 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Mon, May 12, 2014 at 09:05:45AM +0000, Mateo Lozano, Oscar wrote:
> Hi Daniel,
>
> Sorry, this fell through the cracks:
>
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> > GGTT in is_pin_display
> >
> > On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> > > From: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > Otherwise, we do a NULL pointer dereference.
> > >
> > > I've seen this happen while handling an error in
> > > i915_gem_object_pin_to_display_plane():
> > >
> > > If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> > > to handle the error. At this point, the object is still not pinned to
> > > GGTT and maybe not even bound, so we have to check before we
> > > dereference its GGTT vma.
> > >
> > > Issue: VIZ-3772
> > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> >
> > Have you looked into provoking this with an igt testcase? On a hunch a busy
> > load (to extend the race window) plus the usual interruptor trick to jump out of
> > wait_seqno calls should be able to make this go kaboom on command. But I
> > haven't analyzed the bug in detail.
>
> AFAICT, the only sequence where this likely to happen (because we are handling a recently created object) is:
>
> intelfb_alloc -> intel_pin_and_fence_fb_obj -> i915_gem_object_pin_to_display_plane -> i915_gem_object_set_cache_level -> is_pin_display
Pageflipping to a freshly allocated BO without ever touching it beforehand
should be able to achive the same. If this is really all that's needed.
But looking at the code a better way should be:
1. Create new bo, wrap it in a kms fb.
2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
3. Enable evil interruptor (igt_fork_signal_helper).
4. Submit pageflip
-> Boom since the set_cache_level will block, get interrupted and exit
early with -EINTR.
Given sufficient overkill in 2. this should be 100% reliable to reproduce.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 16:11 ` Daniel Vetter
@ 2014-05-12 16:14 ` Daniel Vetter
2014-05-12 17:10 ` Mateo Lozano, Oscar
2014-05-15 13:14 ` Mateo Lozano, Oscar
0 siblings, 2 replies; 18+ messages in thread
From: Daniel Vetter @ 2014-05-12 16:14 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Mon, May 12, 2014 at 06:11:18PM +0200, Daniel Vetter wrote:
> On Mon, May 12, 2014 at 09:05:45AM +0000, Mateo Lozano, Oscar wrote:
> > Hi Daniel,
> >
> > Sorry, this fell through the cracks:
> >
> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> > > GGTT in is_pin_display
> > >
> > > On Wed, Apr 02, 2014 at 07:21:01PM +0100, oscar.mateo@intel.com wrote:
> > > > From: Oscar Mateo <oscar.mateo@intel.com>
> > > >
> > > > Otherwise, we do a NULL pointer dereference.
> > > >
> > > > I've seen this happen while handling an error in
> > > > i915_gem_object_pin_to_display_plane():
> > > >
> > > > If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> > > > to handle the error. At this point, the object is still not pinned to
> > > > GGTT and maybe not even bound, so we have to check before we
> > > > dereference its GGTT vma.
> > > >
> > > > Issue: VIZ-3772
> > > > Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> > >
> > > Have you looked into provoking this with an igt testcase? On a hunch a busy
> > > load (to extend the race window) plus the usual interruptor trick to jump out of
> > > wait_seqno calls should be able to make this go kaboom on command. But I
> > > haven't analyzed the bug in detail.
> >
> > AFAICT, the only sequence where this likely to happen (because we are handling a recently created object) is:
> >
> > intelfb_alloc -> intel_pin_and_fence_fb_obj -> i915_gem_object_pin_to_display_plane -> i915_gem_object_set_cache_level -> is_pin_display
>
> Pageflipping to a freshly allocated BO without ever touching it beforehand
> should be able to achive the same. If this is really all that's needed.
>
> But looking at the code a better way should be:
> 1. Create new bo, wrap it in a kms fb.
> 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> 3. Enable evil interruptor (igt_fork_signal_helper).
> 4. Submit pageflip
>
> -> Boom since the set_cache_level will block, get interrupted and exit
> early with -EINTR.
>
> Given sufficient overkill in 2. this should be 100% reliable to reproduce.
Aside: Those kinds of tricks are a big reason why I think igts aren't just
useful as testcases, but also to really understand how a bug comes about.
At least ime finally figuring out the last ingredient to make an igt fully
reliably often resulted in a suddenly much clearer understanding of the
bug at hand.
I call this "review by asking for an igt" ;-)
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 16:14 ` Daniel Vetter
@ 2014-05-12 17:10 ` Mateo Lozano, Oscar
2014-05-15 13:14 ` Mateo Lozano, Oscar
1 sibling, 0 replies; 18+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-12 17:10 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx@lists.freedesktop.org
> I call this "review by asking for an igt" ;-) -Daniel
Ok, I´ll give it a try. At least I will learn something about the kms code, a.k.a. "learning by igt" :D
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-12 16:14 ` Daniel Vetter
2014-05-12 17:10 ` Mateo Lozano, Oscar
@ 2014-05-15 13:14 ` Mateo Lozano, Oscar
2014-05-15 13:33 ` Ville Syrjälä
2014-05-15 13:45 ` Daniel Vetter
1 sibling, 2 replies; 18+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-15 13:14 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx@lists.freedesktop.org
> > But looking at the code a better way should be:
> > 1. Create new bo, wrap it in a kms fb.
> > 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> > 3. Enable evil interruptor (igt_fork_signal_helper).
> > 4. Submit pageflip
> >
> > -> Boom since the set_cache_level will block, get interrupted and exit
> > early with -EINTR.
> >
> > Given sufficient overkill in 2. this should be 100% reliable to reproduce.
As soon as I execbuffer to the bo, it gets a vma for the GGTT vm:
vm = ctx->vm;
if (!USES_FULL_PPGTT(dev))
vm = &dev_priv->gtt.base;
...
/* Look up object handles */
ret = eb_lookup_vmas(eb, exec, args, vm, file);
if (ret)
goto err;
And then it becomes impossible to reproduce the problem :(
Is there any other trick to make set_cache_level fail?
-- Oscar
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-15 13:14 ` Mateo Lozano, Oscar
@ 2014-05-15 13:33 ` Ville Syrjälä
2014-05-16 10:43 ` Mateo Lozano, Oscar
2014-05-15 13:45 ` Daniel Vetter
1 sibling, 1 reply; 18+ messages in thread
From: Ville Syrjälä @ 2014-05-15 13:33 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Thu, May 15, 2014 at 01:14:54PM +0000, Mateo Lozano, Oscar wrote:
> > > But looking at the code a better way should be:
> > > 1. Create new bo, wrap it in a kms fb.
> > > 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> > > 3. Enable evil interruptor (igt_fork_signal_helper).
> > > 4. Submit pageflip
> > >
> > > -> Boom since the set_cache_level will block, get interrupted and exit
> > > early with -EINTR.
> > >
> > > Given sufficient overkill in 2. this should be 100% reliable to reproduce.
>
> As soon as I execbuffer to the bo, it gets a vma for the GGTT vm:
>
> vm = ctx->vm;
> if (!USES_FULL_PPGTT(dev))
> vm = &dev_priv->gtt.base;
>
> ...
>
> /* Look up object handles */
> ret = eb_lookup_vmas(eb, exec, args, vm, file);
> if (ret)
> goto err;
>
> And then it becomes impossible to reproduce the problem :(
> Is there any other trick to make set_cache_level fail?
What if you make the pin_to_ggtt fail instead? Can you create an object
that's too big for the ggtt?
--
Ville Syrjälä
Intel OTC
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-15 13:14 ` Mateo Lozano, Oscar
2014-05-15 13:33 ` Ville Syrjälä
@ 2014-05-15 13:45 ` Daniel Vetter
1 sibling, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2014-05-15 13:45 UTC (permalink / raw)
To: Mateo Lozano, Oscar; +Cc: intel-gfx@lists.freedesktop.org
On Thu, May 15, 2014 at 01:14:54PM +0000, Mateo Lozano, Oscar wrote:
> > > But looking at the code a better way should be:
> > > 1. Create new bo, wrap it in a kms fb.
> > > 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> > > 3. Enable evil interruptor (igt_fork_signal_helper).
> > > 4. Submit pageflip
> > >
> > > -> Boom since the set_cache_level will block, get interrupted and exit
> > > early with -EINTR.
> > >
> > > Given sufficient overkill in 2. this should be 100% reliable to reproduce.
>
> As soon as I execbuffer to the bo, it gets a vma for the GGTT vm:
>
> vm = ctx->vm;
> if (!USES_FULL_PPGTT(dev))
> vm = &dev_priv->gtt.base;
>
> ...
>
> /* Look up object handles */
> ret = eb_lookup_vmas(eb, exec, args, vm, file);
> if (ret)
> goto err;
>
> And then it becomes impossible to reproduce the problem :(
> Is there any other trick to make set_cache_level fail?
i915.ppgtt=2 should still make this blow up. The bug kinda doesn't exist
without full ppgtt I think ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-15 13:33 ` Ville Syrjälä
@ 2014-05-16 10:43 ` Mateo Lozano, Oscar
0 siblings, 0 replies; 18+ messages in thread
From: Mateo Lozano, Oscar @ 2014-05-16 10:43 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx@lists.freedesktop.org
> -----Original Message-----
> From: Ville Syrjälä [mailto:ville.syrjala@linux.intel.com]
> Sent: Thursday, May 15, 2014 2:34 PM
> To: Mateo Lozano, Oscar
> Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Gracefully handle obj not bound to
> GGTT in is_pin_display
>
> On Thu, May 15, 2014 at 01:14:54PM +0000, Mateo Lozano, Oscar wrote:
> > > > But looking at the code a better way should be:
> > > > 1. Create new bo, wrap it in a kms fb.
> > > > 2. Slap busy load onto that bo, e.g. reapeatedly fill it with the blitter.
> > > > 3. Enable evil interruptor (igt_fork_signal_helper).
> > > > 4. Submit pageflip
> > > >
> > > > -> Boom since the set_cache_level will block, get interrupted and
> > > > -> exit
> > > > early with -EINTR.
> > > >
> > > > Given sufficient overkill in 2. this should be 100% reliable to reproduce.
> >
> > As soon as I execbuffer to the bo, it gets a vma for the GGTT vm:
> >
> > vm = ctx->vm;
> > if (!USES_FULL_PPGTT(dev))
> > vm = &dev_priv->gtt.base;
> >
> > ...
> >
> > /* Look up object handles */
> > ret = eb_lookup_vmas(eb, exec, args, vm, file);
> > if (ret)
> > goto err;
> >
> > And then it becomes impossible to reproduce the problem :( Is there
> > any other trick to make set_cache_level fail?
>
> What if you make the pin_to_ggtt fail instead? Can you create an object that's
> too big for the ggtt?
Thanks Ville: that worked like a charm ;)
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-04-02 18:21 [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display oscar.mateo
2014-04-02 17:59 ` Chris Wilson
2014-04-03 9:34 ` Daniel Vetter
@ 2014-05-16 11:08 ` oscar.mateo
2014-05-16 11:26 ` Chris Wilson
2014-05-16 13:20 ` [PATCH v3] " oscar.mateo
2 siblings, 2 replies; 18+ messages in thread
From: oscar.mateo @ 2014-05-16 11:08 UTC (permalink / raw)
To: intel-gfx
From: Oscar Mateo <oscar.mateo@intel.com>
Otherwise, we do a NULL pointer dereference.
I've seen this happen while handling an error in
i915_gem_object_pin_to_display_plane():
If i915_gem_object_set_cache_level() fails, we call is_pin_display()
to handle the error. At this point, the object is still not pinned
to GGTT and maybe not even bound, so we have to check before we
dereference its GGTT vma.
v2: Chris Wilson says restoring the old value is easier, but that
is_pin_display is useful as a theory of operation. Take the solomonic
decision: at least this way is_pin_display is a little more robust
(until Chris can kill it off).
Issue: VIZ-3772
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 034ba2c..211b778 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3641,6 +3641,15 @@ unlock:
static bool is_pin_display(struct drm_i915_gem_object *obj)
{
+ struct i915_vma *vma;
+
+ if (list_empty(&obj->vma_list))
+ return false;
+
+ vma = i915_gem_obj_to_ggtt(obj);
+ if (!vma)
+ return false;
+
/* There are 3 sources that pin objects:
* 1. The display engine (scanouts, sprites, cursors);
* 2. Reservations for execbuffer;
@@ -3652,7 +3661,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
* subtracting the potential reference by the user, any pin_count
* remains, it must be due to another use by the display engine.
*/
- return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
+ return vma->pin_count - !!obj->user_pin_count;
}
/*
@@ -3666,6 +3675,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *pipelined)
{
u32 old_read_domains, old_write_domain;
+ bool was_pin_display;
int ret;
if (pipelined != obj->ring) {
@@ -3677,6 +3687,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
/* Mark the pin_display early so that we account for the
* display coherency whilst setting up the cache domains.
*/
+ was_pin_display = obj->pin_display;
obj->pin_display = true;
/* The display engine is not coherent with the LLC cache on gen6. As
@@ -3719,7 +3730,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
return 0;
err_unpin_display:
- obj->pin_display = is_pin_display(obj);
+ WARN_ON(was_pin_display != is_pin_display(obj));
+ obj->pin_display = was_pin_display;
return ret;
}
--
1.9.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-16 11:08 ` [PATCH v2] " oscar.mateo
@ 2014-05-16 11:26 ` Chris Wilson
2014-05-16 13:20 ` [PATCH v3] " oscar.mateo
1 sibling, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2014-05-16 11:26 UTC (permalink / raw)
To: oscar.mateo; +Cc: intel-gfx
On Fri, May 16, 2014 at 12:08:22PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> Otherwise, we do a NULL pointer dereference.
>
> I've seen this happen while handling an error in
> i915_gem_object_pin_to_display_plane():
>
> If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> to handle the error. At this point, the object is still not pinned
> to GGTT and maybe not even bound, so we have to check before we
> dereference its GGTT vma.
>
> v2: Chris Wilson says restoring the old value is easier, but that
> is_pin_display is useful as a theory of operation. Take the solomonic
> decision: at least this way is_pin_display is a little more robust
> (until Chris can kill it off).
>
> Issue: VIZ-3772
I heard you wrote a testcase?
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem.c | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 034ba2c..211b778 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3641,6 +3641,15 @@ unlock:
>
> static bool is_pin_display(struct drm_i915_gem_object *obj)
> {
> + struct i915_vma *vma;
> +
> + if (list_empty(&obj->vma_list))
> + return false;
Hmm, this is so that we don't trigger the WARN from inside
i915_gem_obj_to_ggtt(). I would say that means the WARN in the callee
has outlived its usefulness. Other callers WARN if they fail to find the
ggtt_vma they expect, so I think we can just drop the WARN and save the
duplication here.
> +
> + vma = i915_gem_obj_to_ggtt(obj);
> + if (!vma)
> + return false;
> +
> /* There are 3 sources that pin objects:
> * 1. The display engine (scanouts, sprites, cursors);
> * 2. Reservations for execbuffer;
> @@ -3652,7 +3661,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
> * subtracting the potential reference by the user, any pin_count
> * remains, it must be due to another use by the display engine.
> */
> - return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
> + return vma->pin_count - !!obj->user_pin_count;
> }
>
> /*
> @@ -3666,6 +3675,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> struct intel_ring_buffer *pipelined)
> {
> u32 old_read_domains, old_write_domain;
> + bool was_pin_display;
> int ret;
>
> if (pipelined != obj->ring) {
> @@ -3677,6 +3687,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> /* Mark the pin_display early so that we account for the
> * display coherency whilst setting up the cache domains.
> */
> + was_pin_display = obj->pin_display;
> obj->pin_display = true;
>
> /* The display engine is not coherent with the LLC cache on gen6. As
> @@ -3719,7 +3730,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> return 0;
>
> err_unpin_display:
> - obj->pin_display = is_pin_display(obj);
> + WARN_ON(was_pin_display != is_pin_display(obj));
> + obj->pin_display = was_pin_display;
> return ret;
> }
Ok, this looks like a useful check.
Other than the debate over the placement of the WARN() in
i915_gem_obj_to_ggtt() (maybe leave a comment here to remind us to drop
the WARN and the check later?),
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-16 11:08 ` [PATCH v2] " oscar.mateo
2014-05-16 11:26 ` Chris Wilson
@ 2014-05-16 13:20 ` oscar.mateo
2014-05-16 14:25 ` Daniel Vetter
1 sibling, 1 reply; 18+ messages in thread
From: oscar.mateo @ 2014-05-16 13:20 UTC (permalink / raw)
To: intel-gfx
From: Oscar Mateo <oscar.mateo@intel.com>
Otherwise, we do a NULL pointer dereference.
I've seen this happen while handling an error in
i915_gem_object_pin_to_display_plane():
If i915_gem_object_set_cache_level() fails, we call is_pin_display()
to handle the error. At this point, the object is still not pinned
to GGTT and maybe not even bound, so we have to check before we
dereference its GGTT vma.
The IGT kms_flip/bo-too-big tests for this bug.
v2: Chris Wilson says restoring the old value is easier, but that
is_pin_display is useful as a theory of operation. Take the solomonic
decision: at least this way is_pin_display is a little more robust
(until Chris can kill it off).
v3: Chris suggests the WARN in i915_gem_obj_to_ggtt has outlived its
usefulness: add a reminder to remove it.
Issue: VIZ-3772
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
---
drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 034ba2c..f6c0351 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3641,6 +3641,15 @@ unlock:
static bool is_pin_display(struct drm_i915_gem_object *obj)
{
+ struct i915_vma *vma;
+
+ if (list_empty(&obj->vma_list))
+ return false;
+
+ vma = i915_gem_obj_to_ggtt(obj);
+ if (!vma)
+ return false;
+
/* There are 3 sources that pin objects:
* 1. The display engine (scanouts, sprites, cursors);
* 2. Reservations for execbuffer;
@@ -3652,7 +3661,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
* subtracting the potential reference by the user, any pin_count
* remains, it must be due to another use by the display engine.
*/
- return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
+ return vma->pin_count - !!obj->user_pin_count;
}
/*
@@ -3666,6 +3675,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
struct intel_ring_buffer *pipelined)
{
u32 old_read_domains, old_write_domain;
+ bool was_pin_display;
int ret;
if (pipelined != obj->ring) {
@@ -3677,6 +3687,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
/* Mark the pin_display early so that we account for the
* display coherency whilst setting up the cache domains.
*/
+ was_pin_display = obj->pin_display;
obj->pin_display = true;
/* The display engine is not coherent with the LLC cache on gen6. As
@@ -3719,7 +3730,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
return 0;
err_unpin_display:
- obj->pin_display = is_pin_display(obj);
+ WARN_ON(was_pin_display != is_pin_display(obj));
+ obj->pin_display = was_pin_display;
return ret;
}
@@ -5115,6 +5127,9 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
{
struct i915_vma *vma;
+ /* This WARN has probably outlived its usefulness (callers already
+ * WARN if they don't find the GGTT vma they expect). When removing,
+ * remember to remove the pre-check in is_pin_display() as well */
if (WARN_ON(list_empty(&obj->vma_list)))
return NULL;
--
1.9.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display
2014-05-16 13:20 ` [PATCH v3] " oscar.mateo
@ 2014-05-16 14:25 ` Daniel Vetter
0 siblings, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2014-05-16 14:25 UTC (permalink / raw)
To: oscar.mateo; +Cc: intel-gfx
On Fri, May 16, 2014 at 02:20:43PM +0100, oscar.mateo@intel.com wrote:
> From: Oscar Mateo <oscar.mateo@intel.com>
>
> Otherwise, we do a NULL pointer dereference.
>
> I've seen this happen while handling an error in
> i915_gem_object_pin_to_display_plane():
>
> If i915_gem_object_set_cache_level() fails, we call is_pin_display()
> to handle the error. At this point, the object is still not pinned
> to GGTT and maybe not even bound, so we have to check before we
> dereference its GGTT vma.
>
> The IGT kms_flip/bo-too-big tests for this bug.
>
> v2: Chris Wilson says restoring the old value is easier, but that
> is_pin_display is useful as a theory of operation. Take the solomonic
> decision: at least this way is_pin_display is a little more robust
> (until Chris can kill it off).
>
> v3: Chris suggests the WARN in i915_gem_obj_to_ggtt has outlived its
> usefulness: add a reminder to remove it.
>
> Issue: VIZ-3772
> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Queued for -next, thanks for the patch.
-Daniel
> ---
> drivers/gpu/drm/i915/i915_gem.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 034ba2c..f6c0351 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3641,6 +3641,15 @@ unlock:
>
> static bool is_pin_display(struct drm_i915_gem_object *obj)
> {
> + struct i915_vma *vma;
> +
> + if (list_empty(&obj->vma_list))
> + return false;
> +
> + vma = i915_gem_obj_to_ggtt(obj);
> + if (!vma)
> + return false;
> +
> /* There are 3 sources that pin objects:
> * 1. The display engine (scanouts, sprites, cursors);
> * 2. Reservations for execbuffer;
> @@ -3652,7 +3661,7 @@ static bool is_pin_display(struct drm_i915_gem_object *obj)
> * subtracting the potential reference by the user, any pin_count
> * remains, it must be due to another use by the display engine.
> */
> - return i915_gem_obj_to_ggtt(obj)->pin_count - !!obj->user_pin_count;
> + return vma->pin_count - !!obj->user_pin_count;
> }
>
> /*
> @@ -3666,6 +3675,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> struct intel_ring_buffer *pipelined)
> {
> u32 old_read_domains, old_write_domain;
> + bool was_pin_display;
> int ret;
>
> if (pipelined != obj->ring) {
> @@ -3677,6 +3687,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> /* Mark the pin_display early so that we account for the
> * display coherency whilst setting up the cache domains.
> */
> + was_pin_display = obj->pin_display;
> obj->pin_display = true;
>
> /* The display engine is not coherent with the LLC cache on gen6. As
> @@ -3719,7 +3730,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
> return 0;
>
> err_unpin_display:
> - obj->pin_display = is_pin_display(obj);
> + WARN_ON(was_pin_display != is_pin_display(obj));
> + obj->pin_display = was_pin_display;
> return ret;
> }
>
> @@ -5115,6 +5127,9 @@ struct i915_vma *i915_gem_obj_to_ggtt(struct drm_i915_gem_object *obj)
> {
> struct i915_vma *vma;
>
> + /* This WARN has probably outlived its usefulness (callers already
> + * WARN if they don't find the GGTT vma they expect). When removing,
> + * remember to remove the pre-check in is_pin_display() as well */
> if (WARN_ON(list_empty(&obj->vma_list)))
> return NULL;
>
> --
> 1.9.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2014-05-16 14:25 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-02 18:21 [PATCH] drm/i915: Gracefully handle obj not bound to GGTT in is_pin_display oscar.mateo
2014-04-02 17:59 ` Chris Wilson
2014-04-03 9:34 ` Daniel Vetter
2014-05-12 9:05 ` Mateo Lozano, Oscar
2014-05-12 10:09 ` Chris Wilson
2014-05-12 10:30 ` Mateo Lozano, Oscar
2014-05-12 10:37 ` Chris Wilson
2014-05-12 16:11 ` Daniel Vetter
2014-05-12 16:14 ` Daniel Vetter
2014-05-12 17:10 ` Mateo Lozano, Oscar
2014-05-15 13:14 ` Mateo Lozano, Oscar
2014-05-15 13:33 ` Ville Syrjälä
2014-05-16 10:43 ` Mateo Lozano, Oscar
2014-05-15 13:45 ` Daniel Vetter
2014-05-16 11:08 ` [PATCH v2] " oscar.mateo
2014-05-16 11:26 ` Chris Wilson
2014-05-16 13:20 ` [PATCH v3] " oscar.mateo
2014-05-16 14:25 ` Daniel Vetter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox