All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH v2 1/6] drm/i915: move the pre_pin earlier
Date: Thu, 18 Nov 2021 07:57:10 +0100	[thread overview]
Message-ID: <ea91cc10df3f0a5463566c1eea00ff044efe0db4.camel@linux.intel.com> (raw)
In-Reply-To: <9ea7ae8e-c9c4-8b1e-2057-5be69eb35555@linux.intel.com>

On Wed, 2021-11-17 at 19:49 +0100, Thomas Hellström wrote:
> 
> On 11/17/21 15:20, Matthew Auld wrote:
> > In intel_context_do_pin_ww, when calling into the pre_pin
> > hook(which is
> > passed the ww context) it could in theory return -EDEADLK(which is
> > very
> > likely with debug kernels), once we start adding more ww locking in
> > there,
> > like in the next patch. If so then we need to be mindful of having
> > to
> > restart the do_pin at this point.
> > 
> > If this is the kernel_context, or some other early in-kernel
> > context
> > where we have yet to setup the default_state, then we always
> > inhibit the
> > context restore, and instead rely on the delayed active_release to
> > set
> > the CONTEXT_VALID_BIT for us(if we even care), which should
> > indicate
> > that we have context switched away, and that our newly saved
> > context
> > state should now be valid. However, since we currently grab the
> > active
> > reference before the potential ww dance, we can end up setting the
> > CONTEXT_VALID_BIT much too early, if we need to backoff, and then
> > upon
> > re-trying the do_pin, we could potentially cause the hardware to
> > incorrectly load some garbage context state when later context
> > switching
> > to that context, but at the very least this will trigger the
> > GEM_BUG_ON() in __engine_unpark. For now let's just move any ww
> > dance
> > stuff prior to arming the active reference.
> > 
> > For normal user contexts this shouldn't be a concern, since we
> > should
> > already have the default_state ready when initialising the lrc
> > state,
> > and so there should be no concern with active_release somehow
> > prematurely setting the CONTEXT_VALID_BIT.
> > 
> > v2(Thomas):
> >    - Also re-order the union unwind

Oh should this be 

s/union/onion/ ?


> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> 
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> 
> 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 5634d14052bc..4c296de1d67d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -228,17 +228,17 @@ int __intel_context_do_pin_ww(struct
> > intel_context *ce,
> >         if (err)
> >                 return err;
> >   
> > -       err = i915_active_acquire(&ce->active);
> > +       err = ce->ops->pre_pin(ce, ww, &vaddr);
> >         if (err)
> >                 goto err_ctx_unpin;
> >   
> > -       err = ce->ops->pre_pin(ce, ww, &vaddr);
> > +       err = i915_active_acquire(&ce->active);
> >         if (err)
> > -               goto err_release;
> > +               goto err_post_unpin;
> >   
> >         err = mutex_lock_interruptible(&ce->pin_mutex);
> >         if (err)
> > -               goto err_post_unpin;
> > +               goto err_release;
> >   
> >         intel_engine_pm_might_get(ce->engine);
> >   
> > @@ -273,11 +273,11 @@ int __intel_context_do_pin_ww(struct
> > intel_context *ce,
> >   
> >   err_unlock:
> >         mutex_unlock(&ce->pin_mutex);
> > +err_release:
> > +       i915_active_release(&ce->active);
> >   err_post_unpin:
> >         if (!handoff)
> >                 ce->ops->post_unpin(ce);
> > -err_release:
> > -       i915_active_release(&ce->active);
> >   err_ctx_unpin:
> >         intel_context_post_unpin(ce);
> >   



WARNING: multiple messages have this Message-ID (diff)
From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: Matthew Auld <matthew.auld@intel.com>, intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: Re: [PATCH v2 1/6] drm/i915: move the pre_pin earlier
Date: Thu, 18 Nov 2021 07:57:10 +0100	[thread overview]
Message-ID: <ea91cc10df3f0a5463566c1eea00ff044efe0db4.camel@linux.intel.com> (raw)
In-Reply-To: <9ea7ae8e-c9c4-8b1e-2057-5be69eb35555@linux.intel.com>

On Wed, 2021-11-17 at 19:49 +0100, Thomas Hellström wrote:
> 
> On 11/17/21 15:20, Matthew Auld wrote:
> > In intel_context_do_pin_ww, when calling into the pre_pin
> > hook(which is
> > passed the ww context) it could in theory return -EDEADLK(which is
> > very
> > likely with debug kernels), once we start adding more ww locking in
> > there,
> > like in the next patch. If so then we need to be mindful of having
> > to
> > restart the do_pin at this point.
> > 
> > If this is the kernel_context, or some other early in-kernel
> > context
> > where we have yet to setup the default_state, then we always
> > inhibit the
> > context restore, and instead rely on the delayed active_release to
> > set
> > the CONTEXT_VALID_BIT for us(if we even care), which should
> > indicate
> > that we have context switched away, and that our newly saved
> > context
> > state should now be valid. However, since we currently grab the
> > active
> > reference before the potential ww dance, we can end up setting the
> > CONTEXT_VALID_BIT much too early, if we need to backoff, and then
> > upon
> > re-trying the do_pin, we could potentially cause the hardware to
> > incorrectly load some garbage context state when later context
> > switching
> > to that context, but at the very least this will trigger the
> > GEM_BUG_ON() in __engine_unpark. For now let's just move any ww
> > dance
> > stuff prior to arming the active reference.
> > 
> > For normal user contexts this shouldn't be a concern, since we
> > should
> > already have the default_state ready when initialising the lrc
> > state,
> > and so there should be no concern with active_release somehow
> > prematurely setting the CONTEXT_VALID_BIT.
> > 
> > v2(Thomas):
> >    - Also re-order the union unwind

Oh should this be 

s/union/onion/ ?


> > 
> > Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> > Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> 
> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> 
> 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_context.c | 12 ++++++------
> >   1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 5634d14052bc..4c296de1d67d 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -228,17 +228,17 @@ int __intel_context_do_pin_ww(struct
> > intel_context *ce,
> >         if (err)
> >                 return err;
> >   
> > -       err = i915_active_acquire(&ce->active);
> > +       err = ce->ops->pre_pin(ce, ww, &vaddr);
> >         if (err)
> >                 goto err_ctx_unpin;
> >   
> > -       err = ce->ops->pre_pin(ce, ww, &vaddr);
> > +       err = i915_active_acquire(&ce->active);
> >         if (err)
> > -               goto err_release;
> > +               goto err_post_unpin;
> >   
> >         err = mutex_lock_interruptible(&ce->pin_mutex);
> >         if (err)
> > -               goto err_post_unpin;
> > +               goto err_release;
> >   
> >         intel_engine_pm_might_get(ce->engine);
> >   
> > @@ -273,11 +273,11 @@ int __intel_context_do_pin_ww(struct
> > intel_context *ce,
> >   
> >   err_unlock:
> >         mutex_unlock(&ce->pin_mutex);
> > +err_release:
> > +       i915_active_release(&ce->active);
> >   err_post_unpin:
> >         if (!handoff)
> >                 ce->ops->post_unpin(ce);
> > -err_release:
> > -       i915_active_release(&ce->active);
> >   err_ctx_unpin:
> >         intel_context_post_unpin(ce);
> >   



  reply	other threads:[~2021-11-18  6:57 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17 14:20 [Intel-gfx] [PATCH v2 1/6] drm/i915: move the pre_pin earlier Matthew Auld
2021-11-17 14:20 ` Matthew Auld
2021-11-17 14:20 ` [Intel-gfx] [PATCH v2 2/6] drm/i915: Create a dummy object for gen6 ppgtt Matthew Auld
2021-11-17 14:20   ` Matthew Auld
2021-11-17 14:20 ` [Intel-gfx] [PATCH v2 3/6] drm/i915: Create a full object for mock_ring, v2 Matthew Auld
2021-11-17 14:20   ` Matthew Auld
2021-11-17 14:20 ` [Intel-gfx] [PATCH v2 4/6] drm/i915: vma is always backed by an object Matthew Auld
2021-11-17 14:20   ` Matthew Auld
2021-11-17 14:20 ` [Intel-gfx] [PATCH v2 5/6] drm/i915: Remove resv from i915_vma Matthew Auld
2021-11-17 14:20   ` Matthew Auld
2021-11-17 14:20 ` [Intel-gfx] [PATCH v2 6/6] drm/i915: Drain the ttm delayed workqueue too Matthew Auld
2021-11-17 14:20   ` Matthew Auld
2021-11-17 18:49 ` [Intel-gfx] [PATCH v2 1/6] drm/i915: move the pre_pin earlier Thomas Hellström
2021-11-17 18:49   ` Thomas Hellström
2021-11-18  6:57   ` Thomas Hellström [this message]
2021-11-18  6:57     ` Thomas Hellström
2021-11-18  9:58     ` [Intel-gfx] " Matthew Auld
2021-11-18  9:58       ` Matthew Auld
2021-11-17 19:23 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/6] " Patchwork
2021-11-17 19:53 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-11-18 10:28   ` Matthew Auld
2021-11-18 16:31     ` Vudum, Lakshminarayana
2021-11-18 16:06 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/6] drm/i915: move the pre_pin earlier (rev2) Patchwork
2021-11-18 16:36 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-11-19 10:27   ` Matthew Auld
2021-11-19 17:05     ` Vudum, Lakshminarayana
2021-11-19 10:30 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/6] drm/i915: move the pre_pin earlier (rev3) Patchwork
2021-11-19 10:58 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-11-19 13:53 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-11-19 15:30   ` Matthew Auld
2021-11-19 16:58 ` [Intel-gfx] ✓ Fi.CI.IGT: success " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea91cc10df3f0a5463566c1eea00ff044efe0db4.camel@linux.intel.com \
    --to=thomas.hellstrom@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=matthew.auld@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.