public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Yu Dai <yu.dai@intel.com>
To: Daniel Vetter <daniel@ffwll.ch>, Imre Deak <imre.deak@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
Date: Tue, 24 Nov 2015 10:36:54 -0800	[thread overview]
Message-ID: <5654AE46.1010903@intel.com> (raw)
In-Reply-To: <20151124180813.GC17050@phenom.ffwll.local>



On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> > On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> > >
> > > On 11/24/2015 05:26 AM, Imre Deak wrote:
> > > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > > > > > From: Alex Dai <yu.dai@intel.com>
> > > > > >
> > > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > > > > > /
> > > > > > unlock. Otherwise, lockup happens because jiffies won't be
> > > > > > updated
> > > > > > dur to irq is disabled.
> > > > > >
> > > > > > Issue is found in igt/gem_close_race.
> > > > > >
> > > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > > > > > +++++++++++++++++-
> > > > > > ---------
> > > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > index 0a6b007..1418397 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > > > > i915_guc_client *gc)
> > > > > >  	union guc_doorbell_qw *db;
> > > > > >  	void *base;
> > > > > >  	int attempt = 2, ret = -EAGAIN;
> > > > > > +	unsigned long flags;
> > > > > >
> > > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > > client_obj, 0));
> > > > >
> > > > > We don't need kmap_atomic anymore here now, since it's outside of
> > > > > the
> > > > > spinlock.
> > > > >
> > > > > >  	desc = base + gc->proc_desc_offset;
> > > > > >
> > > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> > > > >
> > > > > Please don't use the super-generic _irqsave. It's expensive and
> > > > > results in
> > > > > fragile code when someone accidentally reuses something in an
> > > > > interrupt
> > > > > handler that was never meant to run in that context.
> > > > >
> > > > > Instead please use the most specific funtion:
> > > > > - spin_lock if you know you are in irq context.
> > > > > - sipn_lock_irq if you know you are not.
> > > >
> > > > Right, and simply spin_lock() if the lock is not taken in IRQ
> > > > context
> > > > ever.
> > >
> > > This is not in IRQ context. So I will use spin_lock_irq instead.
> >
> > You can just use spin_lock(). spin_lock_irq() makes only sense if you
> > take the lock in IRQ context too, which is not the case.
>
> Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
>
How about using mutex_lock_interruptible(&dev->struct_mutex) instead in 
debugfs, which is to replace host2guc lock.

spinlock during ring the door bell is still needed.

Alex
> >
> > > > > - spin_lock_irqsave should be a big warning sign that your code
> > > > > has
> > > > >   layering issues.
> > > > >
> > > > > Please audit the entire guc code for the above two issues.
> > > >
> > > > Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> > > > debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither
> > > > of
> > > > them are called from IRQ context AFAICS, in which case a simple
> > > > spin_lock() would do.
> > > >
> > > > --Imre
> > > >
> > > > > > +
> > > > > >  	/* Update the tail so it is visible to GuC */
> > > > > >  	desc->tail = gc->wq_tail;
> > > > > >
> > > > > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > > > > i915_guc_client *gc)
> > > > > >  			db_exc.cookie = 1;
> > > > > >  	}
> > > > > >
> > > > > > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > > +
> > > > > >  	kunmap_atomic(base);
> > > > > > +
> > > > > >  	return ret;
> > > > > >  }
> > > > > >
> > > > > > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > > > > > i915_guc_client *gc, u32 *offset)
> > > > > >  	struct guc_process_desc *desc;
> > > > > >  	void *base;
> > > > > >  	u32 size = sizeof(struct guc_wq_item);
> > > > > > -	int ret = 0, timeout_counter = 200;
> > > > > > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > > > > > +	unsigned long flags;
> > > > > >
> > > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > > client_obj, 0));
> > > > > >  	desc = base + gc->proc_desc_offset;
> > > > > >
> > > > > >  	while (timeout_counter-- > 0) {
> > > > > > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > > > > > desc->head,
> > > > > > -				gc->wq_size) >= size, 1);
> > > > > > +		spin_lock_irqsave(&gc->wq_lock, flags);
> > > > > >
> > > > > > -		if (!ret) {
> > > > > > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > > > > > > wq_size) >= size) {
> > > > > >  			*offset = gc->wq_tail;
> > > > > >
> > > > > >  			/* advance the tail for next workqueue
> > > > > > item */
> > > > > > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > > > > > i915_guc_client *gc, u32 *offset)
> > > > > >
> > > > > >  			/* this will break the loop */
> > > > > >  			timeout_counter = 0;
> > > > > > +			ret = 0;
> > > > > >  		}
> > > > > > +
> > > > > > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > > +
> > > > > > +		if (timeout_counter)
> > > > > > +			usleep_range(1000, 2000);
> > > > >
> > > > > Do we really not have a interrupt/signal from the guc when it has
> > > > > cleared
> > > > > up some space?
> > > > >
> > >
> > > This is not implemented in fw although I think it could be done
> > > through
> > > the guc to host interrupt. I am worry about that if we implement
> > > this,
> > > it will end up with driver handles too many interrupts (maybe same
> > > amount of context switch). However, ideally we don't want to handle
> > > interrupts at all.
> > > > > >  	};
> > > > > >
> > > > > >  	kunmap_atomic(base);
> > > > > > @@ -597,19 +609,17 @@ int i915_guc_submit(struct
> > > > > > i915_guc_client
> > > > > > *client,
> > > > > >  {
> > > > > >  	struct intel_guc *guc = client->guc;
> > > > > >  	enum intel_ring_id ring_id = rq->ring->id;
> > > > > > -	unsigned long flags;
> > > > > >  	int q_ret, b_ret;
> > > > > >
> > > > > >  	/* Need this because of the deferred pin ctx and ring
> > > > > > */
> > > > > >  	/* Shall we move this right after ring is pinned? */
> > > > > >  	lr_context_update(rq);
> > > > > >
> > > > > > -	spin_lock_irqsave(&client->wq_lock, flags);
> > > > > > -
> > > > > >  	q_ret = guc_add_workqueue_item(client, rq);
> > > > > >  	if (q_ret == 0)
> > > > > >  		b_ret = guc_ring_doorbell(client);
> > > > > >
> > > > > > +	spin_lock(&guc->host2guc_lock);
> > > > >
> > > > > So at first I thought there's a race now, but then I looked at
> > > > > what
> > > > > host2guc and wq_lock protect. It seems like the only thing they
> > > > > do is
> > > > > protect against debugfs, all the real protection against
> > > > > inconsistent
> > > > > state is done through dev->struct_mutex.
> > > > >
> > > > > Can't we just rip out all this spinlock business from the guc
> > > > > code?
> > > > > It would be easier than fixing up the races in here.
> > > >
> > > >
> > >
> > > Yes, host2guc lock can be done through dev->struct_mutex. But
> > > definitely
> > > we don't want to interrupt the process when driver program guc work
> > > queue and ring the door bell.
> > > > > -Daniel
> > > > >
> > > > > >  	client->submissions[ring_id] += 1;
> > > > > >  	if (q_ret) {
> > > > > >  		client->q_fail += 1;
> > > > > > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > > > > > *client,
> > > > > >  	} else {
> > > > > >  		client->retcode = 0;
> > > > > >  	}
> > > > > > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > > > > > -
> > > > > > -	spin_lock(&guc->host2guc_lock);
> > > > > >  	guc->submissions[ring_id] += 1;
> > > > > >  	guc->last_seqno[ring_id] = rq->seqno;
> > > > > >  	spin_unlock(&guc->host2guc_lock);
> > > > > > --
> > > > > > 2.5.0
> > > > > >
> > > > >
> > >
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-11-24 18:39 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-20  0:15 [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission yu.dai
2015-11-20  8:45 ` Daniel Vetter
2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
2015-11-24 13:04     ` Daniel Vetter
2015-11-24 13:26       ` Imre Deak
2015-11-24 17:00         ` Yu Dai
2015-11-24 17:05           ` Imre Deak
2015-11-24 18:08             ` Daniel Vetter
2015-11-24 18:36               ` Yu Dai [this message]
2015-11-24 19:13                 ` Daniel Vetter
2015-11-24 22:34                   ` Yu Dai
2015-11-25  8:45                     ` Daniel Vetter
2015-11-25 19:29     ` [PATCH v2] drm/i915/guc: Clean up locks in GuC yu.dai
2015-11-26  9:16       ` Nick Hoath
2015-12-01  0:15     ` [PATCH v3] " yu.dai
2015-12-02 19:50       ` Dave Gordon
2015-12-03  0:56     ` [PATCH v4] " yu.dai
2015-12-03  7:52       ` Daniel Vetter
2016-01-22 10:54       ` Tvrtko Ursulin
2016-01-25 16:01         ` Daniel Vetter
2016-01-26 11:53           ` Tvrtko Ursulin
2016-01-26 12:08             ` Chris Wilson
2016-01-26 16:54               ` Daniel Vetter
2016-01-26 17:11                 ` Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5654AE46.1010903@intel.com \
    --to=yu.dai@intel.com \
    --cc=daniel@ffwll.ch \
    --cc=imre.deak@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox