public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 04/33] drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh
Date: Mon, 8 Aug 2016 11:33:56 +0200	[thread overview]
Message-ID: <20160808093356.GZ6232@phenom.ffwll.local> (raw)
In-Reply-To: <1470581141-14432-5-git-send-email-chris@chris-wilson.co.uk>

On Sun, Aug 07, 2016 at 03:45:12PM +0100, Chris Wilson wrote:
> The bottom-half we use for processing the breadcrumb interrupt is a
> task, which is an RCU protected struct. When accessing this struct, we
> need to be holding the RCU read lock to prevent it disappearing beneath
> us. We can use the RCU annotation to mark our irq_seqno_bh pointer as
> being under RCU guard and then use the RCU accessors to both provide
> correct ordering of access through the pointer.
> 
> Most notably, this fixes the access from hard irq context to use the RCU
> read lock, which both Daniel and Tvrtko complained about.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

I'll leave sparse-checking this to 0day and runtime lockdep checking to CI
;-)

> ---
>  drivers/gpu/drm/i915/i915_drv.h          |  2 +-
>  drivers/gpu/drm/i915/intel_breadcrumbs.c | 22 +++++++++-------------
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  2 --
>  drivers/gpu/drm/i915/intel_ringbuffer.h  | 21 ++++++++++++++-------
>  4 files changed, 24 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index feec00f769e1..3d546b5c2e4c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3848,7 +3848,7 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
>  	 * is woken.
>  	 */
>  	if (engine->irq_seqno_barrier &&
> -	    READ_ONCE(engine->breadcrumbs.irq_seqno_bh) == current &&
> +	    rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh) == current &&
>  	    cmpxchg_relaxed(&engine->breadcrumbs.irq_posted, 1, 0)) {
>  		struct task_struct *tsk;
>  
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 8ecb3b6538fc..7552bd039565 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -60,10 +60,8 @@ static void intel_breadcrumbs_fake_irq(unsigned long data)
>  	 * every jiffie in order to kick the oldest waiter to do the
>  	 * coherent seqno check.
>  	 */
> -	rcu_read_lock();
>  	if (intel_engine_wakeup(engine))
>  		mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
> -	rcu_read_unlock();
>  }
>  
>  static void irq_enable(struct intel_engine_cs *engine)
> @@ -232,7 +230,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  	}
>  	rb_link_node(&wait->node, parent, p);
>  	rb_insert_color(&wait->node, &b->waiters);
> -	GEM_BUG_ON(!first && !b->irq_seqno_bh);
> +	GEM_BUG_ON(!first && !rcu_access_pointer(b->irq_seqno_bh));

Nit: reading through rcu docs I think the suggested accessor here is
rcu_dereference_protected for write-side access. That one allows the
compiler full freedom for reordering. otoh it's a bit more noise-y and meh
about optional debug checks anyway. So with or without that change:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  
>  	if (completed) {
>  		struct rb_node *next = rb_next(completed);
> @@ -242,7 +240,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  			GEM_BUG_ON(first);
>  			b->timeout = wait_timeout();
>  			b->first_wait = to_wait(next);
> -			smp_store_mb(b->irq_seqno_bh, b->first_wait->tsk);
> +			rcu_assign_pointer(b->irq_seqno_bh, b->first_wait->tsk);
>  			/* As there is a delay between reading the current
>  			 * seqno, processing the completed tasks and selecting
>  			 * the next waiter, we may have missed the interrupt
> @@ -269,7 +267,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  		GEM_BUG_ON(rb_first(&b->waiters) != &wait->node);
>  		b->timeout = wait_timeout();
>  		b->first_wait = wait;
> -		smp_store_mb(b->irq_seqno_bh, wait->tsk);
> +		rcu_assign_pointer(b->irq_seqno_bh, wait->tsk);
>  		/* After assigning ourselves as the new bottom-half, we must
>  		 * perform a cursory check to prevent a missed interrupt.
>  		 * Either we miss the interrupt whilst programming the hardware,
> @@ -280,7 +278,7 @@ static bool __intel_engine_add_wait(struct intel_engine_cs *engine,
>  		 */
>  		__intel_breadcrumbs_enable_irq(b);
>  	}
> -	GEM_BUG_ON(!b->irq_seqno_bh);
> +	GEM_BUG_ON(!rcu_access_pointer(b->irq_seqno_bh));
>  	GEM_BUG_ON(!b->first_wait);
>  	GEM_BUG_ON(rb_first(&b->waiters) != &b->first_wait->node);
>  
> @@ -335,7 +333,7 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  		const int priority = wakeup_priority(b, wait->tsk);
>  		struct rb_node *next;
>  
> -		GEM_BUG_ON(b->irq_seqno_bh != wait->tsk);
> +		GEM_BUG_ON(rcu_access_pointer(b->irq_seqno_bh) != wait->tsk);
>  
>  		/* We are the current bottom-half. Find the next candidate,
>  		 * the first waiter in the queue on the remaining oldest
> @@ -379,13 +377,13 @@ void intel_engine_remove_wait(struct intel_engine_cs *engine,
>  			 */
>  			b->timeout = wait_timeout();
>  			b->first_wait = to_wait(next);
> -			smp_store_mb(b->irq_seqno_bh, b->first_wait->tsk);
> +			rcu_assign_pointer(b->irq_seqno_bh, b->first_wait->tsk);
>  			if (b->first_wait->seqno != wait->seqno)
>  				__intel_breadcrumbs_enable_irq(b);
> -			wake_up_process(b->irq_seqno_bh);
> +			wake_up_process(b->first_wait->tsk);
>  		} else {
>  			b->first_wait = NULL;
> -			WRITE_ONCE(b->irq_seqno_bh, NULL);
> +			rcu_assign_pointer(b->irq_seqno_bh, NULL);
>  			__intel_breadcrumbs_disable_irq(b);
>  		}
>  	} else {
> @@ -399,7 +397,7 @@ out_unlock:
>  	GEM_BUG_ON(b->first_wait == wait);
>  	GEM_BUG_ON(rb_first(&b->waiters) !=
>  		   (b->first_wait ? &b->first_wait->node : NULL));
> -	GEM_BUG_ON(!b->irq_seqno_bh ^ RB_EMPTY_ROOT(&b->waiters));
> +	GEM_BUG_ON(!rcu_access_pointer(b->irq_seqno_bh) ^ RB_EMPTY_ROOT(&b->waiters));
>  	spin_unlock(&b->lock);
>  }
>  
> @@ -596,11 +594,9 @@ unsigned int intel_kick_waiters(struct drm_i915_private *i915)
>  	 * RCU lock, i.e. as we call wake_up_process() we must be holding the
>  	 * rcu_read_lock().
>  	 */
> -	rcu_read_lock();
>  	for_each_engine(engine, i915)
>  		if (unlikely(intel_engine_wakeup(engine)))
>  			mask |= intel_engine_flag(engine);
> -	rcu_read_unlock();
>  
>  	return mask;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e08a1e1b04e4..16b726fe33eb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2410,9 +2410,7 @@ void intel_engine_init_seqno(struct intel_engine_cs *engine, u32 seqno)
>  	/* After manually advancing the seqno, fake the interrupt in case
>  	 * there are any waiters for that seqno.
>  	 */
> -	rcu_read_lock();
>  	intel_engine_wakeup(engine);
> -	rcu_read_unlock();
>  }
>  
>  static void gen6_bsd_submit_request(struct drm_i915_gem_request *request)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 4aed4586b0b6..66dc93469076 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -171,7 +171,7 @@ struct intel_engine_cs {
>  	 * the overhead of waking that client is much preferred.
>  	 */
>  	struct intel_breadcrumbs {
> -		struct task_struct *irq_seqno_bh; /* bh for user interrupts */
> +		struct task_struct __rcu *irq_seqno_bh; /* bh for interrupts */
>  		bool irq_posted;
>  
>  		spinlock_t lock; /* protects the lists of requests */
> @@ -541,23 +541,30 @@ void intel_engine_enable_signaling(struct drm_i915_gem_request *request);
>  
>  static inline bool intel_engine_has_waiter(struct intel_engine_cs *engine)
>  {
> -	return READ_ONCE(engine->breadcrumbs.irq_seqno_bh);
> +	return rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh);
>  }
>  
>  static inline bool intel_engine_wakeup(struct intel_engine_cs *engine)
>  {
>  	bool wakeup = false;
> -	struct task_struct *tsk = READ_ONCE(engine->breadcrumbs.irq_seqno_bh);
> +
>  	/* Note that for this not to dangerously chase a dangling pointer,
> -	 * the caller is responsible for ensure that the task remain valid for
> -	 * wake_up_process() i.e. that the RCU grace period cannot expire.
> +	 * we must hold the rcu_read_lock here.
>  	 *
>  	 * Also note that tsk is likely to be in !TASK_RUNNING state so an
>  	 * early test for tsk->state != TASK_RUNNING before wake_up_process()
>  	 * is unlikely to be beneficial.
>  	 */
> -	if (tsk)
> -		wakeup = wake_up_process(tsk);
> +	if (rcu_access_pointer(engine->breadcrumbs.irq_seqno_bh)) {
> +		struct task_struct *tsk;
> +
> +		rcu_read_lock();
> +		tsk = rcu_dereference(engine->breadcrumbs.irq_seqno_bh);
> +		if (tsk)
> +			wakeup = wake_up_process(tsk);
> +		rcu_read_unlock();
> +	}
> +
>  	return wakeup;
>  }
>  
> -- 
> 2.8.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-08-08  9:34 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-07 14:45 First class VMA, take 2 Chris Wilson
2016-08-07 14:45 ` [PATCH 01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Chris Wilson
2016-08-08  9:12   ` Daniel Vetter
2016-08-08  9:30     ` Chris Wilson
2016-08-08  9:45       ` Chris Wilson
2016-08-09  6:36         ` Joonas Lahtinen
2016-08-09  7:14           ` Chris Wilson
2016-08-09  8:48             ` Joonas Lahtinen
2016-08-09  9:05               ` Chris Wilson
2016-08-10 10:12                 ` Daniel Vetter
2016-08-10 10:13                   ` Daniel Vetter
2016-08-10 11:00                     ` Joonas Lahtinen
2016-08-12  9:50                       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 02/33] drm/i915: Do not overwrite the request with zero on reallocation Chris Wilson
2016-08-08  9:25   ` Daniel Vetter
2016-08-08  9:56     ` Chris Wilson
2016-08-09  6:32       ` Daniel Vetter
2016-08-07 14:45 ` [PATCH 03/33] drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs Chris Wilson
2016-08-09 14:08   ` [PATCH v2] " Chris Wilson
2016-08-09 14:10   ` [PATCH v3] " Chris Wilson
2016-08-09 15:24     ` Mika Kuoppala
2016-08-07 14:45 ` [PATCH 04/33] drm/i915: Use RCU to annotate and enforce protection for breadcrumb's bh Chris Wilson
2016-08-08  9:33   ` Daniel Vetter [this message]
2016-08-12  9:56   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 05/33] drm/i915: Reduce amount of duplicate buffer information captured on error Chris Wilson
2016-08-10  7:04   ` Joonas Lahtinen
2016-08-10  7:15     ` Chris Wilson
2016-08-10  8:07       ` Joonas Lahtinen
2016-08-10  8:36         ` Chris Wilson
2016-08-10 10:51           ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 06/33] drm/i915: Stop the machine whilst capturing the GPU crash dump Chris Wilson
2016-08-07 14:45 ` [PATCH 07/33] drm/i915: Store the active context object on all engines upon error Chris Wilson
2016-08-09  9:02   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 08/33] drm/i915: Move setting of request->batch into its single callsite Chris Wilson
2016-08-09 15:53   ` Mika Kuoppala
2016-08-09 16:04     ` Chris Wilson
2016-08-10  7:19   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 09/33] drm/i915: Mark unmappable GGTT entries as PIN_HIGH Chris Wilson
2016-08-08  9:09   ` Joonas Lahtinen
2016-08-09 11:05   ` Tvrtko Ursulin
2016-08-09 11:13     ` Chris Wilson
2016-08-09 11:20       ` Chris Wilson
2016-08-07 14:45 ` [PATCH 10/33] drm/i915: Remove inactive/active list from debugfs Chris Wilson
2016-08-09 10:29   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 11/33] drm/i915: Focus debugfs/i915_gem_pinned to show only display pins Chris Wilson
2016-08-09 10:39   ` Joonas Lahtinen
2016-08-09 10:46     ` Chris Wilson
2016-08-09 11:32       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 12/33] drm/i915: Reduce i915_gem_objects to only show object information Chris Wilson
2016-08-10  7:29   ` Joonas Lahtinen
2016-08-10  7:38     ` Chris Wilson
2016-08-10  8:10       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 13/33] drm/i915: Remove redundant WARN_ON from __i915_add_request() Chris Wilson
2016-08-08  9:03   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 14/33] drm/i915: Create a VMA for an object Chris Wilson
2016-08-08  9:01   ` Joonas Lahtinen
2016-08-08  9:09     ` Chris Wilson
2016-08-10 10:58       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 15/33] drm/i915: Track pinned vma inside guc Chris Wilson
2016-08-11 16:19   ` Dave Gordon
2016-08-11 16:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 16/33] drm/i915: Convert fence computations to use vma directly Chris Wilson
2016-08-09 10:27   ` Joonas Lahtinen
2016-08-09 10:33     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 17/33] drm/i915: Use VMA directly for checking tiling parameters Chris Wilson
2016-08-09  6:18   ` Joonas Lahtinen
2016-08-09  8:03     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 18/33] drm/i915: Use VMA as the primary object for context state Chris Wilson
2016-08-10  8:03   ` Joonas Lahtinen
2016-08-10  8:25     ` Chris Wilson
2016-08-10 10:54       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 19/33] drm/i915: Only clflush the context object when binding Chris Wilson
2016-08-10  8:41   ` Joonas Lahtinen
2016-08-10  9:02     ` Chris Wilson
2016-08-10 10:50       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 20/33] drm/i915: Use VMA for ringbuffer tracking Chris Wilson
2016-08-11  9:32   ` Joonas Lahtinen
2016-08-11  9:58     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Chris Wilson
2016-08-08  8:00   ` [PATCH 1/3] " Chris Wilson
2016-08-08  8:00     ` [PATCH 2/3] drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c Chris Wilson
2016-08-08  9:24       ` Matthew Auld
2016-08-08  8:00     ` [PATCH 3/3] drm/i915: Move common seqno reset " Chris Wilson
2016-08-08  9:40       ` Matthew Auld
2016-08-08 10:15         ` Chris Wilson
2016-08-08 15:34           ` Matthew Auld
2016-08-11 10:06   ` [PATCH 21/33] drm/i915: Use VMA for scratch page tracking Joonas Lahtinen
2016-08-11 10:22     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 22/33] drm/i915/overlay: Use VMA as the primary tracker for images Chris Wilson
2016-08-11 10:17   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 23/33] drm/i915: Use VMA as the primary tracker for semaphore page Chris Wilson
2016-08-11 10:42   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 24/33] drm/i915: Use VMA for render state page tracking Chris Wilson
2016-08-11 10:46   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 25/33] drm/i915: Use VMA for wa_ctx tracking Chris Wilson
2016-08-11 10:53   ` Joonas Lahtinen
2016-08-11 11:02     ` Chris Wilson
2016-08-11 12:41       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 26/33] drm/i915: Track pinned VMA Chris Wilson
2016-08-11 12:18   ` Joonas Lahtinen
2016-08-11 12:37     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 27/33] drm/i915: Print the batchbuffer offset next to BBADDR in error state Chris Wilson
2016-08-11 12:24   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 28/33] drm/i915: Move per-request pid from request to ctx Chris Wilson
2016-08-11 12:32   ` Joonas Lahtinen
2016-08-11 12:41     ` Chris Wilson
2016-08-07 14:45 ` [PATCH 29/33] drm/i915: Only record active and pending requests upon a GPU hang Chris Wilson
2016-08-11 12:36   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 30/33] drm/i915: Record the RING_MODE register for post-mortem debugging Chris Wilson
2016-08-08 11:35   ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 31/33] drm/i915: Always use the GTT for error capture Chris Wilson
2016-08-07 14:45 ` [PATCH 32/33] drm/i915: Consolidate error object printing Chris Wilson
2016-08-09 11:44   ` Joonas Lahtinen
2016-08-09 11:53     ` Chris Wilson
2016-08-10 10:55       ` Joonas Lahtinen
2016-08-07 14:45 ` [PATCH 33/33] drm/i915: Compress GPU objects in error state Chris Wilson
2016-08-10 10:32   ` Joonas Lahtinen
2016-08-10 10:52     ` Chris Wilson
2016-08-10 11:26       ` Joonas Lahtinen
2016-08-07 15:16 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance Patchwork
2016-08-08  9:46 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev4) Patchwork
2016-08-08 10:34 ` ✗ Fi.CI.BAT: " Patchwork
2016-08-09 14:10 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev5) Patchwork
2016-08-09 14:20 ` ✗ Ro.CI.BAT: failure for series starting with [01/33] drm/i915: Add smp_rmb() to busy ioctl's RCU dance (rev6) Patchwork
2016-08-10  6:43 ` Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160808093356.GZ6232@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox