Re: [tip:core/rcu] rcu: Break call_rcu() deadlock involving scheduler and perf

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, hpa@zytor.com,
	tglx@linutronix.de, davej@redhat.com,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:core/rcu] rcu: Break call_rcu() deadlock involving scheduler and perf
Date: Mon, 16 Dec 2013 07:32:48 -0800	[thread overview]
Message-ID: <20131216153248.GA4200@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131216152636.GX21999@twins.programming.kicks-ass.net>

On Mon, Dec 16, 2013 at 04:26:36PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 16, 2013 at 07:19:22AM -0800, tip-bot for Paul E. McKenney wrote:
> > The underlying problem is that perf is invoking call_rcu() with the
> > scheduler locks held, but in NOCB mode, call_rcu() will with high
> > probability invoke the scheduler -- which just might want to use its
> > locks.  The reason that call_rcu() needs to invoke the scheduler is
> > to wake up the corresponding rcuo callback-offload kthread, which
> > does the job of starting up a grace period and invoking the callbacks
> > afterwards.
> > 
> > One solution (championed on a related problem by Lai Jiangshan) is to
> > simply defer the wakeup to some point where scheduler locks are no longer
> > held.  Since we don't want to unnecessarily incur the cost of such
> > deferral, the task before us is threefold:
> > 
> > 1.	Determine when it is likely that a relevant scheduler lock is held.
> > 
> > 2.	Defer the wakeup in such cases.
> > 
> > 3.	Ensure that all deferred wakeups eventually happen, preferably
> > 	sooner rather than later.
> > 
> > We use irqs_disabled_flags() as a proxy for relevant scheduler locks
> > being held.  This works because the relevant locks are always acquired
> > with interrupts disabled.  We may defer more often than needed, but that
> > is at least safe.
> 
> This would also allow us to do away with things like the below patch,
> right?

It takes care of one problem, but there are others, including
rcu_read_unlock() inovking the scheduler to deboost itself.  So for the
moment, we still need the below patch.

							Thanx, Paul

> ---
> commit 058ebd0eba3aff16b144eabf4510ed9510e1416e
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Fri Jul 12 11:08:33 2013 +0200
> 
>     perf: Fix perf_lock_task_context() vs RCU
>     
>     Jiri managed to trigger this warning:
>     
>      [] ======================================================
>      [] [ INFO: possible circular locking dependency detected ]
>      [] 3.10.0+ #228 Tainted: G        W
>      [] -------------------------------------------------------
>      [] p/6613 is trying to acquire lock:
>      []  (rcu_node_0){..-...}, at: [<ffffffff810ca797>] rcu_read_unlock_special+0xa7/0x250
>      []
>      [] but task is already holding lock:
>      []  (&ctx->lock){-.-...}, at: [<ffffffff810f2879>] perf_lock_task_context+0xd9/0x2c0
>      []
>      [] which lock already depends on the new lock.
>      []
>      [] the existing dependency chain (in reverse order) is:
>      []
>      [] -> #4 (&ctx->lock){-.-...}:
>      [] -> #3 (&rq->lock){-.-.-.}:
>      [] -> #2 (&p->pi_lock){-.-.-.}:
>      [] -> #1 (&rnp->nocb_gp_wq[1]){......}:
>      [] -> #0 (rcu_node_0){..-...}:
>     
>     Paul was quick to explain that due to preemptible RCU we cannot call
>     rcu_read_unlock() while holding scheduler (or nested) locks when part
>     of the read side critical section was preemptible.
>     
>     Therefore solve it by making the entire RCU read side non-preemptible.
>     
>     Also pull out the retry from under the non-preempt to play nice with RT.
>     
>     Reported-by: Jiri Olsa <jolsa@redhat.com>
>     Helped-out-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>     Cc: <stable@kernel.org>
>     Signed-off-by: Peter Zijlstra <peterz@infradead.org>
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index ef5e7cc686e3..eba8fb5834ae 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -947,8 +947,18 @@ perf_lock_task_context(struct task_struct *task, int ctxn, unsigned long *flags)
>  {
>  	struct perf_event_context *ctx;
> 
> -	rcu_read_lock();
>  retry:
> +	/*
> +	 * One of the few rules of preemptible RCU is that one cannot do
> +	 * rcu_read_unlock() while holding a scheduler (or nested) lock when
> +	 * part of the read side critical section was preemptible -- see
> +	 * rcu_read_unlock_special().
> +	 *
> +	 * Since ctx->lock nests under rq->lock we must ensure the entire read
> +	 * side critical section is non-preemptible.
> +	 */
> +	preempt_disable();
> +	rcu_read_lock();
>  	ctx = rcu_dereference(task->perf_event_ctxp[ctxn]);
>  	if (ctx) {
>  		/*
> @@ -964,6 +974,8 @@ perf_lock_task_context(struct task_struct *task, int ctxn, unsigned long *flags)
>  		raw_spin_lock_irqsave(&ctx->lock, *flags);
>  		if (ctx != rcu_dereference(task->perf_event_ctxp[ctxn])) {
>  			raw_spin_unlock_irqrestore(&ctx->lock, *flags);
> +			rcu_read_unlock();
> +			preempt_enable();
>  			goto retry;
>  		}
> 
> @@ -973,6 +985,7 @@ perf_lock_task_context(struct task_struct *task, int ctxn, unsigned long *flags)
>  		}
>  	}
>  	rcu_read_unlock();
> +	preempt_enable();
>  	return ctx;
>  }
> 
>

next prev parent reply	other threads:[~2013-12-16 15:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <tip-96d3fd0d315a949e30adc80f086031c5cdf070d1@git.kernel.org>
2013-12-16 15:26 ` [tip:core/rcu] rcu: Break call_rcu() deadlock involving scheduler and perf Peter Zijlstra
2013-12-16 15:32   ` Paul E. McKenney [this message]
2013-12-16 15:45     ` Peter Zijlstra
2013-12-16 16:10       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131216153248.GA4200@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=davej@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.