Re: rcu: endless stalls - Paul E. McKenney

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Mike Galbraith <efault@gmx.de>
Cc: LKML <linux-kernel@vger.kernel.org>
Subject: Re: rcu: endless stalls
Date: Thu, 14 Jun 2012 09:47:47 -0700	[thread overview]
Message-ID: <20120614164746.GG2458@linux.vnet.ibm.com> (raw)
In-Reply-To: <1339659932.7347.76.camel@marge.simpson.net>

On Thu, Jun 14, 2012 at 09:45:32AM +0200, Mike Galbraith wrote:
> On Wed, 2012-06-13 at 09:12 +0200, Mike Galbraith wrote: 
> > On Wed, 2012-06-13 at 07:56 +0200, Mike Galbraith wrote:
> > 
> > > Question remains though.  Maybe the box hit some other problem that led
> > > to death by RCU gripage, but the info I received indicated the box was
> > > in the midst of a major spin-fest.
> > 
> > To (maybe) speak more clearly, since it's a mutex like any other mutex
> > that loads of CPUs can hit if you've got loads of CPUs, did huge box
> > driver do something that we don't expect so many CPUs to be doing, thus
> > instigate simultaneous exit trouble (ie shoot self in foot), or did that
> > mutex addition create the exit trouble which box appeared to be having?
> 
> Crickets chirping.. I know what _that_ means: "tsk tsk, you dummy" :)

In my case, it means that I suspect you would rather me continue working
on my series of patches to further reduce RCU grace-period-initialization
latencies on large systems than worry about this patch.

If I understand correctly, your patch did what you wanted in the
situation at hand.  I have some quibbles, please see below.

> I suspected that would happen, but asked anyway because I couldn't
> imagine even 4096 CPUs getting tangled up for an _eternity_ trying to go
> to sleep, but the lock which landed after 32-stable where these beasts
> earn their daily fuel rods was splattered all over the event.  Oh well.
> 
> So, I can forget that and just make the thing not gripe itself to death
> should a stall for whatever reason be encountered again.
> 
> Rather than mucking about with rcu_cpu_stall_suppress, how about adjust
> timeout as you proceed, and block report functions?  That way, there's
> no fiddling with things used elsewhere, and it shouldn't matter how
> badly console is being be hammered, you get a full report, and maybe
> even only one.

That sounds like a very good interim approach to me!

> Hm, maybe I should forget hoping to keep check_cpu_stall() happy too,
> and only silently ignore it when busy.

But this is a good way to accumulate a variety of stalls, so not
recommended.

							Thanx, Paul

> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 0da7b88..e9dd654 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -727,24 +727,29 @@ static void record_gp_stall_check_time(struct rcu_state *rsp)
>  	rsp->jiffies_stall = jiffies + jiffies_till_stall_check();
>  }
> 
> +int rcu_stall_report_in_progress;
> +
>  static void print_other_cpu_stall(struct rcu_state *rsp)
>  {
>  	int cpu;
>  	long delta;
>  	unsigned long flags;
>  	int ndetected;
> -	struct rcu_node *rnp = rcu_get_root(rsp);
> +	struct rcu_node *root = rcu_get_root(rsp);

s/root/rnp_root/, please -- consistency with other places in the code.
But see below.

> +	struct rcu_node *rnp;
> 
>  	/* Only let one CPU complain about others per time interval. */
> 
> -	raw_spin_lock_irqsave(&rnp->lock, flags);
> +	raw_spin_lock_irqsave(&root->lock, flags);

On a 4096-CPU system, I bet that this needs to be trylock, with
a bare "return" on failure to acquire the lock.

Of course, that fails if someone is holding this lock for some other
reason.  So I believe that you need a separate ->stalllock in the
rcu_state structure for this purpose.  You won't need to disable irqs
when acquiring the lock.

>  	delta = jiffies - rsp->jiffies_stall;
> -	if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp)) {
> -		raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +	if (delta < RCU_STALL_RAT_DELAY || !rcu_gp_in_progress(rsp) ||
> +			rcu_stall_report_in_progress) {

If you conditionally acquire the new ->stalllock, you shouldn't need
this added check.

> +		raw_spin_unlock_irqrestore(&root->lock, flags);
>  		return;
>  	}
>  	rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
> -	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +	rcu_stall_report_in_progress++;

And with the new ->stalllock, rcu_stall_report_in_progress can go away.

> +	raw_spin_unlock_irqrestore(&root->lock, flags);
> 
>  	/*
>  	 * OK, time to rat on our buddy...
> @@ -765,16 +770,23 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
>  				print_cpu_stall_info(rsp, rnp->grplo + cpu);
>  				ndetected++;
>  			}
> +
> +		/*
> +		 * Push the timeout back as we go.  With a slow serial
> +		 * console on a large machine, this may take a while.
> +		 */ 
> +		raw_spin_lock_irqsave(&root->lock, flags);
> +		rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
> +		raw_spin_unlock_irqrestore(&root->lock, flags);

And the separate ->stalllock should make this unnecessary as well.
However, updating rsp->jiffies_stall periodically is a good idea
in order to decrease cache thrashing on ->stalllock.

>  	}
> 
>  	/*
>  	 * Now rat on any tasks that got kicked up to the root rcu_node
>  	 * due to CPU offlining.
>  	 */
> -	rnp = rcu_get_root(rsp);
> -	raw_spin_lock_irqsave(&rnp->lock, flags);
> -	ndetected = rcu_print_task_stall(rnp);
> -	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +	raw_spin_lock_irqsave(&root->lock, flags);
> +	ndetected = rcu_print_task_stall(root);
> +	raw_spin_unlock_irqrestore(&root->lock, flags);

And the separate lock makes this change unnecessary, also.

>  	print_cpu_stall_info_end();
>  	printk(KERN_CONT "(detected by %d, t=%ld jiffies)\n",
> @@ -784,6 +796,10 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
>  	else if (!trigger_all_cpu_backtrace())
>  		dump_stack();
> 
> +	raw_spin_lock_irqsave(&root->lock, flags);
> +	rcu_stall_report_in_progress--;
> +	raw_spin_unlock_irqrestore(&root->lock, flags);

Ditto here.

> +
>  	/* If so configured, complain about tasks blocking the grace period. */
> 
>  	rcu_print_detail_task_stall(rsp);
> @@ -796,6 +812,17 @@ static void print_cpu_stall(struct rcu_state *rsp)
>  	unsigned long flags;
>  	struct rcu_node *rnp = rcu_get_root(rsp);
> 
> +	raw_spin_lock_irqsave(&rnp->lock, flags);
> +	if (rcu_stall_report_in_progress) {
> +		raw_spin_unlock_irqrestore(&rnp->lock, flags);
> +		return;
> +	}
> +
> +	/* Reset timeout, dump_stack() may take a while on large machines. */
> +	rsp->jiffies_stall = jiffies + 3 * jiffies_till_stall_check() + 3;
> +	rcu_stall_report_in_progress++;
> +	raw_spin_unlock_irqrestore(&rnp->lock, flags);

And do trylock (without disabling irqs) here as well.  No need for
rcu_stall_report_in_progress.  You can update rsp->jiffies_stall
to reduce cache thrashing on ->stalllock.

> +
>  	/*
>  	 * OK, time to rat on ourselves...
>  	 * See Documentation/RCU/stallwarn.txt for info on how to debug
> @@ -813,6 +840,7 @@ static void print_cpu_stall(struct rcu_state *rsp)
>  	if (ULONG_CMP_GE(jiffies, rsp->jiffies_stall))
>  		rsp->jiffies_stall = jiffies +
>  				     3 * jiffies_till_stall_check() + 3;
> +	rcu_stall_report_in_progress--;

And ->stalllock makes this unnecessary as well.

>  	raw_spin_unlock_irqrestore(&rnp->lock, flags);
> 
>  	set_need_resched();  /* kick ourselves to get things going. */
> 
> -Mike
>

next prev parent reply	other threads:[~2012-06-14 16:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-11 10:06 rcu: endless stalls Mike Galbraith
2012-06-11 13:39 ` Paul E. McKenney
2012-06-11 14:22   ` Mike Galbraith
2012-06-11 16:54     ` Paul E. McKenney
2012-06-11 17:20     ` Mike Galbraith
2012-06-11 18:01       ` Paul E. McKenney
2012-06-11 18:10         ` Mike Galbraith
2012-06-13  3:35           ` Mike Galbraith
2012-06-13  4:31             ` Hugh Dickins
2012-06-13  5:56               ` Mike Galbraith
2012-06-13  7:12                 ` Mike Galbraith
2012-06-14  7:45                   ` Mike Galbraith
2012-06-14 16:47                     ` Paul E. McKenney [this message]
2012-06-14 21:34                       ` Mike Galbraith
2012-06-15  7:49                       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120614164746.GG2458@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox