Re: RCU qsmask !=0 warnings on large-SMP...

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Steffen Persvold <sp@numascale.com>
Cc: Daniel J Blueman <daniel@numascale-asia.com>,
	Dipankar Sarma <dipankar@in.ibm.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: RCU qsmask !=0 warnings on large-SMP...
Date: Mon, 30 Jan 2012 08:15:29 -0800	[thread overview]
Message-ID: <20120130161529.GA5118@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120129060921.GC17696@linux.vnet.ibm.com>

On Sat, Jan 28, 2012 at 10:09:21PM -0800, Paul E. McKenney wrote:
> On Fri, Jan 27, 2012 at 12:09:25PM +0100, Steffen Persvold wrote:
> > On 1/26/2012 20:26, Paul E. McKenney wrote:
> > >On Thu, Jan 26, 2012 at 04:04:37PM +0100, Steffen Persvold wrote:
> > >>On 1/26/2012 02:58, Paul E. McKenney wrote:
> > >>>On Wed, Jan 25, 2012 at 11:48:58PM +0100, Steffen Persvold wrote:
> > >>[]
> > >>>
> > >>>This looks like it will produce useful information, but I am not seeing
> > >>>output from it below.
> > >>>
> > >>>							Thanx, Paul
> > >>>
> > >>>>This run it was CPU24 that triggered the issue :
> > >>>>
> > >>
> > >>This line is the printout for the root level :
> > >>
> > >>>>[  231.572688] CPU 24, treason uncloaked, rsp @ ffffffff81a1cd80 (rcu_sched), rnp @ ffffffff81a1cd80(r) qsmask=0x1f, c=5132 g=5132 nc=5132 ng=5133 sc=5132 sg=5133 mc=5132 mg=5133
> > >
> > >OK, so the rcu_state structure (sc and sg) believes that grace period
> > >5133 has started but not completed, as expected.  Strangely enough, so
> > >does the root rcu_node structure (nc and ng) and the CPU's leaf rcu_node
> > >structure (mc and mg).
> > >
> > >The per-CPU rcu_data structure (c and g) does not yet know about the
> > >new 5133 grace period, as expected.
> > >
> > >So this is the code in kernel/rcutree.c:rcu_start_gp() that does the
> > >initialization:
> > >
> > >	rcu_for_each_node_breadth_first(rsp, rnp) {
> > >		raw_spin_lock(&rnp->lock);	/* irqs already disabled. */
> > >		rcu_preempt_check_blocked_tasks(rnp);
> > >		rnp->qsmask = rnp->qsmaskinit;
> > >		rnp->gpnum = rsp->gpnum;
> > >		rnp->completed = rsp->completed;
> > >		if (rnp == rdp->mynode)
> > >			rcu_start_gp_per_cpu(rsp, rnp, rdp);
> > >		rcu_preempt_boost_start_gp(rnp);
> > >		trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
> > >					    rnp->level, rnp->grplo,
> > >					    rnp->grphi, rnp->qsmask);
> > >		raw_spin_unlock(&rnp->lock);	/* irqs remain disabled. */
> > >	}
> > >
> > >I am assuming that your debug prints are still invoked right after
> > >the raw_spin_lock() above.  If so, I would expect nc==ng and mc==mg.
> > >Even if your debug prints followed the assignments to rnp->gpnum and
> > >rnp->completed, I would expect mc==mg for the root and internal rcu_node
> > >structures.  But you say below that you get the same values throughout,
> > >and in that case, I would expect the leaf rcu_node structure to show
> > >something different than the root and internal structures.
> > >
> > >The code really does hold the root rcu_node lock at all calls to
> > >rcu_gp_start(), so I don't see how we could be getting two CPUs in that
> > >code at the same time, which would be one way that the rcu_node and
> > >rcu_data structures might get advance notice of the new grace period,
> > >but in that case, you would have more than one bit set in ->qsmask.
> > >
> > >So, any luck with the trace events for rcu_grace_period and
> > >rcu_grace_period_init?
> > >
> > 
> > I've successfully enabled them and it seems to work, however once
> > the issue is triggered any attempt to access
> > /sys/kernel/debug/tracing/trace just hangs :/
> 
> Hmmm...  I wonder if it waits for a grace period?
> 
> If it cannot be made to work, I can probably put together some
> alternative diagnostics, but it will take me a day or three.

Actually, another thing to try is "torture_type=rcu_bh" on the modprobe
line for rcutorture.  Also, it would be good to get a stack dump of the
hung process -- it might be hung for some other reason.

							Thanx, Paul

next prev parent reply	other threads:[~2012-01-30 16:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-25  9:44 RCU qsmask !=0 warnings on large-SMP Daniel J Blueman
2012-01-25 14:00 ` Paul E. McKenney
2012-01-25 14:18   ` Steffen Persvold
2012-01-25 18:14     ` Paul E. McKenney
2012-01-25 20:35       ` Steffen Persvold
2012-01-25 21:51         ` Paul E. McKenney
2012-01-25 22:51           ` Steffen Persvold
2012-01-26  1:57             ` Paul E. McKenney
2012-01-25 21:14       ` Steffen Persvold
2012-01-25 21:34         ` Paul E. McKenney
2012-01-25 22:48           ` Steffen Persvold
2012-01-26  1:58             ` Paul E. McKenney
2012-01-26 15:04               ` Steffen Persvold
2012-01-26 19:26                 ` Paul E. McKenney
2012-01-27 11:09                   ` Steffen Persvold
2012-01-29  6:09                     ` Paul E. McKenney
2012-01-30 16:15                       ` Paul E. McKenney [this message]
2012-01-31 17:33                         ` Steffen Persvold
2012-01-31 17:38                           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120130161529.GA5118@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=daniel@numascale-asia.com \
    --cc=dipankar@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sp@numascale.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).