From: Steffen Persvold <sp@numascale.com>
To: paulmck@linux.vnet.ibm.com
Cc: Daniel J Blueman <daniel@numascale-asia.com>,
Dipankar Sarma <dipankar@in.ibm.com>,
linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: RCU qsmask !=0 warnings on large-SMP...
Date: Fri, 27 Jan 2012 12:09:25 +0100 [thread overview]
Message-ID: <4F2285E5.9050705@numascale.com> (raw)
In-Reply-To: <20120126192653.GC2437@linux.vnet.ibm.com>
On 1/26/2012 20:26, Paul E. McKenney wrote:
> On Thu, Jan 26, 2012 at 04:04:37PM +0100, Steffen Persvold wrote:
>> On 1/26/2012 02:58, Paul E. McKenney wrote:
>>> On Wed, Jan 25, 2012 at 11:48:58PM +0100, Steffen Persvold wrote:
>> []
>>>
>>> This looks like it will produce useful information, but I am not seeing
>>> output from it below.
>>>
>>> Thanx, Paul
>>>
>>>> This run it was CPU24 that triggered the issue :
>>>>
>>
>> This line is the printout for the root level :
>>
>>>> [ 231.572688] CPU 24, treason uncloaked, rsp @ ffffffff81a1cd80 (rcu_sched), rnp @ ffffffff81a1cd80(r) qsmask=0x1f, c=5132 g=5132 nc=5132 ng=5133 sc=5132 sg=5133 mc=5132 mg=5133
>
> OK, so the rcu_state structure (sc and sg) believes that grace period
> 5133 has started but not completed, as expected. Strangely enough, so
> does the root rcu_node structure (nc and ng) and the CPU's leaf rcu_node
> structure (mc and mg).
>
> The per-CPU rcu_data structure (c and g) does not yet know about the
> new 5133 grace period, as expected.
>
> So this is the code in kernel/rcutree.c:rcu_start_gp() that does the
> initialization:
>
> rcu_for_each_node_breadth_first(rsp, rnp) {
> raw_spin_lock(&rnp->lock); /* irqs already disabled. */
> rcu_preempt_check_blocked_tasks(rnp);
> rnp->qsmask = rnp->qsmaskinit;
> rnp->gpnum = rsp->gpnum;
> rnp->completed = rsp->completed;
> if (rnp == rdp->mynode)
> rcu_start_gp_per_cpu(rsp, rnp, rdp);
> rcu_preempt_boost_start_gp(rnp);
> trace_rcu_grace_period_init(rsp->name, rnp->gpnum,
> rnp->level, rnp->grplo,
> rnp->grphi, rnp->qsmask);
> raw_spin_unlock(&rnp->lock); /* irqs remain disabled. */
> }
>
> I am assuming that your debug prints are still invoked right after
> the raw_spin_lock() above. If so, I would expect nc==ng and mc==mg.
> Even if your debug prints followed the assignments to rnp->gpnum and
> rnp->completed, I would expect mc==mg for the root and internal rcu_node
> structures. But you say below that you get the same values throughout,
> and in that case, I would expect the leaf rcu_node structure to show
> something different than the root and internal structures.
>
> The code really does hold the root rcu_node lock at all calls to
> rcu_gp_start(), so I don't see how we could be getting two CPUs in that
> code at the same time, which would be one way that the rcu_node and
> rcu_data structures might get advance notice of the new grace period,
> but in that case, you would have more than one bit set in ->qsmask.
>
> So, any luck with the trace events for rcu_grace_period and
> rcu_grace_period_init?
>
I've successfully enabled them and it seems to work, however once the
issue is triggered any attempt to access /sys/kernel/debug/tracing/trace
just hangs :/
Cheers,
--
Steffen Persvold, Chief Architect NumaChip
Numascale AS - www.numascale.com
Tel: +47 92 49 25 54 Skype: spersvold
next prev parent reply other threads:[~2012-01-27 11:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-25 9:44 RCU qsmask !=0 warnings on large-SMP Daniel J Blueman
2012-01-25 14:00 ` Paul E. McKenney
2012-01-25 14:18 ` Steffen Persvold
2012-01-25 18:14 ` Paul E. McKenney
2012-01-25 20:35 ` Steffen Persvold
2012-01-25 21:51 ` Paul E. McKenney
2012-01-25 22:51 ` Steffen Persvold
2012-01-26 1:57 ` Paul E. McKenney
2012-01-25 21:14 ` Steffen Persvold
2012-01-25 21:34 ` Paul E. McKenney
2012-01-25 22:48 ` Steffen Persvold
2012-01-26 1:58 ` Paul E. McKenney
2012-01-26 15:04 ` Steffen Persvold
2012-01-26 19:26 ` Paul E. McKenney
2012-01-27 11:09 ` Steffen Persvold [this message]
2012-01-29 6:09 ` Paul E. McKenney
2012-01-30 16:15 ` Paul E. McKenney
2012-01-31 17:33 ` Steffen Persvold
2012-01-31 17:38 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F2285E5.9050705@numascale.com \
--to=sp@numascale.com \
--cc=daniel@numascale-asia.com \
--cc=dipankar@in.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).