From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: [BUG] race of RCU vs NOHU
Date: Fri, 7 Aug 2009 15:15:29 +0200 [thread overview]
Message-ID: <20090807151529.2806b8b4@skybase> (raw)
Hi Paul,
I analysed a dump of a hanging 2.6.30 system and found what I think is
a bug of RCU vs NOHZ. There are a number of patches ontop of that
kernel but they should be independent of the bug.
The systems has 4 cpus and uses classic RCU. cpus #0, #2 and #3 woke up
recently, cpu #1 has been sleeping for 5 minutes, but there is a pending
rcu batch. The timer wheel for cpu #1 is empty, it will continue to
sleep for NEXT_TIMER_MAX_DELTA ticks.
Now if I look at the RCU data structures I find this:
rcu_ctrlblk
>> px *(struct rcu_ctrlblk *) 0x810000
struct rcu_ctrlblk {
cur = 0xffffffffffffff99
completed = 0xffffffffffffff98
pending = 0xffffffffffffff99
signaled = 0x0
lock = spinlock_t {
raw_lock = raw_spinlock_t {
owner_cpu = 0x0
}
break_lock = 0x0
magic = 0xdead4ead
owner_cpu = 0xffffffff
owner = 0xffffffffffffffff
dep_map = struct lockdep_map {
key = 0x810118
class_cache = 0xcbcff0
name = 0x63e944
cpu = 0x0
ip = 0x1a7f64
}
}
cpumask = {
[0] 0x2
}
}
rcu_data cpu #0
>> px *(struct rcu_data *) 0x872f8430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff97
nxtlist = (nil)
nxttail = {
[0] 0x872f8448
[1] 0x872f8448
[2] 0x872f8448
}
qlen = 0x0
donelist = (nil)
donetail = 0x872f8470
blimit = 0xa
cpu = 0x0
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #1
>> px *(struct rcu_data *) 0x874be430
struct rcu_data {
quiescbatch = 0xffffffffffffff98
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff97
nxtlist = (nil)
nxttail = {
[0] 0x874be448
[1] 0x874be448
[2] 0x874be448
}
qlen = 0x0
donelist = (nil)
donetail = 0x874be470
blimit = 0xa
cpu = 0x1
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #2
>> px *(struct rcu_data *) 0x87684430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff99
nxtlist = 0xffc1fc18
nxttail = {
[0] 0x87684448
[1] 0x87684448
[2] 0xffc1fc18
}
qlen = 0x1
donelist = (nil)
donetail = 0x87684470
blimit = 0xa
cpu = 0x2
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #3
>> px *(struct rcu_data *) 0x8784a430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff63
nxtlist = (nil)
nxttail = {
[0] 0x8784a448
[1] 0x8784a448
[2] 0x8784a448
}
qlen = 0x0
donelist = (nil)
donetail = 0x8784a470
blimit = 0xa
cpu = 0x3
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
At the time cpu #1 went to sleep rcu_needs_cpu must have answered false,
otherwise a 1 tick delay would have been programmed. rcu_pending compares
rcu_ctrlblk.cur with rcu_data.quiescbatch for cpu #1. So these two must
have been equal otherwise rcu_needs_cpu would have answered true.
That means that the rcu_needs_cpu check has been completed before
rcu_start_batch for batch 0xffffffffffffff99. The bit for cpu #1 is
still set in the rcu_ctrlblk.cpumask, therefore the bit for cpu #1
in nohz_cpu_mask can not have been set at the time rcu_start_batch has
completed. That gives the following race (cpu 0 is starting the batch,
cpu 1 is going to sleep):
cpu 1: tick_nohz_stop_sched_tick: rcu_needs_cpu();
cpu 0: rcu_start_batch: rcp->cur++;
cpu 0: rcu_start_batch: cpumask_andnot(to_cpumask(rcp->cpumask),
cpu_online_mask, nonz_cpu_mask);
cpu 1: tick_nohz_stop_schedk_tick: cpumask_set_cpu(1, nohz_cpu_mask);
The order of i) setting the bit in nohz_cpu_mask and ii) the rcu_needs_cpu()
check in tick_nohz_stop_sched_tick is wrong, no? Or did I miss some suble
check that comes afterwards ?
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
next reply other threads:[~2009-08-07 13:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-07 13:15 Martin Schwidefsky [this message]
2009-08-07 14:29 ` [BUG] race of RCU vs NOHU Paul E. McKenney
2009-08-10 12:25 ` Martin Schwidefsky
2009-08-10 15:08 ` Paul E. McKenney
2009-08-11 10:56 ` Martin Schwidefsky
2009-08-11 14:52 ` Paul E. McKenney
2009-08-11 15:17 ` Martin Schwidefsky
2009-08-11 18:04 ` Paul E. McKenney
2009-08-12 7:32 ` Martin Schwidefsky
2009-08-21 15:54 ` Paul E. McKenney
2009-08-31 8:47 ` Martin Schwidefsky
2009-08-31 14:30 ` Paul E. McKenney
2009-08-11 16:58 ` Greg KH
2009-08-10 16:10 ` Pavel Machek
2009-08-11 21:23 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090807151529.2806b8b4@skybase \
--to=schwidefsky@de.ibm.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox