From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: [BUG] race of RCU vs NOHU
Date: Fri, 7 Aug 2009 15:15:29 +0200 [thread overview]
Message-ID: <20090807151529.2806b8b4@skybase> (raw)
Hi Paul,
I analysed a dump of a hanging 2.6.30 system and found what I think is
a bug of RCU vs NOHZ. There are a number of patches ontop of that
kernel but they should be independent of the bug.
The systems has 4 cpus and uses classic RCU. cpus #0, #2 and #3 woke up
recently, cpu #1 has been sleeping for 5 minutes, but there is a pending
rcu batch. The timer wheel for cpu #1 is empty, it will continue to
sleep for NEXT_TIMER_MAX_DELTA ticks.
Now if I look at the RCU data structures I find this:
rcu_ctrlblk
>> px *(struct rcu_ctrlblk *) 0x810000
struct rcu_ctrlblk {
cur = 0xffffffffffffff99
completed = 0xffffffffffffff98
pending = 0xffffffffffffff99
signaled = 0x0
lock = spinlock_t {
raw_lock = raw_spinlock_t {
owner_cpu = 0x0
}
break_lock = 0x0
magic = 0xdead4ead
owner_cpu = 0xffffffff
owner = 0xffffffffffffffff
dep_map = struct lockdep_map {
key = 0x810118
class_cache = 0xcbcff0
name = 0x63e944
cpu = 0x0
ip = 0x1a7f64
}
}
cpumask = {
[0] 0x2
}
}
rcu_data cpu #0
>> px *(struct rcu_data *) 0x872f8430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff97
nxtlist = (nil)
nxttail = {
[0] 0x872f8448
[1] 0x872f8448
[2] 0x872f8448
}
qlen = 0x0
donelist = (nil)
donetail = 0x872f8470
blimit = 0xa
cpu = 0x0
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #1
>> px *(struct rcu_data *) 0x874be430
struct rcu_data {
quiescbatch = 0xffffffffffffff98
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff97
nxtlist = (nil)
nxttail = {
[0] 0x874be448
[1] 0x874be448
[2] 0x874be448
}
qlen = 0x0
donelist = (nil)
donetail = 0x874be470
blimit = 0xa
cpu = 0x1
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #2
>> px *(struct rcu_data *) 0x87684430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff99
nxtlist = 0xffc1fc18
nxttail = {
[0] 0x87684448
[1] 0x87684448
[2] 0xffc1fc18
}
qlen = 0x1
donelist = (nil)
donetail = 0x87684470
blimit = 0xa
cpu = 0x2
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
rcu_data cpu #3
>> px *(struct rcu_data *) 0x8784a430
struct rcu_data {
quiescbatch = 0xffffffffffffff99
passed_quiesc = 0x1
qs_pending = 0x0
batch = 0xffffffffffffff63
nxtlist = (nil)
nxttail = {
[0] 0x8784a448
[1] 0x8784a448
[2] 0x8784a448
}
qlen = 0x0
donelist = (nil)
donetail = 0x8784a470
blimit = 0xa
cpu = 0x3
barrier = struct rcu_head {
next = (nil)
func = 0x0
}
}
At the time cpu #1 went to sleep rcu_needs_cpu must have answered false,
otherwise a 1 tick delay would have been programmed. rcu_pending compares
rcu_ctrlblk.cur with rcu_data.quiescbatch for cpu #1. So these two must
have been equal otherwise rcu_needs_cpu would have answered true.
That means that the rcu_needs_cpu check has been completed before
rcu_start_batch for batch 0xffffffffffffff99. The bit for cpu #1 is
still set in the rcu_ctrlblk.cpumask, therefore the bit for cpu #1
in nohz_cpu_mask can not have been set at the time rcu_start_batch has
completed. That gives the following race (cpu 0 is starting the batch,
cpu 1 is going to sleep):
cpu 1: tick_nohz_stop_sched_tick: rcu_needs_cpu();
cpu 0: rcu_start_batch: rcp->cur++;
cpu 0: rcu_start_batch: cpumask_andnot(to_cpumask(rcp->cpumask),
cpu_online_mask, nonz_cpu_mask);
cpu 1: tick_nohz_stop_schedk_tick: cpumask_set_cpu(1, nohz_cpu_mask);
The order of i) setting the bit in nohz_cpu_mask and ii) the rcu_needs_cpu()
check in tick_nohz_stop_sched_tick is wrong, no? Or did I miss some suble
check that comes afterwards ?
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
next reply other threads:[~2009-08-07 13:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-07 13:15 Martin Schwidefsky [this message]
2009-08-07 14:29 ` [BUG] race of RCU vs NOHU Paul E. McKenney
2009-08-10 12:25 ` Martin Schwidefsky
2009-08-10 15:08 ` Paul E. McKenney
2009-08-11 10:56 ` Martin Schwidefsky
2009-08-11 14:52 ` Paul E. McKenney
2009-08-11 15:17 ` Martin Schwidefsky
2009-08-11 18:04 ` Paul E. McKenney
2009-08-12 7:32 ` Martin Schwidefsky
2009-08-21 15:54 ` Paul E. McKenney
2009-08-31 8:47 ` Martin Schwidefsky
2009-08-31 14:30 ` Paul E. McKenney
2009-08-11 16:58 ` Greg KH
2009-08-10 16:10 ` Pavel Machek
2009-08-11 21:23 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090807151529.2806b8b4@skybase \
--to=schwidefsky@de.ibm.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.