From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
Date: Thu, 19 Mar 2009 11:06:39 +0800 [thread overview]
Message-ID: <49C1B6BF.5090702@cn.fujitsu.com> (raw)
In-Reply-To: <20090308160005.GE19658@elte.hu>
Ingo Molnar wrote:
> * Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>
>> [RFC]
>> I don't like this patch, but I thought for some days and I can't
>> thought out a better one.
>
> Interesting find. Found via code review or via testing? If via
> testing, what is the symptom of the bug when it hits - did you
> see CPU hotplug stress-tests hanging? Crashing too perhaps? How
> frequently did it occur?
I found this bug when I tested the draft version of kfree_rcu(V3).
I noticed kfree_rcu_cpu_notify() is called earlier than
rcu_cpu_notify(). This means rcu_barrier() is called earlier than
RCU callbacks migration, it should lockup as expectation. But actually,
this lockup can not occurred, I tried to explore it, and I found that
rcu_barrier() does not handle cpu_hotplug. It includes two bugs.
kfree_rcu(V3) (V4 is available too, it will be sent soon):
http://lkml.org/lkml/2009/3/6/156
The V1 fix of this bug:
http://lkml.org/lkml/2009/3/7/38
The fix of the other bug: (it changed the scheduler's code too)
http://lkml.org/lkml/2009/3/7/39
Subject: [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu (V2)
cpu hotplug may be happened asynchronously, some rcu callbacks are maybe
still in dead cpu, rcu_barrier() also needs to wait for these rcu callbacks
to complete, so we must ensure callbacks in dead cpu are migrated to
online cpu.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index cae8a05..2c7b845 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -122,6 +122,8 @@ static void rcu_barrier_func(void *type)
}
}
+static inline void wait_migrated_callbacks(void);
+
/*
* Orchestrate the specified type of RCU barrier, waiting for all
* RCU callbacks of the specified type to complete.
@@ -147,6 +149,7 @@ static void _rcu_barrier(enum rcu_barrier type)
complete(&rcu_barrier_completion);
wait_for_completion(&rcu_barrier_completion);
mutex_unlock(&rcu_barrier_mutex);
+ wait_migrated_callbacks();
}
/**
@@ -176,9 +179,50 @@ void rcu_barrier_sched(void)
}
EXPORT_SYMBOL_GPL(rcu_barrier_sched);
+static atomic_t rcu_migrate_type_count = ATOMIC_INIT(0);
+static struct rcu_head rcu_migrate_head[3];
+static DECLARE_WAIT_QUEUE_HEAD(rcu_migrate_wq);
+
+static void rcu_migrate_callback(struct rcu_head *notused)
+{
+ if (atomic_dec_and_test(&rcu_migrate_type_count))
+ wake_up(&rcu_migrate_wq);
+}
+
+static inline void wait_migrated_callbacks(void)
+{
+ wait_event(rcu_migrate_wq, !atomic_read(&rcu_migrate_type_count));
+}
+
+static int __cpuinit rcu_barrier_cpu_hotplug(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ if (action == CPU_DYING) {
+ /*
+ * preempt_disable() in on_each_cpu() prevents stop_machine(),
+ * so when "on_each_cpu(rcu_barrier_func, (void *)type, 1);"
+ * returns, all online cpus have queued rcu_barrier_func(),
+ * and the dead cpu(if it exist) queues rcu_migrate_callback()s.
+ *
+ * These callbacks ensure _rcu_barrier() waits for all
+ * RCU callbacks of the specified type to complete.
+ */
+ atomic_set(&rcu_migrate_type_count, 3);
+ call_rcu_bh(rcu_migrate_head, rcu_migrate_callback);
+ call_rcu_sched(rcu_migrate_head + 1, rcu_migrate_callback);
+ call_rcu(rcu_migrate_head + 2, rcu_migrate_callback);
+ } else if (action == CPU_POST_DEAD) {
+ /* rcu_migrate_head is protected by cpu_add_remove_lock */
+ wait_migrated_callbacks();
+ }
+
+ return NOTIFY_OK;
+}
+
void __init rcu_init(void)
{
__rcu_init();
+ hotcpu_notifier(rcu_barrier_cpu_hotplug, 0);
}
void rcu_scheduler_starting(void)
next prev parent reply other threads:[~2009-03-19 3:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-07 10:54 [PATCH] rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu Lai Jiangshan
2009-03-07 17:29 ` Paul E. McKenney
2009-03-08 2:58 ` Lai Jiangshan
2009-03-08 6:20 ` Paul E. McKenney
2009-03-09 2:56 ` Lai Jiangshan
2009-03-09 4:28 ` Paul E. McKenney
2009-03-08 16:00 ` Ingo Molnar
2009-03-19 3:06 ` Lai Jiangshan [this message]
2009-03-19 4:05 ` Paul E. McKenney
[not found] ` <20090319082237.GA32179@elte.hu>
2009-03-20 9:40 ` Lai Jiangshan
2009-03-20 20:00 ` [tip:core/rcu] rcu: " Lai Jiangshan
2009-03-30 22:12 ` Lai Jiangshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49C1B6BF.5090702@cn.fujitsu.com \
--to=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox