From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, cl@linux-foundation.org,
mingo@elte.hu, akpm@linux-foundation.org,
manfred@colorfullife.com, dipankar@in.ibm.com,
josht@linux.vnet.ibm.com, schamp@sgi.com, niv@us.ibm.com,
dvhltc@us.ibm.com, ego@in.ibm.com, rostedt@goodmis.org,
peterz@infradead.org
Subject: Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation
Date: Sat, 30 Aug 2008 07:29:35 -0700 [thread overview]
Message-ID: <20080830142935.GE7107@linux.vnet.ibm.com> (raw)
In-Reply-To: <48B919C2.1040809@cn.fujitsu.com>
On Sat, Aug 30, 2008 at 05:58:26PM +0800, Lai Jiangshan wrote:
> I just had a fast review. so my comments is nothing but cleanup.
Thank you for looking it over!!!
> Thanks, Lai.
>
> Paul E. McKenney wrote:
> > Hello!
>
> > +rcu_start_gp(struct rcu_state *rsp, unsigned long iflg)
> > + __releases(rsp->rda[smp_processor_id()]->lock)
> > +{
> > + unsigned long flags = iflg;
> > + struct rcu_data *rdp = rsp->rda[smp_processor_id()];
> > + struct rcu_node *rnp = rcu_get_root(rsp);
> > + struct rcu_node *rnp_cur;
> > + struct rcu_node *rnp_end;
> > +
> > + if (!cpu_needs_another_gp(rsp, rdp)) {
> >
> > /*
> > - * Accessing nohz_cpu_mask before incrementing rcp->cur needs a
> > - * Barrier Otherwise it can cause tickless idle CPUs to be
> > - * included in rcp->cpumask, which will extend graceperiods
> > - * unnecessarily.
> > + * Either there is no need to detect any more grace periods
> > + * at the moment, or we are already in the process of
> > + * detecting one. Either way, we should not start a new
> > + * RCU grace period, so drop the lock and return.
> > */
> > - smp_mb();
> > - cpus_andnot(rcp->cpumask, cpu_online_map, nohz_cpu_mask);
> > + spin_unlock_irqrestore(&rnp->lock, flags);
> > + return;
> > + }
> > +
> > + /* Advance to a new grace period and initialize state. */
> > +
> > + rsp->gpnum++;
> > + rsp->signaled = RCU_SIGNAL_INIT;
> > + rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
> > + record_gp_stall_check_time();
> > + dyntick_save_completed(rsp, rsp->completed - 1);
> > + note_new_gpnum(rsp, rdp);
> > +
> > + /*
> > + * Because we are first, we know that all our callbacks will
> > + * be covered by this upcoming grace period, even the ones
> > + * that were registered arbitrarily recently.
> > + */
> > +
> > + rcu_next_callbacks_are_ready(rdp);
> > + rdp->nxttail[RCU_WAIT_TAIL] = rdp->nxttail[RCU_NEXT_TAIL];
> >
> > - rcp->signaled = 0;
> > + /* Special-case the common single-level case. */
> > +
> > + if (NUM_RCU_NODES == 1) {
> > + rnp->qsmask = rnp->qsmaskinit;
>
> I tried a mask like qsmaskinit before. The system came to deadlock
> when I did on/offline cpus.
And I did need to address this.
> I didn't find out the whys for I bethought of these two problem:
>
> problem 1:
> ----race condition 1:
> <cpu_down>
> synchronize_rcu <called from offline handler in other subsystem>
> rcu_offline_cpu
>
>
> -----race condition 2:
> rcu_online_cpu
> synchronize_rcu <called from online handler in other subsystem>
> <cpu_up>
>
> in these two condition, synchronize_rcu isblocked for ever for
> synchronize_rcu have to wait a cpu in rnp->qsmask, but this
> cpu don't run.
First, only one of these race conditions can happen at a time, since
only one online/offline action can be happening at a time.
What I did to solve it was to make force_quiescent_state() check
to see if the CPU currently blocking the grace period is offline.
(The actual checking for offline is in rcu_implicit_offline_qs(),
which is called indirectly from force_quiescent_state().)
So when this race occurs, it is taken care of within three jiffies.
This happened -many- times during my most recent test ("of=" in the
rcudata trace).
> problem 2:
> we need call rcu_offline_cpu() in these two cases in rcu_cpu_notify()
> since qsmaskinit had changed by rcu_online_cpu()
>
> case CPU_UP_CANCELED:
> case CPU_UP_CANCELED_FROZEN:
Good catch!!! Fixed. The current code would work in this case, but grace
periods would be unnecessarily extended until force_quiescent_state()
got a chance to clean things up. So very good to fix this one.
> > +static void
> > +cpu_quiet(int cpu, struct rcu_state *rsp, struct rcu_data *rdp, long *lastcomp)
> > {
> > unsigned long flags;
> > + long mask;
>
> long mask -> unsigned long mask
Good eyes! Fixed.
> > +static void __rcu_offline_cpu(int cpu, struct rcu_state *rsp)
> > {
> > - if (list) {
> > - local_irq_disable();
> > - this_rdp->batch = batch;
> > - *this_rdp->nxttail[2] = list;
> > - this_rdp->nxttail[2] = tail;
> > - local_irq_enable();
> > + int i;
> > + unsigned long flags;
> > + long mask;
>
> long mask -> unsigned long mask
Here too!
> > + * Queue an RCU callback for invocation after a grace period.
> > + */
> > +void call_rcu(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
> > +{
> > + unsigned long flags;
> > +
> > + head->func = func;
> > + head->next = NULL;
> > + local_irq_save(flags);
> > + __call_rcu(head, &rcu_state, &__get_cpu_var(rcu_data));
> > + local_irq_restore(flags);
> > +}
>
> struct rcu_state has a field: struct rcu_data *rda[NR_CPUS]
> so we can move these lines around __call_rcu into __call_rcu.
>
> __call_rcu(struct rcu_head *head, struct rcu_state *rsp)
> {
> local_irq_save(flags);
> struct rcu_data *rdp = rsp->rda[smp_processor_id()];
> .....
> local_irq_save(flags);
> }
Very good point!!! And then call_rcu() and call_rcu_bh() become
one-liners. ;-)
> > +static void
> > +rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
> > {
> > - if (user ||
> > - (idle_cpu(cpu) && !in_softirq() &&
> > - hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> > -
> > - /*
> > - * Get here if this CPU took its interrupt from user
> > - * mode or from the idle loop, and if this is not a
> > - * nested interrupt. In this case, the CPU is in
> > - * a quiescent state, so count it.
> > - *
> > - * Also do a memory barrier. This is needed to handle
> > - * the case where writes from a preempt-disable section
> > - * of code get reordered into schedule() by this CPU's
> > - * write buffer. The memory barrier makes sure that
> > - * the rcu_qsctr_inc() and rcu_bh_qsctr_inc() are see
> > - * by other CPUs to happen after any such write.
> > - */
> > + unsigned long flags;
> > + int i;
> > + long mask;
>
> long mask -> unsigned long mask
And again! ;-) Very good eyes!!!
> > +
> > +/*
> > + * Helper function for rcu_init() that initializes one rcu_state structure.
> > + */
> > +static void __init rcu_init_one(struct rcu_state *rsp)
> > +{
> > + int i;
> > + int j;
> > + struct rcu_node *rnp;
> > +
> > + /* Initialize the level-tracking arrays. */
> > +
> > + for (i = 1; i < NUM_RCU_LVLS; i++) {
> > + rsp->level[i] = rsp->level[i - 1] + rsp->levelcnt[i - 1];
> > + }
> > + rcu_init_levelspread(rsp);
> > +
> > + /* Initialize the elements themselves, starting from the leaves. */
> > +
> > + for (i = NUM_RCU_LVLS - 1; i >= 0; i--) {
> > + rnp = rsp->level[i];
> > + for (j = 0; j < rsp->levelcnt[i]; j++, rnp++) {
> > + spin_lock_init(&rnp->lock);
> > + rnp->qsmask = 0;
> > + rnp->grplo = j * rsp->levelspread[i];
> > + rnp->grphi = (j + 1) * rsp->levelspread[i] - 1;
> > + if (rnp->grphi >= rsp->levelcnt[i + 1])
> > + rnp->grphi = rsp->levelcnt[i + 1] - 1;
> > + rnp->qsmaskinit = 0;
>
> if no other reason, I will init fields with the order as they are declared.
Good point, moved.
> > + if (i != NUM_RCU_LVLS - 1)
> > + rnp->grplo = rnp->grphi = 0;
> > + if (i == 0) {
> > + rnp->grpnum = 0;
> > + rnp->parent = NULL;
> > + } else {
> > + rnp->grpnum = j % rsp->levelspread[i - 1];
> > + rnp->parent = rsp->level[i - 1] +
> > + j / rsp->levelspread[i - 1];
> > + }
> > + rnp->level = i;
> > + }
> > + }
> > +}
> > +
> > +/*
> > + * Helper macro for rcu_init(). To be used nowhere else!
>
> rcu_init -> __rcu_init
Good catch, fixed.
> > + * Assigns leaf node pointers into each CPU's rcu_data structure.
> > + */
> > +#define RCU_DATA_PTR_INIT(rsp, rcu_data) \
> > +do { \
> > + rnp = (rsp)->level[NUM_RCU_LVLS - 1]; \
> > + j = 0; \
> > + for_each_possible_cpu(i) { \
> > + if (i > rnp[j].grphi) \
> > + j++; \
> > + per_cpu(rcu_data, i).mynode = &rnp[j]; \
> > + (rsp)->rda[i] = &per_cpu(rcu_data, i); \
> > + } \
> > +} while (0)
> > +
> > static struct notifier_block __cpuinitdata rcu_nb = {
> > .notifier_call = rcu_cpu_notify,
> > };
> >
>
next prev parent reply other threads:[~2008-08-30 14:29 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-21 23:43 [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation Paul E. McKenney
2008-08-22 4:37 ` Ingo Molnar
2008-08-22 13:47 ` Paul E. McKenney
2008-08-22 17:22 ` Paul E. McKenney
2008-08-22 18:16 ` Josh Triplett
2008-08-23 16:07 ` Ingo Molnar
2008-08-24 2:44 ` Paul E. McKenney
2008-08-22 23:29 ` Josh Triplett
2008-08-23 1:53 ` Paul E. McKenney
2008-08-25 22:02 ` Josh Triplett
2008-08-26 16:05 ` Paul E. McKenney
2008-08-27 0:38 ` Josh Triplett
2008-08-27 18:34 ` Paul E. McKenney
2008-08-27 20:23 ` Josh Triplett
2008-08-27 20:41 ` Paul E. McKenney
2008-08-25 10:34 ` Peter Zijlstra
2008-08-25 15:16 ` Paul E. McKenney
2008-08-25 15:26 ` Peter Zijlstra
2008-08-27 18:28 ` Paul E. McKenney
2008-08-24 8:08 ` Manfred Spraul
2008-08-24 16:32 ` Paul E. McKenney
2008-08-24 18:25 ` Manfred Spraul
2008-08-24 21:19 ` Paul E. McKenney
2008-08-25 0:07 ` [PATCH, RFC, tip/core/rcu] v2 " Paul E. McKenney
2008-08-30 0:49 ` [PATCH, RFC, tip/core/rcu] v3 " Paul E. McKenney
2008-08-30 9:33 ` Peter Zijlstra
2008-08-30 14:10 ` Paul E. McKenney
2008-08-30 15:40 ` Peter Zijlstra
2008-08-30 19:38 ` Paul E. McKenney
2008-09-02 13:26 ` Mathieu Desnoyers
2008-09-02 13:41 ` Peter Zijlstra
2008-09-02 14:55 ` Paul E. McKenney
2008-08-30 9:58 ` Lai Jiangshan
2008-08-30 13:32 ` Manfred Spraul
2008-08-30 14:34 ` Paul E. McKenney
2008-08-31 10:58 ` Manfred Spraul
2008-08-31 17:20 ` Paul E. McKenney
2008-08-31 17:45 ` Manfred Spraul
2008-08-31 17:55 ` Paul E. McKenney
2008-08-31 18:18 ` Manfred Spraul
2008-08-31 19:23 ` Paul E. McKenney
2008-08-30 14:29 ` Paul E. McKenney [this message]
2008-09-01 9:38 ` Andi Kleen
2008-09-02 1:05 ` Paul E. McKenney
2008-09-02 6:18 ` Andi Kleen
2008-09-05 15:29 ` [PATCH, RFC] v4 " Paul E. McKenney
2008-09-05 19:33 ` Andrew Morton
2008-09-05 23:04 ` Paul E. McKenney
2008-09-05 23:52 ` Andrew Morton
2008-09-06 4:16 ` Paul E. McKenney
2008-09-06 16:37 ` Manfred Spraul
2008-09-07 17:25 ` Paul E. McKenney
2008-09-07 10:18 ` [RFC, PATCH] Add a CPU_STARTING notifier (was: Re: [PATCH, RFC] v4 scalable classic RCU implementation) Manfred Spraul
2008-09-07 11:07 ` Andi Kleen
2008-09-07 19:46 ` Paul E. McKenney
2008-09-15 16:02 ` [PATCH, RFC] v4 scalable classic RCU implementation Paul E. McKenney
2008-09-16 16:52 ` Manfred Spraul
2008-09-16 17:30 ` Paul E. McKenney
2008-09-16 17:48 ` Manfred Spraul
2008-09-16 18:22 ` Paul E. McKenney
2008-09-21 11:09 ` Manfred Spraul
2008-09-21 21:14 ` Paul E. McKenney
2008-09-23 23:53 ` [PATCH, RFC] v6 " Paul E. McKenney
2008-09-25 7:26 ` Ingo Molnar
2008-09-25 14:05 ` Paul E. McKenney
2008-09-25 7:29 ` Ingo Molnar
2008-09-25 14:18 ` Paul E. McKenney
2008-10-10 16:09 ` [PATCH, RFC] v7 " Paul E. McKenney
2008-10-12 15:52 ` Manfred Spraul
2008-10-12 22:46 ` Paul E. McKenney
2008-10-13 18:03 ` Manfred Spraul
2008-10-15 1:11 ` Paul E. McKenney
2008-10-15 8:13 ` Manfred Spraul
2008-10-15 15:26 ` Paul E. McKenney
2008-10-22 18:41 ` Manfred Spraul
2008-10-22 21:02 ` Paul E. McKenney
2008-10-22 21:24 ` Manfred Spraul
2008-10-27 16:45 ` Paul E. McKenney
2008-10-27 19:48 ` Manfred Spraul
2008-10-27 23:52 ` Paul E. McKenney
2008-10-28 5:30 ` Manfred Spraul
2008-10-28 15:17 ` Paul E. McKenney
2008-10-28 17:21 ` Manfred Spraul
2008-10-28 17:35 ` Paul E. McKenney
2008-10-17 8:34 ` Gautham R Shenoy
2008-10-17 15:35 ` Gautham R Shenoy
2008-10-17 15:46 ` Paul E. McKenney
2008-10-17 15:43 ` Paul E. McKenney
2008-12-08 18:42 ` Paul E. McKenney
2008-11-02 20:10 ` Manfred Spraul
2008-11-03 20:33 ` Paul E. McKenney
2008-11-05 19:48 ` Manfred Spraul
2008-11-05 21:27 ` Paul E. McKenney
2008-11-15 23:20 ` [PATCH, RFC] v8 " Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080830142935.GE7107@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=dipankar@in.ibm.com \
--cc=dvhltc@us.ibm.com \
--cc=ego@in.ibm.com \
--cc=josht@linux.vnet.ibm.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=schamp@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.