public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org,
	mingo@elte.hu, akpm@linux-foundation.org, dipankar@in.ibm.com,
	josht@linux.vnet.ibm.com, tytso@us.ibm.com, dvhltc@us.ibm.com,
	tglx@linutronix.de, bunk@kernel.org, ego@in.ibm.com,
	oleg@tv-sign.ru
Subject: Re: [PATCH RFC 3/9] RCU: Preemptible RCU
Date: Fri, 21 Sep 2007 17:46:53 +0200	[thread overview]
Message-ID: <20070921174653.64c15a92@twins> (raw)
In-Reply-To: <20070921144003.GC15697@goodmis.org>

On Fri, 21 Sep 2007 10:40:03 -0400 Steven Rostedt <rostedt@goodmis.org>
wrote:

> On Mon, Sep 10, 2007 at 11:34:12AM -0700, Paul E. McKenney wrote:


> Can you have a pointer somewhere that explains these states. And not a
> "it's in this paper or directory". Either have a short discription here,
> or specify where exactly to find the information (perhaps a
> Documentation/RCU/preemptible_states.txt?).
> 
> Trying to understand these states has caused me the most agony in
> reviewing these patches.
> 
> > + */
> > +
> > +enum rcu_try_flip_states {
> > +	rcu_try_flip_idle_state,	/* "I" */
> > +	rcu_try_flip_waitack_state, 	/* "A" */
> > +	rcu_try_flip_waitzero_state,	/* "Z" */
> > +	rcu_try_flip_waitmb_state	/* "M" */
> > +};

I thought the 4 flip states corresponded to the 4 GP stages, but now
you confused me. It seems to indeed progress one stage for every 4 flip
states.

Hmm, now I have to puzzle how these 4 stages are required by the lock
and unlock magic.

> > +/*
> > + * Return the number of RCU batches processed thus far.  Useful for debug
> > + * and statistics.  The _bh variant is identical to straight RCU.
> > + */
> 
> If they are identical, then why the separation?

I guess a smaller RCU domain makes for quicker grace periods.
 
> > +void __rcu_read_lock(void)
> > +{
> > +	int idx;
> > +	struct task_struct *me = current;
> 
> Nitpick, but other places in the kernel usually use "t" or "p" as a
> variable to assign current to.  It's just that "me" thows me off a
> little while reviewing this.  But this is just a nitpick, so do as you
> will.

struct task_struct *curr = current;

is also not uncommon.
 
> > +	int nesting;
> > +
> > +	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > +	if (nesting != 0) {
> > +
> > +		/* An earlier rcu_read_lock() covers us, just count it. */
> > +
> > +		me->rcu_read_lock_nesting = nesting + 1;
> > +
> > +	} else {
> > +		unsigned long oldirq;
> 
> > +
> > +		/*
> > +		 * Disable local interrupts to prevent the grace-period
> > +		 * detection state machine from seeing us half-done.
> > +		 * NMIs can still occur, of course, and might themselves
> > +		 * contain rcu_read_lock().
> > +		 */
> > +
> > +		local_irq_save(oldirq);
> 
> Isn't the GP detection done via a tasklet/softirq. So wouldn't a
> local_bh_disable be sufficient here? You already cover NMIs, which would
> also handle normal interrupts.

This is also my understanding, but I think this disable is an
'optimization' in that it avoids the regular IRQs from jumping through
these hoops outlined below.

> > +
> > +		/*
> > +		 * Outermost nesting of rcu_read_lock(), so increment
> > +		 * the current counter for the current CPU.  Use volatile
> > +		 * casts to prevent the compiler from reordering.
> > +		 */
> > +
> > +		idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > +		smp_read_barrier_depends();  /* @@@@ might be unneeded */
> > +		ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++;
> > +
> > +		/*
> > +		 * Now that the per-CPU counter has been incremented, we
> > +		 * are protected from races with rcu_read_lock() invoked
> > +		 * from NMI handlers on this CPU.  We can therefore safely
> > +		 * increment the nesting counter, relieving further NMIs
> > +		 * of the need to increment the per-CPU counter.
> > +		 */
> > +
> > +		ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1;
> > +
> > +		/*
> > +		 * Now that we have preventing any NMIs from storing
> > +		 * to the ->rcu_flipctr_idx, we can safely use it to
> > +		 * remember which counter to decrement in the matching
> > +		 * rcu_read_unlock().
> > +		 */
> > +
> > +		ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx;
> > +		local_irq_restore(oldirq);
> > +	}
> > +}

> > +/*
> > + * Attempt a single flip of the counters.  Remember, a single flip does
> > + * -not- constitute a grace period.  Instead, the interval between
> > + * at least three consecutive flips is a grace period.
> > + *
> > + * If anyone is nuts enough to run this CONFIG_PREEMPT_RCU implementation
> 
> Oh, come now! It's not "nuts" to use this ;-)
> 
> > + * on a large SMP, they might want to use a hierarchical organization of
> > + * the per-CPU-counter pairs.
> > + */

Its the large SMP case that's nuts, and on that I have to agree with
Paul, its not really large SMP friendly.



  reply	other threads:[~2007-09-21 15:47 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-10 18:30 [PATCH RFC 0/9] RCU: Preemptible RCU Paul E. McKenney
2007-09-10 18:32 ` [PATCH RFC 1/9] RCU: Split API to permit multiple RCU implementations Paul E. McKenney
2007-09-21  4:14   ` Steven Rostedt
2007-09-10 18:33 ` [PATCH RFC 2/9] RCU: Fix barriers Paul E. McKenney
2007-09-10 18:34 ` [PATCH RFC 3/9] RCU: Preemptible RCU Paul E. McKenney
2007-09-21  4:17   ` Steven Rostedt
2007-09-21  5:50     ` Paul E. McKenney
2007-09-21  5:56     ` Dipankar Sarma
2007-09-21 14:40   ` Steven Rostedt
2007-09-21 15:46     ` Peter Zijlstra [this message]
2007-09-21 22:06       ` Paul E. McKenney
2007-09-21 22:31       ` Steven Rostedt
2007-09-21 22:44         ` Paul E. McKenney
2007-09-21 23:23           ` Steven Rostedt
2007-09-21 23:44             ` Paul E. McKenney
2007-09-22  0:26     ` Paul E. McKenney
2007-09-22  1:15       ` Steven Rostedt
2007-09-22  1:53         ` Paul E. McKenney
2007-09-22  3:15           ` Steven Rostedt
2007-09-22  4:07             ` Paul E. McKenney
2007-09-21 15:20   ` Steven Rostedt
2007-09-21 23:03     ` Paul E. McKenney
2007-09-22  0:32       ` Paul E. McKenney
2007-09-22  1:19         ` Steven Rostedt
2007-09-22  1:43           ` Paul E. McKenney
2007-09-22  2:56             ` Steven Rostedt
2007-09-22  4:10               ` Paul E. McKenney
2007-09-23 17:38   ` Oleg Nesterov
2007-09-24  0:15     ` Paul E. McKenney
2007-09-26 15:13       ` Oleg Nesterov
2007-09-27 15:46         ` Paul E. McKenney
2007-09-28 14:47           ` Oleg Nesterov
2007-09-28 18:57             ` Paul E. McKenney
2007-09-30 16:31               ` Oleg Nesterov
2007-09-30 23:02                 ` Davide Libenzi
2007-10-01  1:37                   ` Paul E. McKenney
2007-10-01 18:44                     ` Davide Libenzi
2007-10-01 19:21                       ` Paul E. McKenney
2007-10-01 22:09                         ` Davide Libenzi
2007-10-01 22:24                           ` Paul E. McKenney
2007-10-02 18:02                     ` Oleg Nesterov
2007-10-01  1:20                 ` Paul E. McKenney
2007-09-10 18:35 ` [PATCH RFC 4/9] RCU: synchronize_sched() workaround for CPU hotplug Paul E. McKenney
2007-09-10 18:36 ` [PATCH RFC 5/9] RCU: CPU hotplug support for preemptible RCU Paul E. McKenney
2007-09-30 16:38   ` Oleg Nesterov
2007-10-01  1:41     ` Paul E. McKenney
2007-09-10 18:39 ` [PATCH RFC 6/9] RCU priority boosting " Paul E. McKenney
2007-09-28 22:56   ` Gautham R Shenoy
2007-09-28 23:05     ` Steven Rostedt
2007-09-30  3:11       ` Paul E. McKenney
2007-10-05 11:46   ` Gautham R Shenoy
2007-10-05 12:24     ` Steven Rostedt
2007-10-05 13:21       ` Gautham R Shenoy
2007-10-05 14:07         ` Paul E. McKenney
2007-09-10 18:39 ` [PATCH RFC 7/9] RCU: rcutorture testing for RCU priority boosting Paul E. McKenney
2007-09-10 18:41 ` [PATCH RFC 8/9] RCU: Make RCU priority boosting consume less power Paul E. McKenney
2007-09-10 18:42 ` [PATCH RFC 9/9] RCU: preemptible documentation and comment cleanups Paul E. McKenney
2007-09-10 18:44 ` [PATCH RFC 0/9] RCU: Preemptible RCU Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070921174653.64c15a92@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bunk@kernel.org \
    --cc=dipankar@in.ibm.com \
    --cc=dvhltc@us.ibm.com \
    --cc=ego@in.ibm.com \
    --cc=josht@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@tv-sign.ru \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tytso@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox