From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, stable@kernel.org,
akpm@linux-foundation.org, npiggin@suse.de,
penberg@cs.helsinki.fi
Subject: Re: [PATCH] v4 Teach RCU that idle task is not quiscent state at boot
Date: Wed, 25 Feb 2009 11:01:18 -0800 [thread overview]
Message-ID: <20090225190118.GI6797@linux.vnet.ibm.com> (raw)
In-Reply-To: <19f34abd0902251038y4c0a5261ya16dfb3db7a3302@mail.gmail.com>
On Wed, Feb 25, 2009 at 07:38:23PM +0100, Vegard Nossum wrote:
> 2009/2/25 Paul E. McKenney <paulmck@linux.vnet.ibm.com>:
> > This patch fixes a bug located by Vegard Nossum with the aid of
> > kmemcheck, updated based on review comments from Nick Piggin,
> > Ingo Molnar, and Andrew Morton.
> >
> > The boot CPU runs in the context of its idle thread during boot-up.
> > During this time, idle_cpu(0) will always return nonzero, which will
> > fool Classic and Hierarchical RCU into deciding that a large chunk of
> > the boot-up sequence is a big long quiescent state. This in turn causes
> > RCU to prematurely end grace periods during this time.
> >
> > This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
> > function to ignore the idle task as a quiescent state until the
> > system has started up the scheduler in rest_init(), introducing a
> > new non-API function rcu_idle_now_means_idle() to inform RCU of this
> > transition. RCU maintains an internal rcu_idle_cpu_truthful variable
> > to track this state, which is then used by rcu_check_callback() to
> > determine if it should believe idle_cpu().
> >
> > Because this patch has the effect of disallowing RCU grace periods
> > during long stretches of the boot-up sequence, this patch also introduces
> > Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
> > no-op if num_online_cpus() returns 1. This allows boot-time code that
> > calls synchronize_rcu() to proceed normally. Note, however, that RCU
> > callbacks registered by call_rcu() will likely queue up until later in
> > the boot sequence. Although rcuclassic and rcutree can also use this
> > same optimization after boot completes, rcupreempt must restrict its
> > use of this optimization to the portion of the boot sequence before the
> > scheduler starts up, given that an rcupreempt RCU read-side critical
> > section may be preeempted.
> >
> > In addition, this patch takes Nick Piggin's suggestion to make the
> > system_state global variable be __read_mostly.
> >
> > Changes since v3:
> >
> > o WARN_ON(nr_context_switches() > 0) to verify that RCU
> > switches out of boot-time mode before the first context
> > switch, as suggested by Nick Piggin.
> >
> > Changes since v2:
> >
> > o Created rcu_blocking_is_gp() internal-to-RCU API that
> > determines whether a call to synchronize_rcu() is itself
> > a grace period.
> >
> > o The definition of rcu_blocking_is_gp() for rcuclassic and
> > rcutree checks to see if but a single CPU is online.
> >
> > o The definition of rcu_blocking_is_gp() for rcupreempt
> > checks to see both if but a single CPU is online and if
> > the system is still in early boot.
> >
> > This allows rcupreempt to again work correctly if running
> > on a single CPU after booting is complete.
> >
> > o Added check to rcupreempt's synchronize_sched() for there
> > being but one online CPU.
> >
> > Tested all three variants both SMP and !SMP, booted fine, passed a short
> > rcutorture test on both x86 and Power.
> >
> > Located-by: Vegard Nossum <vegard.nossum@gmail.com>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
>
> Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
>
> I used such a patch and config (as attached) to verify. On unpatched
> kernel, I get this:
>
> rcu callbacks called: 2300 (should be 4000)
>
> And with the patch, it seems better:
>
> rcu callbacks called: 4000 (should be 4000)
>
> Actually, the first part of the text itself is wrong, because it used
> to count upwards. But it should really say "rcu callbacks NOT called".
Yes, your test patch does look like it validates my patch, thank you!!!
> Of course, this kind of check will always be inaccurate, so it is not
> fit for mainline. The inaccuracy stems from the fact that there could
> be other users of RCU that disturb our expected behaviour, and we
> don't know how quickly the callbacks will be run, only that if
> "enough" of them run inside the spinlock, something is probably wrong.
> But... what am I saying, you're the expert ;-)
On RCU, maybe, but the boot-time code has changed considerably since
I last looked at it. ;-)
> But it is useful enough to verify your patch. Some other configs I
> tried gave the correct result all the time, but I don't know what is
> special about this one.
Thank you again!!!
I will be submitting another patch that differs only in the names of
the variable and function that Ingo pointed out, so I believe it will
be legitimate to attach your Tested-by: to that upcoming patch.
Thanx, Paul
next prev parent reply other threads:[~2009-02-25 19:01 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-23 16:16 [PATCH] Teach RCU that idle task is not quiscent state at boot Paul E. McKenney
2009-02-23 17:13 ` Ingo Molnar
2009-02-23 20:43 ` [PATCH] v2 " Paul E. McKenney
2009-02-24 4:02 ` Nick Piggin
2009-02-24 5:08 ` Paul E. McKenney
2009-02-24 14:50 ` Paul E. McKenney
2009-02-25 0:29 ` [PATCH] v3 " Paul E. McKenney
2009-02-25 4:12 ` Nick Piggin
2009-02-25 4:43 ` Paul E. McKenney
2009-02-25 14:19 ` [PATCH] v4 " Paul E. McKenney
2009-02-25 16:00 ` Ingo Molnar
2009-02-25 16:26 ` Nick Piggin
2009-02-25 17:12 ` Paul E. McKenney
2009-02-25 17:18 ` Nick Piggin
2009-02-25 17:36 ` Paul E. McKenney
2009-02-25 17:08 ` Paul E. McKenney
2009-02-26 3:09 ` Ingo Molnar
2009-02-25 18:38 ` Vegard Nossum
2009-02-25 19:01 ` Paul E. McKenney [this message]
2009-02-26 2:03 ` [PATCH] v5 " Paul E. McKenney
2009-02-26 3:08 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090225190118.GI6797@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=stable@kernel.org \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.