All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	linux-kernel@vger.kernel.org, Tejun Heo <tj@kernel.org>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: [PATCH] rcu: Deinline rcu_read_lock_sched_held() if DEBUG_LOCK_ALLOC
Date: Tue, 26 May 2015 08:18:03 -0700	[thread overview]
Message-ID: <20150526151803.GE5989@linux.vnet.ibm.com> (raw)
In-Reply-To: <555E53D1.8070500@redhat.com>

On Thu, May 21, 2015 at 11:53:21PM +0200, Denys Vlasenko wrote:
> On 05/21/2015 05:26 PM, Paul E. McKenney wrote:
> > On Thu, May 21, 2015 at 05:09:27PM +0200, Denys Vlasenko wrote:
> >> On 05/21/2015 03:45 PM, Steven Rostedt wrote:
> >>> On Thu, 21 May 2015 12:04:07 +0200
> >>> Denys Vlasenko <dvlasenk@redhat.com> wrote:
> >>>
> >>>> DEBUG_LOCK_ALLOC=y is not a production setting, but it is
> >>>> not very unusual either. Many developers routinely
> >>>> use kernels built with it enabled.
> >>>>
> >>>> Apart from being selected by hand, it is also auto-selected by
> >>>> PROVE_LOCKING "Lock debugging: prove locking correctness" and
> >>>> LOCK_STAT "Lock usage statistics" config options.
> >>>> LOCK STAT is necessary for "perf lock" to work.
> >>>>
> >>>> I wouldn't spend too much time optimizing it, but this particular
> >>>> function has a very large cost in code size: when it is deinlined,
> >>>> code size decreases by 830,000 bytes:
> >>>>
> >>>>     text     data      bss       dec     hex filename
> >>>> 85674192 22294776 20627456 128596424 7aa39c8 vmlinux.before
> >>>> 84837612 22294424 20627456 127759492 79d7484 vmlinux
> >>>>
> >>>> (with this config: http://busybox.net/~vda/kernel_config)
> >>>>
> >>>> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> >>>> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> >>>> CC: Josh Triplett <josh@joshtriplett.org>
> >>>> CC: Steven Rostedt <rostedt@goodmis.org>
> >>>> CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >>>> CC: Lai Jiangshan <laijs@cn.fujitsu.com>
> >>>> CC: linux-kernel@vger.kernel.org
> >>>> CC: Tejun Heo <tj@kernel.org>
> >>>> CC: Oleg Nesterov <oleg@redhat.com>
> >>>> ---
> >>>>  include/linux/rcupdate.h | 40 ++-----------------------------------
> >>>>  kernel/rcu/update.c      | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>  2 files changed, 54 insertions(+), 38 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> >>>> index 7809749..6024a65 100644
> >>>> --- a/include/linux/rcupdate.h
> >>>> +++ b/include/linux/rcupdate.h
> >>>> @@ -439,46 +439,10 @@ int rcu_read_lock_bh_held(void);
> >>>>   * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
> >>>>   * RCU-sched read-side critical section.  In absence of
> >>>>   * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
> >>>> - * critical section unless it can prove otherwise.  Note that disabling
> >>>> - * of preemption (including disabling irqs) counts as an RCU-sched
> >>>> - * read-side critical section.  This is useful for debug checks in functions
> >>>> - * that required that they be called within an RCU-sched read-side
> >>>> - * critical section.
> >>>> - *
> >>>> - * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
> >>>> - * and while lockdep is disabled.
> >>>> - *
> >>>> - * Note that if the CPU is in the idle loop from an RCU point of
> >>>> - * view (ie: that we are in the section between rcu_idle_enter() and
> >>>> - * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU
> >>>> - * did an rcu_read_lock().  The reason for this is that RCU ignores CPUs
> >>>> - * that are in such a section, considering these as in extended quiescent
> >>>> - * state, so such a CPU is effectively never in an RCU read-side critical
> >>>> - * section regardless of what RCU primitives it invokes.  This state of
> >>>> - * affairs is required --- we need to keep an RCU-free window in idle
> >>>> - * where the CPU may possibly enter into low power mode. This way we can
> >>>> - * notice an extended quiescent state to other CPUs that started a grace
> >>>> - * period. Otherwise we would delay any grace period as long as we run in
> >>>> - * the idle task.
> >>>> - *
> >>>> - * Similarly, we avoid claiming an SRCU read lock held if the current
> >>>> - * CPU is offline.
> >>>> + * critical section unless it can prove otherwise.
> >>>>   */
> >>>>  #ifdef CONFIG_PREEMPT_COUNT
> >>>> -static inline int rcu_read_lock_sched_held(void)
> >>>> -{
> >>>> -	int lockdep_opinion = 0;
> >>>> -
> >>>> -	if (!debug_lockdep_rcu_enabled())
> >>>> -		return 1;
> >>>> -	if (!rcu_is_watching())
> >>>> -		return 0;
> >>>> -	if (!rcu_lockdep_current_cpu_online())
> >>>> -		return 0;
> >>>> -	if (debug_locks)
> >>>> -		lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
> >>>> -	return lockdep_opinion || preempt_count() != 0 || irqs_disabled();
> >>>> -}
> >>>> +int rcu_read_lock_sched_held(void);
> >>>>  #else /* #ifdef CONFIG_PREEMPT_COUNT */
> >>>>  static inline int rcu_read_lock_sched_held(void)
> >>>>  {
> >>>> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> >>>> index e0d31a3..e02218f 100644
> >>>> --- a/kernel/rcu/update.c
> >>>> +++ b/kernel/rcu/update.c
> >>>> @@ -62,6 +62,58 @@ MODULE_ALIAS("rcupdate");
> >>>>  
> >>>>  module_param(rcu_expedited, int, 0);
> >>>>  
> >>>> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> >>>> +/**
> >>>> + * rcu_read_lock_sched_held() - might we be in RCU-sched read-side critical section?
> >>>> + *
> >>>> + * If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
> >>>> + * RCU-sched read-side critical section.  In absence of
> >>>> + * CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
> >>>> + * critical section unless it can prove otherwise.  Note that disabling
> >>>> + * of preemption (including disabling irqs) counts as an RCU-sched
> >>>> + * read-side critical section.  This is useful for debug checks in functions
> >>>> + * that required that they be called within an RCU-sched read-side
> >>>> + * critical section.
> >>>> + *
> >>>> + * Check debug_lockdep_rcu_enabled() to prevent false positives during boot
> >>>> + * and while lockdep is disabled.
> >>>> + *
> >>>> + * Note that if the CPU is in the idle loop from an RCU point of
> >>>> + * view (ie: that we are in the section between rcu_idle_enter() and
> >>>> + * rcu_idle_exit()) then rcu_read_lock_held() returns false even if the CPU
> >>>> + * did an rcu_read_lock().  The reason for this is that RCU ignores CPUs
> >>>> + * that are in such a section, considering these as in extended quiescent
> >>>> + * state, so such a CPU is effectively never in an RCU read-side critical
> >>>> + * section regardless of what RCU primitives it invokes.  This state of
> >>>> + * affairs is required --- we need to keep an RCU-free window in idle
> >>>> + * where the CPU may possibly enter into low power mode. This way we can
> >>>> + * notice an extended quiescent state to other CPUs that started a grace
> >>>> + * period. Otherwise we would delay any grace period as long as we run in
> >>>> + * the idle task.
> >>>> + *
> >>>> + * Similarly, we avoid claiming an SRCU read lock held if the current
> >>>> + * CPU is offline.
> >>>> + */
> >>>> +#ifdef CONFIG_PREEMPT_COUNT
> >>>> +int rcu_read_lock_sched_held(void)
> >>>> +{
> >>>> +	int lockdep_opinion = 0;
> >>>> +
> >>>> +	if (!debug_lockdep_rcu_enabled())
> >>>> +		return 1;
> >>>> +	if (!rcu_is_watching())
> >>>> +		return 0;
> >>>> +	if (!rcu_lockdep_current_cpu_online())
> >>>> +		return 0;
> >>>> +	if (debug_locks)
> >>>> +		lockdep_opinion = lock_is_held(&rcu_sched_lock_map);
> >>>> +	return lockdep_opinion || preempt_count() != 0 || irqs_disabled();
> >>>> +}
> >>>> +EXPORT_SYMBOL(rcu_read_lock_sched_held);
> >>>> +#else
> >>>> +/* !CONFIG_PREEMPT_COUNT - the function is inlined to always return 1 */
> >>>
> >>> Nuke the #else. It's not needed and this is a common enough practice to
> >>> have the static inline foo() { } when disabled that we do not need to
> >>> comment about it here.
> >>
> >> Sending patch v2 in a few minutes.
> >>
> >>>> +#endif
> >>>
> >>> Hmm, you added two #ifdef, and one #endif. How does this even compile??
> >>
> >> Er... it... doesn't.
> >>
> >> There was "#if defined CONFIG_DEBUG_LOCK_ALLOC && defined CONFIG_PREEMPT_COUNT"
> >> but then I split it into two #ifs to have a nice explanatory empty
> >> #else clause. And I did not test it after that edit
> >> ("what could possibly go wrong?"). Sorry.
> >>
> >> Patch v2 I'm sending _is_ tested.
> > 
> > Boot/run as well as build?
> 
> I did not even dare to try booting 128 megabyte allyesconfig monstrosity,
> before you asked.
> 
> It took me 4 hours of rebuilds to figure out that CONFIG_CMDLINE_BOOL needs to be
> disabled for my qemu boot to succeed :) :)

Just make sure you retain the .config file to allow you to more easily
test future changes!  ;-)

> So, yes, patched allyesconfig kernel seems to boot... it takes 250 seconds in qemu.

Steve, are you OK with Denys's most recent patch?

							Thanx, Paul


  reply	other threads:[~2015-05-26 15:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 10:04 [PATCH] rcu: Deinline rcu_read_lock_sched_held() if DEBUG_LOCK_ALLOC Denys Vlasenko
2015-05-21 12:52 ` Paul E. McKenney
2015-05-21 13:09   ` Steven Rostedt
2015-05-21 13:25     ` Paul E. McKenney
2015-05-21 13:41       ` Steven Rostedt
2015-05-21 13:45 ` Steven Rostedt
2015-05-21 15:09   ` Denys Vlasenko
2015-05-21 15:26     ` Paul E. McKenney
2015-05-21 21:53       ` Denys Vlasenko
2015-05-26 15:18         ` Paul E. McKenney [this message]
2015-05-26 15:37           ` Steven Rostedt
2015-05-26 15:47             ` Denys Vlasenko
2015-05-26 15:51             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150526151803.GE5989@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=dvlasenk@redhat.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=oleg@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.