From: Frederic Weisbecker <fweisbec@gmail.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] lockdep: Make lockstats counting per cpu
Date: Fri, 26 Mar 2010 04:13:02 +0100 [thread overview]
Message-ID: <20100326031301.GC9858@nowhere> (raw)
In-Reply-To: <1269573118-11120-1-git-send-regression-fweisbec@gmail.com>
I forgot to mention in the title of this patch, this is
the "v2".
On Fri, Mar 26, 2010 at 04:11:58AM +0100, Frederic Weisbecker wrote:
> Locking statistics are implemented using global atomic variables.
> This is usually fine unless some path write them very often.
>
> This is the case for the function and function graph tracers
> that disable irqs for each entry saved (except if the function
> tracer is in preempt disabled only mode).
> And calls to local_irq_save/restore() increment hardirqs_on_events
> and hardirqs_off_events stats (or similar stats for redundant
> versions).
>
> Incrementing these global vars for each function ends up in too
> much cache bouncing if lockstats are enabled.
>
> To solve this, implement the debug_atomic_*() operations using
> per cpu vars. We can't use irqsafe per cpu counters for that as
> these stats might be also written from NMI path, and irqsafe per
> cpu counters are not NMI safe, but local_t oprations are.
>
> This version then uses local_t based per cpu counters.
>
> v2: Use per_cpu() instead of get_cpu_var() to fetch the desired
> cpu vars on debug_atomic_read()
>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> ---
> kernel/lockdep.c | 28 ++++++++--------
> kernel/lockdep_internals.h | 71 +++++++++++++++++++++++++++++++-------------
> 2 files changed, 64 insertions(+), 35 deletions(-)
>
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index 65b5f5b..55e60a0 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -430,20 +430,20 @@ static struct stack_trace lockdep_init_trace = {
> /*
> * Various lockdep statistics:
> */
> -atomic_t chain_lookup_hits;
> -atomic_t chain_lookup_misses;
> -atomic_t hardirqs_on_events;
> -atomic_t hardirqs_off_events;
> -atomic_t redundant_hardirqs_on;
> -atomic_t redundant_hardirqs_off;
> -atomic_t softirqs_on_events;
> -atomic_t softirqs_off_events;
> -atomic_t redundant_softirqs_on;
> -atomic_t redundant_softirqs_off;
> -atomic_t nr_unused_locks;
> -atomic_t nr_cyclic_checks;
> -atomic_t nr_find_usage_forwards_checks;
> -atomic_t nr_find_usage_backwards_checks;
> +DEFINE_PER_CPU(local_t, chain_lookup_hits);
> +DEFINE_PER_CPU(local_t, chain_lookup_misses);
> +DEFINE_PER_CPU(local_t, hardirqs_on_events);
> +DEFINE_PER_CPU(local_t, hardirqs_off_events);
> +DEFINE_PER_CPU(local_t, redundant_hardirqs_on);
> +DEFINE_PER_CPU(local_t, redundant_hardirqs_off);
> +DEFINE_PER_CPU(local_t, softirqs_on_events);
> +DEFINE_PER_CPU(local_t, softirqs_off_events);
> +DEFINE_PER_CPU(local_t, redundant_softirqs_on);
> +DEFINE_PER_CPU(local_t, redundant_softirqs_off);
> +DEFINE_PER_CPU(local_t, nr_unused_locks);
> +DEFINE_PER_CPU(local_t, nr_cyclic_checks);
> +DEFINE_PER_CPU(local_t, nr_find_usage_forwards_checks);
> +DEFINE_PER_CPU(local_t, nr_find_usage_backwards_checks);
> #endif
>
> /*
> diff --git a/kernel/lockdep_internals.h b/kernel/lockdep_internals.h
> index a2ee95a..38c8ac7 100644
> --- a/kernel/lockdep_internals.h
> +++ b/kernel/lockdep_internals.h
> @@ -110,29 +110,58 @@ lockdep_count_backward_deps(struct lock_class *class)
> #endif
>
> #ifdef CONFIG_DEBUG_LOCKDEP
> +
> +#include <asm/local.h>
> +/*
> + * Various lockdep statistics.
> + * We want them per cpu as they are often accessed in fast path
> + * and we want to avoid too much cache bouncing.
> + * We can't use irqsafe per cpu counters as those are not NMI safe, as
> + * opposite to local_t.
> + */
> +DECLARE_PER_CPU(local_t, chain_lookup_hits);
> +DECLARE_PER_CPU(local_t, chain_lookup_misses);
> +DECLARE_PER_CPU(local_t, hardirqs_on_events);
> +DECLARE_PER_CPU(local_t, hardirqs_off_events);
> +DECLARE_PER_CPU(local_t, redundant_hardirqs_on);
> +DECLARE_PER_CPU(local_t, redundant_hardirqs_off);
> +DECLARE_PER_CPU(local_t, softirqs_on_events);
> +DECLARE_PER_CPU(local_t, softirqs_off_events);
> +DECLARE_PER_CPU(local_t, redundant_softirqs_on);
> +DECLARE_PER_CPU(local_t, redundant_softirqs_off);
> +DECLARE_PER_CPU(local_t, nr_unused_locks);
> +DECLARE_PER_CPU(local_t, nr_cyclic_checks);
> +DECLARE_PER_CPU(local_t, nr_cyclic_check_recursions);
> +DECLARE_PER_CPU(local_t, nr_find_usage_forwards_checks);
> +DECLARE_PER_CPU(local_t, nr_find_usage_forwards_recursions);
> +DECLARE_PER_CPU(local_t, nr_find_usage_backwards_checks);
> +DECLARE_PER_CPU(local_t, nr_find_usage_backwards_recursions);
> +
> +# define debug_atomic_inc(ptr) { \
> + WARN_ON_ONCE(!irq_disabled()); \
> + local_t *__ptr = &__get_cpu_var(ptr); \
> + local_inc(__ptr); \
> +}
> +
> +# define debug_atomic_dec(ptr) { \
> + WARN_ON_ONCE(!irq_disabled()); \
> + local_t *__ptr = &__get_cpu_var(ptr); \
> + local_dec(__ptr); \
> +}
> +
> /*
> - * Various lockdep statistics:
> + * It's fine to use local_read from other cpus. Read is racy anyway,
> + * but it's guaranteed high and low parts are read atomically.
> */
> -extern atomic_t chain_lookup_hits;
> -extern atomic_t chain_lookup_misses;
> -extern atomic_t hardirqs_on_events;
> -extern atomic_t hardirqs_off_events;
> -extern atomic_t redundant_hardirqs_on;
> -extern atomic_t redundant_hardirqs_off;
> -extern atomic_t softirqs_on_events;
> -extern atomic_t softirqs_off_events;
> -extern atomic_t redundant_softirqs_on;
> -extern atomic_t redundant_softirqs_off;
> -extern atomic_t nr_unused_locks;
> -extern atomic_t nr_cyclic_checks;
> -extern atomic_t nr_cyclic_check_recursions;
> -extern atomic_t nr_find_usage_forwards_checks;
> -extern atomic_t nr_find_usage_forwards_recursions;
> -extern atomic_t nr_find_usage_backwards_checks;
> -extern atomic_t nr_find_usage_backwards_recursions;
> -# define debug_atomic_inc(ptr) atomic_inc(ptr)
> -# define debug_atomic_dec(ptr) atomic_dec(ptr)
> -# define debug_atomic_read(ptr) atomic_read(ptr)
> +# define debug_atomic_read(ptr) ({ \
> + unsigned long long __total = 0; \
> + int __cpu; \
> + for_each_possible_cpu(__cpu) { \
> + local_t *__ptr = &per_cpu(ptr, cpu); \
> + __total += local_read(__ptr); \
> + } \
> + __total; \
> +})
> #else
> # define debug_atomic_inc(ptr) do { } while (0)
> # define debug_atomic_dec(ptr) do { } while (0)
> --
> 1.6.2.3
>
next prev parent reply other threads:[~2010-03-26 3:13 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-26 2:22 [PATCH] lockdep: Make lockstats counting per cpu Frederic Weisbecker
2010-03-26 3:11 ` Frederic Weisbecker
2010-03-26 3:13 ` Frederic Weisbecker [this message]
2010-03-26 9:16 ` Ingo Molnar
2010-03-26 17:11 ` Frederic Weisbecker
2010-03-26 6:52 ` Peter Zijlstra
2010-03-26 7:48 ` Peter Zijlstra
2010-03-26 17:43 ` Frederic Weisbecker
2010-03-26 17:39 ` Frederic Weisbecker
-- strict thread matches above, loose matches on Subject: below --
2010-03-28 0:29 [PATCH v3] " Frederic Weisbecker
2010-04-05 22:10 ` [PATCH] " Frederic Weisbecker
2010-04-06 8:46 ` Peter Zijlstra
2010-04-06 9:24 ` Frederic Weisbecker
2010-04-06 9:25 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100326031301.GC9858@nowhere \
--to=fweisbec@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.