From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Joel Fernandes <joelaf@google.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
"Joel Fernandes, Google" <joel@joelfernandes.org>,
rostedt <rostedt@goodmis.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Tom Zanussi <tom.zanussi@linux.intel.com>,
Namhyung Kim <namhyung@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Boqun Feng <boqun.feng@gmail.com>, fweisbec <fweisbec@gmail.com>,
Randy Dunlap <rdunlap@infradead.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
kbuild test robot <fengguang.wu@intel.com>,
baohong liu <baohong.liu@intel.com>,
vedang patel <vedang.patel@intel.com>,
kernel-team <kernel-team@android.com>
Subject: Re: [PATCH RFC v6 4/5] tracepoint: Make rcuidle tracepoint callers use SRCU
Date: Mon, 7 May 2018 17:17:53 -0400 (EDT) [thread overview]
Message-ID: <2001761952.217.1525727873194.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20180507210801.GZ26088@linux.vnet.ibm.com>
----- On May 7, 2018, at 5:08 PM, Paul E. McKenney paulmck@linux.vnet.ibm.com wrote:
> On Mon, May 07, 2018 at 01:41:42PM -0700, Joel Fernandes wrote:
>> From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
>>
>> In recent tests with IRQ on/off tracepoints, a large performance
>> overhead ~10% is noticed when running hackbench. This is root caused to
>> calls to rcu_irq_enter_irqson and rcu_irq_exit_irqson from the
>> tracepoint code. Following a long discussion on the list [1] about this,
>> we concluded that srcu is a better alternative for use during rcu idle.
>> Although it does involve extra barriers, its lighter than the sched-rcu
>> version which has to do additional RCU calls to notify RCU idle about
>> entry into RCU sections.
>>
>> In this patch, we change the underlying implementation of the
>> trace_*_rcuidle API to use SRCU. This has shown to improve performance
>> alot for the high frequency irq enable/disable tracepoints.
>>
>> Test: Tested idle and preempt/irq tracepoints.
>>
>> Here are some performance numbers:
>>
>> With a run of the following 30 times on a single core x86 Qemu instance
>> with 1GB memory:
>> hackbench -g 4 -f 2 -l 3000
>>
>> Completion times in seconds. CONFIG_PROVE_LOCKING=y.
>>
>> No patches (without this series)
>> Mean: 3.048
>> Median: 3.025
>> Std Dev: 0.064
>>
>> With Lockdep using irq tracepoints with RCU implementation:
>> Mean: 3.451 (-11.66 %)
>> Median: 3.447 (-12.22%)
>> Std Dev: 0.049
>>
>> With Lockdep using irq tracepoints with SRCU implementation (this series):
>> Mean: 3.020 (I would consider the improvement against the "without
>> this series" case as just noise).
>> Median: 3.013
>> Std Dev: 0.033
>>
>> [1] https://patchwork.kernel.org/patch/10344297/
>>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Peter Zilstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
>> Cc: Namhyung Kim <namhyung@kernel.org>
>> Cc: Thomas Glexiner <tglx@linutronix.de>
>> Cc: Boqun Feng <boqun.feng@gmail.com>
>> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
>> Cc: Frederic Weisbecker <fweisbec@gmail.com>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>> Cc: Fenguang Wu <fengguang.wu@intel.com>
>> Cc: Baohong Liu <baohong.liu@intel.com>
>> Cc: Vedang Patel <vedang.patel@intel.com>
>> Cc: kernel-team@android.com
>> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
>> ---
>> include/linux/tracepoint.h | 46 +++++++++++++++++++++++++++++++-------
>> kernel/tracepoint.c | 15 ++++++++++++-
>> 2 files changed, 52 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
>> index c94f466d57ef..f56f290cf8eb 100644
>> --- a/include/linux/tracepoint.h
>> +++ b/include/linux/tracepoint.h
>> @@ -15,6 +15,7 @@
>> */
>>
>> #include <linux/smp.h>
>> +#include <linux/srcu.h>
>> #include <linux/errno.h>
>> #include <linux/types.h>
>> #include <linux/cpumask.h>
>> @@ -33,6 +34,8 @@ struct trace_eval_map {
>>
>> #define TRACEPOINT_DEFAULT_PRIO 10
>>
>> +extern struct srcu_struct tracepoint_srcu;
>> +
>> extern int
>> tracepoint_probe_register(struct tracepoint *tp, void *probe, void *data);
>> extern int
>> @@ -77,6 +80,9 @@ int unregister_tracepoint_module_notifier(struct
>> notifier_block *nb)
>> */
>> static inline void tracepoint_synchronize_unregister(void)
>> {
>> +#ifdef CONFIG_TRACEPOINTS
>> + synchronize_srcu(&tracepoint_srcu);
>> +#endif
>> synchronize_sched();
>> }
>>
>> @@ -129,18 +135,38 @@ extern void syscall_unregfunc(void);
>> * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
>> * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
>> */
>> -#define __DO_TRACE(tp, proto, args, cond, rcucheck) \
>> +#define __DO_TRACE(tp, proto, args, cond, rcuidle) \
>> do { \
>> struct tracepoint_func *it_func_ptr; \
>> void *it_func; \
>> void *__data; \
>> + int __maybe_unused idx = 0; \
>> \
>> if (!(cond)) \
>> return; \
>> - if (rcucheck) \
>> - rcu_irq_enter_irqson(); \
>> - rcu_read_lock_sched_notrace(); \
>> - it_func_ptr = rcu_dereference_sched((tp)->funcs); \
>> + \
>> + /* \
>> + * For rcuidle callers, use srcu since sched-rcu \
>> + * doesn't work from the idle path. \
>> + */ \
>> + if (rcuidle) { \
>> + if (in_nmi()) { \
>> + WARN_ON_ONCE(1); \
>> + return; /* no srcu from nmi */ \
>> + } \
>> + \
>> + idx = srcu_read_lock_notrace(&tracepoint_srcu); \
>> + it_func_ptr = \
>> + srcu_dereference_notrace((tp)->funcs, \
>> + &tracepoint_srcu); \
>> + /* To keep it consistent with !rcuidle path */ \
>> + preempt_disable_notrace(); \
>> + } else { \
>> + rcu_read_lock_sched_notrace(); \
>> + it_func_ptr = \
>> + rcu_dereference_sched((tp)->funcs); \
>> + } \
>> + \
>> if (it_func_ptr) { \
>> do { \
>> it_func = (it_func_ptr)->func; \
>> @@ -148,9 +174,13 @@ extern void syscall_unregfunc(void);
>> ((void(*)(proto))(it_func))(args); \
>> } while ((++it_func_ptr)->func); \
>> } \
>> - rcu_read_unlock_sched_notrace(); \
>> - if (rcucheck) \
>> - rcu_irq_exit_irqson(); \
>> + \
>> + if (rcuidle) { \
>
> Don't we also need an in_nmi() check here in order to avoid unbalanced
> srcu_read_unlock_notrace() calls?
AFAIU the "return;" in the if (in_nmi()) branch above takes care of never
executing the following code. diff appears to be a bit confused by the
preprocessor macros, but in reality this is all part of the same static
inline function.
Thanks,
Mathieu
>
> Thanx, Paul
>
>> + preempt_enable_notrace(); \
>> + srcu_read_unlock_notrace(&tracepoint_srcu, idx);\
>> + } else { \
>> + rcu_read_unlock_sched_notrace(); \
>> + } \
>> } while (0)
>>
>> #ifndef MODULE
>> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
>> index 671b13457387..2089f579f790 100644
>> --- a/kernel/tracepoint.c
>> +++ b/kernel/tracepoint.c
>> @@ -31,6 +31,9 @@
>> extern struct tracepoint * const __start___tracepoints_ptrs[];
>> extern struct tracepoint * const __stop___tracepoints_ptrs[];
>>
>> +DEFINE_SRCU(tracepoint_srcu);
>> +EXPORT_SYMBOL_GPL(tracepoint_srcu);
>> +
>> /* Set to 1 to enable tracepoint debug output */
>> static const int tracepoint_debug;
>>
>> @@ -67,16 +70,26 @@ static inline void *allocate_probes(int count)
>> return p == NULL ? NULL : p->probes;
>> }
>>
>> -static void rcu_free_old_probes(struct rcu_head *head)
>> +static void srcu_free_old_probes(struct rcu_head *head)
>> {
>> kfree(container_of(head, struct tp_probes, rcu));
>> }
>>
>> +static void rcu_free_old_probes(struct rcu_head *head)
>> +{
>> + call_srcu(&tracepoint_srcu, head, srcu_free_old_probes);
>> +}
>> +
>> static inline void release_probes(struct tracepoint_func *old)
>> {
>> if (old) {
>> struct tp_probes *tp_probes = container_of(old,
>> struct tp_probes, probes[0]);
>> + /*
>> + * Tracepoint probes are protected by both sched RCU and SRCU,
>> + * by calling the SRCU callback in the sched RCU callback we
>> + * cover both cases. So lets chain the SRCU and RCU callbacks.
>> + */
>> call_rcu_sched(&tp_probes->rcu, rcu_free_old_probes);
>> }
>> }
>> --
>> 2.17.0.441.gb46fe60e1d-goog
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2018-05-07 21:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-07 20:41 [PATCH RFC v6 0/5] Centralize and unify usage of preempt/irq tracepoints Joel Fernandes
2018-05-07 20:41 ` [PATCH RFC v6 1/5] srcu: Add notrace variants of srcu_read_{lock,unlock} Joel Fernandes
2018-05-07 20:41 ` [PATCH RFC v6 2/5] srcu: Add notrace variant of srcu_dereference Joel Fernandes
2018-05-07 20:41 ` [PATCH RFC v6 3/5] trace/irqsoff: Split reset into separate functions Joel Fernandes
2018-05-07 20:41 ` [PATCH RFC v6 4/5] tracepoint: Make rcuidle tracepoint callers use SRCU Joel Fernandes
2018-05-07 21:05 ` Mathieu Desnoyers
2018-05-07 21:46 ` Joel Fernandes
2018-05-08 0:37 ` Mathieu Desnoyers
2018-05-08 21:10 ` Joel Fernandes
2018-05-07 21:08 ` Paul E. McKenney
2018-05-07 21:17 ` Mathieu Desnoyers [this message]
2018-05-07 21:45 ` Joel Fernandes
2018-05-07 21:55 ` Paul E. McKenney
2018-05-07 23:37 ` Joel Fernandes
2018-05-07 20:41 ` [PATCH RFC v6 5/5] tracing: Centralize preemptirq tracepoints and unify their usage Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2001761952.217.1525727873194.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=baohong.liu@intel.com \
--cc=boqun.feng@gmail.com \
--cc=fengguang.wu@intel.com \
--cc=fweisbec@gmail.com \
--cc=joel@joelfernandes.org \
--cc=joelaf@google.com \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tom.zanussi@linux.intel.com \
--cc=vedang.patel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.