public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: John Stultz <john.stultz@linaro.org>
Cc: Tejun Heo <tj@kernel.org>, Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	lkml <linux-kernel@vger.kernel.org>,
	Dmitry Shmidt <dimitrysh@google.com>,
	Rom Lemarchand <romlem@google.com>,
	Colin Cross <ccross@google.com>, Todd Kjos <tkjos@google.com>,
	Oleg Nesterov <oleg@redhat.com>
Subject: Re: Severe performance regression w/ 4.4+ on Android due to cgroup locking changes
Date: Wed, 13 Jul 2016 16:02:38 -0700	[thread overview]
Message-ID: <20160713230238.GU7094@linux.vnet.ibm.com> (raw)
In-Reply-To: <CALAqxLU880FGWM+sCha1LAi8KCtzreG2jrMEPU92PSHy2tOOJg@mail.gmail.com>

On Wed, Jul 13, 2016 at 03:39:37PM -0700, John Stultz wrote:
> On Wed, Jul 13, 2016 at 3:17 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Wed, Jul 13, 2016 at 02:46:37PM -0700, John Stultz wrote:
> >> On Wed, Jul 13, 2016 at 2:42 PM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Wed, Jul 13, 2016 at 02:18:41PM -0700, Paul E. McKenney wrote:
> >> >> On Wed, Jul 13, 2016 at 05:05:26PM -0400, Tejun Heo wrote:
> >> >> > On Wed, Jul 13, 2016 at 02:03:15PM -0700, Paul E. McKenney wrote:
> >> >> > > Take the patch that I just sent out and make the choice of normal
> >> >> > > vs. expedited depend on CONFIG_PREEMPT_RT or whatever the -rt guys are
> >> >> > > calling it these days.  Is there a low-latency Kconfig option other
> >> >> > > than CONFIG_NO_HZ_FULL?
> >> >> >
> >> >> > Sounds like a plan to me.
> >> >>
> >> >> I like the way we like each other's idea.  Mutually assured laziness?  ;-)
> >> >
> >> > But here is what mine might look like.  Untested, probably does
> >> > not even build.  Note that the default is -no- expediting, use the
> >> > rcusync.expedited kernel parameter to enable it.
> >>
> >> I was working on something similar, but using a config option. Would
> >> adding a config option for the default make sense here, since I'd
> >> probably prefer to have one less thing to always specify on the
> >> cmdline?
> >
> > As long as you don't mind it depending on CONFIG_RCU_EXPERT, no problem.
> >
> > Perhaps like the following, on top of the previous patch?
> >
> > Or if you are going to put it in defconfig files only, I can make it
> > so that it isn't changeable at menuconfig time.
> 
> I think having it discoverable via menuconfig is useful, and I've got
> no objections to it being under RCU_EXPERT
> (assuming I don't badly muck up my RCU settings accidentally :).

But isn't mucking up your RCU settings half of the fun?  ;-)

> I only had that one nit about maybe wanting to put something in dmesg
> when we're using the expedited methods.

Updated, please see below.

> But otherwise both patches look great and are working well!
> 
> Do you mind marking them both for stable 4.4+?

OK, looks like it does qualify in the "fix a notable performance or
interactivity issue" category.

> Tested-by: John Stultz <john.stultz@linaro.org>
> Acked-by: John Stultz <john.stultz@linaro.org>
> 
> Also, do make sure Dmitry gets the reported-by credit for the first patch.

Done!  The updated first patch is below, and the second will follow.

							Thanx, Paul

------------------------------------------------------------------------

commit 59435eb836ee73b30ed6ada525125b67b4029321
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Wed Jul 13 14:43:46 2016 -0700

    rcu: Provide rcusync.expedited kernel boot parameter
    
    Dmitry Shmidt and John Stultz noticed that __cgroup_procs_write()
    sometimes incurred excessive overheads, ranging up into the tens of
    milliseconds.  Further testing confirmed speculation that this was due
    to synchronize_sched() within rcusync being invoked by per-CPU rwsems.
    This testing also showed that substituting synchronize_sched_expedited()
    for synchronize_sched() greatly reduced the overheads to below 200
    microseconds, with the occasional excursion into the low single digits
    worth of milliseconds.
    
    This commit therefore provides a rcusync.expedited kernel boot parameter
    that causes rcusync to use expedited grace-period primitives.
    
    Reported-by: Dmitry Shmidt <dimitrysh@google.com>
    Reported-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Tested-by: John Stultz <john.stultz@linaro.org>
    Acked-by: John Stultz <john.stultz@linaro.org>
    Cc: <stable@vger.kernel.org> # 4.4.x-

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 82b42c958d1c..b8bc9854e548 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3229,6 +3229,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			energy efficiency by requiring that the kthreads
 			periodically wake up to do the polling.
 
+	rcusync.expedited	[KNL]
+			Specify that the rcusync mechanism use expedited
+			grace periods.  As of mid-2016, this affects
+			per-CPU rwsems.
+
 	rcutree.blimit=	[KNL]
 			Set maximum number of finished RCU callbacks to
 			process in one batch.
diff --git a/kernel/rcu/sync.c b/kernel/rcu/sync.c
index be922c9f3d37..0d0dc992cce7 100644
--- a/kernel/rcu/sync.c
+++ b/kernel/rcu/sync.c
@@ -22,6 +22,14 @@
 
 #include <linux/rcu_sync.h>
 #include <linux/sched.h>
+#include <linux/moduleparam.h>
+#include <linux/module.h>
+
+MODULE_ALIAS("rcusync");
+#ifdef MODULE_PARAM_PREFIX
+#undef MODULE_PARAM_PREFIX
+#endif
+#define MODULE_PARAM_PREFIX "rcusync."
 
 #ifdef CONFIG_PROVE_RCU
 #define __INIT_HELD(func)	.held = func,
@@ -29,14 +37,14 @@
 #define __INIT_HELD(func)
 #endif
 
-static const struct {
+static struct {
 	void (*sync)(void);
 	void (*call)(struct rcu_head *, void (*)(struct rcu_head *));
 	void (*wait)(void);
 #ifdef CONFIG_PROVE_RCU
 	int  (*held)(void);
 #endif
-} gp_ops[] = {
+} gp_ops[] __read_mostly = {
 	[RCU_SYNC] = {
 		.sync = synchronize_rcu,
 		.call = call_rcu,
@@ -62,6 +70,21 @@ enum { CB_IDLE = 0, CB_PENDING, CB_REPLAY };
 
 #define	rss_lock	gp_wait.lock
 
+static bool expedited;
+module_param(expedited, bool, 0444);
+
+static int __init rcu_sync_early_init(void)
+{
+	if (expedited) {
+		pr_info("RCU_SYNC: Expedited operation in effect.\n");
+		gp_ops[RCU_SYNC].sync = synchronize_rcu_expedited;
+		gp_ops[RCU_SCHED_SYNC].sync = synchronize_sched_expedited;
+		gp_ops[RCU_BH_SYNC].sync = synchronize_rcu_bh_expedited;
+	}
+	return 0;
+}
+early_initcall(rcu_sync_early_init);
+
 #ifdef CONFIG_PROVE_RCU
 void rcu_sync_lockdep_assert(struct rcu_sync *rsp)
 {

  reply	other threads:[~2016-07-13 23:02 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13  0:00 Severe performance regression w/ 4.4+ on Android due to cgroup locking changes John Stultz
2016-07-13  8:21 ` Peter Zijlstra
2016-07-13 14:42   ` Paul E. McKenney
2016-07-13 18:13     ` Dmitry Shmidt
2016-07-13 18:32       ` Paul E. McKenney
2016-07-13 18:21 ` Tejun Heo
2016-07-13 18:33   ` Tejun Heo
2016-07-13 20:13     ` John Stultz
2016-07-13 20:18       ` Tejun Heo
2016-07-13 20:26         ` Peter Zijlstra
2016-07-13 20:39           ` Tejun Heo
2016-07-13 20:51             ` Peter Zijlstra
2016-07-13 21:01               ` Tejun Heo
2016-07-13 21:03               ` Paul E. McKenney
2016-07-13 21:05                 ` Tejun Heo
2016-07-13 21:18                   ` Paul E. McKenney
2016-07-13 21:42                     ` Paul E. McKenney
2016-07-13 21:46                       ` John Stultz
2016-07-13 22:17                         ` Paul E. McKenney
2016-07-13 22:39                           ` John Stultz
2016-07-13 23:02                             ` Paul E. McKenney [this message]
2016-07-13 23:04                               ` Paul E. McKenney
2016-07-14 11:35                                 ` Tejun Heo
2016-07-14 12:04                                   ` Peter Zijlstra
2016-07-14 12:08                                     ` Tejun Heo
2016-07-14 12:20                                       ` Peter Zijlstra
2016-07-14 15:07                                         ` Tejun Heo
2016-07-14 15:24                                           ` Tejun Heo
2016-07-14 16:32                                           ` Peter Zijlstra
2016-07-14 17:34                                             ` Oleg Nesterov
2016-07-14 16:54                               ` John Stultz
2016-07-13 22:25                       ` John Stultz
2016-07-13 22:01                     ` Tejun Heo
2016-07-13 22:33                       ` Paul E. McKenney
2016-07-14  6:49                       ` Peter Zijlstra
2016-07-14 11:20                         ` Tejun Heo
2016-07-14 12:11                           ` Peter Zijlstra
2016-07-14 15:14                             ` Tejun Heo
2016-07-14 13:18               ` Peter Zijlstra
2016-07-14 14:14                 ` Peter Zijlstra
2016-07-14 14:58                 ` Oleg Nesterov
2016-07-14 16:14                   ` Peter Zijlstra
2016-07-14 16:37                   ` Peter Zijlstra
2016-07-14 17:05                     ` Oleg Nesterov
2016-07-14 16:23                 ` Paul E. McKenney
2016-07-14 16:45                   ` Peter Zijlstra
2016-07-14 17:15                     ` Paul E. McKenney
2016-07-14 16:43                 ` John Stultz
2016-07-14 16:49                   ` Peter Zijlstra
2016-07-14 17:02                     ` John Stultz
2016-07-14 17:13                       ` Oleg Nesterov
2016-07-14 17:30                         ` John Stultz
2016-07-14 17:41                           ` Oleg Nesterov
2016-07-14 17:51                             ` John Stultz
2016-07-14 18:09                 ` Oleg Nesterov
2016-07-14 18:36                   ` Peter Zijlstra
2016-07-14 19:35                     ` Peter Zijlstra
2016-07-13 20:57             ` John Stultz
2016-07-13 20:52           ` Paul E. McKenney
2016-07-13 20:57             ` Peter Zijlstra
2016-07-13 21:08               ` Paul E. McKenney
2016-07-13 21:01             ` Dmitry Shmidt
2016-07-13 21:03               ` John Stultz
2016-07-13 21:05               ` Paul E. McKenney
2016-07-13 20:31     ` Dmitry Shmidt
2016-07-13 20:44   ` Colin Cross
2016-07-13 20:54     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160713230238.GU7094@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ccross@google.com \
    --cc=dimitrysh@google.com \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=romlem@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox