public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Steve Muckle <steve.muckle@linaro.org>,
	Leo Yan <leo.yan@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Todd Kjos <tkjos@google.com>,
	Srinath Sridharan <srinathsr@google.com>,
	Andres Oportus <andresoportus@google.com>,
	Juri Lelli <juri.lelli@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Chris Redpath <chris.redpath@arm.com>,
	Robin Randhawa <robin.randhawa@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: [RFC v2 2/8] sched/tune: add sysctl interface to define a boost value
Date: Thu, 27 Oct 2016 18:41:02 +0100	[thread overview]
Message-ID: <20161027174108.31139-3-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20161027174108.31139-1-patrick.bellasi@arm.com>

The current (CFS) scheduler implementation does not allow "to boost"
tasks performance by running them at a higher OPP compared to the
minimum required to meet their workload demands.

To support tasks performance boosting the scheduler should provide a
"knob" which allows to tune how much the system is going to be optimised
for energy efficiency vs performance boosting.
It's worth to notice that by energy-efficiency we mean running a CPU at
the minimum OPP which satisfy its utilization while for performance
boosting we mean running a task as fast as possible.

This patch is the first of a series which provides a simple interface to
define a tuning knob. One system-wide "boost" tunable is exposed via:
  /proc/sys/kernel/sched_cfs_boost
which can be configured in the range [0..100], to define a percentage
where:
    0% boost requires to operate in "standard" mode by scheduling
       tasks at the minimum capacities required by the workload demand
  100% boost requires to push at maximum the task performances,
       "regardless" of the incurred energy consumption

A boost value in between these two boundaries is used to bias the
power/performance trade-off, the higher the boost value the more the
scheduler is biased toward performance boosting instead of energy
efficiency.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
---
 include/linux/sched/sysctl.h | 16 ++++++++++++++++
 init/Kconfig                 | 31 +++++++++++++++++++++++++++++++
 kernel/sched/Makefile        |  1 +
 kernel/sched/tune.c          | 23 +++++++++++++++++++++++
 kernel/sysctl.c              | 11 +++++++++++
 5 files changed, 82 insertions(+)
 create mode 100644 kernel/sched/tune.c

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 4411453..5bfbb14 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -55,6 +55,22 @@ extern int sysctl_sched_rt_runtime;
 extern unsigned int sysctl_sched_cfs_bandwidth_slice;
 #endif
 
+#ifdef CONFIG_SCHED_TUNE
+extern unsigned int sysctl_sched_cfs_boost;
+int sysctl_sched_cfs_boost_handler(struct ctl_table *table, int write,
+				   void __user *buffer, size_t *length,
+				   loff_t *ppos);
+static inline unsigned int get_sysctl_sched_cfs_boost(void)
+{
+	return sysctl_sched_cfs_boost;
+}
+#else
+static inline unsigned int get_sysctl_sched_cfs_boost(void)
+{
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_SCHED_AUTOGROUP
 extern unsigned int sysctl_sched_autogroup_enabled;
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index 34407f1..461e052 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1248,6 +1248,37 @@ config SCHED_AUTOGROUP
 	  desktop applications.  Task group autogeneration is currently based
 	  upon task session.
 
+config SCHED_TUNE
+	bool "Boosting for CFS tasks (EXPERIMENTAL)"
+	depends on SMP
+	help
+	  This option enables the system-wide support for task boosting.
+	  When this support is enabled a new sysctl interface is exposed to
+	  user-space via:
+	     /proc/sys/kernel/sched_cfs_boost
+	  which allows to set a system-wide boost value in range [0..100].
+
+	  The currently boosting strategy is implemented in such a way that:
+	  - a 0% boost value requires to operate in "standard" mode by
+	    scheduling all tasks at the minimum capacities required by their
+	    workload demand
+	  - a 100% boost value requires to push at maximum the task
+	    performances, "regardless" of the incurred energy consumption
+
+	  A boost value in between these two boundaries is used to bias the
+	  power/performance trade-off, the higher the boost value the more the
+	  scheduler is biased toward performance boosting instead of energy
+	  efficiency.
+
+	  Since this support exposes a single system-wide knob, the specified
+	  boost value is applied to all (CFS) tasks in the system.
+
+	  NOTE: SchedTune support is available only on SMP system since only
+	  for those systems is currently defined and tracked the utilization
+	  signal for RQs and SEs.
+
+	  If unsure, say N.
+
 config SYSFS_DEPRECATED
 	bool "Enable deprecated sysfs features to support old userspace tools"
 	depends on SYSFS
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 5e59b83..26ab2a6 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -22,6 +22,7 @@ obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o
 obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
 obj-$(CONFIG_SCHEDSTATS) += stats.o
 obj-$(CONFIG_SCHED_DEBUG) += debug.o
+obj-$(CONFIG_SCHED_TUNE) += tune.o
 obj-$(CONFIG_CGROUP_CPUACCT) += cpuacct.o
 obj-$(CONFIG_CPU_FREQ) += cpufreq.o
 obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o
diff --git a/kernel/sched/tune.c b/kernel/sched/tune.c
new file mode 100644
index 0000000..7336118
--- /dev/null
+++ b/kernel/sched/tune.c
@@ -0,0 +1,23 @@
+/*
+ * Scheduler Tunability (SchedTune) Extensions for CFS
+ *
+ * Copyright (C) 2016 ARM Ltd, Patrick Bellasi <patrick.bellasi@arm.com>
+ */
+
+#include "sched.h"
+
+unsigned int sysctl_sched_cfs_boost __read_mostly;
+
+int
+sysctl_sched_cfs_boost_handler(struct ctl_table *table, int write,
+			       void __user *buffer, size_t *lenp,
+			       loff_t *ppos)
+{
+	int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	if (ret || !write)
+		return ret;
+
+	return 0;
+}
+
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 739fb17..43b6d14 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -442,6 +442,17 @@ static struct ctl_table kern_table[] = {
 		.extra1		= &one,
 	},
 #endif
+#ifdef CONFIG_SCHED_TUNE
+	{
+		.procname	= "sched_cfs_boost",
+		.data		= &sysctl_sched_cfs_boost,
+		.maxlen		= sizeof(sysctl_sched_cfs_boost),
+		.mode		= 0644,
+		.proc_handler	= &sysctl_sched_cfs_boost_handler,
+		.extra1		= &zero,
+		.extra2		= &one_hundred,
+	},
+#endif
 #ifdef CONFIG_PROVE_LOCKING
 	{
 		.procname	= "prove_locking",
-- 
2.10.1

  parent reply	other threads:[~2016-10-27 17:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-27 17:41 [RFC v2 0/8] SchedTune: central, scheduler-driven, power-perfomance control Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 1/8] sched/tune: add detailed documentation Patrick Bellasi
2016-11-04  9:46   ` Viresh Kumar
2016-11-08 10:53     ` Patrick Bellasi
2016-10-27 17:41 ` Patrick Bellasi [this message]
2016-10-27 17:41 ` [RFC v2 3/8] sched/fair: add function to convert boost value into "margin" Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 4/8] sched/fair: add boosted CPU usage Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 5/8] sched/tune: add initial support for CGroups based boosting Patrick Bellasi
2016-10-27 18:30   ` Tejun Heo
2016-10-27 20:14     ` Patrick Bellasi
2016-10-27 20:39       ` Tejun Heo
2016-10-27 22:34         ` Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 6/8] sched/tune: compute and keep track of per CPU boost value Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 7/8] sched/{fair,tune}: track RUNNABLE tasks impact on " Patrick Bellasi
2016-10-27 17:41 ` [RFC v2 8/8] sched/{fair,tune}: add support for negative boosting Patrick Bellasi
2016-10-27 20:58 ` [RFC v2 0/8] SchedTune: central, scheduler-driven, power-perfomance control Peter Zijlstra
2016-10-28 10:49   ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161027174108.31139-3-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=andresoportus@google.com \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@arm.com \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=robin.randhawa@arm.com \
    --cc=srinathsr@google.com \
    --cc=steve.muckle@linaro.org \
    --cc=tkjos@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox