All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-api@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Paul Turner <pjt@google.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>,
	Suren Baghdasaryan <surenb@google.com>
Subject: [PATCH v6 15/16] sched/core: uclamp: Use TG's clamps to restrict TASK's clamps
Date: Tue, 15 Jan 2019 10:15:12 +0000	[thread overview]
Message-ID: <20190115101513.2822-16-patrick.bellasi@arm.com> (raw)
In-Reply-To: <20190115101513.2822-1-patrick.bellasi@arm.com>

When a task specific clamp value is configured via sched_setattr(2),
this value is accounted in the corresponding clamp bucket every time the
task is {en,de}qeued. However, when cgroups are also in use, the task
specific clamp values could be restricted by the task_group (TG)
clamp values.

Update uclamp_cpu_inc() to aggregate task and TG clamp values. Every
time a task is enqueued, it's accounted in the clamp_bucket defining the
smaller clamp between the task specific value and its TG effective
value. This allows to:

1. ensure cgroup clamps are always used to restrict task specific
   requests, i.e. boosted only up to the effective granted value or
   clamped at least to a certain value

2. implement a "nice-like" policy, where tasks are still allowed to
   request less then what enforced by their current TG

This mimics what already happens for a task's CPU affinity mask when the
task is also in a cpuset, i.e. cgroup attributes are always used to
restrict per-task attributes.

Do this by exploiting the concept of "effective" clamp, which is already
used by a TG to track parent enforced restrictions.

Apply task group clamp restrictions only to tasks belonging to a child
group. While, for tasks in the root group or in an autogroup, only
system defaults are enforced.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>

---
Changes in v6:
 Others:
 - wholesale s/group/bucket/
---
 include/linux/sched.h | 10 ++++++++++
 kernel/sched/core.c   | 42 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 3f02128fe6b2..bb4e3b1085f9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -602,6 +602,7 @@ struct sched_dl_entity {
  * @bucket_id:		the bucket index used by the fast-path
  * @mapped:		the bucket index is valid
  * @active:		the se is currently refcounted in a CPU's clamp bucket
+ * @user_defined:	calmp value explicitly required from user-space
  *
  * A utilization clamp bucket maps a:
  *   clamp value (value), i.e.
@@ -619,12 +620,21 @@ struct sched_dl_entity {
  * The active bit is set whenever a task has got an effective clamp bucket
  * and value assigned, and it allows to know a task is actually refcounting a
  * CPU's clamp bucket.
+ *
+ * The user_defined bit is set whenever a task has got a task-specific clamp
+ * value requested from userspace, i.e. the system defaults apply to this
+ * task just as a restriction. This allows to relax TG's clamps when a less
+ * restrictive task specific value has been defined, thus allowing to
+ * implement a "nice" semantic when both task bucket and task specific values
+ * are used. For example, a task running on a 20% boosted TG can still drop
+ * its own boosting to 0%.
  */
 struct uclamp_se {
 	unsigned int value		: bits_per(SCHED_CAPACITY_SCALE);
 	unsigned int bucket_id		: bits_per(UCLAMP_BUCKETS);
 	unsigned int mapped		: 1;
 	unsigned int active		: 1;
+	unsigned int user_defined	: 1;
 	/*
 	 * Clamp bucket and value actually used by a scheduling entity,
 	 * i.e. a (RUNNABLE) task or a task group.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 734b769db2ca..c8d1fc9880ff 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -845,10 +845,23 @@ static inline void uclamp_cpu_update(struct rq *rq, unsigned int clamp_id,
 	WRITE_ONCE(rq->uclamp[clamp_id].value, max_value);
 }
 
+static inline bool uclamp_apply_defaults(struct task_struct *p)
+{
+	if (!IS_ENABLED(CONFIG_UCLAMP_TASK_GROUP))
+		return true;
+	if (task_group_is_autogroup(task_group(p)))
+		return true;
+	if (task_group(p) == &root_task_group)
+		return true;
+	return false;
+}
+
 /*
  * The effective clamp bucket index of a task depends on, by increasing
  * priority:
  * - the task specific clamp value, explicitly requested from userspace
+ * - the task group effective clamp value, for tasks not in the root group or
+ *   in an autogroup
  * - the system default clamp value, defined by the sysadmin
  *
  * As a side effect, update the task's effective value:
@@ -865,6 +878,29 @@ uclamp_effective_get(struct task_struct *p, unsigned int clamp_id,
 	*clamp_value = p->uclamp[clamp_id].value;
 	*bucket_id = p->uclamp[clamp_id].bucket_id;
 
+	if (!uclamp_apply_defaults(p)) {
+#ifdef CONFIG_UCLAMP_TASK_GROUP
+		unsigned int clamp_max, bucket_max;
+		struct uclamp_se *tg_clamp;
+
+		tg_clamp = &task_group(p)->uclamp[clamp_id];
+		clamp_max = tg_clamp->effective.value;
+		bucket_max = tg_clamp->effective.bucket_id;
+
+		if (!p->uclamp[clamp_id].user_defined ||
+		    *clamp_value > clamp_max) {
+			*clamp_value = clamp_max;
+			*bucket_id = bucket_max;
+		}
+#endif
+		/*
+		 * If we have task groups and we are running in a child group,
+		 * system default does not apply anymore since we assume task
+		 * group clamps are properly configured.
+		 */
+		return;
+	}
+
 	/* RT tasks have different default values */
 	default_clamp = task_has_rt_policy(p)
 		? uclamp_default_perf
@@ -1223,10 +1259,12 @@ static int __setscheduler_uclamp(struct task_struct *p,
 
 	mutex_lock(&uclamp_mutex);
 	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
+		p->uclamp[UCLAMP_MIN].user_defined = true;
 		uclamp_bucket_inc(p, &p->uclamp[UCLAMP_MIN],
 				  UCLAMP_MIN, lower_bound);
 	}
 	if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MAX) {
+		p->uclamp[UCLAMP_MAX].user_defined = true;
 		uclamp_bucket_inc(p, &p->uclamp[UCLAMP_MAX],
 				  UCLAMP_MAX, upper_bound);
 	}
@@ -1259,8 +1297,10 @@ static void uclamp_fork(struct task_struct *p, bool reset)
 	for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
 		unsigned int clamp_value = p->uclamp[clamp_id].value;
 
-		if (unlikely(reset))
+		if (unlikely(reset)) {
 			clamp_value = uclamp_none(clamp_id);
+			p->uclamp[clamp_id].user_defined = false;
+		}
 
 		p->uclamp[clamp_id].mapped = false;
 		p->uclamp[clamp_id].active = false;
-- 
2.19.2

  parent reply	other threads:[~2019-01-15 10:15 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-15 10:14 [PATCH v6 00/16] Add utilization clamping support Patrick Bellasi
2019-01-15 10:14 ` Patrick Bellasi
2019-01-15 10:14 ` [PATCH v6 01/16] sched/core: Allow sched_setattr() to use the current policy Patrick Bellasi
2019-01-25 13:56   ` Alessio Balsini
2019-01-15 10:14 ` [PATCH v6 02/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 03/16] sched/core: uclamp: Map TASK's clamp values into CPU's clamp buckets Patrick Bellasi
2019-01-21 10:15   ` Peter Zijlstra
2019-01-21 12:27     ` Patrick Bellasi
2019-01-21 12:51       ` Peter Zijlstra
2019-01-21 15:05   ` Peter Zijlstra
2019-01-21 15:34     ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets refcounting Patrick Bellasi
2019-01-21 14:59   ` Peter Zijlstra
2019-01-21 15:23     ` Patrick Bellasi
2019-01-21 16:12       ` Peter Zijlstra
2019-01-21 16:33         ` Patrick Bellasi
2019-01-22  9:45           ` Peter Zijlstra
2019-01-22 10:31             ` Patrick Bellasi
2019-01-21 15:17   ` Peter Zijlstra
2019-01-21 15:54     ` Patrick Bellasi
2019-01-22 10:03       ` Peter Zijlstra
2019-01-22 10:53         ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 05/16] sched/core: uclamp: Update CPU's refcount on clamp changes Patrick Bellasi
2019-01-21 15:33   ` Peter Zijlstra
2019-01-21 15:44     ` Patrick Bellasi
2019-01-22  9:37       ` Peter Zijlstra
2019-01-22 10:43         ` Patrick Bellasi
2019-01-22 13:28           ` Peter Zijlstra
2019-01-22 14:01             ` Patrick Bellasi
2019-01-22 14:57               ` Peter Zijlstra
2019-01-22 15:33                 ` Patrick Bellasi
2019-01-23  9:16                   ` Peter Zijlstra
2019-01-23 14:14                     ` Patrick Bellasi
2019-01-23 18:59                       ` Peter Zijlstra
2019-01-24 11:21                         ` Patrick Bellasi
2019-01-24 12:38                           ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 06/16] sched/core: uclamp: Enforce last task UCLAMP_MAX Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 07/16] sched/core: uclamp: Add system default clamps Patrick Bellasi
2019-01-22 13:56   ` Peter Zijlstra
2019-01-22 14:43     ` Patrick Bellasi
2019-01-22 15:13       ` Peter Zijlstra
2019-01-22 15:41         ` Patrick Bellasi
2019-01-23  9:22           ` Peter Zijlstra
2019-01-23 14:19             ` Patrick Bellasi
2019-01-23 19:10               ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 08/16] sched/cpufreq: uclamp: Add utilization clamping for FAIR tasks Patrick Bellasi
2019-01-22 10:37   ` Rafael J. Wysocki
2019-01-22 11:02     ` Patrick Bellasi
2019-01-22 11:04       ` Rafael J. Wysocki
2019-01-22 11:27         ` Patrick Bellasi
2019-01-22 15:21   ` Peter Zijlstra
2019-01-22 15:45     ` Patrick Bellasi
2019-01-22 17:13   ` Peter Zijlstra
2019-01-22 18:18     ` Patrick Bellasi
2019-01-23  9:52       ` Peter Zijlstra
2019-01-23 14:24         ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 09/16] sched/cpufreq: uclamp: Add utilization clamping for RT tasks Patrick Bellasi
2019-01-22 12:30   ` Quentin Perret
2019-01-22 12:37     ` Patrick Bellasi
2019-01-23 10:28   ` Peter Zijlstra
2019-01-23 14:33     ` Patrick Bellasi
2019-01-23 10:49   ` Peter Zijlstra
2019-01-23 14:40     ` Patrick Bellasi
2019-01-23 20:11       ` Peter Zijlstra
2019-01-24 12:30         ` Patrick Bellasi
2019-01-24 12:38           ` Patrick Bellasi
2019-01-24 15:12             ` Peter Zijlstra
2019-01-24 16:00               ` Patrick Bellasi
2019-01-24 15:31           ` Peter Zijlstra
2019-01-24 16:14             ` Patrick Bellasi
2019-01-24 15:33           ` Peter Zijlstra
2019-01-24 15:15   ` Peter Zijlstra
2019-01-24 16:05     ` Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 10/16] sched/core: Add uclamp_util_with() Patrick Bellasi
2019-01-23 13:33   ` Peter Zijlstra
2019-01-23 14:51     ` Patrick Bellasi
2019-01-23 19:22       ` Peter Zijlstra
2019-01-15 10:15 ` [PATCH v6 11/16] sched/fair: Add uclamp support to energy_compute() Patrick Bellasi
2019-01-22 12:13   ` Quentin Perret
2019-01-22 12:45     ` Patrick Bellasi
2019-01-22 13:29       ` Quentin Perret
2019-01-22 14:26         ` Patrick Bellasi
2019-01-22 14:39           ` Quentin Perret
2019-01-22 15:01             ` Patrick Bellasi
2019-01-22 15:14               ` Quentin Perret
2019-01-15 10:15 ` [PATCH v6 12/16] sched/core: uclamp: Extend CPU's cgroup controller Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 13/16] sched/core: uclamp: Propagate parent clamps Patrick Bellasi
2019-01-15 10:15 ` [PATCH v6 14/16] sched/core: uclamp: Map TG's clamp values into CPU's clamp buckets Patrick Bellasi
2019-01-15 10:15 ` Patrick Bellasi [this message]
2019-01-15 10:15 ` [PATCH v6 16/16] sched/core: uclamp: Update CPU's refcount on TG's clamp changes Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190115101513.2822-16-patrick.bellasi@arm.com \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=quentin.perret@arm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.