From: Patrick Bellasi <patrick.bellasi@matbug.net>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Jonathan Corbet <corbet@lwn.net>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Quentin Perret <qperret@google.com>,
Valentin Schneider <valentin.schneider@arm.com>,
Pavan Kondeti <pkondeti@codeaurora.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value
Date: Tue, 14 Apr 2020 20:21:52 +0200 [thread overview]
Message-ID: <20200414182152.GB20442@darkstar> (raw)
In-Reply-To: <20200403123020.13897-1-qais.yousef@arm.com>
Hi Qais!
On 03-Apr 13:30, Qais Yousef wrote:
[...]
> diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
> index d4f6215ee03f..91204480fabc 100644
> --- a/include/linux/sched/sysctl.h
> +++ b/include/linux/sched/sysctl.h
> @@ -59,6 +59,7 @@ extern int sysctl_sched_rt_runtime;
> #ifdef CONFIG_UCLAMP_TASK
> extern unsigned int sysctl_sched_uclamp_util_min;
> extern unsigned int sysctl_sched_uclamp_util_max;
> +extern unsigned int sysctl_sched_rt_default_uclamp_util_min;
nit-pick: I would prefer to keep the same prefix of the already
exising knobs, i.e. sysctl_sched_uclamp_util_min_rt
The same change for consistency should be applied to all the following
symbols related to "uclamp_util_min_rt".
NOTE: I would not use "default" as I think that what we are doing is
exactly force setting a user_defined value for all RT tasks. More on
that later...
> #endif
>
> #ifdef CONFIG_CFS_BANDWIDTH
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 1a9983da4408..a726b26a5056 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -797,6 +797,27 @@ unsigned int sysctl_sched_uclamp_util_min = SCHED_CAPACITY_SCALE;
> /* Max allowed maximum utilization */
> unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
>
> +/*
> + * By default RT tasks run at the maximum performance point/capacity of the
> + * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
> + * SCHED_CAPACITY_SCALE.
> + *
> + * This knob allows admins to change the default behavior when uclamp is being
> + * used. In battery powered devices, particularly, running at the maximum
> + * capacity and frequency will increase energy consumption and shorten the
> + * battery life.
> + *
> + * This knob only affects the default value RT has when a new RT task is
> + * forked or has just changed policy to RT, given the user hasn't modified the
> + * uclamp.min value of the task via sched_setattr().
> + *
> + * This knob will not override the system default sched_util_clamp_min defined
> + * above.
> + *
> + * Any modification is applied lazily on the next RT task wakeup.
> + */
> +unsigned int sysctl_sched_rt_default_uclamp_util_min = SCHED_CAPACITY_SCALE;
> +
> /* All clamps are required to be less or equal than these values */
> static struct uclamp_se uclamp_default[UCLAMP_CNT];
>
> @@ -924,6 +945,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
> return uc_req;
> }
>
> +static void uclamp_rt_sync_default_util_min(struct task_struct *p)
> +{
> + struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
Don't we have to filter for RT tasks only here?
> +
> + if (!uc_se->user_defined)
> + uclamp_se_set(uc_se, sysctl_sched_rt_default_uclamp_util_min, false);
Here you are actually setting a user-requested value, why not marking
it as that, i.e. by using true for the last parameter?
Moreover, by keeping user_defined=false I think you are not getting
what you want for RT tasks running in a nested cgroup.
Let say a subgroup is still with the util_min=1024 inherited from the
system defaults, in uclamp_tg_restrict() we will still return the max
value and not what you requested for. Isn't it?
IOW, what about:
---8<---
static void uclamp_sync_util_min_rt(struct task_struct *p)
{
struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
if (likely(uc_se->user_defined || !rt_task(p)))
return;
uclamp_se_set(uc_se, sysctl_sched_uclamp_util_min_rt, true);
}
---8<---
> +}
> +
> unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
> {
> struct uclamp_se uc_eff;
> @@ -1030,6 +1059,12 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
> if (unlikely(!p->sched_class->uclamp_enabled))
> return;
>
> + /*
> + * When sysctl_sched_rt_default_uclamp_util_min value is changed by the
> + * user, we apply any new value on the next wakeup, which is here.
> + */
> + uclamp_rt_sync_default_util_min(p);
> +
> for_each_clamp_id(clamp_id)
> uclamp_rq_inc_id(rq, p, clamp_id);
>
> @@ -1121,12 +1156,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> loff_t *ppos)
> {
> bool update_root_tg = false;
> - int old_min, old_max;
> + int old_min, old_max, old_rt_min;
> int result;
>
> mutex_lock(&uclamp_mutex);
> old_min = sysctl_sched_uclamp_util_min;
> old_max = sysctl_sched_uclamp_util_max;
> + old_rt_min = sysctl_sched_rt_default_uclamp_util_min;
Perpahs it's just my OCD but, is not "old_min_rt" reading better?
>
> result = proc_dointvec(table, write, buffer, lenp, ppos);
> if (result)
> @@ -1134,12 +1170,23 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> if (!write)
> goto done;
>
> + /*
> + * The new value will be applied to all RT tasks the next time they
> + * wakeup, assuming the task is using the system default and not a user
> + * specified value. In the latter we shall leave the value as the user
> + * requested.
> + */
Should not this comment go before the next block?
> if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
> sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
> result = -EINVAL;
> goto undo;
> }
>
> + if (sysctl_sched_rt_default_uclamp_util_min > SCHED_CAPACITY_SCALE) {
> + result = -EINVAL;
> + goto undo;
> + }
> +
> if (old_min != sysctl_sched_uclamp_util_min) {
> uclamp_se_set(&uclamp_default[UCLAMP_MIN],
> sysctl_sched_uclamp_util_min, false);
> @@ -1165,6 +1212,7 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
> undo:
> sysctl_sched_uclamp_util_min = old_min;
> sysctl_sched_uclamp_util_max = old_max;
> + sysctl_sched_rt_default_uclamp_util_min = old_rt_min;
> done:
> mutex_unlock(&uclamp_mutex);
>
> @@ -1207,9 +1255,13 @@ static void __setscheduler_uclamp(struct task_struct *p,
> if (uc_se->user_defined)
> continue;
>
> - /* By default, RT tasks always get 100% boost */
> + /*
> + * By default, RT tasks always get 100% boost, which the admins
> + * are allowed to change via
> + * sysctl_sched_rt_default_uclamp_util_min knob.
> + */
> if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> - clamp_value = uclamp_none(UCLAMP_MAX);
> + clamp_value = sysctl_sched_rt_default_uclamp_util_min;
>
> uclamp_se_set(uc_se, clamp_value, false);
> }
> @@ -1241,9 +1293,13 @@ static void uclamp_fork(struct task_struct *p)
> for_each_clamp_id(clamp_id) {
> unsigned int clamp_value = uclamp_none(clamp_id);
>
> - /* By default, RT tasks always get 100% boost */
> + /*
> + * By default, RT tasks always get 100% boost, which the admins
> + * are allowed to change via
> + * sysctl_sched_rt_default_uclamp_util_min knob.
> + */
> if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> - clamp_value = uclamp_none(UCLAMP_MAX);
> + clamp_value = sysctl_sched_rt_default_uclamp_util_min;
>
This is not required, look at this Quentin's patch:
Message-ID: <20200414161320.251897-1-qperret@google.com>
https://lore.kernel.org/lkml/20200414161320.251897-1-qperret@google.com/
> uclamp_se_set(&p->uclamp_req[clamp_id], clamp_value, false);
> }
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index ad5b88a53c5a..0272ae8c6147 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
> .mode = 0644,
> .proc_handler = sysctl_sched_uclamp_handler,
> },
> + {
> + .procname = "sched_rt_default_util_clamp_min",
> + .data = &sysctl_sched_rt_default_uclamp_util_min,
> + .maxlen = sizeof(unsigned int),
> + .mode = 0644,
> + .proc_handler = sysctl_sched_uclamp_handler,
> + },
> #endif
> #ifdef CONFIG_SCHED_AUTOGROUP
> {
Best,
Patrick
--
#include <best/regards.h>
Patrick Bellasi
next prev parent reply other threads:[~2020-04-14 18:22 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 12:30 [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Qais Yousef
2020-04-03 12:30 ` [PATCH 2/2] Documentation/sysctl: Document uclamp sysctl knobs Qais Yousef
2020-04-14 18:21 ` Patrick Bellasi [this message]
2020-04-15 7:46 ` [PATCH 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Patrick Bellasi
2020-04-20 15:04 ` Qais Yousef
2020-04-20 8:24 ` Dietmar Eggemann
2020-04-20 15:19 ` Qais Yousef
2020-04-21 0:52 ` Steven Rostedt
2020-04-21 11:16 ` Dietmar Eggemann
2020-04-21 11:23 ` Qais Yousef
2020-04-20 14:50 ` Qais Yousef
2020-04-15 10:11 ` Quentin Perret
2020-04-20 15:08 ` Qais Yousef
2020-04-20 8:29 ` Dietmar Eggemann
2020-04-20 15:13 ` Qais Yousef
2020-04-21 11:18 ` Dietmar Eggemann
2020-04-21 11:27 ` Qais Yousef
2020-04-22 10:59 ` Dietmar Eggemann
2020-04-22 13:13 ` Qais Yousef
-- strict thread matches above, loose matches on Subject: below --
2020-05-11 15:40 Qais Yousef
2020-05-11 17:18 ` Qais Yousef
2020-05-12 2:10 ` Pavan Kondeti
2020-05-12 11:46 ` Qais Yousef
2020-05-15 11:08 ` Patrick Bellasi
2020-05-18 8:31 ` Dietmar Eggemann
2020-05-18 16:49 ` Qais Yousef
2020-05-28 13:23 ` Peter Zijlstra
2020-05-28 15:58 ` Qais Yousef
2020-05-28 16:11 ` Peter Zijlstra
2020-05-28 16:51 ` Qais Yousef
2020-05-28 18:29 ` Peter Zijlstra
2020-05-28 19:08 ` Patrick Bellasi
2020-05-28 19:20 ` Dietmar Eggemann
2020-05-29 9:11 ` Qais Yousef
2020-05-29 10:21 ` Mel Gorman
2020-05-29 15:11 ` Qais Yousef
2020-05-29 16:02 ` Mel Gorman
2020-05-29 16:05 ` Qais Yousef
2020-05-29 10:08 ` Mel Gorman
2020-05-29 16:04 ` Qais Yousef
2020-05-29 16:57 ` Mel Gorman
2020-06-02 16:46 ` Dietmar Eggemann
2020-06-03 8:29 ` Patrick Bellasi
2020-06-03 10:10 ` Mel Gorman
2020-06-03 14:59 ` Vincent Guittot
2020-06-03 16:52 ` Qais Yousef
2020-06-04 12:14 ` Vincent Guittot
2020-06-05 10:45 ` Qais Yousef
2020-06-09 15:29 ` Vincent Guittot
2020-06-08 12:31 ` Qais Yousef
2020-06-08 13:06 ` Valentin Schneider
2020-06-08 14:44 ` Steven Rostedt
2020-06-11 10:13 ` Qais Yousef
2020-06-09 17:10 ` Vincent Guittot
2020-06-11 10:24 ` Qais Yousef
2020-06-11 12:01 ` Vincent Guittot
2020-06-23 15:44 ` Qais Yousef
2020-06-24 8:45 ` Vincent Guittot
2020-06-05 7:55 ` Patrick Bellasi
2020-06-05 11:32 ` Qais Yousef
2020-06-05 13:27 ` Patrick Bellasi
2020-06-03 9:40 ` Mel Gorman
2020-06-03 12:41 ` Qais Yousef
2020-06-04 13:40 ` Mel Gorman
2020-06-05 10:58 ` Qais Yousef
2020-06-11 10:58 ` Qais Yousef
2020-06-16 11:08 ` Qais Yousef
2020-06-16 13:56 ` Lukasz Luba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200414182152.GB20442@darkstar \
--to=patrick.bellasi@matbug.net \
--cc=bsegall@google.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=pkondeti@codeaurora.org \
--cc=qais.yousef@arm.com \
--cc=qperret@google.com \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.