From: Juri Lelli <juri.lelli@redhat.com>
To: peterz@infradead.org, mingo@redhat.com
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
vincent.guittot@linaro.org, rostedt@goodmis.org,
luca.abeni@santannapisa.it, claudio@evidence.eu.com,
tommaso.cucinotta@santannapisa.it, bristot@redhat.com,
mathieu.poirier@linaro.org, tkjos@android.com, joelaf@google.com,
morten.rasmussen@arm.com, dietmar.eggemann@arm.com,
patrick.bellasi@arm.com, alessio.balsini@arm.com,
juri.lelli@redhat.com
Subject: [RFC PATCH 0/3] SCHED_DEADLINE cgroups support
Date: Mon, 12 Feb 2018 14:40:27 +0100 [thread overview]
Message-ID: <20180212134030.12846-1-juri.lelli@redhat.com> (raw)
Hi,
A long time ago there was a patch [1] (written by Dario) adding DEADLINE
bandwidth management control for task groups. That was then removed from
the set of patches that made to mainline because outside of the bare
minimum of features to possibly start playing with SCHED_DEADLINE, and
because quite some discussion points remained open.
Fast forward to present day and more features have been added, DEADLINE
usage is however still reserved to root only. Several things are still
missing before we can comfortably relax privilegies, bandwidth
management for group of tasks being one of the most important (together
with a better/safer PI mechanism I'd say).
Another (different) attempt to add cgroup support was proposed last year
[2]. The set was implementing hierachical scheduling support (RT
entities running inside DEADLINE servers). Complexity (and maybe not
enough documentation? :) made discussion around that proposal difficult
to happen.
Even though hierachical scheduling is still what we want in the end,
this set tries to start getting there by adding cgroup based bandwidth
management for SCHED_DEADLINE. The following design choices have been
made (also detailed in changelog/doc):
- implementation _is not_ hierarchical: only single/plain DEADLINE
entities can be handled, and they get scheduled at root rq level
- DEADLINE_GROUP_SCHED requires RT_GROUP_SCHED (because of the points
below)
- DEADLINE and RT share bandwidth; therefore, DEADLINE tasks will eat
RT bandwidth, as they do today at root level; support for RT_RUNTIME_
SHARE is however missing, an RT task might be able to exceed its group
bandwidth constrain if such feature is enabled (more thinking required)
- and therefore cpu.rt_runtime_us and cpu.rt_period_us are still
controlling a group bandwidth; however, two additional (read only)
knobs are added
# cpu.dl_bw : maximum bandwidth available for the group on each CPU
(rt_runtime_us/rt_period_us)
# cpu.dl_total_bw : current total (across CPUs) amount of bandwidth
allocated by the group (sum of tasks bandwidth)
- father/children/siblings rules are the same as for RT
Adding this kind of support should be useful to be able to let normal
users use DEADLINE, as the sys admin (with root privilegies) could
reserve a fraction of the total available bandwidth to users and let
them allocate what needed inside such space.
I'm more than sure that there are problems lurking in this set (e.g.,
too much ifdeffery) and many discussion points are still open, but I
wanted to share what I have early and see what people thinks about it
(possibly understaning how to move forward).
First patch might actually be a standalone cleanup change.
The set (based on tip/sched/core as of today) is available at:
https://github.com/jlelli/linux.git upstream/deadline/cgroup-rfc-v1
Comments and feedback are the purpose of this RFC. Thanks in advance!
Best,
- Juri
[1] https://lkml.org/lkml/2010/2/28/119
[2] https://lwn.net/Articles/718645/
Juri Lelli (3):
sched/deadline: merge dl_bw into dl_bandwidth
sched/deadline: add task groups bandwidth management support
Documentation/scheduler/sched-deadline: add info about cgroup support
Documentation/scheduler/sched-deadline.txt | 36 +++--
init/Kconfig | 12 ++
kernel/sched/autogroup.c | 7 +
kernel/sched/core.c | 56 ++++++-
kernel/sched/deadline.c | 241 +++++++++++++++++++++++------
kernel/sched/debug.c | 6 +-
kernel/sched/rt.c | 52 ++++++-
kernel/sched/sched.h | 68 ++++----
kernel/sched/topology.c | 2 +-
9 files changed, 381 insertions(+), 99 deletions(-)
--
2.14.3
next reply other threads:[~2018-02-12 13:40 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-12 13:40 Juri Lelli [this message]
2018-02-12 13:40 ` [RFC PATCH 1/3] sched/deadline: merge dl_bw into dl_bandwidth Juri Lelli
2018-02-12 17:34 ` Steven Rostedt
2018-02-12 17:43 ` Juri Lelli
2018-02-12 18:02 ` Steven Rostedt
2018-02-12 18:17 ` Juri Lelli
2018-02-12 13:40 ` [RFC PATCH 2/3] sched/deadline: add task groups bandwidth management support Juri Lelli
2018-02-12 16:47 ` Tejun Heo
2018-02-12 17:09 ` Juri Lelli
2018-02-12 13:40 ` [RFC PATCH 3/3] Documentation/scheduler/sched-deadline: add info about cgroup support Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180212134030.12846-1-juri.lelli@redhat.com \
--to=juri.lelli@redhat.com \
--cc=alessio.balsini@arm.com \
--cc=bristot@redhat.com \
--cc=claudio@evidence.eu.com \
--cc=dietmar.eggemann@arm.com \
--cc=joelaf@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mathieu.poirier@linaro.org \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=patrick.bellasi@arm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tkjos@android.com \
--cc=tommaso.cucinotta@santannapisa.it \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox