* RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us
@ 2008-06-18 14:12 Daniel K.
2008-06-18 14:37 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Daniel K. @ 2008-06-18 14:12 UTC (permalink / raw)
To: Peter Zijlstra, mingo, Linux Kernel Mailing List
mkdir /dev/cgroup
mount -t cgroup -o cpu,cpuset cgroup /dev/cgroup
mkdir /dev/cgroup/0
echo 3 > /dev/cgroup/0/cpuset.cpus
echo 0 > /dev/cgroup/0/cpuset.mems
echo 100000 > /dev/cgroup/0/cpu.rt_period_us
echo 5000 > /dev/cgroup/0/cpu.rt_runtime_us
schedtool -R -p 1 -e burnP6 &
[1] 3309
echo 3309 > /dev/cgroup/0/tasks
At this point I'd expect the burnP6 task to use 5% of the available CPU
resources in the cgroup (5000/100000), but the real CPU usage, as
reported by top, is 20% This is 4 times the expected result, and as I
have 4 cores, I think there is a strong hint of correlation there.
Maybe with a 4 core system there really is 4 000 000 us available for
every 1 wall-time second?
However, I have only assigned one core (3) to _this_ cgroup, so I think
this cgroup is overusing its assigned resources.
What do you think?
Daniel K.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us
2008-06-18 14:12 RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us Daniel K.
@ 2008-06-18 14:37 ` Peter Zijlstra
2008-06-24 6:14 ` Max Krasnyansky
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2008-06-18 14:37 UTC (permalink / raw)
To: Daniel K.
Cc: mingo, Linux Kernel Mailing List, Max Krasnyanskiy, Paul Jackson,
Gregory Haskins
On Wed, 2008-06-18 at 16:12 +0200, Daniel K. wrote:
> mkdir /dev/cgroup
> mount -t cgroup -o cpu,cpuset cgroup /dev/cgroup
>
> mkdir /dev/cgroup/0
>
> echo 3 > /dev/cgroup/0/cpuset.cpus
> echo 0 > /dev/cgroup/0/cpuset.mems
> echo 100000 > /dev/cgroup/0/cpu.rt_period_us
> echo 5000 > /dev/cgroup/0/cpu.rt_runtime_us
>
> schedtool -R -p 1 -e burnP6 &
> [1] 3309
> echo 3309 > /dev/cgroup/0/tasks
>
> At this point I'd expect the burnP6 task to use 5% of the available CPU
> resources in the cgroup (5000/100000), but the real CPU usage, as
> reported by top, is 20% This is 4 times the expected result, and as I
> have 4 cores, I think there is a strong hint of correlation there.
>
> Maybe with a 4 core system there really is 4 000 000 us available for
> every 1 wall-time second?
Indeed. In effect each cpu (see below on specifics) gets the
runtime/period you specify, and it moves unused runtime between cpus.
> However, I have only assigned one core (3) to _this_ cgroup, so I think
> this cgroup is overusing its assigned resources.
>
> What do you think?
I think you're on to something :-)
It uses root domains, that is the largest domain this cpu is part of
that has load-balancing enabled.
So while you have made your process part of the cgroup and the cpuset,
there is no strong relation between them, that is to say, I could either
mount the cpuset or cpu controller on a different mount point and add
tasks to one but not the other.
So the relation I used is that of load-balance domains.
So in order to get what you intended, do something like:
mount none /dev/cpuset cgroup -o cpuset
mount none /cgroup/cpu cgroup -o cpu
mkdir /dev/cpuset/root
mkdir /dev/cpuset/rt
#
# this might not actually make the kernel happy
# as it might attempt (and possibly succeed in)
# moving cpu bound kernel threads
#
for i in `cat /dev/cpuset/tasks`; do
echo $i > /dev/cpuset/root/tasks;
done
echo 0-2 > /dev/cpuset/root/cpuset.cpus
echo 3 > /dev/cpuset/rt/cpuset.cpus
echo 0 > /dev/cpuset/cpuset.sched_load_balance
mkdir /cgroup/cpu/foo
echo 100000 > /cgroup/cpu/foo/cpu.rt_period_us
echo 5000 > /cgroup/cpu/foo/cpu.rt_runtime_us
echo $$ > /dev/cpuset/rt/tasks
echo $$ > /cgroup/cpu/foo/tasks
chrt -r -p 1 burnP6 &
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us
2008-06-18 14:37 ` Peter Zijlstra
@ 2008-06-24 6:14 ` Max Krasnyansky
2008-06-24 9:53 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Max Krasnyansky @ 2008-06-24 6:14 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Daniel K., mingo, Linux Kernel Mailing List, Paul Jackson,
Gregory Haskins
Peter Zijlstra wrote:
> On Wed, 2008-06-18 at 16:12 +0200, Daniel K. wrote:
>> mkdir /dev/cgroup
>> mount -t cgroup -o cpu,cpuset cgroup /dev/cgroup
>>
>> mkdir /dev/cgroup/0
>>
>> echo 3 > /dev/cgroup/0/cpuset.cpus
>> echo 0 > /dev/cgroup/0/cpuset.mems
>> echo 100000 > /dev/cgroup/0/cpu.rt_period_us
>> echo 5000 > /dev/cgroup/0/cpu.rt_runtime_us
>>
>> schedtool -R -p 1 -e burnP6 &
>> [1] 3309
>> echo 3309 > /dev/cgroup/0/tasks
>>
>> At this point I'd expect the burnP6 task to use 5% of the available CPU
>> resources in the cgroup (5000/100000), but the real CPU usage, as
>> reported by top, is 20% This is 4 times the expected result, and as I
>> have 4 cores, I think there is a strong hint of correlation there.
>>
>> Maybe with a 4 core system there really is 4 000 000 us available for
>> every 1 wall-time second?
>
> Indeed. In effect each cpu (see below on specifics) gets the
> runtime/period you specify, and it moves unused runtime between cpus.
>
>> However, I have only assigned one core (3) to _this_ cgroup, so I think
>> this cgroup is overusing its assigned resources.
>>
>> What do you think?
>
> I think you're on to something :-)
>
> It uses root domains, that is the largest domain this cpu is part of
> that has load-balancing enabled.
>
> So while you have made your process part of the cgroup and the cpuset,
> there is no strong relation between them, that is to say, I could either
> mount the cpuset or cpu controller on a different mount point and add
> tasks to one but not the other.
Daniel is probably really confused by now :).
> So the relation I used is that of load-balance domains.
That's the key thing.
> So in order to get what you intended, do something like:
>
> mount none /dev/cpuset cgroup -o cpuset
> mount none /cgroup/cpu cgroup -o cpu
>
> mkdir /dev/cpuset/root
> mkdir /dev/cpuset/rt
>
> #
> # this might not actually make the kernel happy
> # as it might attempt (and possibly succeed in)
> # moving cpu bound kernel threads
> #
> for i in `cat /dev/cpuset/tasks`; do
> echo $i > /dev/cpuset/root/tasks;
> done
It won't let you add tasks before adding cpus.
> echo 0-2 > /dev/cpuset/root/cpuset.cpus
> echo 3 > /dev/cpuset/rt/cpuset.cpus
>
> echo 0 > /dev/cpuset/cpuset.sched_load_balance
>
> mkdir /cgroup/cpu/foo
> echo 100000 > /cgroup/cpu/foo/cpu.rt_period_us
> echo 5000 > /cgroup/cpu/foo/cpu.rt_runtime_us
>
> echo $$ > /dev/cpuset/rt/tasks
> echo $$ > /cgroup/cpu/foo/tasks
>
> chrt -r -p 1 burnP6 &
That seems too complicated :). There is no need to mount them separately. The
only part that was missing from Daniel's example is the sched_load_balance
thingy otherwise he can still have a single cgroup unless I missing something.
In other words:
mkdir /dev/cgroup
mount -t cgroup -o cpu,cpuset cgroup /dev/cgroup
# Setup first domain (cpu 0-2)
mkdir /dev/cgroup/0
echo 0-2 > /dev/cgroup/0/cpuset.cpus
echo 0 > /dev/cgroup/0/cpuset.mems
# Setup second domain (cpu 3)
mkdir /dev/cgroup/1
echo 3 > /dev/cgroup/1/cpuset.cpus
echo 0 > /dev/cgroup/1/cpuset.mems
# Do not balance between domains
echo 0 > /dev/cpuset/cpuset.sched_load_balance
# Move all tasks into first domain if needed
...
# Setup RT bandwidth for second domain
echo 100000 > /dev/cgroup/1/cpu.rt_period_us
echo 5000 > /dev/cgroup/1/cpu.rt_runtime_us
schedtool -R -p 1 -e burnP6 &
[1] 3309
echo 3309 > /dev/cgroup/1/tasks
Max
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us
2008-06-24 6:14 ` Max Krasnyansky
@ 2008-06-24 9:53 ` Peter Zijlstra
2008-06-24 16:50 ` Max Krasnyanskiy
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2008-06-24 9:53 UTC (permalink / raw)
To: Max Krasnyansky
Cc: Daniel K., mingo, Linux Kernel Mailing List, Paul Jackson,
Gregory Haskins
On Mon, 2008-06-23 at 23:14 -0700, Max Krasnyansky wrote:
> That seems too complicated :). There is no need to mount them separately.
Mounting them independently gives you much greater flexibility.
You quickly run into overlapping cpu sets if you want multiple groups on
the isolated part.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us
2008-06-24 9:53 ` Peter Zijlstra
@ 2008-06-24 16:50 ` Max Krasnyanskiy
0 siblings, 0 replies; 5+ messages in thread
From: Max Krasnyanskiy @ 2008-06-24 16:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Daniel K., mingo, Linux Kernel Mailing List, Paul Jackson,
Gregory Haskins
Peter Zijlstra wrote:
> On Mon, 2008-06-23 at 23:14 -0700, Max Krasnyansky wrote:
>
>> That seems too complicated :). There is no need to mount them separately.
>
> Mounting them independently gives you much greater flexibility.
>
> You quickly run into overlapping cpu sets if you want multiple groups on
> the isolated part.
Sure, it is more flexible but for simple cases it's more confusing.
Max
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-06-24 16:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-18 14:12 RT-Scheduler/cgroups: Possible overuse of resources assigned via cpu.rt_period_us and cpu.rt_runtime_us Daniel K.
2008-06-18 14:37 ` Peter Zijlstra
2008-06-24 6:14 ` Max Krasnyansky
2008-06-24 9:53 ` Peter Zijlstra
2008-06-24 16:50 ` Max Krasnyanskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox