From: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
To: Paul Turner <pjt@google.com>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Ingo Molnar <mingo@elte.hu>, Ben Segall <bsegall@google.com>,
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Subject: Re: [patch 3/3] From: Ben Segall <bsegall@google.com>
Date: Mon, 14 Nov 2011 15:33:06 +0530 [thread overview]
Message-ID: <20111114100306.GA10520@linux.vnet.ibm.com> (raw)
In-Reply-To: <4EBB3742.7060404@google.com>
[snip]
> Since throttling occurs in the put_prev_task() path we do not get to observe
> this delta against nr_running when making the decision to idle_balance().
>
> Fix this by first enumerating cfs_rq throttle states so that we can distinguish
> throttling cfs_rqs. Then remove tasks that will be throttled in put_prev_task
> from rq->nr_running/cfs_rq->h_nr_running when in account_cfs_rq_runtime,
> rather than delaying until put_prev_task.
>
> This allows schedule() to call idle_balance when we go idle due to throttling.
>
> Using Kamalesh's nested-cgroup test case[1] we see the following improvement on
> a 16 core system:
> baseline: Average CPU Idle percentage 13.9667%
> +patch: Average CPU Idle percentage 3.53333%
> [1]: https://lkml.org/lkml/2011/9/15/261
>
> Signed-off-by: Ben Segall <bsegall@google.com>
> Signed-off-by: Paul Turner <pjt@google.com>
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Thanks for the patch. I tested patches on the same test environment, over which
the cpu idle time was reported first at https://lkml.org/lkml/2011/6/7/352. In
brief, tests were run on 2 socket quad core machine with three level of nested
cgroups hierarchy and five cgroups created below the third level. Each of the
five cgroups, having 2,2,4,8,16 while1 or cpu-matrix (https://lkml.org/lkml/2011/11/4/107)
tasks attached to them respectively.
[1] CFS Bandwith tweaks, were the patches posted by Paul Turner (https://lkml.org/lkml/2011/11/7/603)
[2] nohz idle balance RFC patch by Srivatsa Vaddagiri (https://lkml.org/lkml/2011/11/2/117)
While running the cpu-matrix benchmark with the patches, there was an improvement
around ~50 to 55% and additional ~3% benefit in idle time with nohz idle balance
patch. With while1 loop the improvment was around ~36 to 40% over tip and an
additional benefit of ~4 to 5% was seen with nohz idle balance patch.
(1) cpu-matrix benchmark with nohz=on
----------------------------------
Run Base (tip) tip + CFS Bandwith tweaks tip + CFS Bandwith tweaks + nohz idle patch
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Average CPU Idle percentage 4.1% Average CPU Idle percentage 2.36667% Average CPU Idle percentage 2.23333%
Bandwidth shared with remaining non-Idle 95.9% Bandwidth shared with remaining non-Idle 97.63333% Bandwidth shared with remaining non-Idle 97.76667%
2 Average CPU Idle percentage 4.23% Average CPU Idle percentage 2.3% Average CPU Idle percentage 2.16667%
Bandwidth shared with remaining non-Idle 95.77% Bandwidth shared with remaining non-Idle 97.7% Bandwidth shared with remaining non-Idle 97.83333%
(2) cpu-matrix benchmark with nohz=off
-----------------------------------
Run Base (tip) tip + CFS Bandwith tweaks tip + CFS Bandwith tweaks + nohz idle patch
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Average CPU Idle percentage 4.53333% Average CPU Idle percentage 2.43333% Average CPU Idle percentage 2.36667%
Bandwidth shared with remaining non-Idle 95.46667% Bandwidth shared with remaining non-Idle 97.56667% Bandwidth shared with remaining non-Idle 97.63333%
2 Average CPU Idle percentage 4.4% Average CPU Idle percentage 2.36667% Average CPU Idle percentage 2.4%
Bandwidth shared with remaining non-Idle 95.6% Bandwidth shared with remaining non-Idle 97.63333% Bandwidth shared with remaining non-Idle 97.6%
(3) while1 loop with nohz=on
-------------------------
Run Base (tip) tip + CFS Bandwith tweaks tip + CFS Bandwith tweaks + nohz idle patch
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Average CPU Idle percentage 6.26667% Average CPU Idle percentage 2.5% Average CPU Idle percentage 2.23333%
Bandwidth shared with remaining non-Idle 93.73333% Bandwidth shared with remaining non-Idle 97.5% Bandwidth shared with remaining non-Idle 97.76667%
2 Average CPU Idle percentage 6.73333% Average CPU Idle percentage 2.46667% Average CPU Idle percentage 2.13333%
Bandwidth shared with remaining non-Idle 93.26667% Bandwidth shared with remaining non-Idle 97.53333% Bandwidth shared with remaining non-Idle 97.86667%
(4) while1 loop with nohz=off
--------------------------
Run Base (tip) tip + CFS Bandwith tweaks tip + CFS Bandwith tweaks + nohz idle patch
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Average CPU Idle percentage 3.6% Average CPU Idle percentage 2.4% Average CPU Idle percentage 2.43333%
Bandwidth shared with remaining non-Idle 96.4% Bandwidth shared with remaining non-Idle 97.6% Bandwidth shared with remaining non-Idle 97.56667%
2 Average CPU Idle percentage 3.46667% Average CPU Idle percentage 2.33333% Average CPU Idle percentage 2.4%
Bandwidth shared with remaining non-Idle 96.53333% Bandwidth shared with remaining non-Idle 97.66667% Bandwidth shared with remaining non-Idle 97.6%
each cpu-matrix benchmark task was run as # perf sched cpu-matrix -s1k -i 1000 -p100
Kamalesh.
next prev parent reply other threads:[~2011-11-14 10:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-08 4:26 [patch 0/3] sched: bandwidth-control tweaks for v3.2 Paul Turner
2011-11-08 4:26 ` [patch 1/3] sched: use jump labels to reduce overhead when bandwidth control is inactive Paul Turner
2011-11-08 9:26 ` Peter Zijlstra
2011-11-08 9:28 ` Peter Zijlstra
2011-11-08 9:29 ` Peter Zijlstra
2011-11-11 4:23 ` Paul Turner
2011-11-18 23:42 ` [tip:sched/core] sched: Use " tip-bot for Paul Turner
2011-11-08 4:26 ` [patch 2/3] sched: fix buglet in return_cfs_rq_runtime() Paul Turner
2011-11-18 23:41 ` [tip:sched/core] sched: Fix " tip-bot for Paul Turner
2011-11-08 4:26 ` [patch 3/3] From: Ben Segall <bsegall@google.com> Paul Turner
2011-11-10 2:28 ` Paul Turner
2011-11-10 2:30 ` Paul Turner
2011-11-14 10:03 ` Kamalesh Babulal [this message]
2011-11-14 12:01 ` Peter Zijlstra
2011-11-15 21:14 ` Benjamin Segall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111114100306.GA10520@linux.vnet.ibm.com \
--to=kamalesh@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=bsegall@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=pjt@google.com \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.