From: "Huang\, Ying" <ying.huang@intel.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Huang\, Ying" <ying.huang@intel.com>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Andi Kleen <ak@linux.intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>, LKP <lkp@01.org>,
LKML <linux-kernel@vger.kernel.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Dave Hansen <dave.hansen@intel.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [LKP] [lkp-developer] [sched/fair] 4e5160766f: +149% ftq.noise.50% regression
Date: Wed, 28 Dec 2016 16:17:27 +0800 [thread overview]
Message-ID: <87r34swjqg.fsf@yhuang-dev.intel.com> (raw)
In-Reply-To: <20161222151215.GA23448@linaro.org> (Vincent Guittot's message of "Thu, 22 Dec 2016 16:12:15 +0100")
Vincent Guittot <vincent.guittot@linaro.org> writes:
> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit :
>> Hi, Vincent,
>>
>> Vincent Guittot <vincent.guittot@linaro.org> writes:
>>
>> > Hi Ying,
>> >
>> > On 12 December 2016 at 06:43, kernel test robot
>> > <ying.huang@linux.intel.com> wrote:
>> >> Greeting,
>> >>
>> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:
>> >>
>> >>
>> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
>> >>
>> >> in testcase: ftq
>> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory
>> >> with following parameters:
>> >>
>> >> nr_task: 100%
>> >> samples: 6000ss
>> >> test: cache
>> >> freq: 20
>> >> cpufreq_governor: powersave
>> >
>> > Why using powersave ? Are you testing every governors ?
>>
>> We will test performance and powersave governor for FTQ.
>
> Ok thanks
>
>>
>> >>
>> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.
>> >> test-url: https://github.com/rminnich/ftq
>> >
>> > It's a bit difficult to understand exactly what is measured and what
>> > is ftq.noise.50% because this result is not part of the bench which
>> > seems to only record a log of data in a file and ftq.noise.50% seems
>> > to be lkp specific
>>
>> Yes. FTQ itself has no noise statistics builtin, although it is an OS
>> noise benchmark. ftq.noise.50% is calculated as below:
>>
>> There is a score for every sample of ftq. The lower the score, the
>> higher the noises. ftq.noise.50% is the number (per 1000000 samples) of
>> samples whose score is less than 50% of the mean score.
>>
>
> ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%
>
> I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.
>
> Could it be possible to test if the patch below fix the regression ?
>
>
> ---
> kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-
> 1 file changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 090a9bb..8efa113 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
> return 1;
> }
>
> +/* Check if we need to update the load and the utilization of a group_entity */
> +static inline bool skip_blocked_update(struct sched_entity *se)
> +{
> + struct cfs_rq *gcfs_rq = group_cfs_rq(se);
> +
> + /*
> + * If sched_entity still have not null load or utilization, we have to
> + * decay it.
> + */
> + if (se->avg.load_avg || se->avg.util_avg)
> + return false;
> +
> + /*
> + * If there is a pending propagation, we have to update the load and
> + * the utilizaion of the sched_entity
> + */
> + if (gcfs_rq->propagate_avg)
> + return false;
> +
> + /*
> + * Other wise, the load and the utilizaiton of the sched_entity is
> + * already null so it will be a waste of time to try to decay it
> + */
> + return true;
> +}
> #else /* CONFIG_FAIR_GROUP_SCHED */
>
> static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)
> {
> struct rq *rq = cpu_rq(cpu);
> struct cfs_rq *cfs_rq;
> + struct sched_entity *se;
> unsigned long flags;
>
> raw_spin_lock_irqsave(&rq->lock, flags);
> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)
> update_tg_load_avg(cfs_rq, 0);
>
> /* Propagate pending load changes to the parent */
> - if (cfs_rq->tg->se[cpu])
> + se = cfs_rq->tg->se[cpu];
> + if (se && !skip_blocked_update(se))
> update_load_avg(cfs_rq->tg->se[cpu], 0);
> }
> raw_spin_unlock_irqrestore(&rq->lock, flags);
The test result is as follow,
=========================================================================================
compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase:
gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq
commit:
4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit
09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit
0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch
4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3
---------------- -------------------------- --------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
61670 ±228% -96.5% 2148 ± 11% -94.7% 3281 ± 58% ftq.noise.25%
3463 ± 10% -60.0% 1386 ± 19% -26.3% 2552 ± 58% ftq.noise.50%
1116 ± 23% -72.6% 305.99 ± 30% -35.8% 716.15 ± 64% ftq.noise.75%
3843815 ± 3% +3.1% 3963589 ± 1% -49.6% 1938221 ±100% ftq.time.involuntary_context_switches
5.33 ± 30% +21.4% 6.46 ± 14% -71.7% 1.50 ±108% time.system_time
It appears that the system_time and involuntary_context_switches reduced
much after applied the debug patch, which is good from noise point of
view. ftq.noise.50% reduced compared with the first bad commit, but
have not restored to that of the parent of the first bad commit.
Best Regards,
Huang, Ying
next prev parent reply other threads:[~2016-12-28 8:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-12 5:43 [lkp-developer] [sched/fair] 4e5160766f: +149% ftq.noise.50% regression kernel test robot
2016-12-12 13:25 ` Vincent Guittot
2016-12-13 1:47 ` [LKP] " Huang, Ying
2016-12-22 15:12 ` Vincent Guittot
2016-12-28 8:17 ` Huang, Ying [this message]
2017-01-02 15:42 ` Vincent Guittot
2017-01-03 10:38 ` Dietmar Eggemann
2017-01-03 11:37 ` Vincent Guittot
2017-01-04 3:08 ` Huang, Ying
2017-01-04 14:06 ` Vincent Guittot
2017-02-21 2:40 ` Huang, Ying
2017-02-27 9:44 ` Vincent Guittot
2017-02-28 0:33 ` Huang, Ying
2017-02-28 9:35 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r34swjqg.fsf@yhuang-dev.intel.com \
--to=ying.huang@intel.com \
--cc=ak@linux.intel.com \
--cc=dave.hansen@intel.com \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=sfr@canb.auug.org.au \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox