From: Vincent Guittot <vincent.guittot@linaro.org>
To: lkp@lists.01.org
Subject: Re: [lkp-developer] [sched/fair] 4e5160766f: +149% ftq.noise.50% regression
Date: Thu, 22 Dec 2016 16:12:15 +0100 [thread overview]
Message-ID: <20161222151215.GA23448@linaro.org> (raw)
In-Reply-To: <878trk8urx.fsf@yhuang-dev.intel.com>
[-- Attachment #1: Type: text/plain, Size: 8668 bytes --]
Le Tuesday 13 Dec 2016 à 09:47:30 (+0800), Huang, Ying a écrit :
> Hi, Vincent,
>
> Vincent Guittot <vincent.guittot@linaro.org> writes:
>
> > Hi Ying,
> >
> > On 12 December 2016 at 06:43, kernel test robot
> > <ying.huang@linux.intel.com> wrote:
> >> Greeting,
> >>
> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:
> >>
> >>
> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >>
> >> in testcase: ftq
> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory
> >> with following parameters:
> >>
> >> nr_task: 100%
> >> samples: 6000ss
> >> test: cache
> >> freq: 20
> >> cpufreq_governor: powersave
> >
> > Why using powersave ? Are you testing every governors ?
>
> We will test performance and powersave governor for FTQ.
Ok thanks
>
> >>
> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.
> >> test-url: https://github.com/rminnich/ftq
> >
> > It's a bit difficult to understand exactly what is measured and what
> > is ftq.noise.50% because this result is not part of the bench which
> > seems to only record a log of data in a file and ftq.noise.50% seems
> > to be lkp specific
>
> Yes. FTQ itself has no noise statistics builtin, although it is an OS
> noise benchmark. ftq.noise.50% is calculated as below:
>
> There is a score for every sample of ftq. The lower the score, the
> higher the noises. ftq.noise.50% is the number (per 1000000 samples) of
> samples whose score is less than 50% of the mean score.
>
ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%
I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.
Could it be possible to test if the patch below fix the regression ?
---
kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 090a9bb..8efa113 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
return 1;
}
+/* Check if we need to update the load and the utilization of a group_entity */
+static inline bool skip_blocked_update(struct sched_entity *se)
+{
+ struct cfs_rq *gcfs_rq = group_cfs_rq(se);
+
+ /*
+ * If sched_entity still have not null load or utilization, we have to
+ * decay it.
+ */
+ if (se->avg.load_avg || se->avg.util_avg)
+ return false;
+
+ /*
+ * If there is a pending propagation, we have to update the load and
+ * the utilizaion of the sched_entity
+ */
+ if (gcfs_rq->propagate_avg)
+ return false;
+
+ /*
+ * Other wise, the load and the utilizaiton of the sched_entity is
+ * already null so it will be a waste of time to try to decay it
+ */
+ return true;
+}
#else /* CONFIG_FAIR_GROUP_SCHED */
static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
@@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)
{
struct rq *rq = cpu_rq(cpu);
struct cfs_rq *cfs_rq;
+ struct sched_entity *se;
unsigned long flags;
raw_spin_lock_irqsave(&rq->lock, flags);
@@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)
update_tg_load_avg(cfs_rq, 0);
/* Propagate pending load changes to the parent */
- if (cfs_rq->tg->se[cpu])
+ se = cfs_rq->tg->se[cpu];
+ if (se && !skip_blocked_update(se))
update_load_avg(cfs_rq->tg->se[cpu], 0);
}
raw_spin_unlock_irqrestore(&rq->lock, flags);
--
2.7.4
Thanks
> Best Regards,
> Huang, Ying
>
> > I have tried to reproduce the lkp test on a debian jessie then a
> > ubuntu server 16.10 but lkp doesn't seems to install cleanly as there
> > are some errors:
> >
> > sudo bin/lkp run job.yaml
> > IPMI BMC is not supported on this machine, skip bmc-watchdog setup!
> > 2016-12-12 13:58:39 ./ftq_cache -f 20 -n 6000 -t 8 -a 524288
> > Start 5088418680237 end 5438443372098 elapsed 350024691861
> > cyclestart 14236344834332 cycleend 15214154208877 elapsed 977809374545
> > Avg Cycles(ticks) per ns. is 2.793544; nspercycle is 0.357968
> > Pre-computed ticks per ns: 2.793541
> > Sample frequency is 20.000000
> > ticks per ns 2.79354
> > chown: utilisateur incorrect: «lkp.lkp»
> > chown: utilisateur incorrect: «lkp.lkp»
> > wait for background monitors: 9405 9407 oom-killer nfs-hang
> > curl: (6) Could not resolve host: ftq.time
> >
> >
> >>
> >> In addition to that, the commit also has significant impact on the following tests:
> >>
> >> +------------------+--------------------------------------------------------------------------------+
> >> | testcase: change | unixbench: unixbench.score 2.7% improvement |
> >> | test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory |
> >> | test parameters | cpufreq_governor=performance |
> >> | | nr_task=100% |
> >> | | runtime=300s |
> >> | | test=execl |
> >> +------------------+--------------------------------------------------------------------------------+
> >>
> >>
> >> Details are as below:
> >> -------------------------------------------------------------------------------------------------->
> >>
> >>
> >> To reproduce:
> >>
> >> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >> cd lkp-tests
> >> bin/lkp install job.yaml # job file is attached in this email
> >> bin/lkp run job.yaml
> >>
> >> testcase/path_params/tbox_group/run: ftq/100%-6000ss-cache-20-powersave/lkp-hsw-d01
> >>
> >> 09a43ace1f986b00 4e5160766fcc9f41bbd38bac11
> >> ---------------- --------------------------
> >> %stddev change %stddev
> >> \ | \
> >> 305 ± 30% 260% 1100 ± 14% ftq.noise.75%
> >> 1386 ± 19% 149% 3457 ± 7% ftq.noise.50%
> >> 2148 ± 11% 98% 4257 ± 4% ftq.noise.25%
> >> 3963589 3898578 ftq.time.involuntary_context_switches
> >>
> >>
> >>
> >> ftq.noise.50_
> >>
> >> 4000 ++------------O------------------------------------------------------+
> >> | O O |
> >> 3500 ++ O O O O O O O
> >> | O O O O O O O O O O O O O O |
> >> O O O O O |
> >> 3000 ++ O |
> >> | O |
> >> 2500 ++ |
> >> | |
> >> 2000 ++ |
> >> | * .* |
> >> | + : * * * + |
> >> 1500 ++ + : + + + + : + .* |
> >> |.* *. + * *.. : * + |
> >> 1000 *+-------*-----------*----------*------------------------------------+
> >>
> >> [*] bisect-good sample
> >> [O] bisect-bad sample
> >>
> >>
> >> Disclaimer:
> >> Results have been estimated based on internal Intel analysis and are provided
> >> for informational purposes only. Any difference in system hardware or software
> >> design or configuration may affect actual performance.
> >>
> >>
> >> Thanks,
> >> Ying Huang
> > _______________________________________________
> > LKP mailing list
> > LKP(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/lkp
WARNING: multiple messages have this Message-ID (diff)
From: Vincent Guittot <vincent.guittot@linaro.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
Andi Kleen <ak@linux.intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>, LKP <lkp@01.org>,
LKML <linux-kernel@vger.kernel.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Dave Hansen <dave.hansen@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [LKP] [lkp-developer] [sched/fair] 4e5160766f: +149% ftq.noise.50% regression
Date: Thu, 22 Dec 2016 16:12:15 +0100 [thread overview]
Message-ID: <20161222151215.GA23448@linaro.org> (raw)
In-Reply-To: <878trk8urx.fsf@yhuang-dev.intel.com>
Le Tuesday 13 Dec 2016 à 09:47:30 (+0800), Huang, Ying a écrit :
> Hi, Vincent,
>
> Vincent Guittot <vincent.guittot@linaro.org> writes:
>
> > Hi Ying,
> >
> > On 12 December 2016 at 06:43, kernel test robot
> > <ying.huang@linux.intel.com> wrote:
> >> Greeting,
> >>
> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit:
> >>
> >>
> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach")
> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> >>
> >> in testcase: ftq
> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory
> >> with following parameters:
> >>
> >> nr_task: 100%
> >> samples: 6000ss
> >> test: cache
> >> freq: 20
> >> cpufreq_governor: powersave
> >
> > Why using powersave ? Are you testing every governors ?
>
> We will test performance and powersave governor for FTQ.
Ok thanks
>
> >>
> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.
> >> test-url: https://github.com/rminnich/ftq
> >
> > It's a bit difficult to understand exactly what is measured and what
> > is ftq.noise.50% because this result is not part of the bench which
> > seems to only record a log of data in a file and ftq.noise.50% seems
> > to be lkp specific
>
> Yes. FTQ itself has no noise statistics builtin, although it is an OS
> noise benchmark. ftq.noise.50% is calculated as below:
>
> There is a score for every sample of ftq. The lower the score, the
> higher the noises. ftq.noise.50% is the number (per 1000000 samples) of
> samples whose score is less than 50% of the mean score.
>
ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50%
I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression.
Could it be possible to test if the patch below fix the regression ?
---
kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 090a9bb..8efa113 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se)
return 1;
}
+/* Check if we need to update the load and the utilization of a group_entity */
+static inline bool skip_blocked_update(struct sched_entity *se)
+{
+ struct cfs_rq *gcfs_rq = group_cfs_rq(se);
+
+ /*
+ * If sched_entity still have not null load or utilization, we have to
+ * decay it.
+ */
+ if (se->avg.load_avg || se->avg.util_avg)
+ return false;
+
+ /*
+ * If there is a pending propagation, we have to update the load and
+ * the utilizaion of the sched_entity
+ */
+ if (gcfs_rq->propagate_avg)
+ return false;
+
+ /*
+ * Other wise, the load and the utilizaiton of the sched_entity is
+ * already null so it will be a waste of time to try to decay it
+ */
+ return true;
+}
#else /* CONFIG_FAIR_GROUP_SCHED */
static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {}
@@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu)
{
struct rq *rq = cpu_rq(cpu);
struct cfs_rq *cfs_rq;
+ struct sched_entity *se;
unsigned long flags;
raw_spin_lock_irqsave(&rq->lock, flags);
@@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu)
update_tg_load_avg(cfs_rq, 0);
/* Propagate pending load changes to the parent */
- if (cfs_rq->tg->se[cpu])
+ se = cfs_rq->tg->se[cpu];
+ if (se && !skip_blocked_update(se))
update_load_avg(cfs_rq->tg->se[cpu], 0);
}
raw_spin_unlock_irqrestore(&rq->lock, flags);
--
2.7.4
Thanks
> Best Regards,
> Huang, Ying
>
> > I have tried to reproduce the lkp test on a debian jessie then a
> > ubuntu server 16.10 but lkp doesn't seems to install cleanly as there
> > are some errors:
> >
> > sudo bin/lkp run job.yaml
> > IPMI BMC is not supported on this machine, skip bmc-watchdog setup!
> > 2016-12-12 13:58:39 ./ftq_cache -f 20 -n 6000 -t 8 -a 524288
> > Start 5088418680237 end 5438443372098 elapsed 350024691861
> > cyclestart 14236344834332 cycleend 15214154208877 elapsed 977809374545
> > Avg Cycles(ticks) per ns. is 2.793544; nspercycle is 0.357968
> > Pre-computed ticks per ns: 2.793541
> > Sample frequency is 20.000000
> > ticks per ns 2.79354
> > chown: utilisateur incorrect: «lkp.lkp»
> > chown: utilisateur incorrect: «lkp.lkp»
> > wait for background monitors: 9405 9407 oom-killer nfs-hang
> > curl: (6) Could not resolve host: ftq.time
> >
> >
> >>
> >> In addition to that, the commit also has significant impact on the following tests:
> >>
> >> +------------------+--------------------------------------------------------------------------------+
> >> | testcase: change | unixbench: unixbench.score 2.7% improvement |
> >> | test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory |
> >> | test parameters | cpufreq_governor=performance |
> >> | | nr_task=100% |
> >> | | runtime=300s |
> >> | | test=execl |
> >> +------------------+--------------------------------------------------------------------------------+
> >>
> >>
> >> Details are as below:
> >> -------------------------------------------------------------------------------------------------->
> >>
> >>
> >> To reproduce:
> >>
> >> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> >> cd lkp-tests
> >> bin/lkp install job.yaml # job file is attached in this email
> >> bin/lkp run job.yaml
> >>
> >> testcase/path_params/tbox_group/run: ftq/100%-6000ss-cache-20-powersave/lkp-hsw-d01
> >>
> >> 09a43ace1f986b00 4e5160766fcc9f41bbd38bac11
> >> ---------------- --------------------------
> >> %stddev change %stddev
> >> \ | \
> >> 305 ± 30% 260% 1100 ± 14% ftq.noise.75%
> >> 1386 ± 19% 149% 3457 ± 7% ftq.noise.50%
> >> 2148 ± 11% 98% 4257 ± 4% ftq.noise.25%
> >> 3963589 3898578 ftq.time.involuntary_context_switches
> >>
> >>
> >>
> >> ftq.noise.50_
> >>
> >> 4000 ++------------O------------------------------------------------------+
> >> | O O |
> >> 3500 ++ O O O O O O O
> >> | O O O O O O O O O O O O O O |
> >> O O O O O |
> >> 3000 ++ O |
> >> | O |
> >> 2500 ++ |
> >> | |
> >> 2000 ++ |
> >> | * .* |
> >> | + : * * * + |
> >> 1500 ++ + : + + + + : + .* |
> >> |.* *. + * *.. : * + |
> >> 1000 *+-------*-----------*----------*------------------------------------+
> >>
> >> [*] bisect-good sample
> >> [O] bisect-bad sample
> >>
> >>
> >> Disclaimer:
> >> Results have been estimated based on internal Intel analysis and are provided
> >> for informational purposes only. Any difference in system hardware or software
> >> design or configuration may affect actual performance.
> >>
> >>
> >> Thanks,
> >> Ying Huang
> > _______________________________________________
> > LKP mailing list
> > LKP@lists.01.org
> > https://lists.01.org/mailman/listinfo/lkp
next prev parent reply other threads:[~2016-12-22 15:12 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-12 5:43 [lkp-developer] [sched/fair] 4e5160766f: +149% ftq.noise.50% regression kernel test robot
2016-12-12 5:43 ` kernel test robot
2016-12-12 13:25 ` Vincent Guittot
2016-12-12 13:25 ` Vincent Guittot
2016-12-13 1:47 ` Huang, Ying
2016-12-13 1:47 ` [LKP] " Huang, Ying
2016-12-22 15:12 ` Vincent Guittot [this message]
2016-12-22 15:12 ` Vincent Guittot
2016-12-28 8:17 ` Huang, Ying
2016-12-28 8:17 ` [LKP] " Huang, Ying
2017-01-02 15:42 ` Vincent Guittot
2017-01-02 15:42 ` [LKP] " Vincent Guittot
2017-01-03 10:38 ` Dietmar Eggemann
2017-01-03 10:38 ` [LKP] " Dietmar Eggemann
2017-01-03 11:37 ` Vincent Guittot
2017-01-03 11:37 ` [LKP] " Vincent Guittot
2017-01-04 3:08 ` Huang, Ying
2017-01-04 3:08 ` [LKP] " Huang, Ying
2017-01-04 14:06 ` Vincent Guittot
2017-01-04 14:06 ` [LKP] " Vincent Guittot
2017-02-21 2:40 ` Huang, Ying
2017-02-21 2:40 ` [LKP] " Huang, Ying
2017-02-27 9:44 ` Vincent Guittot
2017-02-27 9:44 ` [LKP] " Vincent Guittot
2017-02-28 0:33 ` Huang, Ying
2017-02-28 0:33 ` [LKP] " Huang, Ying
2017-02-28 9:35 ` Vincent Guittot
2017-02-28 9:35 ` [LKP] " Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161222151215.GA23448@linaro.org \
--to=vincent.guittot@linaro.org \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.