All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Thara Gopinath <thara.gopinath@linaro.org>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org, rui.zhang@intel.com,
	gregkh@linuxfoundation.org, rafael@kernel.org,
	amit.kachhap@gmail.com, viresh.kumar@linaro.org,
	javi.merino@kernel.org, edubezval@gmail.com,
	daniel.lezcano@linaro.org, linux-pm@vger.kernel.org,
	quentin.perret@arm.com, ionela.voinescu@arm.com,
	vincent.guittot@linaro.org
Subject: Re: [RFC PATCH 0/7] Introduce thermal pressure
Date: Thu, 18 Oct 2018 08:48:49 +0200	[thread overview]
Message-ID: <20181018064849.GA42813@gmail.com> (raw)
In-Reply-To: <5BC76181.90105@linaro.org>


* Thara Gopinath <thara.gopinath@linaro.org> wrote:

> On 10/16/2018 03:33 AM, Ingo Molnar wrote:
> > 
> > * Thara Gopinath <thara.gopinath@linaro.org> wrote:
> > 
> >>>> Regarding testing, basic build, boot and sanity testing have been
> >>>> performed on hikey960 mainline kernel with debian file system.
> >>>> Further aobench (An occlusion renderer for benchmarking realworld
> >>>> floating point performance) showed the following results on hikey960
> >>>> with debain.
> >>>>
> >>>>                                         Result          Standard        Standard
> >>>>                                         (Time secs)     Error           Deviation
> >>>> Hikey 960 - no thermal pressure applied 138.67          6.52            11.52%
> >>>> Hikey 960 -  thermal pressure applied   122.37          5.78            11.57%
> >>>
> >>> Wow, +13% speedup, impressive! We definitely want this outcome.
> >>>
> >>> I'm wondering what happens if we do not track and decay the thermal 
> >>> load at all at the PELT level, but instantaneously decrease/increase 
> >>> effective CPU capacity in reaction to thermal events we receive from 
> >>> the CPU.
> >>
> >> The problem with instantaneous update is that sometimes thermal events 
> >> happen at a much faster pace than cpu_capacity is updated in the 
> >> scheduler. This means that at the moment when scheduler uses the 
> >> value, it might not be correct anymore.
> > 
> > Let me offer a different interpretation: if we average throttling events 
> > then we create a 'smooth' average of 'true CPU capacity' that doesn't 
> > fluctuate much. This allows more stable yet asymmetric task placement if 
> > the thermal characteristics of the different cores is different 
> > (asymmetric). This, compared to instantaneous updates, would reduce 
> > unnecessary task migrations between cores.
> > 
> > Is that accurate?
> 
> Yes. I think it is accurate. I will also add that if we don't average
> throttling events, we will miss the events that occur in between load
> balancing(LB) period.

Yeah, so I'd definitely suggest to not integrate this averaging into 
pelt.c in the fashion presented, because:

 - This couples your thermal throttling averaging to the PELT decay
   half-time AFAICS, which would break the other user every time the
   decay is changed/tuned.

 - The boolean flag that changes behavior in pelt.c is not particularly
   clean either and complicates the code.

 - Instead maybe factor out a decaying average library into
   kernel/sched/avg.h perhaps (if this truly improves the code), and use
   those methods both in pelt.c and any future thermal.c - and maybe
   other places where we do decaying averages.

 - But simple decaying averages are not that complex either, so I think
   your original solution of open coding it is probably fine as well. 

Furthermore, any logic introduced by thermal.c and the resulting change 
to load-balancing behavior would have to be in perfect sync with cpufreq 
governor actions - one mechanism should not work against the other.

The only long term maintainable solution is to move all high level 
cpufreq logic and policy handling code into kernel/sched/cpufreq*.c, 
which has been done to a fair degree already in the past ~2 years - but 
it's unclear to me to what extent this is true for thermal throttling 
policy currently: there might be more governor surgery and code 
reshuffling required?

The short term goal would be to at minimum have all the bugs lined up in 
kernel/sched/* neatly, so that we have the chance to see and fix them in 
a single place. ;-)

Thanks,

	Ingo

  reply	other threads:[~2018-10-18  6:48 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20181009162509epcas1p4fdd2e23039caa24586a4a52c6d2e7336@epcas1p4.samsung.com>
2018-10-09 16:24 ` [RFC PATCH 0/7] Introduce thermal pressure Thara Gopinath
2018-10-09 16:24   ` [RFC PATCH 1/7] sched/pelt.c: Add option to make load and util calculations frequency invariant Thara Gopinath
2018-10-09 16:24   ` [RFC PATCH 2/7] sched/pelt.c: Add support to track thermal pressure Thara Gopinath
2018-10-09 16:24   ` [RFC PATCH 3/7] sched: Add infrastructure to store and update instantaneous " Thara Gopinath
2018-10-09 16:24   ` [RFC PATCH 4/7] sched: Initialize per cpu thermal pressure structure Thara Gopinath
2018-10-09 16:25   ` [RFC PATCH 5/7] sched/fair: Enable CFS periodic tick to update thermal pressure Thara Gopinath
2018-12-04 15:43     ` Vincent Guittot
2018-10-09 16:25   ` [RFC PATCH 6/7] sched/fair: update cpu_capcity to reflect " Thara Gopinath
2018-10-10  5:57     ` Javi Merino
2018-10-10 14:22       ` Thara Gopinath
2018-10-09 16:25   ` [RFC PATCH 7/7] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping Thara Gopinath
2018-10-10  5:44   ` [RFC PATCH 0/7] Introduce thermal pressure Javi Merino
2018-10-10 14:15     ` Thara Gopinath
2018-10-10  6:17   ` Ingo Molnar
2018-10-10  8:29     ` Quentin Perret
2018-10-10  8:50       ` Vincent Guittot
2018-10-10  9:55         ` Quentin Perret
2018-10-10 10:14           ` Vincent Guittot
2018-10-10 10:36             ` Quentin Perret
2018-10-10 12:04               ` Vincent Guittot
2018-10-10 12:23                 ` Juri Lelli
2018-10-10 12:34                   ` Vincent Guittot
2018-10-10 12:50                     ` Juri Lelli
2018-10-10 13:08                       ` Vincent Guittot
2018-10-10 13:34                         ` Juri Lelli
2018-10-10 13:38                           ` Vincent Guittot
2018-10-10 17:08                           ` Thara Gopinath
2018-10-10 13:11                       ` Quentin Perret
2018-10-10 13:05                 ` Quentin Perret
2018-10-10 13:27                   ` Vincent Guittot
2018-10-10 13:47                     ` Quentin Perret
2018-10-10 15:19                       ` Vincent Guittot
2018-10-10 16:15                       ` Ionela Voinescu
2018-10-10 17:03           ` Thara Gopinath
2018-10-10 15:43     ` Thara Gopinath
2018-10-16  7:33       ` Ingo Molnar
2018-10-16  9:28         ` Lukasz Luba
2018-10-17 16:21         ` Thara Gopinath
2018-10-18  6:48           ` Ingo Molnar [this message]
2018-10-18  7:08             ` Rafael J. Wysocki
2018-10-18  7:50               ` Ingo Molnar
2018-10-18  8:14                 ` Rafael J. Wysocki
2018-10-18  9:35                   ` [PATCH 1/2] sched/cpufreq: Reorganize the cpufreq files Daniel Lezcano
2018-10-18  9:35                     ` [PATCH 2/2] sched/cpufreq: Add the SPDX tags Daniel Lezcano
2018-10-18  9:42                     ` [PATCH 1/2] sched/cpufreq: Reorganize the cpufreq files Rafael J. Wysocki
2018-10-18  9:42                       ` Rafael J. Wysocki
2018-10-18  9:54                       ` Daniel Lezcano
2018-10-18 10:06                         ` Rafael J. Wysocki
2018-10-18 10:06                           ` Rafael J. Wysocki
2018-10-18 10:13                           ` Daniel Lezcano
2018-10-18  9:45                     ` Daniel Lezcano
2018-10-19  5:24                     ` kbuild test robot
2018-10-19  5:52                     ` kbuild test robot
2018-10-18  9:44                   ` [PATCH V2 " Daniel Lezcano
2018-10-18  9:44                     ` [PATCH V2 2/2] sched/cpufreq: Add the SPDX tags Daniel Lezcano
2018-10-18 16:17             ` [RFC PATCH 0/7] Introduce thermal pressure Thara Gopinath
2018-10-19  8:02               ` Ingo Molnar
2018-10-19 11:29                 ` Valentin Schneider
2018-10-10 15:35   ` Lukasz Luba
2018-10-10 16:54     ` Daniel Lezcano
2018-10-11  7:35       ` Lukasz Luba
2018-10-11  8:23         ` Daniel Lezcano
2018-10-12  9:37           ` Lukasz Luba
2018-10-10 17:30     ` Thara Gopinath
2018-10-11 11:10       ` Lukasz Luba
2018-10-16 17:11         ` Vincent Guittot
2018-10-17 16:24           ` Thara Gopinath
2018-10-18  8:00             ` Lukasz Luba
2018-10-18  8:12           ` Lukasz Luba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181018064849.GA42813@gmail.com \
    --to=mingo@kernel.org \
    --cc=amit.kachhap@gmail.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=edubezval@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=ionela.voinescu@arm.com \
    --cc=javi.merino@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quentin.perret@arm.com \
    --cc=rafael@kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=thara.gopinath@linaro.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.