From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Colin Cross <ccross@android.com>
Cc: Kevin Hilman <khilman@ti.com>, Len Brown <len.brown@intel.com>,
Russell King <linux@arm.linux.org.uk>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Kay Sievers <kay.sievers@vrfy.org>,
linux-kernel@vger.kernel.org,
Amit Kucheria <amit.kucheria@linaro.org>,
linux-pm@lists.linux-foundation.org,
Arjan van de Ven <arjan@linux.intel.com>,
Arnd Bergmann <arnd.bergmann@linaro.org>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCHv3 0/5] coupled cpuidle state support
Date: Thu, 3 May 2012 22:43:56 +0200 [thread overview]
Message-ID: <201205032243.56848.rjw@sisk.pl> (raw)
In-Reply-To: <CAMbhsRS8D=d9aAbikosOis+Q+jMU04Oj9xaRg0jMVxtQ7nijTw@mail.gmail.com>
On Thursday, May 03, 2012, Colin Cross wrote:
> On Thu, May 3, 2012 at 1:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> <snip>
> > There are two distinct cases to consider here, (1) when the last I/O
> > device in the domain becomes idle and the question is whether or not to
> > power off the entire domain and (2) when a CPU core in a power domain
> > becomes idle while all of the devices in the domain are idle already.
> >
> > Case (2) is quite straightforward, the .enter() routine for the
> > "domain" C-state has to check whether the domain can be turned off and
> > do it eventually.
> >
> > Case (1) is more difficult and (assuming that all CPU cores in the domain
> > are already idle at this point) i see two possible ways to handle it:
> > (a) Wake up all of the (idle) CPU cores in the domain and let the
> > "domain" C-state's .enter() do the job (ie. turn it into case (2)),
> > similarly to your patchset.
> > (b) If cpuidle has prepared the cores for going into deeper idle,
> > turn the domain off directly without waking up the cores.
>
> Multiple clusters is a design that has been considered in this
> patchset (all the data structures are in the right place to support
> it), and can be supported in the future, but does not exist in any
> current systems that would be using this. In all of today's SoCs,
> there is a single cluster, so (1) can't happen - no code can be
> executing while all cpus are idle.
OK, but I think it should be taken into consideration nonetheless.
> (b) is an optimization that would not be possible on any future SoC
> that is similar to the current SoCs, where "turn the domain off" is
> very tightly integrated with TrustZone secure code running on the
> primary cpu of the cluster.
I see.
> <snip>
>
> > Having considered this for a while I think that it may be more straightforward
> > to avoid waking up the already idled cores.
> >
> > For instance, say we have 4 CPU cores in a cluster (package) such that each
> > core has its own idle state (call it C1) and there is a multicore idle state
> > entered by turning off the entire cluster (call this state C-multi). One of
> > the possible ways to handle this seems to be to use an identical table of
> > C-states for each core containing the C1 entry and a kind of fake entry called
> > (for example) C4 with the time characteristics of C-multi and a special
> > .enter() callback. That callback will prepare the core it is called for to
> > enter C-multi, but instead of simply turning off the whole package it will
> > decrement a counter. If the counte happens to be 0 at this point, the
> > package will be turned off. Otherwise, the core will be put into the idle
> > state corresponding to C1, but it will be ready for entering C-multi at
> > any time. The counter will be incremented on exiting the C4 "state".
>
> I implemented something very similar to this on Tegra2 (having each
> cpu go to C1, but with enough state saved for C-multi), but it turns
> out not to work in hardware. On every existing ARM SMP system where I
> have worked with cpuidle (Tegra2, OMAP4, Exynos5, and some Tegra3),
> only cpu 0 can trigger the transition to C-multi. The cause of this
> restriction is different on every platform - sometimes it's by design,
> sometimes it's a bug in the SoC ROM code, but the restriction exists.
> The primary cpu of the cluster always needs to be awake.
OK, so that means we need to do the wakeup for technical reasons.
> In addition, it may not be possible to transition secondary cpus from
> C1 to C-multi without waking them. That would generally involve
> cutting power to a CPU that is in clock gating, which is not a
> supported power transition in any SoC that I have a datasheet for. I
> made it work for cpu1 on Tegra2, but I can't guarantee that there are
> not unsolvable HW race conditions.
>
> The only generic way to make this work is to wake up all cpus. Waking
> up a subset of cpus is certainly worth investigating as an
> optimization, but it would not be used on Tegra2, OMAP4, or Exynos5.
> Tegra3 may benefit from it.
OK
> > It looks like this should work without modifying the cpuidle core, but
> > the drawback here is that the cpuidle core doesn't know how much time
> > spend in C4 is really in C1 and how much of it is in C-multi, so the
> > statistics reported by it won't reflect the real energy usage.
>
> Idle statistics are extremely important when determining why a
> particular use case is drawing too much power, and it is worth
> modifying the cpuidle core if only to keep them accurate. Especially
> when justifying the move from the cpufreq hotplug governor based code
> that every SoC vendor uses in their BSP to a proper multi-CPU cpuidle
> implementation.
I see.
Thanks for the explanation,
Rafael
WARNING: multiple messages have this Message-ID (diff)
From: rjw@sisk.pl (Rafael J. Wysocki)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCHv3 0/5] coupled cpuidle state support
Date: Thu, 3 May 2012 22:43:56 +0200 [thread overview]
Message-ID: <201205032243.56848.rjw@sisk.pl> (raw)
In-Reply-To: <CAMbhsRS8D=d9aAbikosOis+Q+jMU04Oj9xaRg0jMVxtQ7nijTw@mail.gmail.com>
On Thursday, May 03, 2012, Colin Cross wrote:
> On Thu, May 3, 2012 at 1:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> <snip>
> > There are two distinct cases to consider here, (1) when the last I/O
> > device in the domain becomes idle and the question is whether or not to
> > power off the entire domain and (2) when a CPU core in a power domain
> > becomes idle while all of the devices in the domain are idle already.
> >
> > Case (2) is quite straightforward, the .enter() routine for the
> > "domain" C-state has to check whether the domain can be turned off and
> > do it eventually.
> >
> > Case (1) is more difficult and (assuming that all CPU cores in the domain
> > are already idle at this point) i see two possible ways to handle it:
> > (a) Wake up all of the (idle) CPU cores in the domain and let the
> > "domain" C-state's .enter() do the job (ie. turn it into case (2)),
> > similarly to your patchset.
> > (b) If cpuidle has prepared the cores for going into deeper idle,
> > turn the domain off directly without waking up the cores.
>
> Multiple clusters is a design that has been considered in this
> patchset (all the data structures are in the right place to support
> it), and can be supported in the future, but does not exist in any
> current systems that would be using this. In all of today's SoCs,
> there is a single cluster, so (1) can't happen - no code can be
> executing while all cpus are idle.
OK, but I think it should be taken into consideration nonetheless.
> (b) is an optimization that would not be possible on any future SoC
> that is similar to the current SoCs, where "turn the domain off" is
> very tightly integrated with TrustZone secure code running on the
> primary cpu of the cluster.
I see.
> <snip>
>
> > Having considered this for a while I think that it may be more straightforward
> > to avoid waking up the already idled cores.
> >
> > For instance, say we have 4 CPU cores in a cluster (package) such that each
> > core has its own idle state (call it C1) and there is a multicore idle state
> > entered by turning off the entire cluster (call this state C-multi). One of
> > the possible ways to handle this seems to be to use an identical table of
> > C-states for each core containing the C1 entry and a kind of fake entry called
> > (for example) C4 with the time characteristics of C-multi and a special
> > .enter() callback. That callback will prepare the core it is called for to
> > enter C-multi, but instead of simply turning off the whole package it will
> > decrement a counter. If the counte happens to be 0 at this point, the
> > package will be turned off. Otherwise, the core will be put into the idle
> > state corresponding to C1, but it will be ready for entering C-multi at
> > any time. The counter will be incremented on exiting the C4 "state".
>
> I implemented something very similar to this on Tegra2 (having each
> cpu go to C1, but with enough state saved for C-multi), but it turns
> out not to work in hardware. On every existing ARM SMP system where I
> have worked with cpuidle (Tegra2, OMAP4, Exynos5, and some Tegra3),
> only cpu 0 can trigger the transition to C-multi. The cause of this
> restriction is different on every platform - sometimes it's by design,
> sometimes it's a bug in the SoC ROM code, but the restriction exists.
> The primary cpu of the cluster always needs to be awake.
OK, so that means we need to do the wakeup for technical reasons.
> In addition, it may not be possible to transition secondary cpus from
> C1 to C-multi without waking them. That would generally involve
> cutting power to a CPU that is in clock gating, which is not a
> supported power transition in any SoC that I have a datasheet for. I
> made it work for cpu1 on Tegra2, but I can't guarantee that there are
> not unsolvable HW race conditions.
>
> The only generic way to make this work is to wake up all cpus. Waking
> up a subset of cpus is certainly worth investigating as an
> optimization, but it would not be used on Tegra2, OMAP4, or Exynos5.
> Tegra3 may benefit from it.
OK
> > It looks like this should work without modifying the cpuidle core, but
> > the drawback here is that the cpuidle core doesn't know how much time
> > spend in C4 is really in C1 and how much of it is in C-multi, so the
> > statistics reported by it won't reflect the real energy usage.
>
> Idle statistics are extremely important when determining why a
> particular use case is drawing too much power, and it is worth
> modifying the cpuidle core if only to keep them accurate. Especially
> when justifying the move from the cpufreq hotplug governor based code
> that every SoC vendor uses in their BSP to a proper multi-CPU cpuidle
> implementation.
I see.
Thanks for the explanation,
Rafael
WARNING: multiple messages have this Message-ID (diff)
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Colin Cross <ccross@android.com>
Cc: linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-pm@lists.linux-foundation.org,
Kevin Hilman <khilman@ti.com>, Len Brown <len.brown@intel.com>,
Trinabh Gupta <g.trinabh@gmail.com>,
Arjan van de Ven <arjan@linux.intel.com>,
Deepthi Dharwar <deepthi@linux.vnet.ibm.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
Kay Sievers <kay.sievers@vrfy.org>,
Santosh Shilimkar <santosh.shilimkar@ti.com>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Amit Kucheria <amit.kucheria@linaro.org>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
Arnd Bergmann <arnd.bergmann@linaro.org>,
Russell King <linux@arm.linux.org.uk>,
Len Brown <lenb@kernel.org>
Subject: Re: [PATCHv3 0/5] coupled cpuidle state support
Date: Thu, 3 May 2012 22:43:56 +0200 [thread overview]
Message-ID: <201205032243.56848.rjw@sisk.pl> (raw)
In-Reply-To: <CAMbhsRS8D=d9aAbikosOis+Q+jMU04Oj9xaRg0jMVxtQ7nijTw@mail.gmail.com>
On Thursday, May 03, 2012, Colin Cross wrote:
> On Thu, May 3, 2012 at 1:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> <snip>
> > There are two distinct cases to consider here, (1) when the last I/O
> > device in the domain becomes idle and the question is whether or not to
> > power off the entire domain and (2) when a CPU core in a power domain
> > becomes idle while all of the devices in the domain are idle already.
> >
> > Case (2) is quite straightforward, the .enter() routine for the
> > "domain" C-state has to check whether the domain can be turned off and
> > do it eventually.
> >
> > Case (1) is more difficult and (assuming that all CPU cores in the domain
> > are already idle at this point) i see two possible ways to handle it:
> > (a) Wake up all of the (idle) CPU cores in the domain and let the
> > "domain" C-state's .enter() do the job (ie. turn it into case (2)),
> > similarly to your patchset.
> > (b) If cpuidle has prepared the cores for going into deeper idle,
> > turn the domain off directly without waking up the cores.
>
> Multiple clusters is a design that has been considered in this
> patchset (all the data structures are in the right place to support
> it), and can be supported in the future, but does not exist in any
> current systems that would be using this. In all of today's SoCs,
> there is a single cluster, so (1) can't happen - no code can be
> executing while all cpus are idle.
OK, but I think it should be taken into consideration nonetheless.
> (b) is an optimization that would not be possible on any future SoC
> that is similar to the current SoCs, where "turn the domain off" is
> very tightly integrated with TrustZone secure code running on the
> primary cpu of the cluster.
I see.
> <snip>
>
> > Having considered this for a while I think that it may be more straightforward
> > to avoid waking up the already idled cores.
> >
> > For instance, say we have 4 CPU cores in a cluster (package) such that each
> > core has its own idle state (call it C1) and there is a multicore idle state
> > entered by turning off the entire cluster (call this state C-multi). One of
> > the possible ways to handle this seems to be to use an identical table of
> > C-states for each core containing the C1 entry and a kind of fake entry called
> > (for example) C4 with the time characteristics of C-multi and a special
> > .enter() callback. That callback will prepare the core it is called for to
> > enter C-multi, but instead of simply turning off the whole package it will
> > decrement a counter. If the counte happens to be 0 at this point, the
> > package will be turned off. Otherwise, the core will be put into the idle
> > state corresponding to C1, but it will be ready for entering C-multi at
> > any time. The counter will be incremented on exiting the C4 "state".
>
> I implemented something very similar to this on Tegra2 (having each
> cpu go to C1, but with enough state saved for C-multi), but it turns
> out not to work in hardware. On every existing ARM SMP system where I
> have worked with cpuidle (Tegra2, OMAP4, Exynos5, and some Tegra3),
> only cpu 0 can trigger the transition to C-multi. The cause of this
> restriction is different on every platform - sometimes it's by design,
> sometimes it's a bug in the SoC ROM code, but the restriction exists.
> The primary cpu of the cluster always needs to be awake.
OK, so that means we need to do the wakeup for technical reasons.
> In addition, it may not be possible to transition secondary cpus from
> C1 to C-multi without waking them. That would generally involve
> cutting power to a CPU that is in clock gating, which is not a
> supported power transition in any SoC that I have a datasheet for. I
> made it work for cpu1 on Tegra2, but I can't guarantee that there are
> not unsolvable HW race conditions.
>
> The only generic way to make this work is to wake up all cpus. Waking
> up a subset of cpus is certainly worth investigating as an
> optimization, but it would not be used on Tegra2, OMAP4, or Exynos5.
> Tegra3 may benefit from it.
OK
> > It looks like this should work without modifying the cpuidle core, but
> > the drawback here is that the cpuidle core doesn't know how much time
> > spend in C4 is really in C1 and how much of it is in C-multi, so the
> > statistics reported by it won't reflect the real energy usage.
>
> Idle statistics are extremely important when determining why a
> particular use case is drawing too much power, and it is worth
> modifying the cpuidle core if only to keep them accurate. Especially
> when justifying the move from the cpufreq hotplug governor based code
> that every SoC vendor uses in their BSP to a proper multi-CPU cpuidle
> implementation.
I see.
Thanks for the explanation,
Rafael
next prev parent reply other threads:[~2012-05-03 20:43 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-30 20:09 [PATCHv3 0/5] coupled cpuidle state support Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-04-30 20:09 ` [PATCHv3 1/5] cpuidle: refactor out cpuidle_enter_state Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-05-03 20:50 ` Rafael J. Wysocki
2012-05-03 20:50 ` Rafael J. Wysocki
2012-05-03 20:50 ` Rafael J. Wysocki
2012-04-30 20:09 ` [PATCHv3 2/5] cpuidle: fix error handling in __cpuidle_register_device Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-05-03 20:50 ` Rafael J. Wysocki
2012-05-03 20:50 ` [linux-pm] " Rafael J. Wysocki
2012-05-03 20:50 ` Rafael J. Wysocki
2012-04-30 20:09 ` [PATCHv3 3/5] cpuidle: add support for states that affect multiple cpus Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-05-03 22:14 ` Rafael J. Wysocki
2012-05-03 22:14 ` [linux-pm] " Rafael J. Wysocki
2012-05-03 22:14 ` Rafael J. Wysocki
2012-05-03 23:09 ` Colin Cross
2012-05-03 23:09 ` [linux-pm] " Colin Cross
2012-05-03 23:09 ` Colin Cross
2012-05-04 11:51 ` Rafael J. Wysocki
2012-05-04 11:51 ` [linux-pm] " Rafael J. Wysocki
2012-05-04 11:51 ` Rafael J. Wysocki
2012-05-04 18:56 ` Colin Cross
2012-05-04 18:56 ` [linux-pm] " Colin Cross
2012-05-04 18:56 ` Colin Cross
2012-05-04 22:27 ` Rafael J. Wysocki
2012-05-04 22:27 ` [linux-pm] " Rafael J. Wysocki
2012-05-04 22:27 ` Rafael J. Wysocki
2012-04-30 20:09 ` [PATCHv3 4/5] cpuidle: coupled: add parallel barrier function Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-04-30 20:09 ` [PATCHv3 5/5] cpuidle: coupled: add trace events Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-04-30 20:09 ` Colin Cross
2012-05-03 21:00 ` Steven Rostedt
2012-05-03 21:00 ` Steven Rostedt
2012-05-03 21:00 ` Steven Rostedt
2012-05-03 21:13 ` Colin Cross
2012-05-03 21:13 ` Colin Cross
2012-05-03 21:13 ` Colin Cross
2012-04-30 21:18 ` [PATCHv3 0/5] coupled cpuidle state support Colin Cross
2012-04-30 21:18 ` Colin Cross
2012-04-30 21:18 ` Colin Cross
2012-04-30 21:25 ` Rafael J. Wysocki
2012-04-30 21:25 ` Rafael J. Wysocki
2012-04-30 21:25 ` Rafael J. Wysocki
2012-04-30 21:37 ` Colin Cross
2012-04-30 21:37 ` Colin Cross
2012-04-30 21:37 ` Colin Cross
2012-04-30 21:54 ` Rafael J. Wysocki
2012-04-30 21:54 ` Rafael J. Wysocki
2012-04-30 21:54 ` Rafael J. Wysocki
2012-04-30 22:01 ` Colin Cross
2012-04-30 22:01 ` Colin Cross
2012-04-30 22:01 ` Colin Cross
2012-05-03 20:00 ` Rafael J. Wysocki
2012-05-03 20:00 ` Rafael J. Wysocki
2012-05-03 20:00 ` Rafael J. Wysocki
2012-05-03 20:18 ` Colin Cross
2012-05-03 20:18 ` Colin Cross
2012-05-03 20:18 ` Colin Cross
2012-05-03 20:43 ` Rafael J. Wysocki [this message]
2012-05-03 20:43 ` Rafael J. Wysocki
2012-05-03 20:43 ` Rafael J. Wysocki
2012-05-04 10:04 ` Lorenzo Pieralisi
2012-05-04 10:04 ` Lorenzo Pieralisi
2012-05-04 10:04 ` Lorenzo Pieralisi
2012-05-01 10:43 ` Lorenzo Pieralisi
2012-05-01 10:43 ` Lorenzo Pieralisi
2012-05-01 10:43 ` Lorenzo Pieralisi
2012-05-02 0:11 ` Colin Cross
2012-05-02 0:11 ` Colin Cross
2012-05-02 0:11 ` Colin Cross
2012-05-02 7:22 ` Santosh Shilimkar
2012-05-02 7:22 ` Santosh Shilimkar
2012-05-02 7:22 ` Santosh Shilimkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201205032243.56848.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=amit.kucheria@linaro.org \
--cc=arjan@linux.intel.com \
--cc=arnd.bergmann@linaro.org \
--cc=ccross@android.com \
--cc=gregkh@linuxfoundation.org \
--cc=kay.sievers@vrfy.org \
--cc=khilman@ti.com \
--cc=len.brown@intel.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=linux@arm.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.