* Re: [RFC PATCH v2 0/3] New device-tree format and Opal based idle save-restore
[not found] <20181011132237.14604-1-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:55 ` Frank Rowand
[not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
` (2 subsequent siblings)
3 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:55 UTC (permalink / raw)
To: Akshay Adiga, linux-kernel, linuxppc-dev,
devicetree@vger.kernel.org
Cc: huntbag, npiggin, benh, mpe, ego
+ devicetree mail list
On 10/11/18 06:22, Akshay Adiga wrote:
> Previously if a older kernel runs on a newer firmware, it may enable
> all available states irrespective of its capability of handling it.
> New device tree format adds a compatible flag, so that only kernel
> which has the capability to handle the version of stop state will enable
> it.
>
> Older kernel will still see stop0 and stop0_lite in older format and we
> will depricate it after some time.
>
> 1) Idea is to bump up the version string in firmware if we find a bug or
> regression in stop states. A fix will be provided in linux which would
> now know about the bumped up version of stop states, where as kernel
> without fixes would ignore the states.
>
> 2) Slowly deprecate cpuidle/cpuhotplug threshold which is hard-coded
> into cpuidle-powernv driver. Instead use compatible strings to indicate
> if idle state is suitable for cpuidle and hotplug.
>
> New idle state device tree format :
> power-mgt {
> ...
> ibm,enabled-stop-levels = <0xec000000>;
> ibm,cpu-idle-state-psscr-mask = <0x0 0x3003ff 0x0 0x3003ff>;
> ibm,cpu-idle-state-latencies-ns = <0x3e8 0x7d0>;
> ibm,cpu-idle-state-psscr = <0x0 0x330 0x0 0x300330>;
> ibm,cpu-idle-state-flags = <0x100000 0x101000>;
> ibm,cpu-idle-state-residency-ns = <0x2710 0x4e20>;
> ibm,idle-states {
> stop4 {
> flags = <0x207000>;
> compatible = "ibm,state-v1",
> "opal-support";
> type = "cpuidle";
> psscr-mask = <0x0 0x3003ff>;
> handle = <0x102>;
> latency-ns = <0x186a0>;
> residency-ns = <0x989680>;
> psscr = <0x0 0x300374>;
> };
> ...
> stop11 {
> ...
> compatible = "ibm,state-v1",
> "opal-support";
> type = "cpuoffline";
> ...
> };
> };
>
> High-level parsing algorithm :
>
> Say Known version string = "ibm,state-v1"
>
> for each stop state node in device tree:
> if (compatible has known version string)
> kernel takes care of stop-transitions
> else if (compatible has "opal-support")
> OPAL takes care of stop-transitions
> else
> Skip All deeper states
>
> When a state does not have both version support and opal support,
> Its possible to exit from a shallower state. Hence skipping all
> deeper states.
>
> OPAL support for idle states
> ----------------------------
>
> With this patch series, all the states that loose hypervisor state
> will be handled through opal_call.
>
> Patch 3 adds support for Saving/restoring of SPRs and resync-timebase
> in OPAL. Also all the decision making such as identifying first thread
> in the core and taking locks before restoring, etc are implemented in
> OPAL.
>
> How does it work ?
> -------------------
>
> Consider a case that stop4 has a bug. We take the following steps to
> mitigate the problem.
>
> 1) Change compatible string for stop4 in OPAL to "ibm-state-v2" and
> remove "opal-supported". ship the new firmware.
> The kernel ignores stop4 and all deeper states. But we will still have
> shallower states. Prevents from completely disabling stop states.
>
> 2) Implement workaround in OPAL and add "opal-supported". Ship new firmware
> The kernel uses opal for stop-transtion , which has workaround implemented.
> We get stop4 and deeper states working without kernel changes and backports.
> (and considerably less time)
>
> 3) Implement workaround in kernel and add "ibm-state-v2" as known versions
> The kernel will now be able to handle stop4 and deeper states.
>
> Changes from v1 :
> - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
> idle code in C"
> http://patchwork.ozlabs.org/patch/969596/
> - All the states that loses hypervisor states will be handled by OPAL
> - All the decision making such as identifying first thread in
> the core and taking locks before restoring in such cases have also been
> moved to OPAL
>
>
> Abhishek Goel (1):
> cpuidle/powernv: save-restore sprs in opal
>
> Akshay Adiga (2):
> cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1
> powernv/cpuidle: Pass pointers instead of values to stop loop
>
> arch/powerpc/include/asm/cpuidle.h | 9 +
> arch/powerpc/include/asm/opal-api.h | 4 +-
> arch/powerpc/include/asm/opal.h | 3 +
> arch/powerpc/include/asm/processor.h | 8 +-
> arch/powerpc/kernel/idle_book3s.S | 6 +-
> arch/powerpc/platforms/powernv/idle.c | 247 ++++++++++++++----
> .../powerpc/platforms/powernv/opal-wrappers.S | 2 +
> drivers/cpuidle/cpuidle-powernv.c | 46 ++--
> 8 files changed, 251 insertions(+), 74 deletions(-)
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH v2 1/3] cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1
[not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:55 ` Frank Rowand
0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:55 UTC (permalink / raw)
To: Akshay Adiga, linux-kernel, linuxppc-dev,
devicetree@vger.kernel.org
Cc: huntbag, npiggin, benh, mpe, ego
+ devicetree mail list
On 10/11/18 06:22, Akshay Adiga wrote:
> This patch adds support for new device-tree format for idle state
> description.
>
> Previously if a older kernel runs on a newer firmware, it may enable
> all available states irrespective of its capability of handling it.
> New device tree format adds a compatible flag, so that only kernel
> which has the capability to handle the version of stop state will enable
> it.
>
> Older kernel will still see stop0 and stop0_lite in older format and we
> will depricate it after some time.
>
> 1) Idea is to bump up the version in firmware if we find a bug or
> regression in stop states. A fix will be provided in linux which would
> now know about the bumped up version of stop states, where as kernel
> without fixes would ignore the states.
>
> 2) Slowly deprecate cpuidle /cpuhotplug threshold which is hard-coded
> into cpuidle-powernv driver. Instead use compatible strings to indicate
> if idle state is suitable for cpuidle and hotplug.
>
> New idle state device tree format :
> power-mgt {
> ...
> ibm,enabled-stop-levels = <0xec000000>;
> ibm,cpu-idle-state-psscr-mask = <0x0 0x3003ff 0x0 0x3003ff>;
> ibm,cpu-idle-state-latencies-ns = <0x3e8 0x7d0>;
> ibm,cpu-idle-state-psscr = <0x0 0x330 0x0 0x300330>;
> ibm,cpu-idle-state-flags = <0x100000 0x101000>;
> ibm,cpu-idle-state-residency-ns = <0x2710 0x4e20>;
> ibm,idle-states {
> stop4 {
> flags = <0x207000>;
> compatible = "ibm,state-v1",
> "opal-supported";
> type = "cpuidle";
> psscr-mask = <0x0 0x3003ff>;
> handle = <0x102>;
> latency-ns = <0x186a0>;
> residency-ns = <0x989680>;
> psscr = <0x0 0x300374>;
> };
> ...
> stop11 {
> ...
> compatible = "ibm,state-v1",
> "opal-supported";
> type = "cpuoffline";
> ...
> };
> };
> type strings :
> "cpuidle" : indicates it should be used by cpuidle-driver
> "cpuoffline" : indicates it should be used by hotplug driver
>
> compatible strings :
> "ibm,state-v1" : kernel checks if it knows about this version
> "opal-supported" : indicates kernel can fall back to use opal
> for stop-transitions
>
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> ---
>
> Changes from v1 :
> - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
> idle code in C"
> - Moved "cpuidle" and "cpuoffline" as seperate property called
> "type"
>
>
> arch/powerpc/include/asm/cpuidle.h | 9 ++
> arch/powerpc/platforms/powernv/idle.c | 132 +++++++++++++++++++++++++-
> drivers/cpuidle/cpuidle-powernv.c | 31 ++++--
> 3 files changed, 160 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
> index 9844b3ded187..e920a15e797f 100644
> --- a/arch/powerpc/include/asm/cpuidle.h
> +++ b/arch/powerpc/include/asm/cpuidle.h
> @@ -70,14 +70,23 @@
>
> #ifndef __ASSEMBLY__
>
> +enum idle_state_type_t {
> + CPUIDLE_TYPE,
> + CPUOFFLINE_TYPE
> +};
> +
> +#define POWERNV_THRESHOLD_LATENCY_NS 200000
> +#define PNV_VER_NAME_LEN 32
> #define PNV_IDLE_NAME_LEN 16
> struct pnv_idle_states_t {
> char name[PNV_IDLE_NAME_LEN];
> + char version[PNV_VER_NAME_LEN];
> u32 latency_ns;
> u32 residency_ns;
> u64 psscr_val;
> u64 psscr_mask;
> u32 flags;
> + enum idle_state_type_t type;
> bool valid;
> };
>
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 96186af9e953..755918402591 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -54,6 +54,20 @@ static bool default_stop_found;
> static u64 pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
> static u64 pnv_first_hv_loss_level = MAX_STOP_STATE + 1;
>
> +
> +static int parse_dt_v1(struct device_node *np);
> +struct stop_version_t {
> + const char name[PNV_VER_NAME_LEN];
> + int (*parser_fn)(struct device_node *np);
> +};
> +struct stop_version_t known_versions[] = {
> + {
> + .name = "ibm,state-v1",
> + .parser_fn = parse_dt_v1,
> + }
> + };
> +const int nr_known_versions = 1;
> +
> /*
> * psscr value and mask of the deepest stop idle state.
> * Used when a cpu is offlined.
> @@ -1195,6 +1209,77 @@ static void __init pnv_probe_idle_states(void)
> supported_cpuidle_states |= pnv_idle_states[i].flags;
> }
>
> +static int parse_dt_v1(struct device_node *dt_node)
> +{
> + const char *temp_str;
> + int rc;
> + int i = nr_pnv_idle_states;
> +
> + if (!dt_node) {
> + pr_err("Invalid device_node\n");
> + return -EINVAL;
> + }
> +
> + rc = of_property_read_string(dt_node, "name", &temp_str);
> + if (rc) {
> + pr_err("error reading names rc= %d\n", rc);
> + return -EINVAL;
> + }
> + strncpy(pnv_idle_states[i].name, temp_str, PNV_IDLE_NAME_LEN);
> + rc = of_property_read_u32(dt_node, "residency-ns",
> + &pnv_idle_states[i].residency_ns);
> + if (rc) {
> + pr_err("error reading residency rc= %d\n", rc);
> + return -EINVAL;
> + }
> + rc = of_property_read_u32(dt_node, "latency-ns",
> + &pnv_idle_states[i].latency_ns);
> + if (rc) {
> + pr_err("error reading latency rc= %d\n", rc);
> + return -EINVAL;
> + }
> + rc = of_property_read_u32(dt_node, "flags",
> + &pnv_idle_states[i].flags);
> + if (rc) {
> + pr_err("error reading flags rc= %d\n", rc);
> + return -EINVAL;
> + }
> +
> + /* We are not expecting power8 device-tree in this format */
> + rc = of_property_read_u64(dt_node, "psscr-mask",
> + &pnv_idle_states[i].psscr_mask);
> + if (rc) {
> + pr_err("error reading psscr-mask rc= %d\n", rc);
> + return -EINVAL;
> + }
> + rc = of_property_read_u64(dt_node, "psscr",
> + &pnv_idle_states[i].psscr_val);
> + if (rc) {
> + pr_err("error reading psscr rc= %d\n", rc);
> + return -EINVAL;
> + }
> +
> + /*
> + * TODO : save the version strings in data structure
> + */
> + rc = of_property_read_string(dt_node, "type", &temp_str);
> + pr_info("type = %s\n", temp_str);
> + if (rc) {
> + pr_err("error reading type rc= %d\n", rc);
> + return -EINVAL;
> + }
> + if (strcmp(temp_str, "cpuidle") == 0)
> + pnv_idle_states[i].type = CPUIDLE_TYPE;
> + else if (strcmp(temp_str, "cpuoffline") == 0)
> + pnv_idle_states[i].type = CPUOFFLINE_TYPE;
> + else {
> + pr_err("Invalid type skipping %s\n",
> + pnv_idle_states[i].name);
> + return -EINVAL;
> + }
> + return 0;
> +
> +}
> /*
> * This function parses device-tree and populates all the information
> * into pnv_idle_states structure. It also sets up nr_pnv_idle_states
> @@ -1203,8 +1288,9 @@ static void __init pnv_probe_idle_states(void)
>
> static int pnv_parse_cpuidle_dt(void)
> {
> - struct device_node *np;
> + struct device_node *np, *np1, *dt_node;
> int nr_idle_states, i;
> + int additional_states = 0;
> int rc = 0;
> u32 *temp_u32;
> u64 *temp_u64;
> @@ -1218,8 +1304,14 @@ static int pnv_parse_cpuidle_dt(void)
> nr_idle_states = of_property_count_u32_elems(np,
> "ibm,cpu-idle-state-flags");
>
> - pnv_idle_states = kcalloc(nr_idle_states, sizeof(*pnv_idle_states),
> - GFP_KERNEL);
> + np1 = of_find_node_by_path("/ibm,opal/power-mgt/ibm,idle-states");
> + if (np1) {
> + for_each_child_of_node(np1, dt_node)
> + additional_states++;
> + }
> + pr_info("states in new format : %d\n", additional_states);
> + pnv_idle_states = kcalloc(nr_idle_states + additional_states,
> + sizeof(*pnv_idle_states), GFP_KERNEL);
> temp_u32 = kcalloc(nr_idle_states, sizeof(u32), GFP_KERNEL);
> temp_u64 = kcalloc(nr_idle_states, sizeof(u64), GFP_KERNEL);
> temp_string = kcalloc(nr_idle_states, sizeof(char *), GFP_KERNEL);
> @@ -1298,8 +1390,40 @@ static int pnv_parse_cpuidle_dt(void)
> for (i = 0; i < nr_idle_states; i++)
> strlcpy(pnv_idle_states[i].name, temp_string[i],
> PNV_IDLE_NAME_LEN);
> +
> + /* Mark states as CPUIDLE_TYPE /CPUOFFLINE for older version*/
> + for (i = 0; i < nr_idle_states; i++) {
> + if (pnv_idle_states[i].latency_ns > POWERNV_THRESHOLD_LATENCY_NS)
> + pnv_idle_states[i].type = CPUOFFLINE_TYPE;
> + else
> + pnv_idle_states[i].type = CPUIDLE_TYPE;
> + }
> nr_pnv_idle_states = nr_idle_states;
> - rc = 0;
> + /* Parsing node-based idle states device-tree format */
> + if (!np1) {
> + pr_info("dt does not contain ibm,idle_states");
> + goto out;
> + }
> + /* Parse each child node with appropriate parser_fn */
> + for_each_child_of_node(np1, dt_node) {
> + bool found_known_version = false;
> + /* we don't have state falling back to opal*/
> + for (i = 0; i < nr_known_versions ; i++) {
> + if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> + rc = known_versions[i].parser_fn(dt_node);
> + if (rc) {
> + pr_err("%s could not parse\n", known_versions[i].name);
> + continue;
> + }
> + found_known_version = true;
> + }
> + }
> + if (!found_known_version) {
> + pr_info("Unsupported state, skipping all further state\n");
> + goto out;
> + }
> + nr_pnv_idle_states++;
> + }
> out:
> kfree(temp_u32);
> kfree(temp_u64);
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index 84b1ebe212b3..a15514ebd1c3 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -26,7 +26,6 @@
> * Expose only those Hardware idle states via the cpuidle framework
> * that have latency value below POWERNV_THRESHOLD_LATENCY_NS.
> */
> -#define POWERNV_THRESHOLD_LATENCY_NS 200000
>
> static struct cpuidle_driver powernv_idle_driver = {
> .name = "powernv_idle",
> @@ -265,7 +264,7 @@ extern u32 pnv_get_supported_cpuidle_states(void);
> static int powernv_add_idle_states(void)
> {
> int nr_idle_states = 1; /* Snooze */
> - int dt_idle_states;
> + int dt_idle_states = 0;
> u32 has_stop_states = 0;
> int i;
> u32 supported_flags = pnv_get_supported_cpuidle_states();
> @@ -277,14 +276,19 @@ static int powernv_add_idle_states(void)
> goto out;
> }
>
> - /* TODO: Count only states which are eligible for cpuidle */
> - dt_idle_states = nr_pnv_idle_states;
> + /* Count only cpuidle states*/
> + for (i = 0; i < nr_pnv_idle_states; i++) {
> + if (pnv_idle_states[i].type == CPUIDLE_TYPE)
> + dt_idle_states++;
> + }
> + pr_info("idle states in dt = %d , states with idle flag = %d",
> + nr_pnv_idle_states, dt_idle_states);
>
> /*
> * Since snooze is used as first idle state, max idle states allowed is
> * CPUIDLE_STATE_MAX -1
> */
> - if (nr_pnv_idle_states > CPUIDLE_STATE_MAX - 1) {
> + if (dt_idle_states > CPUIDLE_STATE_MAX - 1) {
> pr_warn("cpuidle-powernv: discovered idle states more than allowed");
> dt_idle_states = CPUIDLE_STATE_MAX - 1;
> }
> @@ -305,8 +309,15 @@ static int powernv_add_idle_states(void)
> * Skip the platform idle state whose flag isn't in
> * the supported_cpuidle_states flag mask.
> */
> - if ((state->flags & supported_flags) != state->flags)
> + if ((state->flags & supported_flags) != state->flags) {
> + pr_warn("State %d does not have supported flag\n", i);
> + continue;
> + }
> + if (state->type != CPUIDLE_TYPE) {
> + pr_info("State %d is not idletype, it of %d type\n", i,
> + state->type);
> continue;
> + }
> /*
> * If an idle state has exit latency beyond
> * POWERNV_THRESHOLD_LATENCY_NS then don't use it
> @@ -321,8 +332,10 @@ static int powernv_add_idle_states(void)
> exit_latency = DIV_ROUND_UP(state->latency_ns, 1000);
> target_residency = DIV_ROUND_UP(state->residency_ns, 1000);
>
> - if (has_stop_states && !(state->valid))
> + if (has_stop_states && !(state->valid)) {
> + pr_warn("State %d is invalid\n", i);
> continue;
> + }
>
> if (state->flags & OPAL_PM_TIMEBASE_STOP)
> stops_timebase = true;
> @@ -360,8 +373,10 @@ static int powernv_add_idle_states(void)
> state->psscr_mask);
> }
> #endif
> - else
> + else {
> + pr_warn("cpuidle-powernv : could not add state\n");
> continue;
> + }
> nr_idle_states++;
> }
> out:
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH v2 2/3] powernv/cpuidle: Pass pointers instead of values to stop loop
[not found] ` <20181011132237.14604-3-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:56 ` Frank Rowand
0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:56 UTC (permalink / raw)
To: Akshay Adiga, linux-kernel, linuxppc-dev,
devicetree@vger.kernel.org
Cc: huntbag, npiggin, benh, mpe, ego
+ devicetree mail list
On 10/11/18 06:22, Akshay Adiga wrote:
> Passing pointer to the pnv_idle_state instead of psscr value and mask.
> This helps us to pass more information to the stop loop. This will help to
> figure out the method to enter/exit idle state.
>
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
>
> ---
> Changes from v1 :
> - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
> idle code in C"
>
> arch/powerpc/include/asm/processor.h | 5 ++-
> arch/powerpc/platforms/powernv/idle.c | 47 ++++++++++-----------------
> drivers/cpuidle/cpuidle-powernv.c | 15 +++------
> 3 files changed, 24 insertions(+), 43 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 936795acba48..822d3236ad7f 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -43,6 +43,7 @@
> #include <asm/thread_info.h>
> #include <asm/ptrace.h>
> #include <asm/hw_breakpoint.h>
> +#include <asm/cpuidle.h>
>
> /* We do _not_ want to define new machine types at all, those must die
> * in favor of using the device-tree
> @@ -518,9 +519,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
> extern int powersave_nap; /* set if nap mode can be used in idle loop */
>
> extern void power7_idle_type(unsigned long type);
> -extern void power9_idle_type(unsigned long stop_psscr_val,
> - unsigned long stop_psscr_mask);
> -
> +extern void power9_idle_type(struct pnv_idle_states_t *state);
> extern void flush_instruction_cache(void);
> extern void hard_reset_now(void);
> extern void poweroff_now(void);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 755918402591..681a23a066bb 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -44,8 +44,7 @@ int nr_pnv_idle_states;
> * The default stop state that will be used by ppc_md.power_save
> * function on platforms that support stop instruction.
> */
> -static u64 pnv_default_stop_val;
> -static u64 pnv_default_stop_mask;
> +struct pnv_idle_states_t *pnv_default_state;
> static bool default_stop_found;
>
> /*
> @@ -72,9 +71,7 @@ const int nr_known_versions = 1;
> * psscr value and mask of the deepest stop idle state.
> * Used when a cpu is offlined.
> */
> -static u64 pnv_deepest_stop_psscr_val;
> -static u64 pnv_deepest_stop_psscr_mask;
> -static u64 pnv_deepest_stop_flag;
> +static struct pnv_idle_states_t *pnv_deepest_state;
> static bool deepest_stop_found;
>
> static unsigned long power7_offline_type;
> @@ -96,7 +93,7 @@ static int pnv_save_sprs_for_deep_states(void)
> uint64_t hid5_val = mfspr(SPRN_HID5);
> uint64_t hmeer_val = mfspr(SPRN_HMEER);
> uint64_t msr_val = MSR_IDLE;
> - uint64_t psscr_val = pnv_deepest_stop_psscr_val;
> + uint64_t psscr_val = pnv_deepest_state->psscr_val;
>
> for_each_present_cpu(cpu) {
> uint64_t pir = get_hard_smp_processor_id(cpu);
> @@ -820,17 +817,15 @@ static unsigned long power9_offline_stop(unsigned long psscr)
> return srr1;
> }
>
> -static unsigned long __power9_idle_type(unsigned long stop_psscr_val,
> - unsigned long stop_psscr_mask)
> +static unsigned long __power9_idle_type(struct pnv_idle_states_t *state)
> {
> unsigned long psscr;
> unsigned long srr1;
>
> if (!prep_irq_for_idle_irqsoff())
> return 0;
> -
> psscr = mfspr(SPRN_PSSCR);
> - psscr = (psscr & ~stop_psscr_mask) | stop_psscr_val;
> + psscr = (psscr & ~state->psscr_mask) | state->psscr_val;
>
> __ppc64_runlatch_off();
> srr1 = power9_idle_stop(psscr, true);
> @@ -841,12 +836,10 @@ static unsigned long __power9_idle_type(unsigned long stop_psscr_val,
> return srr1;
> }
>
> -void power9_idle_type(unsigned long stop_psscr_val,
> - unsigned long stop_psscr_mask)
> +void power9_idle_type(struct pnv_idle_states_t *state)
> {
> unsigned long srr1;
> -
> - srr1 = __power9_idle_type(stop_psscr_val, stop_psscr_mask);
> + srr1 = __power9_idle_type(state);
> irq_set_pending_from_srr1(srr1);
> }
>
> @@ -855,7 +848,7 @@ void power9_idle_type(unsigned long stop_psscr_val,
> */
> void power9_idle(void)
> {
> - power9_idle_type(pnv_default_stop_val, pnv_default_stop_mask);
> + power9_idle_type(pnv_default_state);
> }
>
> #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> @@ -974,8 +967,8 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
> unsigned long psscr;
>
> psscr = mfspr(SPRN_PSSCR);
> - psscr = (psscr & ~pnv_deepest_stop_psscr_mask) |
> - pnv_deepest_stop_psscr_val;
> + psscr = (psscr & ~pnv_deepest_state->psscr_mask) |
> + pnv_deepest_state->psscr_val;
> srr1 = power9_offline_stop(psscr);
> } else if (cpu_has_feature(CPU_FTR_ARCH_206) && power7_offline_type) {
> srr1 = power7_offline();
> @@ -1123,16 +1116,13 @@ static void __init pnv_power9_idle_init(void)
>
> if (max_residency_ns < state->residency_ns) {
> max_residency_ns = state->residency_ns;
> - pnv_deepest_stop_psscr_val = state->psscr_val;
> - pnv_deepest_stop_psscr_mask = state->psscr_mask;
> - pnv_deepest_stop_flag = state->flags;
> + pnv_deepest_state = state;
> deepest_stop_found = true;
> }
>
> if (!default_stop_found &&
> (state->flags & OPAL_PM_STOP_INST_FAST)) {
> - pnv_default_stop_val = state->psscr_val;
> - pnv_default_stop_mask = state->psscr_mask;
> + pnv_default_state = state;
> default_stop_found = true;
> WARN_ON(state->flags & OPAL_PM_LOSE_FULL_CONTEXT);
> }
> @@ -1143,15 +1133,15 @@ static void __init pnv_power9_idle_init(void)
> } else {
> ppc_md.power_save = power9_idle;
> pr_info("cpuidle-powernv: Default stop: psscr = 0x%016llx,mask=0x%016llx\n",
> - pnv_default_stop_val, pnv_default_stop_mask);
> + pnv_default_state->psscr_val, pnv_default_state->psscr_mask);
> }
>
> if (unlikely(!deepest_stop_found)) {
> pr_warn("cpuidle-powernv: No suitable stop state for CPU-Hotplug. Offlined CPUs will busy wait");
> } else {
> pr_info("cpuidle-powernv: Deepest stop: psscr = 0x%016llx,mask=0x%016llx\n",
> - pnv_deepest_stop_psscr_val,
> - pnv_deepest_stop_psscr_mask);
> + pnv_deepest_state->psscr_val,
> + pnv_deepest_state->psscr_mask);
> }
>
> pr_info("cpuidle-powernv: First stop level that may lose SPRs = 0x%lld\n",
> @@ -1173,16 +1163,15 @@ static void __init pnv_disable_deep_states(void)
> pr_warn("cpuidle-powernv: Idle power-savings, CPU-Hotplug affected\n");
>
> if (cpu_has_feature(CPU_FTR_ARCH_300) &&
> - (pnv_deepest_stop_flag & OPAL_PM_LOSE_FULL_CONTEXT)) {
> + (pnv_deepest_state->flags & OPAL_PM_LOSE_FULL_CONTEXT)) {
> /*
> * Use the default stop state for CPU-Hotplug
> * if available.
> */
> if (default_stop_found) {
> - pnv_deepest_stop_psscr_val = pnv_default_stop_val;
> - pnv_deepest_stop_psscr_mask = pnv_default_stop_mask;
> + pnv_deepest_state = pnv_default_state;
> pr_warn("cpuidle-powernv: Offlined CPUs will stop with psscr = 0x%016llx\n",
> - pnv_deepest_stop_psscr_val);
> + pnv_deepest_state->psscr_val);
> } else { /* Fallback to snooze loop for CPU-Hotplug */
> deepest_stop_found = false;
> pr_warn("cpuidle-powernv: Offlined CPUs will busy wait\n");
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index a15514ebd1c3..5116d5991d30 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -35,13 +35,7 @@ static struct cpuidle_driver powernv_idle_driver = {
> static int max_idle_state __read_mostly;
> static struct cpuidle_state *cpuidle_state_table __read_mostly;
>
> -struct stop_psscr_table {
> - u64 val;
> - u64 mask;
> -};
> -
> -static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX] __read_mostly;
> -
> +struct pnv_idle_states_t idx_to_state_ptr[CPUIDLE_STATE_MAX] __read_mostly;
> static u64 default_snooze_timeout __read_mostly;
> static bool snooze_timeout_en __read_mostly;
>
> @@ -143,8 +137,9 @@ static int stop_loop(struct cpuidle_device *dev,
> struct cpuidle_driver *drv,
> int index)
> {
> - power9_idle_type(stop_psscr_table[index].val,
> - stop_psscr_table[index].mask);
> + struct pnv_idle_states_t *state;
> + state = &pnv_idle_states[index];
> + power9_idle_type(state);
> return index;
> }
>
> @@ -242,8 +237,6 @@ static inline void add_powernv_state(int index, const char *name,
> powernv_states[index].exit_latency = exit_latency;
> powernv_states[index].enter = idle_fn;
> /* For power8 and below psscr_* will be 0 */
> - stop_psscr_table[index].val = psscr_val;
> - stop_psscr_table[index].mask = psscr_mask;
> }
>
> /*
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH v2 3/3] cpuidle/powernv: save-restore sprs in opal
[not found] ` <20181011132237.14604-4-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:56 ` Frank Rowand
0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:56 UTC (permalink / raw)
To: Akshay Adiga, linux-kernel, linuxppc-dev,
devicetree@vger.kernel.org
Cc: huntbag, npiggin, benh, mpe, ego
+ devicetree mail list
On 10/11/18 06:22, Akshay Adiga wrote:
> From: Abhishek Goel <huntbag@linux.vnet.ibm.com>
>
> This patch moves the saving and restoring of sprs for P9 cpuidle
> from kernel to opal.
> In an attempt to make the powernv idle code backward compatible,
> and to some extent forward compatible, add support for pre-stop entry
> and post-stop exit actions in OPAL. If a kernel knows about this
> opal call, then just a firmware supporting newer hardware is required,
> instead of waiting for kernel updates.
>
> Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> ---
> Changes from v1 :
> - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
> idle code in C"
> - Set a global variable "request_opal_call" to indicate that deep
> states should make opal_call.
> - All the states that loses hypervisor states will be handled by OPAL
> - All the decision making such as identifying first thread in
> the core and taking locks before restoring in such cases have also been
> moved to OPAL
> arch/powerpc/include/asm/opal-api.h | 4 +-
> arch/powerpc/include/asm/opal.h | 3 +
> arch/powerpc/include/asm/processor.h | 3 +-
> arch/powerpc/kernel/idle_book3s.S | 6 +-
> arch/powerpc/platforms/powernv/idle.c | 88 +++++++++++++------
> .../powerpc/platforms/powernv/opal-wrappers.S | 2 +
> 6 files changed, 77 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 8365353330b4..93ea1f79e295 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -210,7 +210,9 @@
> #define OPAL_PCI_GET_PBCQ_TUNNEL_BAR 164
> #define OPAL_PCI_SET_PBCQ_TUNNEL_BAR 165
> #define OPAL_NX_COPROC_INIT 167
> -#define OPAL_LAST 167
> +#define OPAL_IDLE_SAVE 170
> +#define OPAL_IDLE_RESTORE 171
> +#define OPAL_LAST 171
>
> #define QUIESCE_HOLD 1 /* Spin all calls at entry */
> #define QUIESCE_REJECT 2 /* Fail all calls with OPAL_BUSY */
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index ff3866473afe..26995e16171e 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -356,6 +356,9 @@ extern int opal_handle_hmi_exception(struct pt_regs *regs);
> extern void opal_shutdown(void);
> extern int opal_resync_timebase(void);
>
> +extern int opal_cpuidle_save(u64 psscr);
> +extern int opal_cpuidle_restore(u64 psscr, u64 srr1);
> +
> extern void opal_lpc_init(void);
>
> extern void opal_kmsg_init(void);
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 822d3236ad7f..26fa6c1836f4 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -510,7 +510,8 @@ static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
>
> /* asm stubs */
> extern unsigned long isa300_idle_stop_noloss(unsigned long psscr_val);
> -extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val);
> +extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val,
> + bool request_opal_call);
> extern unsigned long isa206_idle_insn_mayloss(unsigned long type);
>
> extern unsigned long cpuidle_disable;
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index ffdee1ab4388..a2014d152035 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -52,14 +52,16 @@ _GLOBAL(isa300_idle_stop_noloss)
> _GLOBAL(isa300_idle_stop_mayloss)
> mtspr SPRN_PSSCR,r3
> std r1,PACAR1(r13)
> - mflr r4
> + mflr r7
> mfcr r5
> /* use stack red zone rather than a new frame */
> addi r6,r1,-INT_FRAME_SIZE
> SAVE_GPR(2, r6)
> SAVE_NVGPRS(r6)
> - std r4,_LINK(r6)
> + std r7,_LINK(r6)
> std r5,_CCR(r6)
> + cmpwi r4,0
> + bne opal_cpuidle_save
> PPC_STOP
> b . /* catch bugs */
>
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 681a23a066bb..bcfe08022e65 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -171,6 +171,7 @@ static void pnv_fastsleep_workaround_apply(void *info)
>
> static bool power7_fastsleep_workaround_entry = true;
> static bool power7_fastsleep_workaround_exit = true;
> +static bool request_opal_call = false;
>
> /*
> * Used to store fastsleep workaround state
> @@ -604,6 +605,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> unsigned long mmcr0 = 0;
> struct p9_sprs sprs;
> bool sprs_saved = false;
> + bool is_hv_loss = false;
>
> memset(&sprs, 0, sizeof(sprs));
>
> @@ -648,7 +650,9 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> */
> mmcr0 = mfspr(SPRN_MMCR0);
> }
> - if ((psscr & PSSCR_RL_MASK) >= pnv_first_hv_loss_level) {
> +
> + is_hv_loss = (psscr & PSSCR_RL_MASK) >= pnv_first_hv_loss_level;
> + if (is_hv_loss && (!request_opal_call)) {
> sprs.lpcr = mfspr(SPRN_LPCR);
> sprs.hfscr = mfspr(SPRN_HFSCR);
> sprs.fscr = mfspr(SPRN_FSCR);
> @@ -674,7 +678,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> atomic_start_thread_idle();
> }
>
> - srr1 = isa300_idle_stop_mayloss(psscr);
> + srr1 = isa300_idle_stop_mayloss(psscr,
> + is_hv_loss && request_opal_call);
>
> #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> local_paca->requested_psscr = 0;
> @@ -685,6 +690,25 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> WARN_ON_ONCE(!srr1);
> WARN_ON_ONCE(mfmsr() & (MSR_IR|MSR_DR));
>
> + /*
> + * On POWER9, SRR1 bits do not match exactly as expected.
> + * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> + * always test PSSCR if there is any state loss.
> + */
> + if (likely(((psscr & PSSCR_PLS) >> 60) < pnv_first_hv_loss_level)) {
> + if (sprs_saved)
> + atomic_stop_thread_idle();
> + goto out;
> + }
> +
> + if (request_opal_call) {
> + opal_cpuidle_restore(psscr, srr1);
> + goto opal_return;
> + }
> +
> + /* HV state loss */
> + BUG_ON(!sprs_saved);
> +
> if ((srr1 & SRR1_WAKESTATE) != SRR1_WS_NOLOSS) {
> unsigned long mmcra;
>
> @@ -712,19 +736,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
> if (unlikely((srr1 & SRR1_WAKEMASK_P8) == SRR1_WAKEHMI))
> hmi_exception_realmode(NULL);
>
> - /*
> - * On POWER9, SRR1 bits do not match exactly as expected.
> - * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> - * always test PSSCR if there is any state loss.
> - */
> - if (likely((psscr & PSSCR_RL_MASK) < pnv_first_hv_loss_level)) {
> - if (sprs_saved)
> - atomic_stop_thread_idle();
> - goto out;
> - }
> -
> - /* HV state loss */
> - BUG_ON(!sprs_saved);
>
> atomic_lock_thread_idle();
>
> @@ -771,6 +782,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>
> mtspr(SPRN_SPRG3, local_paca->sprg_vdso);
>
> +opal_return:
> if (!radix_enabled())
> __slb_restore_bolted_realmode();
>
> @@ -1284,6 +1296,7 @@ static int pnv_parse_cpuidle_dt(void)
> u32 *temp_u32;
> u64 *temp_u64;
> const char **temp_string;
> + bool fall_back_to_opal = false;
>
> np = of_find_node_by_path("/ibm,opal/power-mgt");
> if (!np) {
> @@ -1396,23 +1409,48 @@ static int pnv_parse_cpuidle_dt(void)
> /* Parse each child node with appropriate parser_fn */
> for_each_child_of_node(np1, dt_node) {
> bool found_known_version = false;
> - /* we don't have state falling back to opal*/
> - for (i = 0; i < nr_known_versions ; i++) {
> - if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> - rc = known_versions[i].parser_fn(dt_node);
> + if (!fall_back_to_opal) {
> + /* we don't have state falling back to opal*/
> + for (i = 0; i < nr_known_versions ; i++) {
> + if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> + rc = known_versions[i].parser_fn(dt_node);
> + if (rc) {
> + pr_err("%s could not parse\n", known_versions[i].name);
> + continue;
> + }
> + found_known_version = true;
> + }
> + }
> + }
> +
> + /*
> + * If any previous state falls back to opal_call
> + * Then all futher states will either call opal_call
> + * or not be included for cpuidle/cpuoffline.
> + *
> + * Moreover, having any intermediate state with no
> + * kernel support or opal support can be potentially
> + * dangerous, as hardware can potentially wakeup from
> + * that state. Hence, no futher states are added to
> + * to cpuidle/cpuoffline
> + */
> + if (!found_known_version || fall_back_to_opal) {
> + if (of_device_is_compatible(dt_node, "opal-support")) {
> + rc = known_versions[0].parser_fn(dt_node);
> if (rc) {
> - pr_err("%s could not parse\n", known_versions[i].name);
> + pr_err("%s could not parse\n", "opal-support");
> continue;
> }
> - found_known_version = true;
> + fall_back_to_opal = true;
> + } else {
> + pr_info("Unsupported state, skipping all further state\n");
> + goto out;
> }
> }
> - if (!found_known_version) {
> - pr_info("Unsupported state, skipping all further state\n");
> - goto out;
> - }
> nr_pnv_idle_states++;
> }
> + if (fall_back_to_opal)
> + request_opal_call = true;
> out:
> kfree(temp_u32);
> kfree(temp_u64);
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index 251528231a9e..7a039a81a67e 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -331,3 +331,5 @@ OPAL_CALL(opal_pci_set_pbcq_tunnel_bar, OPAL_PCI_SET_PBCQ_TUNNEL_BAR);
> OPAL_CALL(opal_sensor_read_u64, OPAL_SENSOR_READ_U64);
> OPAL_CALL(opal_sensor_group_enable, OPAL_SENSOR_GROUP_ENABLE);
> OPAL_CALL(opal_nx_coproc_init, OPAL_NX_COPROC_INIT);
> +OPAL_CALL(opal_cpuidle_save, OPAL_IDLE_SAVE);
> +OPAL_CALL(opal_cpuidle_restore, OPAL_IDLE_RESTORE);
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-11 19:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20181011132237.14604-1-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:55 ` [RFC PATCH v2 0/3] New device-tree format and Opal based idle save-restore Frank Rowand
[not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:55 ` [RFC PATCH v2 1/3] cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1 Frank Rowand
[not found] ` <20181011132237.14604-3-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:56 ` [RFC PATCH v2 2/3] powernv/cpuidle: Pass pointers instead of values to stop loop Frank Rowand
[not found] ` <20181011132237.14604-4-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:56 ` [RFC PATCH v2 3/3] cpuidle/powernv: save-restore sprs in opal Frank Rowand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).