devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH v2 0/3] New device-tree format and Opal based idle save-restore
       [not found] <20181011132237.14604-1-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:55 ` Frank Rowand
       [not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:55 UTC (permalink / raw)
  To: Akshay Adiga, linux-kernel, linuxppc-dev,
	devicetree@vger.kernel.org
  Cc: huntbag, npiggin, benh, mpe, ego

+ devicetree mail list

On 10/11/18 06:22, Akshay Adiga wrote:
> Previously if a older kernel runs on a newer firmware, it may enable
> all available states irrespective of its capability of handling it.
> New device tree format adds a compatible flag, so that only kernel
> which has the capability to handle the version of stop state will enable
> it.
> 
> Older kernel will still see stop0 and stop0_lite in older format and we
> will depricate it after some time.
> 
> 1) Idea is to bump up the version string in firmware if we find a bug or
> regression in stop states. A fix will be provided in linux which would
> now know about the bumped up version of stop states, where as kernel
> without fixes would ignore the states.
> 
> 2) Slowly deprecate cpuidle/cpuhotplug threshold which is hard-coded
> into cpuidle-powernv driver. Instead use compatible strings to indicate
> if idle state is suitable for cpuidle and hotplug.
> 
> New idle state device tree format :
>        power-mgt {
>             ...
>          ibm,enabled-stop-levels = <0xec000000>;
>          ibm,cpu-idle-state-psscr-mask = <0x0 0x3003ff 0x0 0x3003ff>;
>          ibm,cpu-idle-state-latencies-ns = <0x3e8 0x7d0>;
>          ibm,cpu-idle-state-psscr = <0x0 0x330 0x0 0x300330>;
>          ibm,cpu-idle-state-flags = <0x100000 0x101000>;
>          ibm,cpu-idle-state-residency-ns = <0x2710 0x4e20>;
>          ibm,idle-states {
>                      stop4 {
>                          flags = <0x207000>;
>                          compatible = "ibm,state-v1",
> 				      "opal-support";
> 			 type = "cpuidle";
>                          psscr-mask = <0x0 0x3003ff>;
>                          handle = <0x102>;
>                          latency-ns = <0x186a0>;
>                          residency-ns = <0x989680>;
>                          psscr = <0x0 0x300374>;
>                   };
>                     ...
>                     stop11 {
>                      ...
>                          compatible = "ibm,state-v1",
> 				      "opal-support";
> 			 type = "cpuoffline";
>                          ...
>                   };
>              };
> 
> High-level parsing algorithm :
> 
> Say Known version string = "ibm,state-v1"
> 
> for each stop state node in device tree:
> 	if (compatible has known version string)
> 		kernel takes care of stop-transitions
> 	else if (compatible has "opal-support")
> 		OPAL takes care of stop-transitions
> 	else
> 		Skip All deeper states
> 
> When a state does not have both version support and opal support,
> Its possible to exit from a shallower state. Hence skipping all
> deeper states.
> 
> OPAL support for idle states
> ----------------------------
> 
> With this patch series, all the states that loose hypervisor state
> will be handled through opal_call.
> 
> Patch 3 adds support for Saving/restoring of SPRs and resync-timebase
> in OPAL. Also all the decision making such as identifying first thread
> in the core and taking locks before restoring, etc are implemented in
> OPAL.
> 
> How does it work ?
> -------------------
> 
> Consider a case that stop4 has a bug. We take the following steps to
> mitigate the problem.
> 
> 1) Change compatible string for stop4 in OPAL to "ibm-state-v2" and
> remove "opal-supported". ship the new firmware.
> The kernel ignores stop4 and all deeper states. But we will still have
> shallower states. Prevents from completely disabling stop states.
> 
> 2) Implement workaround in OPAL and add "opal-supported". Ship new firmware
> The kernel uses opal for stop-transtion , which has workaround implemented.
> We get stop4 and deeper states working without kernel changes and backports.
> (and considerably less time)
> 
> 3) Implement workaround in kernel and add "ibm-state-v2" as known versions
> The kernel will now be able to handle stop4 and deeper states.
> 
> Changes from v1 :
>  - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
>    idle code in C"
> 	http://patchwork.ozlabs.org/patch/969596/
>  - All the states that loses hypervisor states will be handled by OPAL
>  - All the decision making such as identifying first thread in
>    the core and taking locks before restoring in such cases have also been
>    moved to OPAL
> 
> 
> Abhishek Goel (1):
>   cpuidle/powernv: save-restore sprs in opal
> 
> Akshay Adiga (2):
>   cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1
>   powernv/cpuidle: Pass pointers instead of values to  stop loop
> 
>  arch/powerpc/include/asm/cpuidle.h            |   9 +
>  arch/powerpc/include/asm/opal-api.h           |   4 +-
>  arch/powerpc/include/asm/opal.h               |   3 +
>  arch/powerpc/include/asm/processor.h          |   8 +-
>  arch/powerpc/kernel/idle_book3s.S             |   6 +-
>  arch/powerpc/platforms/powernv/idle.c         | 247 ++++++++++++++----
>  .../powerpc/platforms/powernv/opal-wrappers.S |   2 +
>  drivers/cpuidle/cpuidle-powernv.c             |  46 ++--
>  8 files changed, 251 insertions(+), 74 deletions(-)
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 1/3] cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1
       [not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:55   ` Frank Rowand
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:55 UTC (permalink / raw)
  To: Akshay Adiga, linux-kernel, linuxppc-dev,
	devicetree@vger.kernel.org
  Cc: huntbag, npiggin, benh, mpe, ego

+ devicetree mail list

On 10/11/18 06:22, Akshay Adiga wrote:
> This patch adds support for new device-tree format for idle state
> description.
> 
> Previously if a older kernel runs on a newer firmware, it may enable
> all available states irrespective of its capability of handling it.
> New device tree format adds a compatible flag, so that only kernel
> which has the capability to handle the version of stop state will enable
> it.
> 
> Older kernel will still see stop0 and stop0_lite in older format and we
> will depricate it after some time.
> 
> 1) Idea is to bump up the version in firmware if we find a bug or
> regression in stop states. A fix will be provided in linux which would
> now know about the bumped up version of stop states, where as kernel
> without fixes would ignore the states.
> 
> 2) Slowly deprecate cpuidle /cpuhotplug threshold which is hard-coded
> into cpuidle-powernv driver. Instead use compatible strings to indicate
> if idle state is suitable for cpuidle and hotplug.
> 
> New idle state device tree format :
>        power-mgt {
>             ...
>          ibm,enabled-stop-levels = <0xec000000>;
>          ibm,cpu-idle-state-psscr-mask = <0x0 0x3003ff 0x0 0x3003ff>;
>          ibm,cpu-idle-state-latencies-ns = <0x3e8 0x7d0>;
>          ibm,cpu-idle-state-psscr = <0x0 0x330 0x0 0x300330>;
>          ibm,cpu-idle-state-flags = <0x100000 0x101000>;
>          ibm,cpu-idle-state-residency-ns = <0x2710 0x4e20>;
>          ibm,idle-states {
>                      stop4 {
>                          flags = <0x207000>;
>                          compatible = "ibm,state-v1",
>                                       "opal-supported";
>                          type = "cpuidle";
>                          psscr-mask = <0x0 0x3003ff>;
>                          handle = <0x102>;
>                          latency-ns = <0x186a0>;
>                          residency-ns = <0x989680>;
>                          psscr = <0x0 0x300374>;
>                   };
>                     ...
>                     stop11 {
>                      ...
>                          compatible = "ibm,state-v1",
>                                       "opal-supported";
>                          type = "cpuoffline";
>                          ...
>                   };
>              };
> type strings :
> "cpuidle" : indicates it should be used by cpuidle-driver
> "cpuoffline" : indicates it should be used by hotplug driver
> 
> compatible strings :
> "ibm,state-v1" : kernel checks if it knows about this version
> "opal-supported" : indicates kernel can fall back to use opal
> 		   for stop-transitions
> 
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> ---
> 
> Changes from v1 :
>  - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
>    idle code in C"
>  - Moved "cpuidle" and "cpuoffline" as seperate property called
>    "type"
>  
> 
>  arch/powerpc/include/asm/cpuidle.h    |   9 ++
>  arch/powerpc/platforms/powernv/idle.c | 132 +++++++++++++++++++++++++-
>  drivers/cpuidle/cpuidle-powernv.c     |  31 ++++--
>  3 files changed, 160 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
> index 9844b3ded187..e920a15e797f 100644
> --- a/arch/powerpc/include/asm/cpuidle.h
> +++ b/arch/powerpc/include/asm/cpuidle.h
> @@ -70,14 +70,23 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +enum idle_state_type_t {
> +	CPUIDLE_TYPE,
> +	CPUOFFLINE_TYPE
> +};
> +
> +#define POWERNV_THRESHOLD_LATENCY_NS 200000
> +#define PNV_VER_NAME_LEN    32
>  #define PNV_IDLE_NAME_LEN    16
>  struct pnv_idle_states_t {
>  	char name[PNV_IDLE_NAME_LEN];
> +	char version[PNV_VER_NAME_LEN];
>  	u32 latency_ns;
>  	u32 residency_ns;
>  	u64 psscr_val;
>  	u64 psscr_mask;
>  	u32 flags;
> +	enum idle_state_type_t type;
>  	bool valid;
>  };
>  
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 96186af9e953..755918402591 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -54,6 +54,20 @@ static bool default_stop_found;
>  static u64 pnv_first_tb_loss_level = MAX_STOP_STATE + 1;
>  static u64 pnv_first_hv_loss_level = MAX_STOP_STATE + 1;
>  
> +
> +static int parse_dt_v1(struct device_node *np);
> +struct stop_version_t {
> +	const char name[PNV_VER_NAME_LEN];
> +	int (*parser_fn)(struct device_node *np);
> +};
> +struct stop_version_t known_versions[] = {
> +		{
> +			.name =  "ibm,state-v1",
> +			.parser_fn = parse_dt_v1,
> +		}
> +	};
> +const int nr_known_versions = 1;
> +
>  /*
>   * psscr value and mask of the deepest stop idle state.
>   * Used when a cpu is offlined.
> @@ -1195,6 +1209,77 @@ static void __init pnv_probe_idle_states(void)
>  		supported_cpuidle_states |= pnv_idle_states[i].flags;
>  }
>  
> +static int parse_dt_v1(struct device_node *dt_node)
> +{
> +	const char *temp_str;
> +	int rc;
> +	int i = nr_pnv_idle_states;
> +
> +	if (!dt_node) {
> +		pr_err("Invalid device_node\n");
> +		return -EINVAL;
> +	}
> +
> +	rc = of_property_read_string(dt_node, "name", &temp_str);
> +	if (rc) {
> +		pr_err("error reading names rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +	strncpy(pnv_idle_states[i].name, temp_str, PNV_IDLE_NAME_LEN);
> +	rc = of_property_read_u32(dt_node, "residency-ns",
> +				  &pnv_idle_states[i].residency_ns);
> +	if (rc) {
> +		pr_err("error reading residency rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +	rc = of_property_read_u32(dt_node, "latency-ns",
> +				  &pnv_idle_states[i].latency_ns);
> +	if (rc) {
> +		pr_err("error reading latency rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +	rc = of_property_read_u32(dt_node, "flags",
> +				  &pnv_idle_states[i].flags);
> +	if (rc) {
> +		pr_err("error reading flags rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +
> +	/* We are not expecting power8 device-tree in this format */
> +	rc = of_property_read_u64(dt_node, "psscr-mask",
> +				  &pnv_idle_states[i].psscr_mask);
> +	if (rc) {
> +		pr_err("error reading psscr-mask rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +	rc = of_property_read_u64(dt_node, "psscr",
> +				  &pnv_idle_states[i].psscr_val);
> +	if (rc) {
> +		pr_err("error reading psscr rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * TODO : save the version strings in data structure
> +	 */
> +	rc = of_property_read_string(dt_node, "type", &temp_str);
> +	pr_info("type = %s\n", temp_str);
> +	if (rc) {
> +		pr_err("error reading type rc= %d\n", rc);
> +		return -EINVAL;
> +	}
> +	if (strcmp(temp_str, "cpuidle") == 0)
> +		pnv_idle_states[i].type = CPUIDLE_TYPE;
> +	else if (strcmp(temp_str, "cpuoffline") == 0)
> +		pnv_idle_states[i].type = CPUOFFLINE_TYPE;
> +	else {
> +		pr_err("Invalid type skipping %s\n",
> +					pnv_idle_states[i].name);
> +		return -EINVAL;
> +	}
> +	return 0;
> +
> +}
>  /*
>   * This function parses device-tree and populates all the information
>   * into pnv_idle_states structure. It also sets up nr_pnv_idle_states
> @@ -1203,8 +1288,9 @@ static void __init pnv_probe_idle_states(void)
>  
>  static int pnv_parse_cpuidle_dt(void)
>  {
> -	struct device_node *np;
> +	struct device_node *np, *np1, *dt_node;
>  	int nr_idle_states, i;
> +	int additional_states = 0;
>  	int rc = 0;
>  	u32 *temp_u32;
>  	u64 *temp_u64;
> @@ -1218,8 +1304,14 @@ static int pnv_parse_cpuidle_dt(void)
>  	nr_idle_states = of_property_count_u32_elems(np,
>  						"ibm,cpu-idle-state-flags");
>  
> -	pnv_idle_states = kcalloc(nr_idle_states, sizeof(*pnv_idle_states),
> -				  GFP_KERNEL);
> +	np1 = of_find_node_by_path("/ibm,opal/power-mgt/ibm,idle-states");
> +	if (np1) {
> +		for_each_child_of_node(np1, dt_node)
> +			additional_states++;
> +	}
> +	pr_info("states in new format : %d\n", additional_states);
> +	pnv_idle_states = kcalloc(nr_idle_states + additional_states,
> +				  sizeof(*pnv_idle_states), GFP_KERNEL);
>  	temp_u32 = kcalloc(nr_idle_states, sizeof(u32),  GFP_KERNEL);
>  	temp_u64 = kcalloc(nr_idle_states, sizeof(u64),  GFP_KERNEL);
>  	temp_string = kcalloc(nr_idle_states, sizeof(char *),  GFP_KERNEL);
> @@ -1298,8 +1390,40 @@ static int pnv_parse_cpuidle_dt(void)
>  	for (i = 0; i < nr_idle_states; i++)
>  		strlcpy(pnv_idle_states[i].name, temp_string[i],
>  			PNV_IDLE_NAME_LEN);
> +
> +	/* Mark states as CPUIDLE_TYPE /CPUOFFLINE for older version*/
> +	for (i = 0; i < nr_idle_states; i++) {
> +		if (pnv_idle_states[i].latency_ns > POWERNV_THRESHOLD_LATENCY_NS)
> +			pnv_idle_states[i].type  = CPUOFFLINE_TYPE;
> +		else
> +			pnv_idle_states[i].type  = CPUIDLE_TYPE;
> +	}
>  	nr_pnv_idle_states = nr_idle_states;
> -	rc = 0;
> +	/* Parsing node-based idle states device-tree format */
> +	if (!np1) {
> +		pr_info("dt does not contain ibm,idle_states");
> +		goto out;
> +	}
> +	/* Parse each child node with appropriate parser_fn */
> +	for_each_child_of_node(np1, dt_node) {
> +		bool found_known_version = false;
> +		/* we don't have state falling back to opal*/
> +		for (i = 0; i < nr_known_versions ; i++) {
> +			if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> +				rc = known_versions[i].parser_fn(dt_node);
> +				if (rc) {
> +					pr_err("%s could not parse\n", known_versions[i].name);
> +					continue;
> +				}
> +				found_known_version = true;
> +			}
> +		}
> +		if (!found_known_version) {
> +			pr_info("Unsupported state, skipping all further state\n");
> +			goto out;
> +		}
> +		nr_pnv_idle_states++;
> +	}
>  out:
>  	kfree(temp_u32);
>  	kfree(temp_u64);
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index 84b1ebe212b3..a15514ebd1c3 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -26,7 +26,6 @@
>   * Expose only those Hardware idle states via the cpuidle framework
>   * that have latency value below POWERNV_THRESHOLD_LATENCY_NS.
>   */
> -#define POWERNV_THRESHOLD_LATENCY_NS 200000
>  
>  static struct cpuidle_driver powernv_idle_driver = {
>  	.name             = "powernv_idle",
> @@ -265,7 +264,7 @@ extern u32 pnv_get_supported_cpuidle_states(void);
>  static int powernv_add_idle_states(void)
>  {
>  	int nr_idle_states = 1; /* Snooze */
> -	int dt_idle_states;
> +	int dt_idle_states = 0;
>  	u32 has_stop_states = 0;
>  	int i;
>  	u32 supported_flags = pnv_get_supported_cpuidle_states();
> @@ -277,14 +276,19 @@ static int powernv_add_idle_states(void)
>  		goto out;
>  	}
>  
> -	/* TODO: Count only states which are eligible for cpuidle */
> -	dt_idle_states = nr_pnv_idle_states;
> +	/* Count only cpuidle states*/
> +	for (i = 0; i < nr_pnv_idle_states; i++) {
> +		if (pnv_idle_states[i].type == CPUIDLE_TYPE)
> +			dt_idle_states++;
> +	}
> +	pr_info("idle states in dt = %d , states with idle flag = %d",
> +					nr_pnv_idle_states, dt_idle_states);
>  
>  	/*
>  	 * Since snooze is used as first idle state, max idle states allowed is
>  	 * CPUIDLE_STATE_MAX -1
>  	 */
> -	if (nr_pnv_idle_states > CPUIDLE_STATE_MAX - 1) {
> +	if (dt_idle_states > CPUIDLE_STATE_MAX - 1) {
>  		pr_warn("cpuidle-powernv: discovered idle states more than allowed");
>  		dt_idle_states = CPUIDLE_STATE_MAX - 1;
>  	}
> @@ -305,8 +309,15 @@ static int powernv_add_idle_states(void)
>  		 * Skip the platform idle state whose flag isn't in
>  		 * the supported_cpuidle_states flag mask.
>  		 */
> -		if ((state->flags & supported_flags) != state->flags)
> +		if ((state->flags & supported_flags) != state->flags) {
> +			pr_warn("State %d does not have supported flag\n", i);
> +			continue;
> +		}
> +		if (state->type != CPUIDLE_TYPE) {
> +			pr_info("State %d is not idletype, it of %d type\n", i,
> +								state->type);
>  			continue;
> +		}
>  		/*
>  		 * If an idle state has exit latency beyond
>  		 * POWERNV_THRESHOLD_LATENCY_NS then don't use it
> @@ -321,8 +332,10 @@ static int powernv_add_idle_states(void)
>  		exit_latency = DIV_ROUND_UP(state->latency_ns, 1000);
>  		target_residency = DIV_ROUND_UP(state->residency_ns, 1000);
>  
> -		if (has_stop_states && !(state->valid))
> +		if (has_stop_states && !(state->valid)) {
> +			pr_warn("State %d is invalid\n", i);
>  				continue;
> +		}
>  
>  		if (state->flags & OPAL_PM_TIMEBASE_STOP)
>  			stops_timebase = true;
> @@ -360,8 +373,10 @@ static int powernv_add_idle_states(void)
>  					  state->psscr_mask);
>  		}
>  #endif
> -		else
> +		else {
> +			pr_warn("cpuidle-powernv : could not add state\n");
>  			continue;
> +		}
>  		nr_idle_states++;
>  	}
>  out:
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 2/3] powernv/cpuidle: Pass pointers instead of values to stop loop
       [not found] ` <20181011132237.14604-3-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:56   ` Frank Rowand
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:56 UTC (permalink / raw)
  To: Akshay Adiga, linux-kernel, linuxppc-dev,
	devicetree@vger.kernel.org
  Cc: huntbag, npiggin, benh, mpe, ego

+ devicetree mail list

On 10/11/18 06:22, Akshay Adiga wrote:
> Passing pointer to the pnv_idle_state instead of psscr value and mask.
> This helps us to pass more information to the stop loop. This will help to
> figure out the method to enter/exit idle state.
> 
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> 
> ---
> Changes from v1 :
>  - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
>    idle code in C"
> 
>  arch/powerpc/include/asm/processor.h  |  5 ++-
>  arch/powerpc/platforms/powernv/idle.c | 47 ++++++++++-----------------
>  drivers/cpuidle/cpuidle-powernv.c     | 15 +++------
>  3 files changed, 24 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 936795acba48..822d3236ad7f 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -43,6 +43,7 @@
>  #include <asm/thread_info.h>
>  #include <asm/ptrace.h>
>  #include <asm/hw_breakpoint.h>
> +#include <asm/cpuidle.h>
>  
>  /* We do _not_ want to define new machine types at all, those must die
>   * in favor of using the device-tree
> @@ -518,9 +519,7 @@ enum idle_boot_override {IDLE_NO_OVERRIDE = 0, IDLE_POWERSAVE_OFF};
>  extern int powersave_nap;	/* set if nap mode can be used in idle loop */
>  
>  extern void power7_idle_type(unsigned long type);
> -extern void power9_idle_type(unsigned long stop_psscr_val,
> -			      unsigned long stop_psscr_mask);
> -
> +extern void power9_idle_type(struct pnv_idle_states_t *state);
>  extern void flush_instruction_cache(void);
>  extern void hard_reset_now(void);
>  extern void poweroff_now(void);
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 755918402591..681a23a066bb 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -44,8 +44,7 @@ int nr_pnv_idle_states;
>   * The default stop state that will be used by ppc_md.power_save
>   * function on platforms that support stop instruction.
>   */
> -static u64 pnv_default_stop_val;
> -static u64 pnv_default_stop_mask;
> +struct pnv_idle_states_t *pnv_default_state;
>  static bool default_stop_found;
>  
>  /*
> @@ -72,9 +71,7 @@ const int nr_known_versions = 1;
>   * psscr value and mask of the deepest stop idle state.
>   * Used when a cpu is offlined.
>   */
> -static u64 pnv_deepest_stop_psscr_val;
> -static u64 pnv_deepest_stop_psscr_mask;
> -static u64 pnv_deepest_stop_flag;
> +static struct pnv_idle_states_t *pnv_deepest_state;
>  static bool deepest_stop_found;
>  
>  static unsigned long power7_offline_type;
> @@ -96,7 +93,7 @@ static int pnv_save_sprs_for_deep_states(void)
>  	uint64_t hid5_val	= mfspr(SPRN_HID5);
>  	uint64_t hmeer_val	= mfspr(SPRN_HMEER);
>  	uint64_t msr_val = MSR_IDLE;
> -	uint64_t psscr_val = pnv_deepest_stop_psscr_val;
> +	uint64_t psscr_val = pnv_deepest_state->psscr_val;
>  
>  	for_each_present_cpu(cpu) {
>  		uint64_t pir = get_hard_smp_processor_id(cpu);
> @@ -820,17 +817,15 @@ static unsigned long power9_offline_stop(unsigned long psscr)
>  	return srr1;
>  }
>  
> -static unsigned long __power9_idle_type(unsigned long stop_psscr_val,
> -				      unsigned long stop_psscr_mask)
> +static unsigned long __power9_idle_type(struct pnv_idle_states_t *state)
>  {
>  	unsigned long psscr;
>  	unsigned long srr1;
>  
>  	if (!prep_irq_for_idle_irqsoff())
>  		return 0;
> -
>  	psscr = mfspr(SPRN_PSSCR);
> -	psscr = (psscr & ~stop_psscr_mask) | stop_psscr_val;
> +	psscr = (psscr & ~state->psscr_mask) | state->psscr_val;
>  
>  	__ppc64_runlatch_off();
>  	srr1 = power9_idle_stop(psscr, true);
> @@ -841,12 +836,10 @@ static unsigned long __power9_idle_type(unsigned long stop_psscr_val,
>  	return srr1;
>  }
>  
> -void power9_idle_type(unsigned long stop_psscr_val,
> -				      unsigned long stop_psscr_mask)
> +void power9_idle_type(struct pnv_idle_states_t *state)
>  {
>  	unsigned long srr1;
> -
> -	srr1 = __power9_idle_type(stop_psscr_val, stop_psscr_mask);
> +	srr1 = __power9_idle_type(state);
>  	irq_set_pending_from_srr1(srr1);
>  }
>  
> @@ -855,7 +848,7 @@ void power9_idle_type(unsigned long stop_psscr_val,
>   */
>  void power9_idle(void)
>  {
> -	power9_idle_type(pnv_default_stop_val, pnv_default_stop_mask);
> +	power9_idle_type(pnv_default_state);
>  }
>  
>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> @@ -974,8 +967,8 @@ unsigned long pnv_cpu_offline(unsigned int cpu)
>  		unsigned long psscr;
>  
>  		psscr = mfspr(SPRN_PSSCR);
> -		psscr = (psscr & ~pnv_deepest_stop_psscr_mask) |
> -						pnv_deepest_stop_psscr_val;
> +		psscr = (psscr & ~pnv_deepest_state->psscr_mask) |
> +						pnv_deepest_state->psscr_val;
>  		srr1 = power9_offline_stop(psscr);
>  	} else if (cpu_has_feature(CPU_FTR_ARCH_206) && power7_offline_type) {
>  		srr1 = power7_offline();
> @@ -1123,16 +1116,13 @@ static void __init pnv_power9_idle_init(void)
>  
>  		if (max_residency_ns < state->residency_ns) {
>  			max_residency_ns = state->residency_ns;
> -			pnv_deepest_stop_psscr_val = state->psscr_val;
> -			pnv_deepest_stop_psscr_mask = state->psscr_mask;
> -			pnv_deepest_stop_flag = state->flags;
> +			pnv_deepest_state = state;
>  			deepest_stop_found = true;
>  		}
>  
>  		if (!default_stop_found &&
>  		    (state->flags & OPAL_PM_STOP_INST_FAST)) {
> -			pnv_default_stop_val = state->psscr_val;
> -			pnv_default_stop_mask = state->psscr_mask;
> +			pnv_default_state = state;
>  			default_stop_found = true;
>  			WARN_ON(state->flags & OPAL_PM_LOSE_FULL_CONTEXT);
>  		}
> @@ -1143,15 +1133,15 @@ static void __init pnv_power9_idle_init(void)
>  	} else {
>  		ppc_md.power_save = power9_idle;
>  		pr_info("cpuidle-powernv: Default stop: psscr = 0x%016llx,mask=0x%016llx\n",
> -			pnv_default_stop_val, pnv_default_stop_mask);
> +			pnv_default_state->psscr_val, pnv_default_state->psscr_mask);
>  	}
>  
>  	if (unlikely(!deepest_stop_found)) {
>  		pr_warn("cpuidle-powernv: No suitable stop state for CPU-Hotplug. Offlined CPUs will busy wait");
>  	} else {
>  		pr_info("cpuidle-powernv: Deepest stop: psscr = 0x%016llx,mask=0x%016llx\n",
> -			pnv_deepest_stop_psscr_val,
> -			pnv_deepest_stop_psscr_mask);
> +			pnv_deepest_state->psscr_val,
> +			pnv_deepest_state->psscr_mask);
>  	}
>  
>  	pr_info("cpuidle-powernv: First stop level that may lose SPRs = 0x%lld\n",
> @@ -1173,16 +1163,15 @@ static void __init pnv_disable_deep_states(void)
>  	pr_warn("cpuidle-powernv: Idle power-savings, CPU-Hotplug affected\n");
>  
>  	if (cpu_has_feature(CPU_FTR_ARCH_300) &&
> -	    (pnv_deepest_stop_flag & OPAL_PM_LOSE_FULL_CONTEXT)) {
> +	    (pnv_deepest_state->flags & OPAL_PM_LOSE_FULL_CONTEXT)) {
>  		/*
>  		 * Use the default stop state for CPU-Hotplug
>  		 * if available.
>  		 */
>  		if (default_stop_found) {
> -			pnv_deepest_stop_psscr_val = pnv_default_stop_val;
> -			pnv_deepest_stop_psscr_mask = pnv_default_stop_mask;
> +			pnv_deepest_state = pnv_default_state;
>  			pr_warn("cpuidle-powernv: Offlined CPUs will stop with psscr = 0x%016llx\n",
> -				pnv_deepest_stop_psscr_val);
> +				pnv_deepest_state->psscr_val);
>  		} else { /* Fallback to snooze loop for CPU-Hotplug */
>  			deepest_stop_found = false;
>  			pr_warn("cpuidle-powernv: Offlined CPUs will busy wait\n");
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index a15514ebd1c3..5116d5991d30 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -35,13 +35,7 @@ static struct cpuidle_driver powernv_idle_driver = {
>  static int max_idle_state __read_mostly;
>  static struct cpuidle_state *cpuidle_state_table __read_mostly;
>  
> -struct stop_psscr_table {
> -	u64 val;
> -	u64 mask;
> -};
> -
> -static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX] __read_mostly;
> -
> +struct pnv_idle_states_t idx_to_state_ptr[CPUIDLE_STATE_MAX] __read_mostly;
>  static u64 default_snooze_timeout __read_mostly;
>  static bool snooze_timeout_en __read_mostly;
>  
> @@ -143,8 +137,9 @@ static int stop_loop(struct cpuidle_device *dev,
>  		     struct cpuidle_driver *drv,
>  		     int index)
>  {
> -	power9_idle_type(stop_psscr_table[index].val,
> -			 stop_psscr_table[index].mask);
> +	struct pnv_idle_states_t *state;
> +	state = &pnv_idle_states[index];
> +	power9_idle_type(state);
>  	return index;
>  }
>  
> @@ -242,8 +237,6 @@ static inline void add_powernv_state(int index, const char *name,
>  	powernv_states[index].exit_latency = exit_latency;
>  	powernv_states[index].enter = idle_fn;
>  	/* For power8 and below psscr_* will be 0 */
> -	stop_psscr_table[index].val = psscr_val;
> -	stop_psscr_table[index].mask = psscr_mask;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2 3/3] cpuidle/powernv: save-restore sprs in opal
       [not found] ` <20181011132237.14604-4-akshay.adiga@linux.vnet.ibm.com>
@ 2018-10-11 19:56   ` Frank Rowand
  0 siblings, 0 replies; 4+ messages in thread
From: Frank Rowand @ 2018-10-11 19:56 UTC (permalink / raw)
  To: Akshay Adiga, linux-kernel, linuxppc-dev,
	devicetree@vger.kernel.org
  Cc: huntbag, npiggin, benh, mpe, ego

+ devicetree mail list

On 10/11/18 06:22, Akshay Adiga wrote:
> From: Abhishek Goel <huntbag@linux.vnet.ibm.com>
> 
> This patch moves the saving and restoring of sprs for P9 cpuidle
> from kernel to opal.
> In an attempt to make the powernv idle code backward compatible,
> and to some extent forward compatible, add support for pre-stop entry
> and post-stop exit actions in OPAL. If a kernel knows about this
> opal call, then just a firmware supporting newer hardware is required,
> instead of waiting for kernel updates.
> 
> Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
> Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com>
> ---
> Changes from v1 :
>  - Code is rebased on Nick Piggin's v4 patch "powerpc/64s: reimplement book3s
>    idle code in C"
>  - Set a global variable "request_opal_call" to indicate that deep
>    states should make opal_call.
>  - All the states that loses hypervisor states will be handled by OPAL
>  - All the decision making such as identifying first thread in
>    the core and taking locks before restoring in such cases have also been
>    moved to OPAL
>  arch/powerpc/include/asm/opal-api.h           |  4 +-
>  arch/powerpc/include/asm/opal.h               |  3 +
>  arch/powerpc/include/asm/processor.h          |  3 +-
>  arch/powerpc/kernel/idle_book3s.S             |  6 +-
>  arch/powerpc/platforms/powernv/idle.c         | 88 +++++++++++++------
>  .../powerpc/platforms/powernv/opal-wrappers.S |  2 +
>  6 files changed, 77 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index 8365353330b4..93ea1f79e295 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -210,7 +210,9 @@
>  #define OPAL_PCI_GET_PBCQ_TUNNEL_BAR		164
>  #define OPAL_PCI_SET_PBCQ_TUNNEL_BAR		165
>  #define	OPAL_NX_COPROC_INIT			167
> -#define OPAL_LAST				167
> +#define OPAL_IDLE_SAVE				170
> +#define OPAL_IDLE_RESTORE			171
> +#define OPAL_LAST				171
>  
>  #define QUIESCE_HOLD			1 /* Spin all calls at entry */
>  #define QUIESCE_REJECT			2 /* Fail all calls with OPAL_BUSY */
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index ff3866473afe..26995e16171e 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -356,6 +356,9 @@ extern int opal_handle_hmi_exception(struct pt_regs *regs);
>  extern void opal_shutdown(void);
>  extern int opal_resync_timebase(void);
>  
> +extern int opal_cpuidle_save(u64 psscr);
> +extern int opal_cpuidle_restore(u64 psscr, u64 srr1);
> +
>  extern void opal_lpc_init(void);
>  
>  extern void opal_kmsg_init(void);
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 822d3236ad7f..26fa6c1836f4 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -510,7 +510,8 @@ static inline unsigned long get_clean_sp(unsigned long sp, int is_32)
>  
>  /* asm stubs */
>  extern unsigned long isa300_idle_stop_noloss(unsigned long psscr_val);
> -extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val);
> +extern unsigned long isa300_idle_stop_mayloss(unsigned long psscr_val,
> +						bool request_opal_call);
>  extern unsigned long isa206_idle_insn_mayloss(unsigned long type);
>  
>  extern unsigned long cpuidle_disable;
> diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S
> index ffdee1ab4388..a2014d152035 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -52,14 +52,16 @@ _GLOBAL(isa300_idle_stop_noloss)
>  _GLOBAL(isa300_idle_stop_mayloss)
>  	mtspr 	SPRN_PSSCR,r3
>  	std	r1,PACAR1(r13)
> -	mflr	r4
> +	mflr	r7
>  	mfcr	r5
>  	/* use stack red zone rather than a new frame */
>  	addi	r6,r1,-INT_FRAME_SIZE
>  	SAVE_GPR(2, r6)
>  	SAVE_NVGPRS(r6)
> -	std	r4,_LINK(r6)
> +	std	r7,_LINK(r6)
>  	std	r5,_CCR(r6)
> +	cmpwi	r4,0
> +	bne	opal_cpuidle_save
>  	PPC_STOP
>  	b	.	/* catch bugs */
>  
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 681a23a066bb..bcfe08022e65 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -171,6 +171,7 @@ static void pnv_fastsleep_workaround_apply(void *info)
>  
>  static bool power7_fastsleep_workaround_entry = true;
>  static bool power7_fastsleep_workaround_exit = true;
> +static bool request_opal_call = false;
>  
>  /*
>   * Used to store fastsleep workaround state
> @@ -604,6 +605,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  	unsigned long mmcr0 = 0;
>  	struct p9_sprs sprs;
>  	bool sprs_saved = false;
> +	bool is_hv_loss = false;
>  
>  	memset(&sprs, 0, sizeof(sprs));
>  
> @@ -648,7 +650,9 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  		  */
>  		mmcr0		= mfspr(SPRN_MMCR0);
>  	}
> -	if ((psscr & PSSCR_RL_MASK) >= pnv_first_hv_loss_level) {
> +
> +	is_hv_loss = (psscr & PSSCR_RL_MASK) >= pnv_first_hv_loss_level;
> +	if (is_hv_loss && (!request_opal_call)) {
>  		sprs.lpcr	= mfspr(SPRN_LPCR);
>  		sprs.hfscr	= mfspr(SPRN_HFSCR);
>  		sprs.fscr	= mfspr(SPRN_FSCR);
> @@ -674,7 +678,8 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  		atomic_start_thread_idle();
>  	}
>  
> -	srr1 = isa300_idle_stop_mayloss(psscr);
> +	srr1 = isa300_idle_stop_mayloss(psscr,
> +			is_hv_loss && request_opal_call);
>  
>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
>  	local_paca->requested_psscr = 0;
> @@ -685,6 +690,25 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  	WARN_ON_ONCE(!srr1);
>  	WARN_ON_ONCE(mfmsr() & (MSR_IR|MSR_DR));
>  
> +	/*
> +	 * On POWER9, SRR1 bits do not match exactly as expected.
> +	 * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> +	 * always test PSSCR if there is any state loss.
> +	 */
> +	if (likely(((psscr & PSSCR_PLS) >> 60) < pnv_first_hv_loss_level)) {
> +		if (sprs_saved)
> +			atomic_stop_thread_idle();
> +		goto out;
> +	}
> +
> +	if (request_opal_call) {
> +		opal_cpuidle_restore(psscr, srr1);
> +		goto opal_return;
> +	}
> +
> +	/* HV state loss */
> +	BUG_ON(!sprs_saved);
> +
>  	if ((srr1 & SRR1_WAKESTATE) != SRR1_WS_NOLOSS) {
>  		unsigned long mmcra;
>  
> @@ -712,19 +736,6 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  	if (unlikely((srr1 & SRR1_WAKEMASK_P8) == SRR1_WAKEHMI))
>  		hmi_exception_realmode(NULL);
>  
> -	/*
> -	 * On POWER9, SRR1 bits do not match exactly as expected.
> -	 * SRR1_WS_GPRLOSS (10b) can also result in SPR loss, so
> -	 * always test PSSCR if there is any state loss.
> -	 */
> -	if (likely((psscr & PSSCR_RL_MASK) < pnv_first_hv_loss_level)) {
> -		if (sprs_saved)
> -			atomic_stop_thread_idle();
> -		goto out;
> -	}
> -
> -	/* HV state loss */
> -	BUG_ON(!sprs_saved);
>  
>  	atomic_lock_thread_idle();
>  
> @@ -771,6 +782,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  
>  	mtspr(SPRN_SPRG3,	local_paca->sprg_vdso);
>  
> +opal_return:
>  	if (!radix_enabled())
>  		__slb_restore_bolted_realmode();
>  
> @@ -1284,6 +1296,7 @@ static int pnv_parse_cpuidle_dt(void)
>  	u32 *temp_u32;
>  	u64 *temp_u64;
>  	const char **temp_string;
> +	bool fall_back_to_opal = false;
>  
>  	np = of_find_node_by_path("/ibm,opal/power-mgt");
>  	if (!np) {
> @@ -1396,23 +1409,48 @@ static int pnv_parse_cpuidle_dt(void)
>  	/* Parse each child node with appropriate parser_fn */
>  	for_each_child_of_node(np1, dt_node) {
>  		bool found_known_version = false;
> -		/* we don't have state falling back to opal*/
> -		for (i = 0; i < nr_known_versions ; i++) {
> -			if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> -				rc = known_versions[i].parser_fn(dt_node);
> +		if (!fall_back_to_opal) {
> +			/* we don't have state falling back to opal*/
> +			for (i = 0; i < nr_known_versions ; i++) {
> +				if (of_device_is_compatible(dt_node, known_versions[i].name)) {
> +					rc = known_versions[i].parser_fn(dt_node);
> +					if (rc) {
> +						pr_err("%s could not parse\n", known_versions[i].name);
> +						continue;
> +					}
> +					found_known_version = true;
> +				}
> +			}
> +		}
> +
> +		/*
> +		 * If any previous state falls back to opal_call
> +		 * Then all futher states will either call opal_call
> +		 * or not be included for cpuidle/cpuoffline.
> +		 *
> +		 * Moreover, having any intermediate state with no
> +		 * kernel support or opal support can be potentially
> +		 * dangerous, as hardware can potentially wakeup from
> +		 * that state. Hence, no futher states are added to
> +		 * to cpuidle/cpuoffline
> +		 */
> +		if (!found_known_version || fall_back_to_opal) {
> +			if (of_device_is_compatible(dt_node, "opal-support")) {
> +				rc = known_versions[0].parser_fn(dt_node);
>  				if (rc) {
> -					pr_err("%s could not parse\n", known_versions[i].name);
> +					pr_err("%s could not parse\n", "opal-support");
>  					continue;
>  				}
> -				found_known_version = true;
> +				fall_back_to_opal = true;
> +			} else {
> +				pr_info("Unsupported state, skipping all further state\n");
> +				goto out;
>  			}
>  		}
> -		if (!found_known_version) {
> -			pr_info("Unsupported state, skipping all further state\n");
> -			goto out;
> -		}
>  		nr_pnv_idle_states++;
>  	}
> +	if (fall_back_to_opal)
> +		request_opal_call = true;
>  out:
>  	kfree(temp_u32);
>  	kfree(temp_u64);
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index 251528231a9e..7a039a81a67e 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -331,3 +331,5 @@ OPAL_CALL(opal_pci_set_pbcq_tunnel_bar,		OPAL_PCI_SET_PBCQ_TUNNEL_BAR);
>  OPAL_CALL(opal_sensor_read_u64,			OPAL_SENSOR_READ_U64);
>  OPAL_CALL(opal_sensor_group_enable,		OPAL_SENSOR_GROUP_ENABLE);
>  OPAL_CALL(opal_nx_coproc_init,			OPAL_NX_COPROC_INIT);
> +OPAL_CALL(opal_cpuidle_save,			OPAL_IDLE_SAVE);
> +OPAL_CALL(opal_cpuidle_restore,			OPAL_IDLE_RESTORE);
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-11 19:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20181011132237.14604-1-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:55 ` [RFC PATCH v2 0/3] New device-tree format and Opal based idle save-restore Frank Rowand
     [not found] ` <20181011132237.14604-2-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:55   ` [RFC PATCH v2 1/3] cpuidle/powernv: Add support for states with ibm,cpuidle-state-v1 Frank Rowand
     [not found] ` <20181011132237.14604-3-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:56   ` [RFC PATCH v2 2/3] powernv/cpuidle: Pass pointers instead of values to stop loop Frank Rowand
     [not found] ` <20181011132237.14604-4-akshay.adiga@linux.vnet.ibm.com>
2018-10-11 19:56   ` [RFC PATCH v2 3/3] cpuidle/powernv: save-restore sprs in opal Frank Rowand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).