xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Support controlling the max C-state sub-state
@ 2014-06-23 11:09 Ross Lagerwall
  2014-06-23 11:09 ` [PATCH v3 1/3] x86: Allow limiting " Ross Lagerwall
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-23 11:09 UTC (permalink / raw)
  To: Xen-devel; +Cc: Ross Lagerwall

As discussed previously on the list, here is a patch series to allow
controlling the maximum C-state sub-state. It doesn't fix the output of
xenpm to correctly show the C-states sub-states (that can be for later).

Changes since v2:
- Drop patch that's in staging
- Update ACPI idle function
- Fix handling of cpuid for single processor machines.
- Document overloading of cpuid for sub-state control.
- Formatting changes.

Changes since v1:
- Use a single boot parameter to control max_cstate and max_csubstate.
- Use max_csubstate rather than max_substate global variable,
  it is less generic.
- Reuse sysctl sub-ops rather than adding new ones.
- Limit the sub-state in the ACPI cpu_idle driver.
- Use unsigned rather than signed in places
- Update docs.
- Formatting changes.

Ross Lagerwall (3):
  x86: Allow limiting the max C-state sub-state
  tools/libxc: Alow controlling the max C-state sub-state
  xenpm: Allow controlling the max C-state sub-state

 docs/misc/xen-command-line.markdown |  8 +++++++-
 tools/libxc/xc_pm.c                 | 28 ++++++++++++++++++++++++----
 tools/libxc/xenctrl.h               |  3 +++
 tools/misc/xenpm.c                  | 34 +++++++++++++++++++++++++++++++++-
 xen/arch/x86/acpi/cpu_idle.c        | 21 ++++++++++++++++++---
 xen/arch/x86/cpu/mwait-idle.c       |  4 +++-
 xen/drivers/acpi/pmstat.c           | 20 ++++++++++++++++----
 xen/include/public/sysctl.h         |  5 ++++-
 xen/include/xen/acpi.h              | 29 +++++++++++++++++++++++++----
 9 files changed, 133 insertions(+), 19 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-06-23 11:09 [PATCH v3 0/3] Support controlling the max C-state sub-state Ross Lagerwall
@ 2014-06-23 11:09 ` Ross Lagerwall
  2014-06-25 12:37   ` Jan Beulich
  2014-06-23 11:09 ` [PATCH v3 2/3] tools/libxc: Alow controlling " Ross Lagerwall
  2014-06-23 11:09 ` [PATCH v3 3/3] xenpm: Allow " Ross Lagerwall
  2 siblings, 1 reply; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-23 11:09 UTC (permalink / raw)
  To: Xen-devel; +Cc: Ross Lagerwall, Liu Jinsong

Allow limiting the max C-state sub-state by appending to the max_cstate
command-line parameter. E.g. max_cstate=1,0
The limit only applies to the highest legal C-state. For example:
 max_cstate = 1, max_csubstate = 0 ==> C0, C1 okay, but not C1E
 max_cstate = 1, max_csubstate = 1 ==> C0, C1 and C1E okay, but not C2
 max_cstate = 2, max_csubstate = 0 ==> C0, C1, C1E, C2 okay, but not C3
 max_cstate = 2, max_csubstate = 1 ==> C0, C1, C1E, C2 okay, but not C3

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Liu Jinsong <jinsong.liu@alibaba-inc.com>
---
 docs/misc/xen-command-line.markdown |  8 +++++++-
 xen/arch/x86/acpi/cpu_idle.c        | 21 ++++++++++++++++++---
 xen/arch/x86/cpu/mwait-idle.c       |  4 +++-
 xen/include/xen/acpi.h              | 16 ++++++++++++----
 4 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 94f04bd..f4f4109 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -715,7 +715,13 @@ so the crash kernel may find find them.  Should be used in combination
 with **crashinfo_maxaddr**.
 
 ### max\_cstate
-> `= <integer>`
+> `= <cstate>[,<substate>]`
+
+`cstate` is an integer which limits the maximum C-state that Xen uses.
+
+`substate` is an integer which limits the maximum sub C-state that Xen
+uses.  The limit only applies to the highest legal C-state.
+
 
 ### max\_gsi\_irqs
 > `= <integer>`
diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index b05fb39..9cb5076 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -105,7 +105,16 @@ static uint64_t (*__read_mostly tick_to_ns)(uint64_t) = acpi_pm_tick_to_ns;
 
 void (*__read_mostly pm_idle_save)(void);
 unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER - 1;
-integer_param("max_cstate", max_cstate);
+unsigned int max_csubstate __read_mostly = UINT_MAX;
+
+static void __init parse_cstate(const char *s)
+{
+    max_cstate = simple_strtoul(s, &s, 0);
+    if ( *s == ',' )
+        max_csubstate = simple_strtoul(s + 1, &s, 0);
+}
+custom_param("max_cstate", parse_cstate);
+
 static bool_t __read_mostly local_apic_timer_c2_ok;
 boolean_param("lapic_timer_c2_ok", local_apic_timer_c2_ok);
 
@@ -240,6 +249,7 @@ static void print_acpi_power(uint32_t cpu, struct acpi_processor_power *power)
     last_state_idx = power->last_state ? power->last_state->idx : -1;
     printk("active state:\t\tC%d\n", last_state_idx);
     printk("max_cstate:\t\tC%d\n", max_cstate);
+    printk("max_csubstate:\t\t%u\n", max_csubstate);
     printk("states:\n");
     
     for ( i = 1; i < power->count; i++ )
@@ -484,8 +494,13 @@ static void acpi_processor_idle(void)
         if ( cx->type == ACPI_STATE_C3 && power->flags.bm_check &&
              acpi_idle_bm_check() )
             cx = power->safe_state;
-        if ( cx->idx > max_cstate )
-            cx = &power->states[max_cstate];
+        while ( (cx->type > max_cstate ||
+                 cx->entry_method == ACPI_CSTATE_EM_NONE ||
+                 (cx->entry_method == ACPI_CSTATE_EM_FFH &&
+                   cx->type == max_cstate &&
+                   (cx->address & MWAIT_SUBSTATE_MASK) > max_csubstate)) &&
+               next_state-- )
+            cx = &power->states[next_state];
         menu_get_trace_data(&exp, &pred);
     }
     if ( !cx )
diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
index 38172e5..3bad6d8 100644
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -330,7 +330,9 @@ static void mwait_idle(void)
 	    (next_state = cpuidle_current_governor->select(power)) > 0) {
 		do {
 			cx = &power->states[next_state];
-		} while (cx->type > max_cstate && --next_state);
+		} while ((cx->type > max_cstate || (cx->type == max_cstate &&
+			  MWAIT_HINT2SUBSTATE(cx->address) > max_csubstate)) &&
+			 --next_state);
 		if (!next_state)
 			cx = NULL;
 		menu_get_trace_data(&exp, &pred);
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 3aeba4a..331dc8d 100644
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -124,13 +124,21 @@ void acpi_unregister_gsi (u32 gsi);
 
 #ifdef	CONFIG_ACPI_CSTATE
 /*
- * Set highest legal C-state
- * 0: C0 okay, but not C1
- * 1: C1 okay, but not C2
- * 2: C2 okay, but not C3 etc.
+ * max_cstate sets the highest legal C-state.
+ * max_cstate = 0: C0 okay, but not C1
+ * max_cstate = 1: C1 okay, but not C2
+ * max_cstate = 2: C2 okay, but not C3 etc.
+
+ * max_csubstate sets the highest legal C-state sub-state. Only applies to the
+ * highest legal C-state.
+ * max_cstate = 1, max_csubstate = 0 ==> C0, C1 okay, but not C1E
+ * max_cstate = 1, max_csubstate = 1 ==> C0, C1 and C1E okay, but not C2
+ * max_cstate = 2, max_csubstate = 0 ==> C0, C1, C1E, C2 okay, but not C3
+ * max_cstate = 2, max_csubstate = 1 ==> C0, C1, C1E, C2 okay, but not C3
  */
 
 extern unsigned int max_cstate;
+extern unsigned int max_csubstate;
 
 static inline unsigned int acpi_get_cstate_limit(void)
 {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/3] tools/libxc: Alow controlling the max C-state sub-state
  2014-06-23 11:09 [PATCH v3 0/3] Support controlling the max C-state sub-state Ross Lagerwall
  2014-06-23 11:09 ` [PATCH v3 1/3] x86: Allow limiting " Ross Lagerwall
@ 2014-06-23 11:09 ` Ross Lagerwall
  2014-06-23 15:43   ` [PATCH v3A 2/3] tools/libxc: allow " Jan Beulich
  2014-06-23 11:09 ` [PATCH v3 3/3] xenpm: Allow " Ross Lagerwall
  2 siblings, 1 reply; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-23 11:09 UTC (permalink / raw)
  To: Xen-devel; +Cc: Ross Lagerwall, Ian Jackson, Ian Campbell, Stefano Stabellini

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_pm.c         | 28 ++++++++++++++++++++++++----
 tools/libxc/xenctrl.h       |  3 +++
 xen/drivers/acpi/pmstat.c   | 20 ++++++++++++++++----
 xen/include/public/sysctl.h |  5 ++++-
 xen/include/xen/acpi.h      | 13 +++++++++++++
 5 files changed, 60 insertions(+), 9 deletions(-)

diff --git a/tools/libxc/xc_pm.c b/tools/libxc/xc_pm.c
index e4e0fb9..9631d99 100644
--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -386,7 +386,7 @@ int xc_get_vcpu_migration_delay(xc_interface *xch, uint32_t *value)
    return rc;
 }
 
-int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+static int get_max_cstate(xc_interface *xch, uint32_t *value, uint32_t type)
 {
     int rc;
     DECLARE_SYSCTL;
@@ -396,7 +396,7 @@ int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_get_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.get_max_cstate = 0;
     rc = do_sysctl(xch, &sysctl);
     *value = sysctl.u.pm_op.u.get_max_cstate;
@@ -404,7 +404,17 @@ int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
     return rc;
 }
 
-int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 0);
+}
+
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 1);
+}
+
+static int set_max_cstate(xc_interface *xch, uint32_t value, uint32_t type)
 {
     DECLARE_SYSCTL;
 
@@ -413,12 +423,22 @@ int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_set_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.set_max_cstate = value;
 
     return do_sysctl(xch, &sysctl);
 }
 
+int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 0);
+}
+
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 1);
+}
+
 int xc_enable_turbo(xc_interface *xch, int cpuid)
 {
     DECLARE_SYSCTL;
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index b55d857..a4856a2 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2159,6 +2159,9 @@ int xc_get_vcpu_migration_delay(xc_interface *xch, uint32_t *value);
 int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value);
 int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value);
 
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value);
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value);
+
 int xc_enable_turbo(xc_interface *xch, int cpuid);
 int xc_disable_turbo(xc_interface *xch, int cpuid);
 /**
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index daac2da..cd74835 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -400,15 +400,17 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
     int ret = 0;
     const struct processor_pminfo *pmpt;
 
-    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
+    if ( !op || (op->cmd != XEN_SYSCTL_pm_op_get_max_cstate &&
+                 op->cmd != XEN_SYSCTL_pm_op_set_max_cstate &&
+                 (op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid))) )
         return -EINVAL;
-    pmpt = processor_pminfo[op->cpuid];
 
     switch ( op->cmd & PM_PARA_CATEGORY_MASK )
     {
     case CPUFREQ_PARA:
         if ( !(xen_processor_pmbits & XEN_PROCESSOR_PM_PX) )
             return -ENODEV;
+        pmpt = processor_pminfo[op->cpuid];
         if ( !pmpt || !(pmpt->perf.init & XEN_PX_INIT) )
             return -EINVAL;
         break;
@@ -465,13 +467,23 @@ int do_pm_op(struct xen_sysctl_pm_op *op)
 
     case XEN_SYSCTL_pm_op_get_max_cstate:
     {
-        op->u.get_max_cstate = acpi_get_cstate_limit();
+        if ( op->cpuid == 0 )
+            op->u.get_max_cstate = acpi_get_cstate_limit();
+        else if ( op->cpuid == 1 )
+            op->u.get_max_cstate = acpi_get_csubstate_limit();
+        else
+            ret = -EINVAL;
         break;
     }
 
     case XEN_SYSCTL_pm_op_set_max_cstate:
     {
-        acpi_set_cstate_limit(op->u.set_max_cstate);
+        if ( op->cpuid == 0 )
+            acpi_set_cstate_limit(op->u.set_max_cstate);
+        else if ( op->cpuid == 1 )
+            acpi_set_csubstate_limit(op->u.set_max_cstate);
+        else
+            ret = -EINVAL;
         break;
     }
 
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 3588698..77f820b 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -354,7 +354,10 @@ struct xen_sysctl_pm_op {
     /* set/reset scheduler power saving option */
     #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
 
-    /* cpuidle max_cstate access command */
+    /* cpuidle max_cstate and max_csubstate access command
+     * Set cpuid to 0 for max_cstate.
+     * Set cpuid to 1 for max_csubstate.
+     */
     #define XEN_SYSCTL_pm_op_get_max_cstate       0x22
     #define XEN_SYSCTL_pm_op_set_max_cstate       0x23
 
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 331dc8d..2724fd0 100644
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -149,9 +149,22 @@ static inline void acpi_set_cstate_limit(unsigned int new_limit)
 	max_cstate = new_limit;
 	return;
 }
+
+static inline unsigned int acpi_get_csubstate_limit(void)
+{
+	return max_csubstate;
+}
+
+static inline void acpi_set_csubstate_limit(unsigned int new_limit)
+{
+	max_csubstate = new_limit;
+}
+
 #else
 static inline unsigned int acpi_get_cstate_limit(void) { return 0; }
 static inline void acpi_set_cstate_limit(unsigned int new_limit) { return; }
+static inline unsigned int acpi_get_csubstate_limit(void) { return 0; }
+static inline void acpi_set_csubstate_limit(unsigned int new_limit) { return; }
 #endif
 
 #ifdef XEN_GUEST_HANDLE_PARAM
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/3] xenpm: Allow controlling the max C-state sub-state
  2014-06-23 11:09 [PATCH v3 0/3] Support controlling the max C-state sub-state Ross Lagerwall
  2014-06-23 11:09 ` [PATCH v3 1/3] x86: Allow limiting " Ross Lagerwall
  2014-06-23 11:09 ` [PATCH v3 2/3] tools/libxc: Alow controlling " Ross Lagerwall
@ 2014-06-23 11:09 ` Ross Lagerwall
  2014-06-27 15:03   ` Ian Campbell
  2 siblings, 1 reply; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-23 11:09 UTC (permalink / raw)
  To: Xen-devel; +Cc: Ross Lagerwall, Ian Jackson, Ian Campbell, Stefano Stabellini

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
---
 tools/misc/xenpm.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index e43924c..9b52b00 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -64,6 +64,7 @@ void show_help(void)
             " set-vcpu-migration-delay      <num> set scheduler vcpu migration delay in us\n"
             " get-vcpu-migration-delay            get scheduler vcpu migration delay\n"
             " set-max-cstate        <num>         set the C-State limitation (<num> >= 0)\n"
+            " set-max-csubstate     <num>         set the C-State sub-state limitation (<num> >= 0)\n"
             " start [seconds]                     start collect Cx/Px statistics,\n"
             "                                     output after CTRL-C or SIGINT or several seconds.\n"
             " enable-turbo-mode     [cpuid]       enable Turbo Mode for processors that support it.\n"
@@ -188,7 +189,19 @@ static int show_max_cstate(xc_interface *xc_handle)
     if ( (ret = xc_get_cpuidle_max_cstate(xc_handle, &value)) )
         return ret;
 
-    printf("Max possible C-state: C%d\n\n", value);
+    printf("Max possible C-state: C%d\n", value);
+    return 0;
+}
+
+static int show_max_csubstate(xc_interface *xc_handle)
+{
+    int ret = 0;
+    uint32_t value;
+
+    if ( (ret = xc_get_cpuidle_max_csubstate(xc_handle, &value)) )
+        return ret;
+
+    printf("Max possible C-state sub-state: %u\n\n", value);
     return 0;
 }
 
@@ -223,6 +236,7 @@ void cxstat_func(int argc, char *argv[])
         parse_cpuid(argv[0], &cpuid);
 
     show_max_cstate(xc_handle);
+    show_max_csubstate(xc_handle);
 
     if ( cpuid < 0 )
     {
@@ -1088,6 +1102,23 @@ void set_max_cstate_func(int argc, char *argv[])
                 value, errno, strerror(errno));
 }
 
+void set_max_csubstate_func(int argc, char *argv[])
+{
+    uint32_t value;
+
+    if ( argc != 1 || sscanf(argv[0], "%u", &value) != 1 )
+    {
+        fprintf(stderr, "Missing or invalid argument(s)\n");
+        exit(EINVAL);
+    }
+
+    if ( !xc_set_cpuidle_max_csubstate(xc_handle, value) )
+        printf("set max_csubstate to %u succeeded\n", value);
+    else
+        fprintf(stderr, "set max_csubstate to %u failed (%d - %s)\n",
+                value, errno, strerror(errno));
+}
+
 void enable_turbo_mode(int argc, char *argv[])
 {
     int cpuid = -1;
@@ -1154,6 +1185,7 @@ struct {
     { "get-vcpu-migration-delay", get_vcpu_migration_delay_func},
     { "set-vcpu-migration-delay", set_vcpu_migration_delay_func},
     { "set-max-cstate", set_max_cstate_func},
+    { "set-max-csubstate", set_max_csubstate_func},
     { "enable-turbo-mode", enable_turbo_mode },
     { "disable-turbo-mode", disable_turbo_mode },
 };
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3A 2/3] tools/libxc: allow controlling the max C-state sub-state
  2014-06-23 11:09 ` [PATCH v3 2/3] tools/libxc: Alow controlling " Ross Lagerwall
@ 2014-06-23 15:43   ` Jan Beulich
  2014-06-23 16:00     ` Ross Lagerwall
  2014-06-27 15:02     ` Ian Campbell
  0 siblings, 2 replies; 16+ messages in thread
From: Jan Beulich @ 2014-06-23 15:43 UTC (permalink / raw)
  To: Ross Lagerwall, Xen-devel; +Cc: Ian Jackson, Ian Campbell, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 6795 bytes --]

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

Make handling in do_pm_op() more homogeneous: Before interpreting
op->cpuid as such, handle all operations not acting on a particular
CPU.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -386,7 +386,7 @@ int xc_get_vcpu_migration_delay(xc_inter
    return rc;
 }
 
-int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+static int get_max_cstate(xc_interface *xch, uint32_t *value, uint32_t type)
 {
     int rc;
     DECLARE_SYSCTL;
@@ -396,7 +396,7 @@ int xc_get_cpuidle_max_cstate(xc_interfa
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_get_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.get_max_cstate = 0;
     rc = do_sysctl(xch, &sysctl);
     *value = sysctl.u.pm_op.u.get_max_cstate;
@@ -404,7 +404,17 @@ int xc_get_cpuidle_max_cstate(xc_interfa
     return rc;
 }
 
-int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 0);
+}
+
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 1);
+}
+
+static int set_max_cstate(xc_interface *xch, uint32_t value, uint32_t type)
 {
     DECLARE_SYSCTL;
 
@@ -413,12 +423,22 @@ int xc_set_cpuidle_max_cstate(xc_interfa
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_set_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.set_max_cstate = value;
 
     return do_sysctl(xch, &sysctl);
 }
 
+int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 0);
+}
+
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 1);
+}
+
 int xc_enable_turbo(xc_interface *xch, int cpuid)
 {
     DECLARE_SYSCTL;
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2159,6 +2159,9 @@ int xc_get_vcpu_migration_delay(xc_inter
 int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value);
 int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value);
 
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value);
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value);
+
 int xc_enable_turbo(xc_interface *xch, int cpuid);
 int xc_disable_turbo(xc_interface *xch, int cpuid);
 /**
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -400,7 +400,51 @@ int do_pm_op(struct xen_sysctl_pm_op *op
     int ret = 0;
     const struct processor_pminfo *pmpt;
 
-    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
+    switch ( op->cmd )
+    {
+    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
+    {
+        uint32_t saved_value = sched_smt_power_savings;
+
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
+        op->u.set_sched_opt_smt = saved_value;
+        return 0;
+    }
+
+    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
+        return 0;
+
+    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
+        return 0;
+
+    case XEN_SYSCTL_pm_op_get_max_cstate:
+        if ( op->cpuid == 0 )
+            op->u.get_max_cstate = acpi_get_cstate_limit();
+        else if ( op->cpuid == 1 )
+            op->u.get_max_cstate = acpi_get_csubstate_limit();
+        else
+            ret = -EINVAL;
+        return ret;
+
+    case XEN_SYSCTL_pm_op_set_max_cstate:
+        if ( op->cpuid == 0 )
+            acpi_set_cstate_limit(op->u.set_max_cstate);
+        else if ( op->cpuid == 1 )
+            acpi_set_csubstate_limit(op->u.set_max_cstate);
+        else
+            ret = -EINVAL;
+        return ret;
+    }
+
+    if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
         return -EINVAL;
     pmpt = processor_pminfo[op->cpuid];
 
@@ -440,41 +484,6 @@ int do_pm_op(struct xen_sysctl_pm_op *op
         break;
     }
 
-    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
-    {
-        uint32_t saved_value;
-
-        saved_value = sched_smt_power_savings;
-        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
-        op->u.set_sched_opt_smt = saved_value;
-
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
-    {
-        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
-    {
-        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_max_cstate:
-    {
-        op->u.get_max_cstate = acpi_get_cstate_limit();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_max_cstate:
-    {
-        acpi_set_cstate_limit(op->u.set_max_cstate);
-        break;
-    }
-
     case XEN_SYSCTL_pm_op_enable_turbo:
     {
         ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -354,7 +354,11 @@ struct xen_sysctl_pm_op {
     /* set/reset scheduler power saving option */
     #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
 
-    /* cpuidle max_cstate access command */
+    /*
+     * cpuidle max C-state and max C-sub-state access command:
+     * Set cpuid to 0 for max C-state.
+     * Set cpuid to 1 for max C-sub-state.
+     */
     #define XEN_SYSCTL_pm_op_get_max_cstate       0x22
     #define XEN_SYSCTL_pm_op_set_max_cstate       0x23
 
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -149,9 +149,22 @@ static inline void acpi_set_cstate_limit
 	max_cstate = new_limit;
 	return;
 }
+
+static inline unsigned int acpi_get_csubstate_limit(void)
+{
+	return max_csubstate;
+}
+
+static inline void acpi_set_csubstate_limit(unsigned int new_limit)
+{
+	max_csubstate = new_limit;
+}
+
 #else
 static inline unsigned int acpi_get_cstate_limit(void) { return 0; }
 static inline void acpi_set_cstate_limit(unsigned int new_limit) { return; }
+static inline unsigned int acpi_get_csubstate_limit(void) { return 0; }
+static inline void acpi_set_csubstate_limit(unsigned int new_limit) { return; }
 #endif
 
 #ifdef XEN_GUEST_HANDLE_PARAM



[-- Attachment #2: libxc-allow-controlling-C-sub-state.patch --]
[-- Type: text/plain, Size: 6851 bytes --]

tools/libxc: allow controlling the max C-state sub-state

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

Make handling in do_pm_op() more homogeneous: Before interpreting
op->cpuid as such, handle all operations not acting on a particular
CPU.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -386,7 +386,7 @@ int xc_get_vcpu_migration_delay(xc_inter
    return rc;
 }
 
-int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+static int get_max_cstate(xc_interface *xch, uint32_t *value, uint32_t type)
 {
     int rc;
     DECLARE_SYSCTL;
@@ -396,7 +396,7 @@ int xc_get_cpuidle_max_cstate(xc_interfa
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_get_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.get_max_cstate = 0;
     rc = do_sysctl(xch, &sysctl);
     *value = sysctl.u.pm_op.u.get_max_cstate;
@@ -404,7 +404,17 @@ int xc_get_cpuidle_max_cstate(xc_interfa
     return rc;
 }
 
-int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 0);
+}
+
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value)
+{
+    return get_max_cstate(xch, value, 1);
+}
+
+static int set_max_cstate(xc_interface *xch, uint32_t value, uint32_t type)
 {
     DECLARE_SYSCTL;
 
@@ -413,12 +423,22 @@ int xc_set_cpuidle_max_cstate(xc_interfa
 
     sysctl.cmd = XEN_SYSCTL_pm_op;
     sysctl.u.pm_op.cmd = XEN_SYSCTL_pm_op_set_max_cstate;
-    sysctl.u.pm_op.cpuid = 0;
+    sysctl.u.pm_op.cpuid = type;
     sysctl.u.pm_op.u.set_max_cstate = value;
 
     return do_sysctl(xch, &sysctl);
 }
 
+int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 0);
+}
+
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value)
+{
+    return set_max_cstate(xch, value, 1);
+}
+
 int xc_enable_turbo(xc_interface *xch, int cpuid)
 {
     DECLARE_SYSCTL;
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2159,6 +2159,9 @@ int xc_get_vcpu_migration_delay(xc_inter
 int xc_get_cpuidle_max_cstate(xc_interface *xch, uint32_t *value);
 int xc_set_cpuidle_max_cstate(xc_interface *xch, uint32_t value);
 
+int xc_get_cpuidle_max_csubstate(xc_interface *xch, uint32_t *value);
+int xc_set_cpuidle_max_csubstate(xc_interface *xch, uint32_t value);
+
 int xc_enable_turbo(xc_interface *xch, int cpuid);
 int xc_disable_turbo(xc_interface *xch, int cpuid);
 /**
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -400,7 +400,51 @@ int do_pm_op(struct xen_sysctl_pm_op *op
     int ret = 0;
     const struct processor_pminfo *pmpt;
 
-    if ( !op || op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
+    switch ( op->cmd )
+    {
+    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
+    {
+        uint32_t saved_value = sched_smt_power_savings;
+
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
+        op->u.set_sched_opt_smt = saved_value;
+        return 0;
+    }
+
+    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
+        return 0;
+
+    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
+        if ( op->cpuid != 0 )
+            return -EINVAL;
+        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
+        return 0;
+
+    case XEN_SYSCTL_pm_op_get_max_cstate:
+        if ( op->cpuid == 0 )
+            op->u.get_max_cstate = acpi_get_cstate_limit();
+        else if ( op->cpuid == 1 )
+            op->u.get_max_cstate = acpi_get_csubstate_limit();
+        else
+            ret = -EINVAL;
+        return ret;
+
+    case XEN_SYSCTL_pm_op_set_max_cstate:
+        if ( op->cpuid == 0 )
+            acpi_set_cstate_limit(op->u.set_max_cstate);
+        else if ( op->cpuid == 1 )
+            acpi_set_csubstate_limit(op->u.set_max_cstate);
+        else
+            ret = -EINVAL;
+        return ret;
+    }
+
+    if ( op->cpuid >= nr_cpu_ids || !cpu_online(op->cpuid) )
         return -EINVAL;
     pmpt = processor_pminfo[op->cpuid];
 
@@ -440,41 +484,6 @@ int do_pm_op(struct xen_sysctl_pm_op *op
         break;
     }
 
-    case XEN_SYSCTL_pm_op_set_sched_opt_smt:
-    {
-        uint32_t saved_value;
-
-        saved_value = sched_smt_power_savings;
-        sched_smt_power_savings = !!op->u.set_sched_opt_smt;
-        op->u.set_sched_opt_smt = saved_value;
-
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_vcpu_migration_delay:
-    {
-        set_vcpu_migration_delay(op->u.set_vcpu_migration_delay);
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_vcpu_migration_delay:
-    {
-        op->u.get_vcpu_migration_delay = get_vcpu_migration_delay();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_get_max_cstate:
-    {
-        op->u.get_max_cstate = acpi_get_cstate_limit();
-        break;
-    }
-
-    case XEN_SYSCTL_pm_op_set_max_cstate:
-    {
-        acpi_set_cstate_limit(op->u.set_max_cstate);
-        break;
-    }
-
     case XEN_SYSCTL_pm_op_enable_turbo:
     {
         ret = cpufreq_update_turbo(op->cpuid, CPUFREQ_TURBO_ENABLED);
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -354,7 +354,11 @@ struct xen_sysctl_pm_op {
     /* set/reset scheduler power saving option */
     #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
 
-    /* cpuidle max_cstate access command */
+    /*
+     * cpuidle max C-state and max C-sub-state access command:
+     * Set cpuid to 0 for max C-state.
+     * Set cpuid to 1 for max C-sub-state.
+     */
     #define XEN_SYSCTL_pm_op_get_max_cstate       0x22
     #define XEN_SYSCTL_pm_op_set_max_cstate       0x23
 
--- a/xen/include/xen/acpi.h
+++ b/xen/include/xen/acpi.h
@@ -149,9 +149,22 @@ static inline void acpi_set_cstate_limit
 	max_cstate = new_limit;
 	return;
 }
+
+static inline unsigned int acpi_get_csubstate_limit(void)
+{
+	return max_csubstate;
+}
+
+static inline void acpi_set_csubstate_limit(unsigned int new_limit)
+{
+	max_csubstate = new_limit;
+}
+
 #else
 static inline unsigned int acpi_get_cstate_limit(void) { return 0; }
 static inline void acpi_set_cstate_limit(unsigned int new_limit) { return; }
+static inline unsigned int acpi_get_csubstate_limit(void) { return 0; }
+static inline void acpi_set_csubstate_limit(unsigned int new_limit) { return; }
 #endif
 
 #ifdef XEN_GUEST_HANDLE_PARAM

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3A 2/3] tools/libxc: allow controlling the max C-state sub-state
  2014-06-23 15:43   ` [PATCH v3A 2/3] tools/libxc: allow " Jan Beulich
@ 2014-06-23 16:00     ` Ross Lagerwall
  2014-06-27 15:02     ` Ian Campbell
  1 sibling, 0 replies; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-23 16:00 UTC (permalink / raw)
  To: Jan Beulich, Xen-devel; +Cc: Ian Jackson, Ian Campbell, Stefano Stabellini

On 06/23/2014 04:43 PM, Jan Beulich wrote:
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> Make handling in do_pm_op() more homogeneous: Before interpreting
> op->cpuid as such, handle all operations not acting on a particular
> CPU.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>

Looks good.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-06-23 11:09 ` [PATCH v3 1/3] x86: Allow limiting " Ross Lagerwall
@ 2014-06-25 12:37   ` Jan Beulich
  2014-06-25 15:52     ` Ross Lagerwall
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-06-25 12:37 UTC (permalink / raw)
  To: Ross Lagerwall; +Cc: Liu Jinsong, Xen-devel

>>> On 23.06.14 at 13:09, <ross.lagerwall@citrix.com> wrote:
> Allow limiting the max C-state sub-state by appending to the max_cstate
> command-line parameter. E.g. max_cstate=1,0
> The limit only applies to the highest legal C-state. For example:
>  max_cstate = 1, max_csubstate = 0 ==> C0, C1 okay, but not C1E
>  max_cstate = 1, max_csubstate = 1 ==> C0, C1 and C1E okay, but not C2
>  max_cstate = 2, max_csubstate = 0 ==> C0, C1, C1E, C2 okay, but not C3
>  max_cstate = 2, max_csubstate = 1 ==> C0, C1, C1E, C2 okay, but not C3

While from an abstract perspective this looks okay to me now, I'm
afraid the description, which is also being put into the header file, is
possibly misleading: Neither is the first sub-state of C1 necessarily
C1E, nor is it excluded that C2 and higher also have sub-states (yet
the last of the examples sort of suggests that).

> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -330,7 +330,9 @@ static void mwait_idle(void)
>  	    (next_state = cpuidle_current_governor->select(power)) > 0) {
>  		do {
>  			cx = &power->states[next_state];
> -		} while (cx->type > max_cstate && --next_state);
> +		} while ((cx->type > max_cstate || (cx->type == max_cstate &&
> +			  MWAIT_HINT2SUBSTATE(cx->address) > max_csubstate)) &&
> +			 --next_state);

In the context of the above comment it then is questionable
whether here (and similarly in acpi_processor_idle()) using the
MWAIT parameter value for the comparison here is really
suitable: If you look at hsw_cstates[] and atom_cstates[] you'll
see that there we have states with just a single non-zero sub-
state (which the logic here would exclude in certain cases when
one would expect it to be permitted).

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-06-25 12:37   ` Jan Beulich
@ 2014-06-25 15:52     ` Ross Lagerwall
  2014-06-26 13:38       ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Ross Lagerwall @ 2014-06-25 15:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Xen-devel

On 06/25/2014 01:37 PM, Jan Beulich wrote:
>>>> On 23.06.14 at 13:09, <ross.lagerwall@citrix.com> wrote:
>> Allow limiting the max C-state sub-state by appending to the max_cstate
>> command-line parameter. E.g. max_cstate=1,0
>> The limit only applies to the highest legal C-state. For example:
>>   max_cstate = 1, max_csubstate = 0 ==> C0, C1 okay, but not C1E
>>   max_cstate = 1, max_csubstate = 1 ==> C0, C1 and C1E okay, but not C2
>>   max_cstate = 2, max_csubstate = 0 ==> C0, C1, C1E, C2 okay, but not C3
>>   max_cstate = 2, max_csubstate = 1 ==> C0, C1, C1E, C2 okay, but not C3
>
> While from an abstract perspective this looks okay to me now, I'm
> afraid the description, which is also being put into the header file, is
> possibly misleading: Neither is the first sub-state of C1 necessarily
> C1E, nor is it excluded that C2 and higher also have sub-states (yet
> the last of the examples sort of suggests that).

The comment was meant to clarify how max_cstate and max_csubstate work 
by means of an example from a real machine.  I don't think it suggests 
that the C-states used in the example are necessarily what one would 
find on a real machine.  I could make the example more abstract, but I 
don't think that would be helpful.

>
>> --- a/xen/arch/x86/cpu/mwait-idle.c
>> +++ b/xen/arch/x86/cpu/mwait-idle.c
>> @@ -330,7 +330,9 @@ static void mwait_idle(void)
>>   	    (next_state = cpuidle_current_governor->select(power)) > 0) {
>>   		do {
>>   			cx = &power->states[next_state];
>> -		} while (cx->type > max_cstate && --next_state);
>> +		} while ((cx->type > max_cstate || (cx->type == max_cstate &&
>> +			  MWAIT_HINT2SUBSTATE(cx->address) > max_csubstate)) &&
>> +			 --next_state);
>
> In the context of the above comment it then is questionable
> whether here (and similarly in acpi_processor_idle()) using the
> MWAIT parameter value for the comparison here is really
> suitable: If you look at hsw_cstates[] and atom_cstates[] you'll
> see that there we have states with just a single non-zero sub-
> state (which the logic here would exclude in certain cases when
> one would expect it to be permitted).
>

When would one expect them to be permitted that this logic would exclude?

C7s-HSW has a C-state of 4 and a sub-state of 2.  If you set max_cstate 
= 4, then no C-state > 4 will be selected.
Similarly, if you select max_csubstate = 2, then no sub C-state > 2 will 
be selected (if max_cstate = 4). This seems congruous to me.

Regards
-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-06-25 15:52     ` Ross Lagerwall
@ 2014-06-26 13:38       ` Jan Beulich
  2014-07-07 15:14         ` Ross Lagerwall
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-06-26 13:38 UTC (permalink / raw)
  To: Ross Lagerwall; +Cc: Liu Jinsong, Xen-devel

>>> On 25.06.14 at 17:52, <ross.lagerwall@citrix.com> wrote:
> On 06/25/2014 01:37 PM, Jan Beulich wrote:
>>>>> On 23.06.14 at 13:09, <ross.lagerwall@citrix.com> wrote:
>>> --- a/xen/arch/x86/cpu/mwait-idle.c
>>> +++ b/xen/arch/x86/cpu/mwait-idle.c
>>> @@ -330,7 +330,9 @@ static void mwait_idle(void)
>>>   	    (next_state = cpuidle_current_governor->select(power)) > 0) {
>>>   		do {
>>>   			cx = &power->states[next_state];
>>> -		} while (cx->type > max_cstate && --next_state);
>>> +		} while ((cx->type > max_cstate || (cx->type == max_cstate &&
>>> +			  MWAIT_HINT2SUBSTATE(cx->address) > max_csubstate)) &&
>>> +			 --next_state);
>>
>> In the context of the above comment it then is questionable
>> whether here (and similarly in acpi_processor_idle()) using the
>> MWAIT parameter value for the comparison here is really
>> suitable: If you look at hsw_cstates[] and atom_cstates[] you'll
>> see that there we have states with just a single non-zero sub-
>> state (which the logic here would exclude in certain cases when
>> one would expect it to be permitted).
>>
> 
> When would one expect them to be permitted that this logic would exclude?
> 
> C7s-HSW has a C-state of 4 and a sub-state of 2.  If you set max_cstate 
> = 4, then no C-state > 4 will be selected.
> Similarly, if you select max_csubstate = 2, then no sub C-state > 2 will 
> be selected (if max_cstate = 4). This seems congruous to me.

Actually I think I got it the wrong way round: Taking your example, if
one sets max_csubstate = 1, one may expect the 1st (not necessarily
the one numbered "1") to still be permitted (much like we also don't tie
max_cstate to actual values to be passed to MWAIT). Just look at how
much more interesting the sub-states get with the ports from recent
Linux that I posted earlier today.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3A 2/3] tools/libxc: allow controlling the max C-state sub-state
  2014-06-23 15:43   ` [PATCH v3A 2/3] tools/libxc: allow " Jan Beulich
  2014-06-23 16:00     ` Ross Lagerwall
@ 2014-06-27 15:02     ` Ian Campbell
  2014-06-27 15:25       ` Jan Beulich
  1 sibling, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2014-06-27 15:02 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ross Lagerwall, Stefano Stabellini, Ian Jackson, Xen-devel

On Mon, 2014-06-23 at 16:43 +0100, Jan Beulich wrote:
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -354,7 +354,11 @@ struct xen_sysctl_pm_op {
>      /* set/reset scheduler power saving option */
>      #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
>  
> -    /* cpuidle max_cstate access command */
> +    /*
> +     * cpuidle max C-state and max C-sub-state access command:
> +     * Set cpuid to 0 for max C-state.
> +     * Set cpuid to 1 for max C-sub-state.

This seems pretty nasty. Since the sysctl interface is not fixed can't
we just turn the existing get/set_max_cstate fields into structs? Or
(less preferably) define C-sub-states to have bit 31 set.

Can we at least get some named constants instead of 0 and 1?

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] xenpm: Allow controlling the max C-state sub-state
  2014-06-23 11:09 ` [PATCH v3 3/3] xenpm: Allow " Ross Lagerwall
@ 2014-06-27 15:03   ` Ian Campbell
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2014-06-27 15:03 UTC (permalink / raw)
  To: Ross Lagerwall; +Cc: Stefano Stabellini, Ian Jackson, Xen-devel

On Mon, 2014-06-23 at 12:09 +0100, Ross Lagerwall wrote:
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> CC: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3A 2/3] tools/libxc: allow controlling the max C-state sub-state
  2014-06-27 15:02     ` Ian Campbell
@ 2014-06-27 15:25       ` Jan Beulich
  2014-06-27 15:26         ` Ian Campbell
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-06-27 15:25 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ross Lagerwall, Xen-devel, Ian Jackson, StefanoStabellini

>>> On 27.06.14 at 17:02, <Ian.Campbell@citrix.com> wrote:
> On Mon, 2014-06-23 at 16:43 +0100, Jan Beulich wrote:
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -354,7 +354,11 @@ struct xen_sysctl_pm_op {
>>      /* set/reset scheduler power saving option */
>>      #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
>>  
>> -    /* cpuidle max_cstate access command */
>> +    /*
>> +     * cpuidle max C-state and max C-sub-state access command:
>> +     * Set cpuid to 0 for max C-state.
>> +     * Set cpuid to 1 for max C-sub-state.
> 
> This seems pretty nasty. Since the sysctl interface is not fixed can't
> we just turn the existing get/set_max_cstate fields into structs? Or
> (less preferably) define C-sub-states to have bit 31 set.
> 
> Can we at least get some named constants instead of 0 and 1?

Any/all of this would be fine with me; what I suggested was merely
an approach with as little changes as possible for an interface that
wasn't defined/implemented well in the first place.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3A 2/3] tools/libxc: allow controlling the max C-state sub-state
  2014-06-27 15:25       ` Jan Beulich
@ 2014-06-27 15:26         ` Ian Campbell
  0 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2014-06-27 15:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ross Lagerwall, Xen-devel, Ian Jackson, StefanoStabellini

On Fri, 2014-06-27 at 16:25 +0100, Jan Beulich wrote:
> >>> On 27.06.14 at 17:02, <Ian.Campbell@citrix.com> wrote:
> > On Mon, 2014-06-23 at 16:43 +0100, Jan Beulich wrote:
> >> --- a/xen/include/public/sysctl.h
> >> +++ b/xen/include/public/sysctl.h
> >> @@ -354,7 +354,11 @@ struct xen_sysctl_pm_op {
> >>      /* set/reset scheduler power saving option */
> >>      #define XEN_SYSCTL_pm_op_set_sched_opt_smt    0x21
> >>  
> >> -    /* cpuidle max_cstate access command */
> >> +    /*
> >> +     * cpuidle max C-state and max C-sub-state access command:
> >> +     * Set cpuid to 0 for max C-state.
> >> +     * Set cpuid to 1 for max C-sub-state.
> > 
> > This seems pretty nasty. Since the sysctl interface is not fixed can't
> > we just turn the existing get/set_max_cstate fields into structs? Or
> > (less preferably) define C-sub-states to have bit 31 set.
> > 
> > Can we at least get some named constants instead of 0 and 1?
> 
> Any/all of this would be fine with me; what I suggested was merely
> an approach with as little changes as possible for an interface that
> wasn't defined/implemented well in the first place.

If the interface wasn't well defined/implemented in the first place
shouldn't we make it well defined/implemented rather than worry about
minimising the changes?

Ian.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-06-26 13:38       ` Jan Beulich
@ 2014-07-07 15:14         ` Ross Lagerwall
  2014-07-23  7:34           ` Jan Beulich
  0 siblings, 1 reply; 16+ messages in thread
From: Ross Lagerwall @ 2014-07-07 15:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Xen-devel

On 06/26/2014 02:38 PM, Jan Beulich wrote:
>>>> On 25.06.14 at 17:52, <ross.lagerwall@citrix.com> wrote:
>> On 06/25/2014 01:37 PM, Jan Beulich wrote:
>>>>>> On 23.06.14 at 13:09, <ross.lagerwall@citrix.com> wrote:
>>>> --- a/xen/arch/x86/cpu/mwait-idle.c
>>>> +++ b/xen/arch/x86/cpu/mwait-idle.c
>>>> @@ -330,7 +330,9 @@ static void mwait_idle(void)
>>>>    	    (next_state = cpuidle_current_governor->select(power)) > 0) {
>>>>    		do {
>>>>    			cx = &power->states[next_state];
>>>> -		} while (cx->type > max_cstate && --next_state);
>>>> +		} while ((cx->type > max_cstate || (cx->type == max_cstate &&
>>>> +			  MWAIT_HINT2SUBSTATE(cx->address) > max_csubstate)) &&
>>>> +			 --next_state);
>>>
>>> In the context of the above comment it then is questionable
>>> whether here (and similarly in acpi_processor_idle()) using the
>>> MWAIT parameter value for the comparison here is really
>>> suitable: If you look at hsw_cstates[] and atom_cstates[] you'll
>>> see that there we have states with just a single non-zero sub-
>>> state (which the logic here would exclude in certain cases when
>>> one would expect it to be permitted).
>>>
>>
>> When would one expect them to be permitted that this logic would exclude?
>>
>> C7s-HSW has a C-state of 4 and a sub-state of 2.  If you set max_cstate
>> = 4, then no C-state > 4 will be selected.
>> Similarly, if you select max_csubstate = 2, then no sub C-state > 2 will
>> be selected (if max_cstate = 4). This seems congruous to me.
>
> Actually I think I got it the wrong way round: Taking your example, if
> one sets max_csubstate = 1, one may expect the 1st (not necessarily
> the one numbered "1") to still be permitted (much like we also don't tie
> max_cstate to actual values to be passed to MWAIT). Just look at how
> much more interesting the sub-states get with the ports from recent
> Linux that I posted earlier today.
>

AFAICT, max_cstate _is_ tied to the actual values passed to MWAIT. For 
example, take Haswell's C9-HSW state: it has an MWAIT flag value of 0x50 
which gets assigned as cx->type = 6 and subsequently compared with 
max_cstate.

max_csubstate is implemented in the same way as max_cstate.  However, 
given the states in the Bay Trail patch (specifically C6N-BYT and 
C6S-BYT), this is clearly insufficient.
How do you think max_csubstate should limit the substate? Should it be 
something like a max_csubstate = 2 permits the first and second substates?

Regards
-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-07-07 15:14         ` Ross Lagerwall
@ 2014-07-23  7:34           ` Jan Beulich
  2014-07-23 12:56             ` Ross Lagerwall
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Beulich @ 2014-07-23  7:34 UTC (permalink / raw)
  To: Ross Lagerwall; +Cc: Liu Jinsong, Xen-devel

>>> On 07.07.14 at 17:14, <ross.lagerwall@citrix.com> wrote:
> max_csubstate is implemented in the same way as max_cstate.  However, 
> given the states in the Bay Trail patch (specifically C6N-BYT and 
> C6S-BYT), this is clearly insufficient.
> How do you think max_csubstate should limit the substate? Should it be 
> something like a max_csubstate = 2 permits the first and second substates?

It's not at all clear to me how to properly translate the CPU internal
state identifiers to/from command line option values, in a forward
compatible manner.

Jan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/3] x86: Allow limiting the max C-state sub-state
  2014-07-23  7:34           ` Jan Beulich
@ 2014-07-23 12:56             ` Ross Lagerwall
  0 siblings, 0 replies; 16+ messages in thread
From: Ross Lagerwall @ 2014-07-23 12:56 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Xen-devel

On 07/23/2014 08:34 AM, Jan Beulich wrote:
>>>> On 07.07.14 at 17:14, <ross.lagerwall@citrix.com> wrote:
>> max_csubstate is implemented in the same way as max_cstate.  However,
>> given the states in the Bay Trail patch (specifically C6N-BYT and
>> C6S-BYT), this is clearly insufficient.
>> How do you think max_csubstate should limit the substate? Should it be
>> something like a max_csubstate = 2 permits the first and second substates?
>
> It's not at all clear to me how to properly translate the CPU internal
> state identifiers to/from command line option values, in a forward
> compatible manner.
>

Well, given that Xen treats the cpuidle_state array as a linear sequence 
of states, and they are ordered by exit latency/target residency, one 
approach is to have the command-line parameters control how deep down 
into the state array the processor is allowed to travel, as was my first 
approach. I know you rejected this before because it was 
hardware-specific and broke the notion of a C-state but I still feel 
it's the simplest solution given that any other way of mapping to CPU 
internal states is likely to be more complex and even more hardware 
dependent...

Regards
-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-07-23 12:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-23 11:09 [PATCH v3 0/3] Support controlling the max C-state sub-state Ross Lagerwall
2014-06-23 11:09 ` [PATCH v3 1/3] x86: Allow limiting " Ross Lagerwall
2014-06-25 12:37   ` Jan Beulich
2014-06-25 15:52     ` Ross Lagerwall
2014-06-26 13:38       ` Jan Beulich
2014-07-07 15:14         ` Ross Lagerwall
2014-07-23  7:34           ` Jan Beulich
2014-07-23 12:56             ` Ross Lagerwall
2014-06-23 11:09 ` [PATCH v3 2/3] tools/libxc: Alow controlling " Ross Lagerwall
2014-06-23 15:43   ` [PATCH v3A 2/3] tools/libxc: allow " Jan Beulich
2014-06-23 16:00     ` Ross Lagerwall
2014-06-27 15:02     ` Ian Campbell
2014-06-27 15:25       ` Jan Beulich
2014-06-27 15:26         ` Ian Campbell
2014-06-23 11:09 ` [PATCH v3 3/3] xenpm: Allow " Ross Lagerwall
2014-06-27 15:03   ` Ian Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).