xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/9] Porting the intel_pstate driver to Xen
@ 2015-09-14  2:32 Wei Wang
  2015-09-14  2:32 ` [PATCH v5 1/9] x86/intel_pstate: add some calculation related support Wei Wang
                   ` (8 more replies)
  0 siblings, 9 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

v5 changes:
We have made various changes in this version, including introducing new
data structures, coding styles changes etc. Please see each patch's commit
message for change details.

v4 changes:
1) introduce a new struct, internal_governor, to "cpufreq_policy";
2) add a new header file, xen/include/asm-x86/cpufreq.h;
3) remove the APERF/MPERF feature detection code in cpufreq.c and powernow.c;
4) coding style changes.

Please check each patch's commit message for details.

v3 Changes:
1) coding style changes based on Jan's comments;
2) remove the function - unregister_cpu_notifier();
3) solve a bug in the CPU offline code (Patch 0007);
4) move the perf_limits struct into the per-CPU policy struct, so that
each CPU can be managed individually;
5) "load_intel_pstate" is changed local to the intel_pstate.c file, and
add its description to the xen-command-line.markdown.

v2 Changes:
1) The intel_pstate driver can be controlled via two ways:
A. min_perf_pct and max_perf_pct
   The user directly adjusts min_perf_pct and max_perf_pct to get what 
   they want. For example, if min_perf_pct=max_perf_pct=60%, then the 
   user is asking for something similar to a userspace governor with 
   setting the requested performance=60%.
B. set-scaling-governor
   This one is functionally redundant, since A. can achieve all the
   governor functions. It is remained to give people time to get
   familiar with method A.
   Users can choose from the four governors: Powersave, Ondemand,
   Powersave, Performance. The driver achieves the functionality of 
   the selected governor via adjusting the min_perf_pct and max_perf_pct
   itself.
2) The xenpm "get-cpufreq-para" displays the following things:
cpu id               : 10
affected_cpus        : 10
cpuinfo frequency    : max [3700000] min [1200000] cur [1400000]
scaling_driver       : intel_pstate
scaling_avail_gov    : performance powersave userspace ondemand
current_governor     : ondemand
max_perf_pct         : 100
min_perf_pct         : 32
turbo_pct            : 54
turbo mode           : enabled
3) Changed "intel_pstate=disable" to "intel_pstate=enable". 
If "intel_pstate=enable" is added, but the CPU does not support the
intel_pstate driver, the old P-state driver (acpi-cpufreq) will be loaded.
4) Moved the declarations under xen/include/acpi to an x86-specific header.

v1:
This patch series ports the intel_pstate driver from the Linux kernel to
Xen. The intel_pstate driver is used to tune P states for SandyBridge+
processors. It needs to be enabled by adding "intel_pstate=enable" to the
booting parameter list.

The intel_pstate.c file under xen/arch/x86/acpi/cpufreq/
contains all the logic for selecting the current P-state. It follows its
implementation in the kernel. In order to better support future Intel CPUs
(e.g. the HWP feature on Skylake+), intel_pstate changes to tune P-state
based on percentage values.

The xenpm tool is also upgraded to support the intel_pstate driver. If
intel_pstate is used, "get-cpufreq-para" displays percentage value based
feedback. If the intel_pstate driver is not enabled, xenpm will work in
the old style.

Wei Wang (9):
  x86/intel_pstate: add some calculation related support
  x86/intel_pstate: APERF/MPERF feature detect
  x86/intel_pstate: add a new driver interface, setpolicy()
  x86/intel_pstate: relocate the driver register function
  x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline
  x86/intel_pstate: the main boby of the intel_pstate driver
  x86/intel_pstate: add a booting param to select the driver to load
  x86/intel_pstate: support the use of intel_pstate in pmstat.c
  tools: enable xenpm to control the intel_pstate driver

 docs/misc/xen-command-line.markdown      |   7 +
 tools/libxc/include/xenctrl.h            |  21 +-
 tools/libxc/xc_pm.c                      |  16 +-
 tools/misc/xenpm.c                       | 116 +++-
 xen/arch/x86/acpi/cpufreq/Makefile       |   1 +
 xen/arch/x86/acpi/cpufreq/cpufreq.c      |  15 +-
 xen/arch/x86/acpi/cpufreq/intel_pstate.c | 884 +++++++++++++++++++++++++++++++
 xen/arch/x86/acpi/cpufreq/powernow.c     |   6 +-
 xen/arch/x86/cpu/common.c                |   4 +
 xen/arch/x86/oprofile/op_model_athlon.c  |   9 -
 xen/drivers/acpi/pmstat.c                | 179 ++++++-
 xen/drivers/cpufreq/cpufreq.c            |  21 +-
 xen/drivers/cpufreq/utility.c            |   3 +
 xen/include/acpi/cpufreq/cpufreq.h       |  54 +-
 xen/include/asm-x86/cpufeature.h         |   5 +
 xen/include/asm-x86/cpufreq.h            |  34 ++
 xen/include/asm-x86/div64.h              |  79 +++
 xen/include/asm-x86/msr-index.h          |   3 +
 xen/include/public/sysctl.h              |  29 +-
 xen/include/xen/kernel.h                 |  23 +
 20 files changed, 1384 insertions(+), 125 deletions(-)
 create mode 100644 xen/arch/x86/acpi/cpufreq/intel_pstate.c
 create mode 100644 xen/include/asm-x86/cpufreq.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v5 1/9] x86/intel_pstate: add some calculation related support
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:19   ` Andrew Cooper
  2015-09-14  2:32 ` [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect Wei Wang
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

The added calculation related functions will be used in the intel_pstate.c.
They are copied from the Linux kernel(commit 2418f4f2, f3002134, eb18cba7).

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/arch/x86/oprofile/op_model_athlon.c |  9 ----
 xen/include/asm-x86/div64.h             | 79 +++++++++++++++++++++++++++++++++
 xen/include/xen/kernel.h                | 23 ++++++++++
 3 files changed, 102 insertions(+), 9 deletions(-)

 changes in v5:
 1) add clamp(), a type checking variant of clamp_t();
 2) remove the private copy of clamp() in op_model_athlon.c.

diff --git a/xen/arch/x86/oprofile/op_model_athlon.c b/xen/arch/x86/oprofile/op_model_athlon.c
index c0a81ed..4122eee 100644
--- a/xen/arch/x86/oprofile/op_model_athlon.c
+++ b/xen/arch/x86/oprofile/op_model_athlon.c
@@ -103,15 +103,6 @@ static u64 ibs_op_ctl;
 #define IBS_FETCH_CODE                  13
 #define IBS_OP_CODE                     14
 
-#define clamp(val, min, max) ({			\
-	typeof(val) __val = (val);		\
-	typeof(min) __min = (min);		\
-	typeof(max) __max = (max);		\
-	(void) (&__val == &__min);		\
-	(void) (&__val == &__max);		\
-	__val = __val < __min ? __min: __val;	\
-	__val > __max ? __max: __val; })
-
 /*
  * 16-bit Linear Feedback Shift Register (LFSR)
  */
diff --git a/xen/include/asm-x86/div64.h b/xen/include/asm-x86/div64.h
index dd49f64..6ba03cb 100644
--- a/xen/include/asm-x86/div64.h
+++ b/xen/include/asm-x86/div64.h
@@ -11,4 +11,83 @@
     __rem;                                      \
 })
 
+static inline uint64_t div_u64_rem(uint64_t dividend, uint32_t divisor,
+                                      uint32_t *remainder)
+{
+    *remainder = do_div(dividend, divisor);
+    return dividend;
+}
+
+static inline uint64_t div_u64(uint64_t dividend, uint32_t  divisor)
+{
+    uint32_t remainder;
+
+    return div_u64_rem(dividend, divisor, &remainder);
+}
+
+/*
+ * div64_u64 - unsigned 64bit divide with 64bit divisor
+ * @dividend:    64bit dividend
+ * @divisor:    64bit divisor
+ *
+ * This implementation is a modified version of the algorithm proposed
+ * by the book 'Hacker's Delight'.  The original source and full proof
+ * can be found here and is available for use without restriction.
+ *
+ * 'http://www.hackersdelight.org/HDcode/newCode/divDouble.c.txt'
+ */
+static inline uint64_t div64_u64(uint64_t dividend, uint64_t divisor)
+{
+    uint32_t high = divisor >> 32;
+    uint64_t quot;
+
+    if ( high == 0 )
+        quot = div_u64(dividend, divisor);
+    else
+    {
+        int n = 1 + fls(high);
+
+        quot = div_u64(dividend >> n, divisor >> n);
+
+        if ( quot != 0 )
+            quot--;
+        if ( (dividend - quot * divisor) >= divisor )
+            quot++;
+    }
+    return quot;
+}
+
+static inline int64_t div_s64_rem(int64_t dividend, int32_t divisor,
+                                     int32_t *remainder)
+{
+    int64_t quotient;
+
+    if ( dividend < 0 )
+    {
+        quotient = div_u64_rem(-dividend, ABS(divisor),
+                        (uint32_t *)remainder);
+        *remainder = -*remainder;
+        if ( divisor > 0 )
+            quotient = -quotient;
+    }
+    else
+    {
+        quotient = div_u64_rem(dividend, ABS(divisor),
+                        (uint32_t *)remainder);
+        if ( divisor < 0 )
+            quotient = -quotient;
+    }
+    return quotient;
+}
+
+/*
+ * div_s64 - signed 64bit divide with 32bit divisor
+ */
+static inline int64_t div_s64(int64_t dividend, int32_t divisor)
+{
+    int32_t remainder;
+
+    return div_s64_rem(dividend, divisor, &remainder);
+}
+
 #endif
diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64d..9812698 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -43,6 +43,29 @@
 #define MAX(x,y) ((x) > (y) ? (x) : (y))
 
 /**
+ * clamp - return a value clamped to a given range with strict typechecking
+ * @val: current value
+ * @lo: lowest allowable value
+ * @hi: highest allowable value
+ *
+ * This macro does strict typechecking of lo/hi to make sure they are of the
+ * same type as val.  See the unnecessary pointer comparisons.
+ */
+#define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
+
+/*
+ * clamp_t - return a value clamped to a given range using a given type
+ * @type: the type of variable to use
+ * @val: current value
+ * @lo: minimum allowable value
+ * @hi: maximum allowable value
+ *
+ * This macro does no typechecking and uses temporary variables of type
+ * 'type' to make all the comparisons.
+ */
+#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
+
+/**
  * container_of - cast a member of a structure out to the containing structure
  *
  * @ptr:	the pointer to the member.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
  2015-09-14  2:32 ` [PATCH v5 1/9] x86/intel_pstate: add some calculation related support Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:26   ` Andrew Cooper
  2015-10-05 16:14   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy() Wei Wang
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

Add support to detect the APERF/MPERF feature. Also, remove the identical
code in cpufreq.c and powernow.c. This patch is independent of the
earlier patches.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/arch/x86/acpi/cpufreq/cpufreq.c  | 6 ++----
 xen/arch/x86/acpi/cpufreq/powernow.c | 6 ++----
 xen/arch/x86/cpu/common.c            | 4 ++++
 xen/include/asm-x86/cpufeature.h     | 5 +++++
 4 files changed, 13 insertions(+), 8 deletions(-)

 changes in v5:
 1) define macros for 0x1 and CPUID leaf5;
 2) add a statement stating that this patch is independent of the
    previous ones.

diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c
index ef79f77..8494fa0 100644
--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -50,7 +50,6 @@ enum {
 };
 
 #define INTEL_MSR_RANGE         (0xffffull)
-#define CPUID_6_ECX_APERFMPERF_CAPABILITY       (0x1)
 
 struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
 
@@ -351,10 +350,9 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
 static void feature_detect(void *info)
 {
     struct cpufreq_policy *policy = info;
-    unsigned int eax, ecx;
+    unsigned int eax;
 
-    ecx = cpuid_ecx(6);
-    if (ecx & CPUID_6_ECX_APERFMPERF_CAPABILITY) {
+    if ( cpu_has_aperfmperf ) {
         policy->aperf_mperf = 1;
         acpi_cpufreq_driver.getavg = get_measured_perf;
     }
diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
index 4de6f8d..d11da1a 100644
--- a/xen/arch/x86/acpi/cpufreq/powernow.c
+++ b/xen/arch/x86/acpi/cpufreq/powernow.c
@@ -37,7 +37,6 @@
 #include <acpi/acpi.h>
 #include <acpi/cpufreq/cpufreq.h>
 
-#define CPUID_6_ECX_APERFMPERF_CAPABILITY       (0x1)
 #define CPUID_FREQ_VOLT_CAPABILITIES    0x80000007
 #define CPB_CAPABLE             0x00000200
 #define USE_HW_PSTATE           0x00000080
@@ -211,10 +210,9 @@ static int powernow_cpufreq_verify(struct cpufreq_policy *policy)
 static void feature_detect(void *info)
 {
     struct cpufreq_policy *policy = info;
-    unsigned int ecx, edx;
+    unsigned int edx;
 
-    ecx = cpuid_ecx(6);
-    if (ecx & CPUID_6_ECX_APERFMPERF_CAPABILITY) {
+    if ( cpu_has_aperfmperf ) {
         policy->aperf_mperf = 1;
         powernow_cpufreq_driver.getavg = get_measured_perf;
     }
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 35ef21b..5224d10 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -239,6 +239,10 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 	if ( cpu_has(c, X86_FEATURE_CLFLSH) )
 		c->x86_clflush_size = ((ebx >> 8) & 0xff) * 8;
 
+	if ( (c->cpuid_level > CPUID_PM_LEAF) &&
+		(cpuid_ecx(CPUID_PM_LEAF) & CPUID6_ECX_APERFMPERF_CAPABILITY) )
+		set_bit(X86_FEATURE_APERFMPERF, c->x86_capability);
+
 	/* AMD-defined flags: level 0x80000001 */
 	c->extended_cpuid_level = cpuid_eax(0x80000000);
 	if ( (c->extended_cpuid_level & 0xffff0000) == 0x80000000 ) {
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 9a01563..d5f532b 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -69,6 +69,7 @@
 #define X86_FEATURE_XTOPOLOGY    (3*32+13) /* cpu topology enum extensions */
 #define X86_FEATURE_CPUID_FAULTING (3*32+14) /* cpuid faulting */
 #define X86_FEATURE_CLFLUSH_MONITOR (3*32+15) /* clflush reqd with monitor */
+#define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
 #define X86_FEATURE_XMM3	(4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -165,6 +166,9 @@
 #define CPUID5_ECX_EXTENSIONS_SUPPORTED 0x1
 #define CPUID5_ECX_INTERRUPT_BREAK      0x2
 
+#define CPUID_PM_LEAF                    6
+#define CPUID6_ECX_APERFMPERF_CAPABILITY 0x1
+
 #define cpu_has_vme		0
 #define cpu_has_de		1
 #define cpu_has_pse		1
@@ -190,6 +194,7 @@
 #define cpu_has_page1gb		boot_cpu_has(X86_FEATURE_PAGE1GB)
 #define cpu_has_efer		1
 #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
+#define cpu_has_aperfmperf      boot_cpu_has(X86_FEATURE_APERFMPERF)
 
 #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
 #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy()
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
  2015-09-14  2:32 ` [PATCH v5 1/9] x86/intel_pstate: add some calculation related support Wei Wang
  2015-09-14  2:32 ` [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:34   ` Andrew Cooper
  2015-10-06 15:37   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function Wei Wang
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

In order to better support future Intel processors, intel_pstate
changes to use percentage values to tune P-states. The setpolicy
driver interface is used to configure the intel_pstate internal
policy. The __cpufreq_set_policy needs to be intercepted to use
the setpolicy driver if it exists.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/drivers/cpufreq/utility.c      |  3 +++
 xen/include/acpi/cpufreq/cpufreq.h | 14 ++++++++++++++
 2 files changed, 17 insertions(+)

 changes in v5:
 1) delay the addition of the structures that are used in later patches.

diff --git a/xen/drivers/cpufreq/utility.c b/xen/drivers/cpufreq/utility.c
index 519f862..53879fe 100644
--- a/xen/drivers/cpufreq/utility.c
+++ b/xen/drivers/cpufreq/utility.c
@@ -456,6 +456,9 @@ int __cpufreq_set_policy(struct cpufreq_policy *data,
 
     data->min = policy->min;
     data->max = policy->max;
+    data->limits = policy->limits;
+    if (cpufreq_driver->setpolicy)
+        return cpufreq_driver->setpolicy(data);
 
     if (policy->governor != data->governor) {
         /* save old, working values */
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index f96c3e4..1ec04ca 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -41,6 +41,18 @@ struct cpufreq_cpuinfo {
     unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
 };
 
+struct perf_limits {
+    bool_t no_turbo;
+    bool_t turbo_disabled;
+    uint32_t turbo_pct;
+    uint32_t max_perf_pct; /* max performance in percentage */
+    uint32_t min_perf_pct; /* min performance in percentage */
+    uint32_t max_perf;
+    uint32_t min_perf;
+    uint32_t max_policy_pct;
+    uint32_t min_policy_pct;
+};
+
 struct cpufreq_policy {
     cpumask_var_t       cpus;          /* affected CPUs */
     unsigned int        shared_type;   /* ANY or ALL affected CPUs
@@ -52,6 +64,7 @@ struct cpufreq_policy {
     unsigned int        max;    /* in kHz */
     unsigned int        cur;    /* in kHz, only needed if cpufreq
                                  * governors are used */
+    struct perf_limits  limits;
     struct cpufreq_governor     *governor;
 
     bool_t              resume; /* flag for cpufreq 1st run
@@ -145,6 +158,7 @@ struct cpufreq_driver {
     char   name[CPUFREQ_NAME_LEN];
     int    (*init)(struct cpufreq_policy *policy);
     int    (*verify)(struct cpufreq_policy *policy);
+    int    (*setpolicy)(struct cpufreq_policy *policy);
     int    (*update)(int cpuid, struct cpufreq_policy *policy);
     int    (*target)(struct cpufreq_policy *policy,
                      unsigned int target_freq,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (2 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy() Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:38   ` Andrew Cooper
  2015-10-07 15:08   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline Wei Wang
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

Move the driver register function to
the cpufreq.c.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/drivers/cpufreq/cpufreq.c      | 15 +++++++++++++++
 xen/include/acpi/cpufreq/cpufreq.h | 27 +--------------------------
 2 files changed, 16 insertions(+), 26 deletions(-)

 changes in v5:
 1) keep cpufreq_presmp_init() intact.

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 567e9e9..0c437d4 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -638,3 +638,18 @@ static int __init cpufreq_presmp_init(void)
 }
 presmp_initcall(cpufreq_presmp_init);
 
+int cpufreq_register_driver(struct cpufreq_driver *driver_data)
+{
+   if ( !driver_data || !driver_data->init ||
+        !driver_data->verify || !driver_data->exit ||
+        (!driver_data->target == !driver_data->setpolicy) )
+        return -EINVAL;
+
+    if ( cpufreq_driver )
+        return -EBUSY;
+
+    cpufreq_driver = driver_data;
+
+    register_cpu_notifier(&cpu_nfb);
+    return 0;
+}
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index 1ec04ca..c6976d0 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -170,32 +170,7 @@ struct cpufreq_driver {
 
 extern struct cpufreq_driver *cpufreq_driver;
 
-static __inline__ 
-int cpufreq_register_driver(struct cpufreq_driver *driver_data)
-{
-    if (!driver_data         || 
-        !driver_data->init   || 
-        !driver_data->exit   || 
-        !driver_data->verify || 
-        !driver_data->target)
-        return -EINVAL;
-
-    if (cpufreq_driver)
-        return -EBUSY;
-
-    cpufreq_driver = driver_data;
-    return 0;
-}
-
-static __inline__ 
-int cpufreq_unregister_driver(struct cpufreq_driver *driver)
-{
-    if (!cpufreq_driver || (driver != cpufreq_driver))
-        return -EINVAL;
-
-    cpufreq_driver = NULL;
-    return 0;
-}
+extern int cpufreq_register_driver(struct cpufreq_driver *driver_data);
 
 static __inline__
 void cpufreq_verify_within_limits(struct cpufreq_policy *policy,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (3 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:43   ` Andrew Cooper
  2015-10-07 15:28   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver Wei Wang
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

We change to NULL the cpufreq_cpu_policy pointer after the call of
cpufreq_driver->exit, because the pointer is still needed in
intel_pstate_set_pstate().

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/drivers/cpufreq/cpufreq.c      | 6 +++---
 xen/include/acpi/cpufreq/cpufreq.h | 7 +++++++
 2 files changed, 10 insertions(+), 3 deletions(-)

 changes in v5:
 1) put this patch prior to the "main body of intel pstate driver", which is 
    one of the acceptable options suggested by the Jan.

diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
index 0c437d4..5485944 100644
--- a/xen/drivers/cpufreq/cpufreq.c
+++ b/xen/drivers/cpufreq/cpufreq.c
@@ -334,12 +334,11 @@ int cpufreq_del_cpu(unsigned int cpu)
 
     /* for HW_ALL, stop gov for each core of the _PSD domain */
     /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */
-    if (hw_all || (cpumask_weight(cpufreq_dom->map) ==
-                   perf->domain_info.num_processors))
+    if (!policy->internal_gov && (hw_all || (cpumask_weight(cpufreq_dom->map) ==
+                   perf->domain_info.num_processors)))
         __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
 
     cpufreq_statistic_exit(cpu);
-    per_cpu(cpufreq_cpu_policy, cpu) = NULL;
     cpumask_clear_cpu(cpu, policy->cpus);
     cpumask_clear_cpu(cpu, cpufreq_dom->map);
 
@@ -348,6 +347,7 @@ int cpufreq_del_cpu(unsigned int cpu)
         free_cpumask_var(policy->cpus);
         xfree(policy);
     }
+    per_cpu(cpufreq_cpu_policy, cpu) = NULL;
 
     /* for the last cpu of the domain, clean room */
     /* It's safe here to free freq_table, drv_data and policy */
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index c6976d0..48bd94d 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -53,6 +53,12 @@ struct perf_limits {
     uint32_t min_policy_pct;
 };
 
+struct internal_governor {
+    char *avail_gov;
+    uint32_t gov_num;
+    uint32_t cur_gov;
+};
+
 struct cpufreq_policy {
     cpumask_var_t       cpus;          /* affected CPUs */
     unsigned int        shared_type;   /* ANY or ALL affected CPUs
@@ -66,6 +72,7 @@ struct cpufreq_policy {
                                  * governors are used */
     struct perf_limits  limits;
     struct cpufreq_governor     *governor;
+    struct internal_governor    *internal_gov;
 
     bool_t              resume; /* flag for cpufreq 1st run
                                  * S3 wakeup, hotplug cpu, etc */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (4 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 15:51   ` Andrew Cooper
  2015-10-07 15:39   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load Wei Wang
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

We simply grab the fundamental logic of the intel_pstate driver
from Linux kernel, and customize it to Xen style. In the kernel,
a user can adjust the limits via sysfs
(limits.min_sysfs_pct/max_sysfs_pct). In Xen, the
policy->limits.min_perf_pct/max_perf_pct acts as the transit
station. A user interacts with it via xenpm.

The new xen/include/asm-x86/cpufreq.h header file is added.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 xen/arch/x86/acpi/cpufreq/Makefile       |   1 +
 xen/arch/x86/acpi/cpufreq/intel_pstate.c | 880 +++++++++++++++++++++++++++++++
 xen/include/acpi/cpufreq/cpufreq.h       |   6 +
 xen/include/asm-x86/cpufreq.h            |  34 ++
 xen/include/asm-x86/msr-index.h          |   3 +
 5 files changed, 924 insertions(+)
 create mode 100644 xen/arch/x86/acpi/cpufreq/intel_pstate.c
 create mode 100644 xen/include/asm-x86/cpufreq.h

 changes in v5:
 1) customize it to Xen style.

diff --git a/xen/arch/x86/acpi/cpufreq/Makefile b/xen/arch/x86/acpi/cpufreq/Makefile
index f75da9b..99fa9f4 100644
--- a/xen/arch/x86/acpi/cpufreq/Makefile
+++ b/xen/arch/x86/acpi/cpufreq/Makefile
@@ -1,2 +1,3 @@
 obj-y += cpufreq.o
+obj-y += intel_pstate.o
 obj-y += powernow.o
diff --git a/xen/arch/x86/acpi/cpufreq/intel_pstate.c b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
new file mode 100644
index 0000000..3292bcd
--- /dev/null
+++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
@@ -0,0 +1,880 @@
+#include <xen/kernel.h>
+#include <xen/types.h>
+#include <xen/init.h>
+#include <xen/bitmap.h>
+#include <xen/cpumask.h>
+#include <xen/timer.h>
+#include <asm/msr.h>
+#include <asm/msr-index.h>
+#include <asm/processor.h>
+#include <asm/div64.h>
+#include <asm/cpufreq.h>
+#include <acpi/cpufreq/cpufreq.h>
+
+#define BYT_RATIOS       0x66a
+#define BYT_VIDS         0x66b
+#define BYT_TURBO_RATIOS 0x66c
+#define BYT_TURBO_VIDS   0x66d
+
+#define FRAC_BITS 8
+#define int_tofp(X) ((int64_t)(X) << FRAC_BITS)
+#define fp_toint(X) ((X) >> FRAC_BITS)
+
+static inline int32_t mul_fp(int32_t x, int32_t y)
+{
+    return ((int64_t)x * (int64_t)y) >> FRAC_BITS;
+}
+
+static inline int32_t div_fp(int32_t x, int32_t y)
+{
+    return div_s64((int64_t)x << FRAC_BITS, y);
+}
+
+static inline int ceiling_fp(int32_t x)
+{
+    int mask, ret;
+
+    ret = fp_toint(x);
+    mask = (1 << FRAC_BITS) - 1;
+    if ( x & mask )
+        ret += 1;
+    return ret;
+}
+
+struct sample {
+    int32_t core_pct_busy;
+    u64 aperf;
+    u64 mperf;
+    int freq;
+    s_time_t time;
+};
+
+struct pstate_data {
+    int    current_pstate;
+    int    min_pstate;
+    int    max_pstate;
+    int    scaling;
+    int    turbo_pstate;
+};
+
+struct vid_data {
+    int min;
+    int max;
+    int turbo;
+    int32_t ratio;
+};
+
+struct _pid {
+    int setpoint;
+    int32_t integral;
+    int32_t p_gain;
+    int32_t i_gain;
+    int32_t d_gain;
+    int deadband;
+    int32_t last_err;
+};
+
+struct cpudata {
+    int cpu;
+
+    struct timer timer;
+
+    struct pstate_data pstate;
+    struct vid_data vid;
+    struct _pid pid;
+
+    s_time_t last_sample_time;
+    u64    prev_aperf;
+    u64    prev_mperf;
+    struct sample sample;
+};
+
+static struct cpudata **all_cpu_data;
+
+struct pstate_adjust_policy {
+    int sample_rate_ms;
+    int deadband;
+    int setpoint;
+    int p_gain_pct;
+    int d_gain_pct;
+    int i_gain_pct;
+};
+
+struct pstate_funcs {
+    int (*get_max)(void);
+    int (*get_min)(void);
+    int (*get_turbo)(void);
+    int (*get_scaling)(void);
+    void (*set)(struct perf_limits *, struct cpudata *, int pstate);
+    void (*get_vid)(struct cpudata *);
+};
+
+struct cpu_defaults {
+    struct pstate_adjust_policy pid_policy;
+    struct pstate_funcs funcs;
+};
+
+static struct pstate_adjust_policy pid_params;
+static struct pstate_funcs pstate_funcs;
+
+static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
+                 int deadband, int integral)
+{
+    pid->setpoint = setpoint;
+    pid->deadband  = deadband;
+    pid->integral  = int_tofp(integral);
+    pid->last_err  = int_tofp(setpoint) - int_tofp(busy);
+}
+
+static inline void pid_p_gain_set(struct _pid *pid, int percent)
+{
+    pid->p_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_i_gain_set(struct _pid *pid, int percent)
+{
+    pid->i_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static inline void pid_d_gain_set(struct _pid *pid, int percent)
+{
+    pid->d_gain = div_fp(int_tofp(percent), int_tofp(100));
+}
+
+static signed int pid_calc(struct _pid *pid, int32_t busy)
+{
+    signed int result;
+    int32_t pterm, dterm, fp_error;
+    int32_t integral_limit;
+
+    fp_error = int_tofp(pid->setpoint) - busy;
+
+    if ( ABS(fp_error) <= int_tofp(pid->deadband) )
+        return 0;
+
+    pterm = mul_fp(pid->p_gain, fp_error);
+
+    pid->integral += fp_error;
+
+    /*
+     * We limit the integral here so that it will never
+     * get higher than 30.  This prevents it from becoming
+     * too large an input over long periods of time and allows
+     * it to get factored out sooner.
+     * The value of 30 was chosen through experimentation.
+     */
+    integral_limit = int_tofp(30);
+    if ( pid->integral > integral_limit )
+        pid->integral = integral_limit;
+    if ( pid->integral < -integral_limit )
+        pid->integral = -integral_limit;
+
+    dterm = mul_fp(pid->d_gain, fp_error - pid->last_err);
+    pid->last_err = fp_error;
+
+    result = pterm + mul_fp(pid->integral, pid->i_gain) + dterm;
+    result = result + (1 << (FRAC_BITS-1));
+    return (signed int)fp_toint(result);
+}
+
+static inline void intel_pstate_busy_pid_reset(struct cpudata *cpu)
+{
+    pid_p_gain_set(&cpu->pid, pid_params.p_gain_pct);
+    pid_d_gain_set(&cpu->pid, pid_params.d_gain_pct);
+    pid_i_gain_set(&cpu->pid, pid_params.i_gain_pct);
+
+    pid_reset(&cpu->pid, pid_params.setpoint, 100, pid_params.deadband, 0);
+}
+
+static inline void intel_pstate_reset_all_pid(void)
+{
+    unsigned int cpu;
+
+    for_each_online_cpu(cpu)
+    {
+        if ( all_cpu_data[cpu] )
+            intel_pstate_busy_pid_reset(all_cpu_data[cpu]);
+    }
+}
+
+static inline void update_turbo_state(struct cpufreq_policy *policy)
+{
+    u64 misc_en;
+    struct cpudata *cpu;
+
+    cpu = all_cpu_data[policy->cpu];
+    rdmsrl(MSR_IA32_MISC_ENABLE, misc_en);
+    policy->limits.turbo_disabled =
+        (misc_en & MSR_IA32_MISC_ENABLE_TURBO_DISABLE ||
+            cpu->pstate.max_pstate == cpu->pstate.turbo_pstate);
+}
+
+#define BYT_TURBO_CONTROL_BIT 32
+#define BYT_MIN_PSTATE(val) (((value) >> 8) & 0x7f)
+#define BYT_MAX_PSTATE(val) (((value) >> 16) & 0x7f)
+#define BYT_TURBO_PSTATE(value) ((value) & 0x7f)
+static int byt_get_min_pstate(void)
+{
+    u64 value;
+
+    rdmsrl(BYT_RATIOS, value);
+    return BYT_MIN_PSTATE(val);
+}
+
+static int byt_get_max_pstate(void)
+{
+    u64 value;
+
+    rdmsrl(BYT_RATIOS, value);
+    return BYT_MAX_PSTATE(val);
+}
+
+static int byt_get_turbo_pstate(void)
+{
+    u64 value;
+
+    rdmsrl(BYT_TURBO_RATIOS, value);
+    return BYT_TURBO_PSTATE(value);
+}
+
+static void byt_set_pstate(struct perf_limits *limits,
+                struct cpudata *cpudata, int pstate)
+{
+    u64 val;
+    int32_t vid_fp;
+    u32 vid;
+
+    val = pstate << 8;
+    if ( limits->no_turbo && !limits->turbo_disabled )
+        val |= (u64)1 << BYT_TURBO_CONTROL_BIT;
+
+    vid_fp = cpudata->vid.min + mul_fp(
+        int_tofp(pstate - cpudata->pstate.min_pstate),
+        cpudata->vid.ratio);
+
+    vid_fp = clamp_t(int32_t, vid_fp, cpudata->vid.min, cpudata->vid.max);
+    vid = ceiling_fp(vid_fp);
+
+    if ( pstate > cpudata->pstate.max_pstate )
+        vid = cpudata->vid.turbo;
+
+    val |= vid;
+
+    wrmsrl(MSR_IA32_PERF_CTL, val);
+}
+
+#define BYT_BCLK_FREQS 5
+#define TO_FREQ_TABLE_IDX_MASK 0x7
+static const int byt_freq_table[BYT_BCLK_FREQS] =
+                                    { 833, 1000, 1333, 1167, 800};
+
+static int byt_get_scaling(void)
+{
+    u64 value;
+    int i;
+
+    rdmsrl(MSR_FSB_FREQ, value);
+    i = value & TO_FREQ_TABLE_IDX_MASK;
+
+    BUG_ON(i > BYT_BCLK_FREQS);
+
+    return byt_freq_table[i] * 100;
+}
+
+static void byt_get_vid(struct cpudata *cpudata)
+{
+    u64 value;
+
+    rdmsrl(BYT_VIDS, value);
+    cpudata->vid.min = int_tofp(BYT_MIN_PSTATE(val));
+    cpudata->vid.max = int_tofp(BYT_MAX_PSTATE(val));
+    cpudata->vid.ratio = div_fp(
+        cpudata->vid.max - cpudata->vid.min,
+        int_tofp(cpudata->pstate.max_pstate -
+            cpudata->pstate.min_pstate));
+
+    rdmsrl(BYT_TURBO_VIDS, value);
+    cpudata->vid.turbo = BYT_TURBO_PSTATE(value);
+}
+
+#define SCALING_FACTOR 100000
+#define CORE_TURBO_CONTROL_BIT 32
+#define CORE_MIN_PSTATE(val) (((value) >> 40) & 0xff)
+#define CORE_MAX_PSTATE(val) (((value) >> 8) & 0xff)
+#define CORE_TURBO_PSTATE(value) ((value) & 0xff)
+static int core_get_min_pstate(void)
+{
+    u64 value;
+
+    rdmsrl(MSR_INTEL_PLATFORM_INFO, value);
+    return CORE_MIN_PSTATE(val);
+}
+
+static int core_get_max_pstate(void)
+{
+    u64 value;
+
+    rdmsrl(MSR_INTEL_PLATFORM_INFO, value);
+    return CORE_MAX_PSTATE(val);
+}
+
+static int core_get_turbo_pstate(void)
+{
+    u64 value;
+    int nont, ret;
+
+    rdmsrl(MSR_NHM_TURBO_RATIO_LIMIT, value);
+    nont = core_get_max_pstate();
+    ret = CORE_TURBO_PSTATE(value);
+    if ( ret <= nont )
+        ret = nont;
+    return ret;
+}
+
+static inline int core_get_scaling(void)
+{
+    return SCALING_FACTOR;
+}
+
+static void core_set_pstate(struct perf_limits *limits,
+                struct cpudata *cpudata, int pstate)
+{
+    u64 val;
+
+    val = pstate << 8;
+    if ( limits->no_turbo && !limits->turbo_disabled )
+        val |= (u64)1 << CORE_TURBO_CONTROL_BIT;
+
+    wrmsrl(MSR_IA32_PERF_CTL, val);
+}
+
+static const struct cpu_defaults core_params = {
+    .pid_policy = {
+        .sample_rate_ms = 10,
+        .deadband = 0,
+        .setpoint = 97,
+        .p_gain_pct = 20,
+        .d_gain_pct = 0,
+        .i_gain_pct = 0,
+    },
+    .funcs = {
+        .get_max = core_get_max_pstate,
+        .get_min = core_get_min_pstate,
+        .get_turbo = core_get_turbo_pstate,
+        .get_scaling = core_get_scaling,
+        .set = core_set_pstate,
+    },
+};
+
+static const struct cpu_defaults byt_params = {
+    .pid_policy = {
+        .sample_rate_ms = 10,
+        .deadband = 0,
+        .setpoint = 97,
+        .p_gain_pct = 14,
+        .d_gain_pct = 0,
+        .i_gain_pct = 4,
+    },
+    .funcs = {
+        .get_max = byt_get_max_pstate,
+        .get_min = byt_get_min_pstate,
+        .get_turbo = byt_get_turbo_pstate,
+        .set = byt_set_pstate,
+        .get_scaling = byt_get_scaling,
+        .get_vid = byt_get_vid,
+    },
+};
+
+static void intel_pstate_get_min_max(struct perf_limits *limits,
+                struct cpudata *cpu, int *min, int *max)
+{
+    int max_perf = cpu->pstate.turbo_pstate;
+    int max_perf_adj;
+    int min_perf;
+
+    if ( limits->no_turbo || limits->turbo_disabled )
+        max_perf = cpu->pstate.max_pstate;
+
+    /* performance can be limited by user through xenpm */
+    max_perf_adj = fp_toint(mul_fp(int_tofp(max_perf), limits->max_perf));
+    *max = clamp_t(int, max_perf_adj,
+            cpu->pstate.min_pstate, cpu->pstate.turbo_pstate);
+
+    min_perf = fp_toint(mul_fp(int_tofp(max_perf), limits->min_perf));
+    *min = clamp_t(int, min_perf, cpu->pstate.min_pstate, max_perf);
+}
+
+static void intel_pstate_set_pstate(struct cpudata *cpu, int pstate)
+{
+    int max_perf, min_perf;
+    struct cpufreq_policy *policy;
+    struct perf_limits *limits;
+
+    policy = per_cpu(cpufreq_cpu_policy, cpu->cpu);
+    limits = &policy->limits;
+
+    update_turbo_state(policy);
+
+    if ( limits->turbo_disabled )
+        policy->turbo = CPUFREQ_TURBO_UNSUPPORTED;
+    else if ( limits->no_turbo )
+        policy->turbo = CPUFREQ_TURBO_DISABLED;
+    else
+        policy->turbo = CPUFREQ_TURBO_ENABLED;
+
+    intel_pstate_get_min_max(limits, cpu, &min_perf, &max_perf);
+
+    pstate = clamp_t(int, pstate, min_perf, max_perf);
+
+    if ( pstate == cpu->pstate.current_pstate )
+        return;
+
+    cpu->pstate.current_pstate = pstate;
+    policy->cur = pstate * SCALING_FACTOR;
+
+    pstate_funcs.set(limits, cpu, pstate);
+}
+
+static void intel_pstate_get_cpu_pstates(struct cpudata *cpu)
+{
+    cpu->pstate.min_pstate = pstate_funcs.get_min();
+    cpu->pstate.max_pstate = pstate_funcs.get_max();
+    cpu->pstate.turbo_pstate = pstate_funcs.get_turbo();
+    cpu->pstate.scaling = pstate_funcs.get_scaling();
+
+    if ( pstate_funcs.get_vid )
+        pstate_funcs.get_vid(cpu);
+    intel_pstate_set_pstate(cpu, cpu->pstate.min_pstate);
+}
+
+static inline void intel_pstate_calc_busy(struct cpudata *cpu)
+{
+    struct sample *sample = &cpu->sample;
+    int64_t core_pct;
+
+    core_pct = int_tofp(sample->aperf) * int_tofp(100);
+    core_pct = div64_u64(core_pct, int_tofp(sample->mperf));
+
+    sample->freq = fp_toint(
+        mul_fp(int_tofp(
+            cpu->pstate.max_pstate * cpu->pstate.scaling / 100),
+            core_pct));
+
+    sample->core_pct_busy = (int32_t)core_pct;
+}
+
+static inline void intel_pstate_sample(struct cpudata *cpu)
+{
+    u64 aperf, mperf;
+    unsigned long flags;
+
+    local_irq_save(flags);
+    rdmsrl(MSR_IA32_APERF, aperf);
+    rdmsrl(MSR_IA32_MPERF, mperf);
+    local_irq_restore(flags);
+
+    cpu->last_sample_time = cpu->sample.time;
+    cpu->sample.time = get_s_time();
+    cpu->sample.aperf = aperf;
+    cpu->sample.mperf = mperf;
+    cpu->sample.aperf -= cpu->prev_aperf;
+    cpu->sample.mperf -= cpu->prev_mperf;
+
+    intel_pstate_calc_busy(cpu);
+
+    cpu->prev_aperf = aperf;
+    cpu->prev_mperf = mperf;
+}
+
+static inline void intel_pstate_set_sample_time(struct cpudata *cpu)
+{
+    set_timer(&cpu->timer, NOW() + MILLISECS(pid_params.sample_rate_ms));
+}
+
+static inline int32_t intel_pstate_get_scaled_busy(struct cpudata *cpu)
+{
+    int32_t core_busy, max_pstate, current_pstate, sample_ratio;
+    u32 duration_us;
+    u32 sample_time_us;
+
+    /*
+     * core_busy is the ratio of actual performance to max
+     * max_pstate is the max non turbo pstate available
+     * current_pstate was the pstate that was requested during
+     * the last sample period.
+     *
+     * We normalize core_busy, which was our actual percent
+     * performance to what we requested during the last sample
+     * period. The result will be a percentage of busy at a
+     * specified pstate.
+     */
+    core_busy = cpu->sample.core_pct_busy;
+    max_pstate = int_tofp(cpu->pstate.max_pstate);
+    current_pstate = int_tofp(cpu->pstate.current_pstate);
+    core_busy = mul_fp(core_busy, div_fp(max_pstate, current_pstate));
+
+    /*
+     * Since we have a deferred timer, it will not fire unless
+     * we are in C0.  So, determine if the actual elapsed time
+     * is significantly greater (3x) than our sample interval. If it
+     * is, then we were idle for a long enough period of time
+     * to adjust our busyness.
+     */
+    sample_time_us = pid_params.sample_rate_ms  * 1000ULL;
+    duration_us = (u32)((s_time_t)(cpu->sample.time - cpu->last_sample_time)
+            / 1000);
+    if ( duration_us > sample_time_us * 3 )
+    {
+        sample_ratio = div_fp(int_tofp(sample_time_us),
+                      int_tofp(duration_us));
+        core_busy = mul_fp(core_busy, sample_ratio);
+    }
+
+    return core_busy;
+}
+
+static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
+{
+    int32_t busy_scaled;
+    struct _pid *pid;
+    signed int ctl;
+
+    pid = &cpu->pid;
+    busy_scaled = intel_pstate_get_scaled_busy(cpu);
+
+    ctl = pid_calc(pid, busy_scaled);
+
+    /* Negative values of ctl increase the pstate and vice versa */
+    intel_pstate_set_pstate(cpu, cpu->pstate.current_pstate - ctl);
+}
+
+static void intel_pstate_timer_func(void *data)
+{
+    struct cpudata *cpu = (struct cpudata *) data;
+
+    intel_pstate_sample(cpu);
+
+    intel_pstate_adjust_busy_pstate(cpu);
+
+    intel_pstate_set_sample_time(cpu);
+}
+
+#define ICPU(model, policy) \
+    { X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF,\
+            &policy##_params }
+
+static const struct x86_cpu_id intel_pstate_cpu_ids[] __initconst = {
+    ICPU(0x2a, core),
+    ICPU(0x2d, core),
+    ICPU(0x37, byt),
+    ICPU(0x3a, core),
+    ICPU(0x3c, core),
+    ICPU(0x3d, core),
+    ICPU(0x3e, core),
+    ICPU(0x3f, core),
+    ICPU(0x45, core),
+    ICPU(0x46, core),
+    ICPU(0x47, core),
+    ICPU(0x4c, byt),
+    ICPU(0x4e, core),
+    ICPU(0x4f, core),
+    ICPU(0x56, core),
+    {}
+};
+
+static int intel_pstate_init_cpu(unsigned int cpunum)
+{
+    struct cpudata *cpu;
+    s_time_t expires;
+
+    if ( !all_cpu_data[cpunum] )
+        all_cpu_data[cpunum] = xzalloc(struct cpudata);
+    if ( !all_cpu_data[cpunum] )
+        return -ENOMEM;
+
+    cpu = all_cpu_data[cpunum];
+
+    cpu->cpu = cpunum;
+    intel_pstate_get_cpu_pstates(cpu);
+
+    init_timer(&cpu->timer, intel_pstate_timer_func, cpu, cpunum);
+    expires = NOW() + MILLISECS(10);
+
+    intel_pstate_busy_pid_reset(cpu);
+    intel_pstate_sample(cpu);
+
+    set_timer(&cpu->timer, expires);
+
+    return 0;
+}
+
+static int intel_pstate_set_policy(struct cpufreq_policy *policy)
+{
+    struct perf_limits *limits = &policy->limits;
+    uint32_t cur_gov = policy->internal_gov->cur_gov;
+
+    if ( !policy->cpuinfo.max_freq )
+        return -ENODEV;
+
+    switch ( cur_gov )
+    {
+    case INTERNAL_GOV_PERFORMANCE:
+        limits->no_turbo = 0;
+        limits->max_perf_pct = 100;
+        limits->max_perf = int_tofp(1);
+        limits->min_perf_pct = 100;
+        limits->min_perf = int_tofp(1);
+        break;
+    case INTERNAL_GOV_POWERSAVE:
+        limits->min_perf =
+            div_fp(int_tofp(limits->min_policy_pct),
+                        int_tofp(100));
+        limits->max_perf = limits->min_perf;
+        limits->min_perf_pct = limits->min_policy_pct;
+        limits->max_perf_pct = limits->min_perf_pct;
+        break;
+    case INTERNAL_GOV_USERSPACE:
+        limits->max_perf =
+            div_fp(int_tofp(limits->max_perf_pct),
+                        int_tofp(100));
+        limits->min_perf = limits->max_perf;
+        limits->min_perf_pct = limits->max_perf_pct;
+        break;
+    case INTERNAL_GOV_ONDEMAND:
+    default:
+        limits->min_perf =
+            div_fp(int_tofp(limits->min_perf_pct), int_tofp(100));
+        limits->max_perf =
+            div_fp(int_tofp(limits->max_perf_pct), int_tofp(100));
+        break;
+    }
+
+    return 0;
+}
+
+static int intel_pstate_verify_policy(struct cpufreq_policy *policy)
+{
+    uint32_t cur_gov = policy->internal_gov->cur_gov;
+
+    cpufreq_verify_within_limits(policy, policy->cpuinfo.min_freq,
+                         policy->cpuinfo.max_freq);
+
+    switch( cur_gov )
+    {
+    case INTERNAL_GOV_PERFORMANCE:
+    case INTERNAL_GOV_POWERSAVE:
+    case INTERNAL_GOV_USERSPACE:
+    case INTERNAL_GOV_ONDEMAND:
+                    return 0;
+    default:
+                    return -EINVAL;
+    }
+}
+
+static void intel_pstate_internal_gov_release(struct internal_governor *gov)
+{
+    xfree(gov->avail_gov);
+    xfree(gov);
+}
+
+static int intel_pstate_cpu_exit(struct cpufreq_policy *policy)
+{
+    int cpu_num = policy->cpu;
+    struct cpudata *cpu = all_cpu_data[cpu_num];
+
+    kill_timer(&all_cpu_data[cpu_num]->timer);
+
+    intel_pstate_set_pstate(cpu, cpu->pstate.min_pstate);
+
+    intel_pstate_internal_gov_release(policy->internal_gov);
+
+    return 0;
+}
+
+static int intel_pstate_turbo_update(int cpuid, struct cpufreq_policy *policy)
+{
+    struct cpudata *cpu = all_cpu_data[policy->cpu];
+    struct perf_limits *limits = &policy->limits;
+
+    update_turbo_state(policy);
+    if ( limits->turbo_disabled )
+    {
+        printk("Turbo disabled by BIOS or not supported on CPU\n");
+        return -EINVAL;
+    }
+    limits->no_turbo = policy->turbo == CPUFREQ_TURBO_ENABLED ? 0 : 1;
+
+    if ( limits->no_turbo )
+        policy->cpuinfo.max_freq =
+            cpu->pstate.max_pstate * cpu->pstate.scaling;
+    else
+        policy->cpuinfo.max_freq =
+            cpu->pstate.turbo_pstate * cpu->pstate.scaling;
+
+    policy->max = clamp_t(unsigned int, policy->max,
+            policy->cpuinfo.min_freq, policy->cpuinfo.max_freq);
+
+    return 0;
+}
+
+static int get_turbo_pct(struct cpudata *cpu)
+{
+    int total, no_turbo, turbo_pct;
+    uint32_t turbo_fp;
+
+    total = cpu->pstate.turbo_pstate - cpu->pstate.min_pstate + 1;
+    no_turbo = cpu->pstate.max_pstate - cpu->pstate.min_pstate + 1;
+    turbo_fp = div_fp(int_tofp(no_turbo), int_tofp(total));
+    turbo_pct = 100 - fp_toint(mul_fp(turbo_fp, int_tofp(100)));
+    return turbo_pct;
+}
+
+#define INTEL_PSTATE_GOV_NUM 4
+static struct internal_governor* intel_pstate_internal_gov_init(void)
+{
+    unsigned int i = 0;
+    struct internal_governor *gov;
+    char *avail_gov;
+
+    gov = xzalloc(struct internal_governor);
+    if ( !gov )
+        return NULL;
+    avail_gov = xzalloc_array(char,
+            INTEL_PSTATE_GOV_NUM * CPUFREQ_NAME_LEN);
+    if ( !avail_gov )
+        return NULL;
+
+    gov->avail_gov = avail_gov;
+
+    i += scnprintf(&avail_gov[0], CPUFREQ_NAME_LEN, "%s ", "performance");
+    i += scnprintf(&avail_gov[i], CPUFREQ_NAME_LEN, "%s ", "powersave");
+    i += scnprintf(&avail_gov[i], CPUFREQ_NAME_LEN, "%s ", "userspace");
+    i += scnprintf(&avail_gov[i], CPUFREQ_NAME_LEN, "%s ", "ondemand");
+    avail_gov[i-1] = '\0';
+    gov->gov_num = INTEL_PSTATE_GOV_NUM;
+    gov->cur_gov = INTERNAL_GOV_ONDEMAND;
+    return gov;
+}
+
+static int intel_pstate_cpu_setup(struct cpufreq_policy *policy)
+{
+    struct cpudata *cpu;
+    struct perf_limits *limits = &policy->limits;
+    int rc;
+
+    rc = intel_pstate_init_cpu(policy->cpu);
+    if ( rc )
+        return rc;
+
+    policy->internal_gov = intel_pstate_internal_gov_init();
+    if ( !policy->internal_gov )
+        return -ENOMEM;
+
+    cpu = all_cpu_data[policy->cpu];
+    policy->min = cpu->pstate.min_pstate * cpu->pstate.scaling;
+    policy->max = cpu->pstate.turbo_pstate * cpu->pstate.scaling;
+
+    /* cpuinfo and default policy values */
+    policy->cpuinfo.min_freq =
+        cpu->pstate.min_pstate * cpu->pstate.scaling;
+    policy->cpuinfo.max_freq =
+        cpu->pstate.turbo_pstate * cpu->pstate.scaling;
+    policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+    cpumask_set_cpu(policy->cpu, policy->cpus);
+
+    limits->no_turbo = 0;
+    limits->turbo_disabled = 0;
+    limits->turbo_pct = get_turbo_pct(cpu);
+    limits->min_policy_pct =
+        (policy->min * 100) / policy->cpuinfo.max_freq;
+    limits->min_policy_pct =
+        clamp_t(uint32_t, limits->min_policy_pct, 0, 100);
+    limits->max_policy_pct =
+        (policy->max * 100) / policy->cpuinfo.max_freq;
+    limits->max_policy_pct =
+        clamp_t(uint32_t, limits->max_policy_pct, 0, 100);
+    limits->max_perf_pct   = limits->max_policy_pct;
+    limits->min_perf_pct   = limits->min_policy_pct;
+
+    return 0;
+}
+
+static struct cpufreq_driver intel_pstate_driver = {
+    .verify       = intel_pstate_verify_policy,
+    .setpolicy    = intel_pstate_set_policy,
+    .init         = intel_pstate_cpu_setup,
+    .exit         = intel_pstate_cpu_exit,
+    .update       = intel_pstate_turbo_update,
+    .name         = "intel_pstate",
+};
+
+static int intel_pstate_msrs_not_valid(void)
+{
+    if ( !pstate_funcs.get_max() ||
+         !pstate_funcs.get_min() ||
+         !pstate_funcs.get_turbo() )
+        return -ENODEV;
+
+    return 0;
+}
+
+static void __init copy_pid_params(struct pstate_adjust_policy *policy)
+{
+    pid_params.sample_rate_ms = policy->sample_rate_ms;
+    pid_params.p_gain_pct = policy->p_gain_pct;
+    pid_params.i_gain_pct = policy->i_gain_pct;
+    pid_params.d_gain_pct = policy->d_gain_pct;
+    pid_params.deadband = policy->deadband;
+    pid_params.setpoint = policy->setpoint;
+}
+
+static void __init copy_cpu_funcs(struct pstate_funcs *funcs)
+{
+    pstate_funcs.get_max   = funcs->get_max;
+    pstate_funcs.get_min   = funcs->get_min;
+    pstate_funcs.get_turbo = funcs->get_turbo;
+    pstate_funcs.get_scaling = funcs->get_scaling;
+    pstate_funcs.set       = funcs->set;
+    pstate_funcs.get_vid   = funcs->get_vid;
+}
+
+int __init intel_pstate_init(void)
+{
+    int cpu, rc = 0;
+    const struct x86_cpu_id *id;
+    struct cpu_defaults *cpu_info;
+
+
+    id = x86_match_cpu(intel_pstate_cpu_ids);
+    if ( !id )
+        return -ENODEV;
+
+    cpu_info = (struct cpu_defaults *)id->driver_data;
+
+    copy_pid_params(&cpu_info->pid_policy);
+    copy_cpu_funcs(&cpu_info->funcs);
+
+    if ( intel_pstate_msrs_not_valid() )
+        return -ENODEV;
+
+    all_cpu_data = xzalloc_array(struct cpudata *, NR_CPUS);
+    if ( !all_cpu_data )
+        return -ENOMEM;
+
+    rc = cpufreq_register_driver(&intel_pstate_driver);
+    if ( rc )
+        goto out;
+
+    return rc;
+out:
+    for_each_online_cpu(cpu)
+    {
+        if ( all_cpu_data[cpu] )
+        {
+            kill_timer(&all_cpu_data[cpu]->timer);
+            xfree(all_cpu_data[cpu]);
+        }
+    }
+    xfree(all_cpu_data);
+    return -ENODEV;
+}
diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
index 48bd94d..502774f 100644
--- a/xen/include/acpi/cpufreq/cpufreq.h
+++ b/xen/include/acpi/cpufreq/cpufreq.h
@@ -107,6 +107,12 @@ struct cpufreq_freqs {
  *                          CPUFREQ GOVERNORS                        *
  *********************************************************************/
 
+/* Please add internal governors here */
+#define INTERNAL_GOV_POWERSAVE        (1)
+#define INTERNAL_GOV_PERFORMANCE      (2)
+#define INTERNAL_GOV_USERSPACE        (3)
+#define INTERNAL_GOV_ONDEMAND         (4)
+
 #define CPUFREQ_GOV_START  1
 #define CPUFREQ_GOV_STOP   2
 #define CPUFREQ_GOV_LIMITS 3
diff --git a/xen/include/asm-x86/cpufreq.h b/xen/include/asm-x86/cpufreq.h
new file mode 100644
index 0000000..94410f8
--- /dev/null
+++ b/xen/include/asm-x86/cpufreq.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_X86_CPUFREQ_H
+#define _ASM_X86_CPUFREQ_H
+
+/*
+ *  Copyright (C) 2015 Wei Wang <wei.w.wang@intel.com>
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ */
+
+extern int intel_pstate_init(void);
+
+/*
+ * Maximum transition latency is in nanoseconds - if it's unknown,
+ * CPUFREQ_ETERNAL shall be used.
+ */
+#define CPUFREQ_ETERNAL        (-1)
+
+#endif /* _ASM_X86_CPUFREQ_H */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index e9c4723..3b59801 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -52,6 +52,8 @@
 #define MSR_IA32_MCG_STATUS		0x0000017a
 #define MSR_IA32_MCG_CTL		0x0000017b
 
+#define MSR_NHM_TURBO_RATIO_LIMIT	0x000001ad
+
 #define MSR_IA32_PEBS_ENABLE		0x000003f1
 #define MSR_IA32_DS_AREA		0x00000600
 #define MSR_IA32_PERF_CAPABILITIES	0x00000345
@@ -320,6 +322,7 @@
 #define MSR_IA32_MISC_ENABLE_MONITOR_ENABLE (1<<18)
 #define MSR_IA32_MISC_ENABLE_LIMIT_CPUID  (1<<22)
 #define MSR_IA32_MISC_ENABLE_XTPR_DISABLE (1<<23)
+#define MSR_IA32_MISC_ENABLE_TURBO_DISABLE (1ULL<<38)
 
 #define MSR_IA32_TSC_DEADLINE		0x000006E0
 #define MSR_IA32_ENERGY_PERF_BIAS	0x000001b0
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (5 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-09-17 16:08   ` Andrew Cooper
  2015-10-07 15:46   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c Wei Wang
  2015-09-14  2:32 ` [PATCH v5 9/9] tools: enable xenpm to control the intel_pstate driver Wei Wang
  8 siblings, 2 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

By default, the old P-state driver (acpi-freq) is used. Adding
"intel_pstate" to the Xen booting param list to enable the
use of intel_pstate. However, if intel_pstate is enabled on a
machine which does not support the driver (e.g. Nehalem), the
old P-state driver will be loaded due to the failure loading of
intel_pstate.

Also, adding the intel_pstate booting parameter to
xen-command-line.markdown.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 docs/misc/xen-command-line.markdown      | 7 +++++++
 xen/arch/x86/acpi/cpufreq/cpufreq.c      | 9 ++++++---
 xen/arch/x86/acpi/cpufreq/intel_pstate.c | 4 ++++
 3 files changed, 17 insertions(+), 3 deletions(-)

 changes in v5:
 1) move the booting parameter into the intel_pstate_init() function - have
    it be a local variable;
 2) rename "intel_pstate_load" to "load".

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index a2e427c..2d70137 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -849,6 +849,13 @@ debug hypervisor only).
 ### idle\_latency\_factor
 > `= <integer>`
 
+### intel\_pstate
+> `= <boolean>`
+
+> Default: `false`
+
+Enable the loading of the intel pstate driver.
+
 ### ioapic\_ack
 > `= old | new`
 
diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c
index 8494fa0..7e517b9 100644
--- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
+++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
@@ -40,6 +40,7 @@
 #include <asm/processor.h>
 #include <asm/percpu.h>
 #include <asm/cpufeature.h>
+#include <asm/cpufreq.h>
 #include <acpi/acpi.h>
 #include <acpi/cpufreq/cpufreq.h>
 
@@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
     int ret = 0;
 
     if ((cpufreq_controller == FREQCTL_xen) &&
-        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
-        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
-    else if ((cpufreq_controller == FREQCTL_xen) &&
+        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
+            ret = intel_pstate_init();
+            if (ret)
+                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
+    } else if ((cpufreq_controller == FREQCTL_xen) &&
         (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
         ret = powernow_register_driver();
 
diff --git a/xen/arch/x86/acpi/cpufreq/intel_pstate.c b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
index 3292bcd..4ebd9c7 100644
--- a/xen/arch/x86/acpi/cpufreq/intel_pstate.c
+++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
@@ -843,7 +843,11 @@ int __init intel_pstate_init(void)
     int cpu, rc = 0;
     const struct x86_cpu_id *id;
     struct cpu_defaults *cpu_info;
+    static bool_t __read_mostly load;
+    boolean_param("intel_pstate", load);
 
+    if ( !load )
+        return -ENODEV;
 
     id = x86_match_cpu(intel_pstate_cpu_ids);
     if ( !id )
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (6 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  2015-10-07 16:10   ` Jan Beulich
  2015-09-14  2:32 ` [PATCH v5 9/9] tools: enable xenpm to control the intel_pstate driver Wei Wang
  8 siblings, 1 reply; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

Add support in the pmstat.c so that the xenpm tool can request to
access the intel_pstate driver.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 tools/libxc/xc_pm.c         |  15 ++--
 xen/drivers/acpi/pmstat.c   | 179 +++++++++++++++++++++++++++++++++++++-------
 xen/include/public/sysctl.h |  29 ++++---
 3 files changed, 179 insertions(+), 44 deletions(-)

 changes in v5:
 1) remove a intermediate variable, "scaling_avail_governors";
 2) add condition checks in set_cpufreq_para();
 3) replace the previous union "scaling_max" and "scaling_min" with
   "uint32_t scaling_max_perf" and "uint32_t scaling_min_perf";
 4) add "enum perf_alias", to indicate the meaning of "scaling_max_perf"
   and "scaling_min_perf" - holding Percentage values or Frequency values;
 5) re-organize the xen_get_cpufreq_para structure to make it less than
   128Byte;
 6) coding style changes.

diff --git a/tools/libxc/xc_pm.c b/tools/libxc/xc_pm.c
index 5b38cf1..5ad777a 100644
--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -260,13 +260,14 @@ int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
     }
     else
     {
-        user_para->cpuinfo_cur_freq = sys_para->cpuinfo_cur_freq;
-        user_para->cpuinfo_max_freq = sys_para->cpuinfo_max_freq;
-        user_para->cpuinfo_min_freq = sys_para->cpuinfo_min_freq;
-        user_para->scaling_cur_freq = sys_para->scaling_cur_freq;
-        user_para->scaling_max_freq = sys_para->scaling_max_freq;
-        user_para->scaling_min_freq = sys_para->scaling_min_freq;
-        user_para->turbo_enabled    = sys_para->turbo_enabled;
+        user_para->cpuinfo_cur_freq  = sys_para->cpuinfo_cur_freq;
+        user_para->cpuinfo_max_freq  = sys_para->cpuinfo_max_freq;
+        user_para->cpuinfo_min_freq  = sys_para->cpuinfo_min_freq;
+        user_para->scaling_cur_freq  = sys_para->scaling_cur_freq;
+        user_para->scaling_max.pct   = sys_para->scaling_max_perf;
+        user_para->scaling_min.pct   = sys_para->scaling_min_perf;
+        user_para->scaling_turbo_pct = sys_para->scaling_turbo_pct;
+        user_para->turbo_enabled     = sys_para->turbo_enabled;
 
         memcpy(user_para->scaling_driver,
                 sys_para->scaling_driver, CPUFREQ_NAME_LEN);
diff --git a/xen/drivers/acpi/pmstat.c b/xen/drivers/acpi/pmstat.c
index 892260d..97893c5 100644
--- a/xen/drivers/acpi/pmstat.c
+++ b/xen/drivers/acpi/pmstat.c
@@ -191,7 +191,9 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
     uint32_t ret = 0;
     const struct processor_pminfo *pmpt;
     struct cpufreq_policy *policy;
-    uint32_t gov_num = 0;
+    struct perf_limits *limits;
+    struct internal_governor *internal_gov;
+    uint32_t cur_gov, gov_num = 0;
     uint32_t *affected_cpus;
     uint32_t *scaling_available_frequencies;
     char     *scaling_available_governors;
@@ -200,13 +202,21 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
 
     pmpt = processor_pminfo[op->cpuid];
     policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+    limits = &policy->limits;
+    internal_gov = policy->internal_gov;
+    cur_gov = internal_gov ? internal_gov->cur_gov : 0;
 
     if ( !pmpt || !pmpt->perf.states ||
-         !policy || !policy->governor )
+         !policy || (!policy->governor && !policy->internal_gov) )
         return -EINVAL;
 
-    list_for_each(pos, &cpufreq_governor_list)
-        gov_num++;
+    if ( internal_gov )
+        gov_num = internal_gov->gov_num;
+    else
+    {
+        list_for_each(pos, &cpufreq_governor_list)
+            gov_num++;
+    }
 
     if ( (op->u.get_para.cpu_num  != cpumask_weight(policy->cpus)) ||
          (op->u.get_para.freq_num != pmpt->perf.state_count)    ||
@@ -240,40 +250,88 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
     if ( ret )
         return ret;
 
-    if ( !(scaling_available_governors =
-           xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
-        return -ENOMEM;
-    if ( (ret = read_scaling_available_governors(scaling_available_governors,
-                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
+    if ( internal_gov )
     {
+        ret = copy_to_guest(op->u.get_para.scaling_available_governors,
+                    internal_gov->avail_gov, gov_num * CPUFREQ_NAME_LEN);
+        if ( ret )
+            return ret;
+    }
+    else
+    {
+        if ( !(scaling_available_governors =
+               xzalloc_array(char, gov_num * CPUFREQ_NAME_LEN)) )
+            return -ENOMEM;
+        if ( (ret = read_scaling_available_governors(scaling_available_governors,
+                gov_num * CPUFREQ_NAME_LEN * sizeof(char))) )
+        {
+            xfree(scaling_available_governors);
+            return ret;
+        }
+        ret = copy_to_guest(op->u.get_para.scaling_available_governors,
+                    scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
         xfree(scaling_available_governors);
-        return ret;
+        if ( ret )
+            return ret;
     }
-    ret = copy_to_guest(op->u.get_para.scaling_available_governors,
-                scaling_available_governors, gov_num * CPUFREQ_NAME_LEN);
-    xfree(scaling_available_governors);
-    if ( ret )
-        return ret;
-
     op->u.get_para.cpuinfo_cur_freq =
         cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
     op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
     op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
     op->u.get_para.scaling_cur_freq = policy->cur;
-    op->u.get_para.scaling_max_freq = policy->max;
-    op->u.get_para.scaling_min_freq = policy->min;
+    if ( internal_gov )
+    {
+        op->u.get_para.scaling_max_perf = limits->max_perf_pct;
+        op->u.get_para.scaling_min_perf = limits->min_perf_pct;
+        op->u.get_para.scaling_turbo_pct = limits->turbo_pct;
+    }
+    else
+    {
+        op->u.get_para.scaling_max_perf = policy->max;
+        op->u.get_para.scaling_min_perf = policy->min;
+    }
 
     if ( cpufreq_driver->name[0] )
+    {
         strlcpy(op->u.get_para.scaling_driver, 
             cpufreq_driver->name, CPUFREQ_NAME_LEN);
+        if ( !strncmp(cpufreq_driver->name,
+                     "intel_pstate", CPUFREQ_NAME_LEN) )
+            op->u.get_para.perf_alias = PERCENTAGE;
+        else
+            op->u.get_para.perf_alias = FREQUENCY;
+    }
     else
+    {
         strlcpy(op->u.get_para.scaling_driver, "Unknown", CPUFREQ_NAME_LEN);
+        op->u.get_para.perf_alias = FREQUENCY;
+    }
 
-    if ( policy->governor->name[0] )
-        strlcpy(op->u.get_para.scaling_governor, 
-            policy->governor->name, CPUFREQ_NAME_LEN);
-    else
-        strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
+    switch ( cur_gov )
+    {
+    case INTERNAL_GOV_PERFORMANCE:
+        strlcpy(op->u.get_para.scaling_governor,
+                "performance", CPUFREQ_NAME_LEN);
+        break;
+    case INTERNAL_GOV_POWERSAVE:
+        strlcpy(op->u.get_para.scaling_governor,
+                "powersave", CPUFREQ_NAME_LEN);
+        break;
+    case INTERNAL_GOV_USERSPACE:
+        strlcpy(op->u.get_para.scaling_governor,
+                "userspace", CPUFREQ_NAME_LEN);
+        break;
+    case INTERNAL_GOV_ONDEMAND:
+        strlcpy(op->u.get_para.scaling_governor,
+                "ondemand", CPUFREQ_NAME_LEN);
+        break;
+    default:
+        if ( policy->governor->name[0] )
+            strlcpy(op->u.get_para.scaling_governor,
+                policy->governor->name, CPUFREQ_NAME_LEN);
+        else
+            strlcpy(op->u.get_para.scaling_governor, "Unknown", CPUFREQ_NAME_LEN);
+    }
 
     /* governor specific para */
     if ( !strnicmp(op->u.get_para.scaling_governor, 
@@ -299,16 +357,36 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
 static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
 {
     struct cpufreq_policy new_policy, *old_policy;
+    struct internal_governor *internal_gov;
 
     old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
     if ( !old_policy )
         return -EINVAL;
+    internal_gov = old_policy->internal_gov;
 
     memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
 
-    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
-    if (new_policy.governor == NULL)
-        return -EINVAL;
+    if ( internal_gov && internal_gov->cur_gov )
+    {
+        if ( !strnicmp(op->u.set_gov.scaling_governor,
+                       "performance", CPUFREQ_NAME_LEN) )
+            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
+        else if ( !strnicmp(op->u.set_gov.scaling_governor,
+                       "powersave", CPUFREQ_NAME_LEN) )
+            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
+        else if ( !strnicmp(op->u.set_gov.scaling_governor,
+                       "userspace", CPUFREQ_NAME_LEN) )
+            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
+        else if ( !strnicmp(op->u.set_gov.scaling_governor,
+                       "ondemand", CPUFREQ_NAME_LEN) )
+            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;
+    }
+    else
+    {
+        new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
+        if ( new_policy.governor == NULL )
+            return -EINVAL;
+    }
 
     return __cpufreq_set_policy(old_policy, &new_policy);
 }
@@ -317,10 +395,12 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
 {
     int ret = 0;
     struct cpufreq_policy *policy;
+    struct internal_governor *internal_gov;
 
     policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
+    internal_gov = policy->internal_gov;
 
-    if ( !policy || !policy->governor )
+    if ( !policy || (!policy->governor && !internal_gov) )
         return -EINVAL;
 
     switch(op->u.set_para.ctrl_type)
@@ -329,6 +409,9 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
     {
         struct cpufreq_policy new_policy;
 
+	if ( !policy->governor || internal_gov )
+            return -EINVAL;
+
         memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
         new_policy.max = op->u.set_para.ctrl_value;
         ret = __cpufreq_set_policy(policy, &new_policy);
@@ -340,6 +423,9 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
     {
         struct cpufreq_policy new_policy;
 
+	if ( !policy->governor || internal_gov )
+            return -EINVAL;
+
         memcpy(&new_policy, policy, sizeof(struct cpufreq_policy));
         new_policy.min = op->u.set_para.ctrl_value;
         ret = __cpufreq_set_policy(policy, &new_policy);
@@ -347,10 +433,43 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
         break;
     }
 
+    case SCALING_MAX_PCT:
+    {
+        struct cpufreq_policy new_policy;
+        struct perf_limits *limits = &new_policy.limits;
+
+	if ( policy->governor || !internal_gov )
+            return -EINVAL;
+
+        new_policy = *policy;
+        limits->max_perf_pct = clamp_t(uint32_t, op->u.set_para.ctrl_value,
+			limits->min_policy_pct, limits->max_policy_pct);
+        ret = __cpufreq_set_policy(policy, &new_policy);
+        break;
+    }
+
+    case SCALING_MIN_PCT:
+    {
+        struct cpufreq_policy new_policy;
+        struct perf_limits *limits = &new_policy.limits;
+
+	if ( policy->governor || !internal_gov )
+            return -EINVAL;
+
+        new_policy = *policy;
+        limits->min_perf_pct = clamp_t(uint32_t, op->u.set_para.ctrl_value,
+			limits->min_policy_pct, limits->max_policy_pct);
+        ret = __cpufreq_set_policy(policy, &new_policy);
+        break;
+    }
+
     case SCALING_SETSPEED:
     {
         unsigned int freq =op->u.set_para.ctrl_value;
 
+	if ( !policy->governor || internal_gov )
+            return -EINVAL;
+
         if ( !strnicmp(policy->governor->name,
                        "userspace", CPUFREQ_NAME_LEN) )
             ret = write_userspace_scaling_setspeed(op->cpuid, freq);
@@ -364,6 +483,9 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
     {
         unsigned int sampling_rate = op->u.set_para.ctrl_value;
 
+	if ( !policy->governor || internal_gov )
+            return -EINVAL;
+
         if ( !strnicmp(policy->governor->name,
                        "ondemand", CPUFREQ_NAME_LEN) )
             ret = write_ondemand_sampling_rate(sampling_rate);
@@ -377,6 +499,9 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
     {
         unsigned int up_threshold = op->u.set_para.ctrl_value;
 
+	if ( !policy->governor || internal_gov )
+            return -EINVAL;
+
         if ( !strnicmp(policy->governor->name,
                        "ondemand", CPUFREQ_NAME_LEN) )
             ret = write_ondemand_up_threshold(up_threshold);
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 0cacacc..6d39fe9 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -297,11 +297,28 @@ typedef struct xen_ondemand xen_ondemand_t;
  * same as sysfs file name of native linux
  */
 #define CPUFREQ_NAME_LEN 16
+
+enum perf_alias {
+    FREQUENCY  = 0,
+    PERCENTAGE = 1
+};
+
 struct xen_get_cpufreq_para {
     /* IN/OUT variable */
     uint32_t cpu_num;
     uint32_t freq_num;
     uint32_t gov_num;
+    int32_t turbo_enabled;
+
+    uint32_t cpuinfo_cur_freq;
+    uint32_t cpuinfo_max_freq;
+    uint32_t cpuinfo_min_freq;
+    uint32_t scaling_cur_freq;
+
+    uint32_t scaling_turbo_pct;
+    uint32_t scaling_max_perf;
+    uint32_t scaling_min_perf;
+    enum perf_alias perf_alias;
 
     /* for all governors */
     /* OUT variable */
@@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
     XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
     XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
     char scaling_driver[CPUFREQ_NAME_LEN];
-
-    uint32_t cpuinfo_cur_freq;
-    uint32_t cpuinfo_max_freq;
-    uint32_t cpuinfo_min_freq;
-    uint32_t scaling_cur_freq;
-
     char scaling_governor[CPUFREQ_NAME_LEN];
-    uint32_t scaling_max_freq;
-    uint32_t scaling_min_freq;
 
     /* for specific governor */
     union {
         struct  xen_userspace userspace;
         struct  xen_ondemand ondemand;
     } u;
-
-    int32_t turbo_enabled;
 };
 
 struct xen_set_cpufreq_gov {
@@ -338,6 +345,8 @@ struct xen_set_cpufreq_para {
     #define SCALING_SETSPEED           3
     #define SAMPLING_RATE              4
     #define UP_THRESHOLD               5
+    #define SCALING_MAX_PCT            6
+    #define SCALING_MIN_PCT            7
 
     uint32_t ctrl_type;
     uint32_t ctrl_value;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v5 9/9] tools: enable xenpm to control the intel_pstate driver
  2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
                   ` (7 preceding siblings ...)
  2015-09-14  2:32 ` [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c Wei Wang
@ 2015-09-14  2:32 ` Wei Wang
  8 siblings, 0 replies; 44+ messages in thread
From: Wei Wang @ 2015-09-14  2:32 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, xen-devel; +Cc: Wei Wang

The intel_pstate driver receives percentage values to set the
performance limits. This patch adds interfaces to support the
input of percentage values to control the intel_pstate driver.
The "get-cpufreq-para" is modified to show percentage
based feedback info.
Also, some changes in identation are made to make the printed
info looks tidy.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
---
 tools/libxc/include/xenctrl.h |  21 ++++----
 tools/libxc/xc_pm.c           |   5 +-
 tools/misc/xenpm.c            | 116 ++++++++++++++++++++++++++++++++++--------
 3 files changed, 108 insertions(+), 34 deletions(-)

 changes in v5:
 1) re-organize "struct xc_get_cpufreq_para" to make it less than 128Byte;
 2) change to use switch() and enum based perf_alias, instead of string
    comparisons.

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 2000f12..27f9f18 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2249,6 +2249,17 @@ struct xc_get_cpufreq_para {
     uint32_t cpu_num;
     uint32_t freq_num;
     uint32_t gov_num;
+    int32_t turbo_enabled;
+
+    uint32_t cpuinfo_cur_freq;
+    uint32_t cpuinfo_max_freq;
+    uint32_t cpuinfo_min_freq;
+    uint32_t scaling_cur_freq;
+
+    uint32_t scaling_turbo_pct;
+    uint32_t scaling_max_perf;
+    uint32_t scaling_min_perf;
+    enum perf_alias perf_alias;
 
     /* for all governors */
     /* OUT variable */
@@ -2256,23 +2267,13 @@ struct xc_get_cpufreq_para {
     uint32_t *scaling_available_frequencies;
     char     *scaling_available_governors;
     char scaling_driver[CPUFREQ_NAME_LEN];
-
-    uint32_t cpuinfo_cur_freq;
-    uint32_t cpuinfo_max_freq;
-    uint32_t cpuinfo_min_freq;
-    uint32_t scaling_cur_freq;
-
     char scaling_governor[CPUFREQ_NAME_LEN];
-    uint32_t scaling_max_freq;
-    uint32_t scaling_min_freq;
 
     /* for specific governor */
     union {
         xc_userspace_t userspace;
         xc_ondemand_t ondemand;
     } u;
-
-    int32_t turbo_enabled;
 };
 
 int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
diff --git a/tools/libxc/xc_pm.c b/tools/libxc/xc_pm.c
index 5ad777a..2e22ae4 100644
--- a/tools/libxc/xc_pm.c
+++ b/tools/libxc/xc_pm.c
@@ -264,8 +264,9 @@ int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
         user_para->cpuinfo_max_freq  = sys_para->cpuinfo_max_freq;
         user_para->cpuinfo_min_freq  = sys_para->cpuinfo_min_freq;
         user_para->scaling_cur_freq  = sys_para->scaling_cur_freq;
-        user_para->scaling_max.pct   = sys_para->scaling_max_perf;
-        user_para->scaling_min.pct   = sys_para->scaling_min_perf;
+        user_para->scaling_max_perf  = sys_para->scaling_max_perf;
+        user_para->scaling_min_perf  = sys_para->scaling_min_perf;
+        user_para->perf_alias        = sys_para->perf_alias;
         user_para->scaling_turbo_pct = sys_para->scaling_turbo_pct;
         user_para->turbo_enabled     = sys_para->turbo_enabled;
 
diff --git a/tools/misc/xenpm.c b/tools/misc/xenpm.c
index 08f2242..49ceb89 100644
--- a/tools/misc/xenpm.c
+++ b/tools/misc/xenpm.c
@@ -32,6 +32,11 @@
 #define MAX_CORE_RESIDENCIES 8
 
 #define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+#define min_t(type,x,y) \
+        ({ type __x = (x); type __y = (y); __x < __y ? __x: __y; })
+#define max_t(type,x,y) \
+        ({ type __x = (x); type __y = (y); __x > __y ? __x: __y; })
+#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)
 
 static xc_interface *xc_handle;
 static unsigned int max_cpu_nr;
@@ -46,6 +51,9 @@ void show_help(void)
             " get-cpuidle-states    [cpuid]       list cpu idle info of CPU <cpuid> or all\n"
             " get-cpufreq-states    [cpuid]       list cpu freq info of CPU <cpuid> or all\n"
             " get-cpufreq-para      [cpuid]       list cpu freq parameter of CPU <cpuid> or all\n"
+            " set-scaling-max-pct   [cpuid] <num> set max performance limit in percentage\n"
+            "                                     or as scaling speed in percentage in userspace governor\n"
+            " set-scaling-min-pct   [cpuid] <num> set min performance limit in percentage\n"
             " set-scaling-maxfreq   [cpuid] <HZ>  set max cpu frequency <HZ> on CPU <cpuid>\n"
             "                                     or all CPUs\n"
             " set-scaling-minfreq   [cpuid] <HZ>  set min cpu frequency <HZ> on CPU <cpuid>\n"
@@ -59,10 +67,10 @@ void show_help(void)
             " set-up-threshold      [cpuid] <num> set up threshold on CPU <cpuid> or all\n"
             "                                     it is used in ondemand governor.\n"
             " get-cpu-topology                    get thread/core/socket topology info\n"
-            " set-sched-smt           enable|disable enable/disable scheduler smt power saving\n"
+            " set-sched-smt                       enable|disable enable/disable scheduler smt power saving\n"
             " set-vcpu-migration-delay      <num> set scheduler vcpu migration delay in us\n"
             " get-vcpu-migration-delay            get scheduler vcpu migration delay\n"
-            " set-max-cstate        <num>         set the C-State limitation (<num> >= 0)\n"
+            " set-max-cstate                <num> set the C-State limitation (<num> >= 0)\n"
             " start [seconds]                     start collect Cx/Px statistics,\n"
             "                                     output after CTRL-C or SIGINT or several seconds.\n"
             " enable-turbo-mode     [cpuid]       enable Turbo Mode for processors that support it.\n"
@@ -677,37 +685,51 @@ static void print_cpufreq_para(int cpuid, struct xc_get_cpufreq_para *p_cpufreq)
 
     printf("current_governor     : %s\n", p_cpufreq->scaling_governor);
     if ( !strncmp(p_cpufreq->scaling_governor,
-                  "userspace", CPUFREQ_NAME_LEN) )
+                  "userspace", CPUFREQ_NAME_LEN) &&
+          strncmp(p_cpufreq->scaling_driver,
+                  "intel_pstate", CPUFREQ_NAME_LEN) )
     {
-        printf("  userspace specific :\n");
-        printf("    scaling_setspeed : %u\n",
+        printf("userspace specific   :\n");
+        printf("scaling_setspeed     : %u\n",
                p_cpufreq->u.userspace.scaling_setspeed);
     }
     else if ( !strncmp(p_cpufreq->scaling_governor,
-                       "ondemand", CPUFREQ_NAME_LEN) )
+                       "ondemand", CPUFREQ_NAME_LEN) &&
+               strncmp(p_cpufreq->scaling_driver,
+                       "intel_pstate", CPUFREQ_NAME_LEN) )
     {
-        printf("  ondemand specific  :\n");
-        printf("    sampling_rate    : max [%u] min [%u] cur [%u]\n",
+        printf("ondemand specific    :\n");
+        printf("sampling_rate        : max [%u] min [%u] cur [%u]\n",
                p_cpufreq->u.ondemand.sampling_rate_max,
                p_cpufreq->u.ondemand.sampling_rate_min,
                p_cpufreq->u.ondemand.sampling_rate);
-        printf("    up_threshold     : %u\n",
+        printf("up_threshold         : %u\n",
                p_cpufreq->u.ondemand.up_threshold);
     }
 
-    printf("scaling_avail_freq   :");
-    for ( i = 0; i < p_cpufreq->freq_num; i++ )
-        if ( p_cpufreq->scaling_available_frequencies[i] ==
-             p_cpufreq->scaling_cur_freq )
-            printf(" *%d", p_cpufreq->scaling_available_frequencies[i]);
-        else
-            printf(" %d", p_cpufreq->scaling_available_frequencies[i]);
-    printf("\n");
-
-    printf("scaling frequency    : max [%u] min [%u] cur [%u]\n",
-           p_cpufreq->scaling_max_freq,
-           p_cpufreq->scaling_min_freq,
-           p_cpufreq->scaling_cur_freq);
+    switch ( p_cpufreq->perf_alias )
+    {
+     case PERCENTAGE:
+        printf("max_perf_pct         : %d\n", p_cpufreq->scaling_max_perf);
+        printf("min_perf_pct         : %d\n", p_cpufreq->scaling_min_perf);
+        printf("turbo_pct            : %d\n", p_cpufreq->scaling_turbo_pct);
+        break;
+    case FREQUENCY:
+    default:
+        printf("scaling_avail_freq   :");
+        for ( i = 0; i < p_cpufreq->freq_num; i++ )
+            if ( p_cpufreq->scaling_available_frequencies[i] ==
+                 p_cpufreq->scaling_cur_freq )
+                printf(" *%d", p_cpufreq->scaling_available_frequencies[i]);
+            else
+                printf(" %d", p_cpufreq->scaling_available_frequencies[i]);
+        printf("\n");
+        printf("scaling frequency    : max [%u] min [%u] cur [%u]\n",
+               p_cpufreq->scaling_max_perf,
+               p_cpufreq->scaling_min_perf,
+               p_cpufreq->scaling_cur_freq);
+        break;
+    }
 
     printf("turbo mode           : %s\n",
            p_cpufreq->turbo_enabled ? "enabled" : "disabled or n/a");
@@ -854,6 +876,54 @@ void scaling_min_freq_func(int argc, char *argv[])
     }
 }
 
+void scaling_max_pct_func(int argc, char *argv[])
+{
+    int cpuid = -1, pct = -1;
+
+    parse_cpuid_and_int(argc, argv, &cpuid, &pct, "percentage");
+    pct = clamp_t(int, pct, 0, 100);
+
+    if ( cpuid < 0 )
+    {
+        int i;
+        for ( i = 0; i < max_cpu_nr; i++ )
+            if ( xc_set_cpufreq_para(xc_handle, i, SCALING_MAX_PCT, pct) )
+                fprintf(stderr,
+                        "[CPU%d] failed to set scaling max freq (%d - %s)\n",
+                        i, errno, strerror(errno));
+    }
+    else
+    {
+        if ( xc_set_cpufreq_para(xc_handle, cpuid, SCALING_MAX_PCT, pct) )
+            fprintf(stderr, "failed to set scaling max freq (%d - %s)\n",
+                    errno, strerror(errno));
+    }
+}
+
+void scaling_min_pct_func(int argc, char *argv[])
+{
+    int cpuid = -1, pct = -1;
+
+    parse_cpuid_and_int(argc, argv, &cpuid, &pct, "percentage");
+    pct = clamp_t(int, pct, 0, 100);
+
+    if ( cpuid < 0 )
+    {
+        int i;
+        for ( i = 0; i < max_cpu_nr; i++ )
+            if ( xc_set_cpufreq_para(xc_handle, i, SCALING_MIN_PCT, pct) )
+                fprintf(stderr,
+                        "[CPU%d] failed to set scaling max pct (%d - %s)\n",
+                        i, errno, strerror(errno));
+    }
+    else
+    {
+        if ( xc_set_cpufreq_para(xc_handle, cpuid, SCALING_MIN_PCT, pct) )
+            fprintf(stderr, "failed to set scaling min pct (%d - %s)\n",
+                    errno, strerror(errno));
+    }
+}
+
 void scaling_speed_func(int argc, char *argv[])
 {
     int cpuid = -1, speed = -1;
@@ -1133,6 +1203,8 @@ struct {
     { "get-cpufreq-para", cpufreq_para_func },
     { "set-scaling-maxfreq", scaling_max_freq_func },
     { "set-scaling-minfreq", scaling_min_freq_func },
+    { "set-scaling-max-pct", scaling_max_pct_func},
+    { "set-scaling-min-pct", scaling_min_pct_func},
     { "set-scaling-governor", scaling_governor_func },
     { "set-scaling-speed", scaling_speed_func },
     { "set-sampling-rate", scaling_sampling_rate_func },
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/9] x86/intel_pstate: add some calculation related support
  2015-09-14  2:32 ` [PATCH v5 1/9] x86/intel_pstate: add some calculation related support Wei Wang
@ 2015-09-17 15:19   ` Andrew Cooper
  2015-10-05 16:00     ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:19 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> The added calculation related functions will be used in the intel_pstate.c.
> They are copied from the Linux kernel(commit 2418f4f2, f3002134, eb18cba7).
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
>  xen/arch/x86/oprofile/op_model_athlon.c |  9 ----
>  xen/include/asm-x86/div64.h             | 79 +++++++++++++++++++++++++++++++++
>  xen/include/xen/kernel.h                | 23 ++++++++++
>  3 files changed, 102 insertions(+), 9 deletions(-)
>
>  changes in v5:
>  1) add clamp(), a type checking variant of clamp_t();
>  2) remove the private copy of clamp() in op_model_athlon.c.
>
> diff --git a/xen/arch/x86/oprofile/op_model_athlon.c b/xen/arch/x86/oprofile/op_model_athlon.c
> index c0a81ed..4122eee 100644
> --- a/xen/arch/x86/oprofile/op_model_athlon.c
> +++ b/xen/arch/x86/oprofile/op_model_athlon.c
> @@ -103,15 +103,6 @@ static u64 ibs_op_ctl;
>  #define IBS_FETCH_CODE                  13
>  #define IBS_OP_CODE                     14
>  
> -#define clamp(val, min, max) ({			\
> -	typeof(val) __val = (val);		\
> -	typeof(min) __min = (min);		\
> -	typeof(max) __max = (max);		\
> -	(void) (&__val == &__min);		\
> -	(void) (&__val == &__max);		\
> -	__val = __val < __min ? __min: __val;	\
> -	__val > __max ? __max: __val; })
> -
>  /*
>   * 16-bit Linear Feedback Shift Register (LFSR)
>   */
> diff --git a/xen/include/asm-x86/div64.h b/xen/include/asm-x86/div64.h
> index dd49f64..6ba03cb 100644
> --- a/xen/include/asm-x86/div64.h
> +++ b/xen/include/asm-x86/div64.h
> @@ -11,4 +11,83 @@
>      __rem;                                      \
>  })
>  

Comment to explain the functionality?

> +static inline uint64_t div_u64_rem(uint64_t dividend, uint32_t divisor,
> +                                      uint32_t *remainder)

Alignment

> +{
> +    *remainder = do_div(dividend, divisor);
> +    return dividend;
> +}
> +
> +static inline uint64_t div_u64(uint64_t dividend, uint32_t  divisor)


> +
> +/**
>   * container_of - cast a member of a structure out to the containing structure
>   *
>   * @ptr:	the pointer to the member.

Too many spaces before "divisor"

> +{
> +    uint32_t remainder;
> +
> +    return div_u64_rem(dividend, divisor, &remainder);
> +}
> +
> +/*
> + * div64_u64 - unsigned 64bit divide with 64bit divisor
> + * @dividend:    64bit dividend
> + * @divisor:    64bit divisor

Alignment also looks wonky here.

> + *
> + * This implementation is a modified version of the algorithm proposed
> + * by the book 'Hacker's Delight'.  The original source and full proof
> + * can be found here and is available for use without restriction.
> + *
> + * 'http://www.hackersdelight.org/HDcode/newCode/divDouble.c.txt'
> + */
> +static inline uint64_t div64_u64(uint64_t dividend, uint64_t divisor)
> +{
> +    uint32_t high = divisor >> 32;
> +    uint64_t quot;
> +
> +    if ( high == 0 )
> +        quot = div_u64(dividend, divisor);
> +    else
> +    {
> +        int n = 1 + fls(high);
> +
> +        quot = div_u64(dividend >> n, divisor >> n);
> +
> +        if ( quot != 0 )
> +            quot--;
> +        if ( (dividend - quot * divisor) >= divisor )
> +            quot++;
> +    }
> +    return quot;
> +}
> +

Comment?

> +static inline int64_t div_s64_rem(int64_t dividend, int32_t divisor,
> +                                     int32_t *remainder)

Alignment.

> +{
> +    int64_t quotient;
> +
> +    if ( dividend < 0 )
> +    {
> +        quotient = div_u64_rem(-dividend, ABS(divisor),
> +                        (uint32_t *)remainder);

Alignment.

> +        *remainder = -*remainder;
> +        if ( divisor > 0 )
> +            quotient = -quotient;
> +    }
> +    else
> +    {
> +        quotient = div_u64_rem(dividend, ABS(divisor),
> +                        (uint32_t *)remainder);

Alignment.

> +        if ( divisor < 0 )
> +            quotient = -quotient;
> +    }
> +    return quotient;
> +}
> +
> +/*
> + * div_s64 - signed 64bit divide with 32bit divisor
> + */
> +static inline int64_t div_s64(int64_t dividend, int32_t divisor)
> +{
> +    int32_t remainder;
> +
> +    return div_s64_rem(dividend, divisor, &remainder);
> +}
> +
>  #endif
> diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
> index 548b64d..9812698 100644
> --- a/xen/include/xen/kernel.h
> +++ b/xen/include/xen/kernel.h
> @@ -43,6 +43,29 @@
>  #define MAX(x,y) ((x) > (y) ? (x) : (y))
>  
>  /**
> + * clamp - return a value clamped to a given range with strict typechecking
> + * @val: current value
> + * @lo: lowest allowable value
> + * @hi: highest allowable value
> + *
> + * This macro does strict typechecking of lo/hi to make sure they are of the
> + * same type as val.  See the unnecessary pointer comparisons.
> + */
> +#define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)

This is a change of behaviour from the clamp() you removed, as this now
evaluates its arguments multiple times.

Please use a ({ }) style macro to avoid evaluating the arguments
multiple times.

> +
> +/*
> + * clamp_t - return a value clamped to a given range using a given type
> + * @type: the type of variable to use
> + * @val: current value
> + * @lo: minimum allowable value
> + * @hi: maximum allowable value
> + *
> + * This macro does no typechecking and uses temporary variables of type
> + * 'type' to make all the comparisons.
> + */
> +#define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi)

And here as well please.

~Andrew

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect
  2015-09-14  2:32 ` [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect Wei Wang
@ 2015-09-17 15:26   ` Andrew Cooper
  2015-10-05 16:14   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:26 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> Add support to detect the APERF/MPERF feature. Also, remove the identical
> code in cpufreq.c and powernow.c. This patch is independent of the
> earlier patches.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy()
  2015-09-14  2:32 ` [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy() Wei Wang
@ 2015-09-17 15:34   ` Andrew Cooper
  2015-10-06 15:37   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:34 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> In order to better support future Intel processors, intel_pstate
> changes to use percentage values to tune P-states. The setpolicy
> driver interface is used to configure the intel_pstate internal
> policy. The __cpufreq_set_policy needs to be intercepted to use
> the setpolicy driver if it exists.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function
  2015-09-14  2:32 ` [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function Wei Wang
@ 2015-09-17 15:38   ` Andrew Cooper
  2015-09-21 13:13     ` Jan Beulich
  2015-10-07 15:08   ` Jan Beulich
  1 sibling, 1 reply; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:38 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> Move the driver register function to
> the cpufreq.c.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
>  xen/drivers/cpufreq/cpufreq.c      | 15 +++++++++++++++
>  xen/include/acpi/cpufreq/cpufreq.h | 27 +--------------------------
>  2 files changed, 16 insertions(+), 26 deletions(-)
>
>  changes in v5:
>  1) keep cpufreq_presmp_init() intact.
>
> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> index 567e9e9..0c437d4 100644
> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -638,3 +638,18 @@ static int __init cpufreq_presmp_init(void)
>  }
>  presmp_initcall(cpufreq_presmp_init);
>  
> +int cpufreq_register_driver(struct cpufreq_driver *driver_data)
> +{
> +   if ( !driver_data || !driver_data->init ||
> +        !driver_data->verify || !driver_data->exit ||
> +        (!driver_data->target == !driver_data->setpolicy) )

This line will incur the wrath of newer GCC's which have warnings
against such logic.

Either bracket the (!driver_data->$X) or alter the logic itself.

Also, the commit message needs to be corrected to state that you are
also introducing a new check to the registration function.

~Andrew

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline
  2015-09-14  2:32 ` [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline Wei Wang
@ 2015-09-17 15:43   ` Andrew Cooper
  2015-10-07 15:28   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:43 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> We change to NULL the cpufreq_cpu_policy pointer after the call of
> cpufreq_driver->exit, because the pointer is still needed in
> intel_pstate_set_pstate().
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
>  xen/drivers/cpufreq/cpufreq.c      | 6 +++---
>  xen/include/acpi/cpufreq/cpufreq.h | 7 +++++++
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
>  changes in v5:
>  1) put this patch prior to the "main body of intel pstate driver", which is 
>     one of the acceptable options suggested by the Jan.
>
> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
> index 0c437d4..5485944 100644
> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -334,12 +334,11 @@ int cpufreq_del_cpu(unsigned int cpu)
>  
>      /* for HW_ALL, stop gov for each core of the _PSD domain */
>      /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */
> -    if (hw_all || (cpumask_weight(cpufreq_dom->map) ==
> -                   perf->domain_info.num_processors))
> +    if (!policy->internal_gov && (hw_all || (cpumask_weight(cpufreq_dom->map) ==
> +                   perf->domain_info.num_processors)))

This will be easier to read if you change the location of the linebreak
as such:

perf->domain_info.num_processors)))


if (!policy->internal_gov &&
    (hw_all || (cpumask_weight(cpufreq_dom->map) ==
                   perf->domain_info.num_processors)))

>          __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
>  
>      cpufreq_statistic_exit(cpu);
> -    per_cpu(cpufreq_cpu_policy, cpu) = NULL;
>      cpumask_clear_cpu(cpu, policy->cpus);
>      cpumask_clear_cpu(cpu, cpufreq_dom->map);
>  
> @@ -348,6 +347,7 @@ int cpufreq_del_cpu(unsigned int cpu)
>          free_cpumask_var(policy->cpus);
>          xfree(policy);
>      }
> +    per_cpu(cpufreq_cpu_policy, cpu) = NULL;
>  
>      /* for the last cpu of the domain, clean room */
>      /* It's safe here to free freq_table, drv_data and policy */
> diff --git a/xen/include/acpi/cpufreq/cpufreq.h b/xen/include/acpi/cpufreq/cpufreq.h
> index c6976d0..48bd94d 100644
> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ b/xen/include/acpi/cpufreq/cpufreq.h
> @@ -53,6 +53,12 @@ struct perf_limits {
>      uint32_t min_policy_pct;
>  };
>  
> +struct internal_governor {
> +    char *avail_gov;
> +    uint32_t gov_num;
> +    uint32_t cur_gov;
> +};
> +

Adding this structure needs a mention in the commit message.

~Andrew

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
  2015-09-14  2:32 ` [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver Wei Wang
@ 2015-09-17 15:51   ` Andrew Cooper
  2015-10-07 15:39   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 15:51 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> We simply grab the fundamental logic of the intel_pstate driver
> from Linux kernel, and customize it to Xen style.

How much customisation?  Is it just style, or other additions as well? 
(Deletions are less interesting)

For files we import directly from Linux, we keep Linux style to simplify
taking new patches in the future.  Whether it is sensible to do so here
depends on how modified it is from Linux.

> diff --git a/xen/include/asm-x86/cpufreq.h b/xen/include/asm-x86/cpufreq.h
> new file mode 100644
> index 0000000..94410f8
> --- /dev/null
> +++ b/xen/include/asm-x86/cpufreq.h
> @@ -0,0 +1,34 @@
> +#ifndef _ASM_X86_CPUFREQ_H
> +#define _ASM_X86_CPUFREQ_H
> +
> +/*
> + *  Copyright (C) 2015 Wei Wang <wei.w.wang@intel.com>
> + *
> + * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; either version 2 of the License, or
> + *  (at your option) any later version.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License
> + *  along with this program; if not, write to the Free Software
> + *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA

Per c/s 443701e "Replace FSF street address with canonical URL", please
correct this GPL header.

~Andrew

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-09-14  2:32 ` [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load Wei Wang
@ 2015-09-17 16:08   ` Andrew Cooper
  2015-09-21 13:15     ` Jan Beulich
  2015-10-07 15:46   ` Jan Beulich
  1 sibling, 1 reply; 44+ messages in thread
From: Andrew Cooper @ 2015-09-17 16:08 UTC (permalink / raw)
  To: Wei Wang, jbeulich, xen-devel

On 14/09/15 03:32, Wei Wang wrote:
> By default, the old P-state driver (acpi-freq) is used. Adding
> "intel_pstate" to the Xen booting param list to enable the
> use of intel_pstate. However, if intel_pstate is enabled on a
> machine which does not support the driver (e.g. Nehalem), the
> old P-state driver will be loaded due to the failure loading of
> intel_pstate.
>
> Also, adding the intel_pstate booting parameter to
> xen-command-line.markdown.
>
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
>  docs/misc/xen-command-line.markdown      | 7 +++++++
>  xen/arch/x86/acpi/cpufreq/cpufreq.c      | 9 ++++++---
>  xen/arch/x86/acpi/cpufreq/intel_pstate.c | 4 ++++
>  3 files changed, 17 insertions(+), 3 deletions(-)
>
>  changes in v5:
>  1) move the booting parameter into the intel_pstate_init() function - have
>     it be a local variable;
>  2) rename "intel_pstate_load" to "load".
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index a2e427c..2d70137 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -849,6 +849,13 @@ debug hypervisor only).
>  ### idle\_latency\_factor
>  > `= <integer>`
>  
> +### intel\_pstate
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Enable the loading of the intel pstate driver.
> +
>  ### ioapic\_ack
>  > `= old | new`
>  
> diff --git a/xen/arch/x86/acpi/cpufreq/cpufreq.c b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> index 8494fa0..7e517b9 100644
> --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> @@ -40,6 +40,7 @@
>  #include <asm/processor.h>
>  #include <asm/percpu.h>
>  #include <asm/cpufeature.h>
> +#include <asm/cpufreq.h>
>  #include <acpi/acpi.h>
>  #include <acpi/cpufreq/cpufreq.h>
>  
> @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
>      int ret = 0;
>  
>      if ((cpufreq_controller == FREQCTL_xen) &&
> -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> -    else if ((cpufreq_controller == FREQCTL_xen) &&
> +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> +            ret = intel_pstate_init();
> +            if (ret)
> +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);

This looks like a mess (although not your fault to start with).

We shouldn't have multiple different top level command line options.  In
particular, having "mwait-idle" and "intel_pstate" seems wrong, given a
perfectly good "cpufreq=" option.

Is the driver in use available to change at runtime from `xenpm` ?

> +    } else if ((cpufreq_controller == FREQCTL_xen) &&
>          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
>          ret = powernow_register_driver();
>  
> diff --git a/xen/arch/x86/acpi/cpufreq/intel_pstate.c b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
> index 3292bcd..4ebd9c7 100644
> --- a/xen/arch/x86/acpi/cpufreq/intel_pstate.c
> +++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
> @@ -843,7 +843,11 @@ int __init intel_pstate_init(void)
>      int cpu, rc = 0;
>      const struct x86_cpu_id *id;
>      struct cpu_defaults *cpu_info;
> +    static bool_t __read_mostly load;

__initdata

~Andrew

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function
  2015-09-17 15:38   ` Andrew Cooper
@ 2015-09-21 13:13     ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-09-21 13:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Wang, xen-devel

>>> On 17.09.15 at 17:38, <andrew.cooper3@citrix.com> wrote:
> On 14/09/15 03:32, Wei Wang wrote:
>> Move the driver register function to
>> the cpufreq.c.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> ---
>>  xen/drivers/cpufreq/cpufreq.c      | 15 +++++++++++++++
>>  xen/include/acpi/cpufreq/cpufreq.h | 27 +--------------------------
>>  2 files changed, 16 insertions(+), 26 deletions(-)
>>
>>  changes in v5:
>>  1) keep cpufreq_presmp_init() intact.
>>
>> diff --git a/xen/drivers/cpufreq/cpufreq.c b/xen/drivers/cpufreq/cpufreq.c
>> index 567e9e9..0c437d4 100644
>> --- a/xen/drivers/cpufreq/cpufreq.c
>> +++ b/xen/drivers/cpufreq/cpufreq.c
>> @@ -638,3 +638,18 @@ static int __init cpufreq_presmp_init(void)
>>  }
>>  presmp_initcall(cpufreq_presmp_init);
>>  
>> +int cpufreq_register_driver(struct cpufreq_driver *driver_data)
>> +{
>> +   if ( !driver_data || !driver_data->init ||
>> +        !driver_data->verify || !driver_data->exit ||
>> +        (!driver_data->target == !driver_data->setpolicy) )
> 
> This line will incur the wrath of newer GCC's which have warnings
> against such logic.

Hmm, I think we have other instances of such, without gcc
complaining. Iirc there was a bug in an early 5.0-rc which got
fixed in the final 5.1.0.

> Either bracket the (!driver_data->$X) or alter the logic itself.

I'd prefer to avoid either change.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-09-17 16:08   ` Andrew Cooper
@ 2015-09-21 13:15     ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-09-21 13:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: Wei Wang, xen-devel

>>> On 17.09.15 at 18:08, <andrew.cooper3@citrix.com> wrote:
> We shouldn't have multiple different top level command line options.  In
> particular, having "mwait-idle" and "intel_pstate" seems wrong, given a
> perfectly good "cpufreq=" option.

"mwait-idle" is C-state related.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 1/9] x86/intel_pstate: add some calculation related support
  2015-09-17 15:19   ` Andrew Cooper
@ 2015-10-05 16:00     ` Jan Beulich
  0 siblings, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-05 16:00 UTC (permalink / raw)
  To: Andrew Cooper, Wei Wang; +Cc: xen-devel

>>> On 17.09.15 at 17:19, <andrew.cooper3@citrix.com> wrote:
> On 14/09/15 03:32, Wei Wang wrote:
>> --- a/xen/include/xen/kernel.h
>> +++ b/xen/include/xen/kernel.h
>> @@ -43,6 +43,29 @@
>>  #define MAX(x,y) ((x) > (y) ? (x) : (y))
>>  
>>  /**
>> + * clamp - return a value clamped to a given range with strict typechecking
>> + * @val: current value
>> + * @lo: lowest allowable value
>> + * @hi: highest allowable value
>> + *
>> + * This macro does strict typechecking of lo/hi to make sure they are of the
>> + * same type as val.  See the unnecessary pointer comparisons.
>> + */
>> +#define clamp(val, lo, hi) min((typeof(val))max(val, lo), hi)
> 
> This is a change of behaviour from the clamp() you removed, as this now
> evaluates its arguments multiple times.
> 
> Please use a ({ }) style macro to avoid evaluating the arguments
> multiple times.

So where do you see the multiple argument evaluation? The only
odd thing I can spot is the type cast on the result of max(), which
ought to be superfluous (due to the strict type handling of both
max() and min()).

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect
  2015-09-14  2:32 ` [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect Wei Wang
  2015-09-17 15:26   ` Andrew Cooper
@ 2015-10-05 16:14   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-05 16:14 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
>  changes in v5:
>  1) define macros for 0x1 and CPUID leaf5;
>  2) add a statement stating that this patch is independent of the
>     previous ones.

This statement doesn't belong in the commit message, i.e. should go
after the first ---.

> --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> @@ -50,7 +50,6 @@ enum {
>  };
>  
>  #define INTEL_MSR_RANGE         (0xffffull)
> -#define CPUID_6_ECX_APERFMPERF_CAPABILITY       (0x1)
>  
>  struct acpi_cpufreq_data *cpufreq_drv_data[NR_CPUS];
>  
> @@ -351,10 +350,9 @@ static unsigned int get_cur_freq_on_cpu(unsigned int cpu)
>  static void feature_detect(void *info)
>  {
>      struct cpufreq_policy *policy = info;
> -    unsigned int eax, ecx;
> +    unsigned int eax;
>  
> -    ecx = cpuid_ecx(6);
> -    if (ecx & CPUID_6_ECX_APERFMPERF_CAPABILITY) {
> +    if ( cpu_has_aperfmperf ) {

You should have fully fixed coding style here.

> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -239,6 +239,10 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
>  	if ( cpu_has(c, X86_FEATURE_CLFLSH) )
>  		c->x86_clflush_size = ((ebx >> 8) & 0xff) * 8;
>  
> +	if ( (c->cpuid_level > CPUID_PM_LEAF) &&
> +		(cpuid_ecx(CPUID_PM_LEAF) & CPUID6_ECX_APERFMPERF_CAPABILITY) )

Indentation.

> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -69,6 +69,7 @@
>  #define X86_FEATURE_XTOPOLOGY    (3*32+13) /* cpu topology enum extensions */
>  #define X86_FEATURE_CPUID_FAULTING (3*32+14) /* cpuid faulting */
>  #define X86_FEATURE_CLFLUSH_MONITOR (3*32+15) /* clflush reqd with monitor */
> +#define X86_FEATURE_APERFMPERF (3*32+28) /* APERFMPERF */

Why 28 instead of the consecutive 16?

> @@ -190,6 +194,7 @@
>  #define cpu_has_page1gb		boot_cpu_has(X86_FEATURE_PAGE1GB)
>  #define cpu_has_efer		1
>  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
> +#define cpu_has_aperfmperf      boot_cpu_has(X86_FEATURE_APERFMPERF)

Should use tab for aligning the right side to match the lines right
above.

I've taken care of all of these for you, but I would really
appreciate you taking care of such going forward.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy()
  2015-09-14  2:32 ` [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy() Wei Wang
  2015-09-17 15:34   ` Andrew Cooper
@ 2015-10-06 15:37   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-06 15:37 UTC (permalink / raw)
  To: andrew.cooper3, Wei Wang; +Cc: xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> In order to better support future Intel processors, intel_pstate
> changes to use percentage values to tune P-states. The setpolicy
> driver interface is used to configure the intel_pstate internal
> policy. The __cpufreq_set_policy needs to be intercepted to use
> the setpolicy driver if it exists.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>

Andrew having given his R-b I'm going to apply this as is, but I don't
see how ...

> --- a/xen/drivers/cpufreq/utility.c
> +++ b/xen/drivers/cpufreq/utility.c
> @@ -456,6 +456,9 @@ int __cpufreq_set_policy(struct cpufreq_policy *data,
>  
>      data->min = policy->min;
>      data->max = policy->max;
> +    data->limits = policy->limits;

... this and ...

> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ b/xen/include/acpi/cpufreq/cpufreq.h
> @@ -41,6 +41,18 @@ struct cpufreq_cpuinfo {
>      unsigned int        transition_latency; /* in 10^(-9) s = nanoseconds */
>  };
>  
> +struct perf_limits {
> +    bool_t no_turbo;
> +    bool_t turbo_disabled;
> +    uint32_t turbo_pct;
> +    uint32_t max_perf_pct; /* max performance in percentage */
> +    uint32_t min_perf_pct; /* min performance in percentage */
> +    uint32_t max_perf;
> +    uint32_t min_perf;
> +    uint32_t max_policy_pct;
> +    uint32_t min_policy_pct;
> +};

.. this are related to the patch subject. In particular it is impossible
to tell at this point whether and how all of the fields above are
going to be used. Please try to split patch series into _logical_
pieces.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function
  2015-09-14  2:32 ` [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function Wei Wang
  2015-09-17 15:38   ` Andrew Cooper
@ 2015-10-07 15:08   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-07 15:08 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> Move the driver register function to
> the cpufreq.c.

... and remove the (unused) de-registration one.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline
  2015-09-14  2:32 ` [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline Wei Wang
  2015-09-17 15:43   ` Andrew Cooper
@ 2015-10-07 15:28   ` Jan Beulich
  2015-10-11  2:19     ` Wang, Wei W
  1 sibling, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-07 15:28 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> We change to NULL the cpufreq_cpu_policy pointer after the call of
> cpufreq_driver->exit, because the pointer is still needed in
> intel_pstate_set_pstate().
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> ---
>  xen/drivers/cpufreq/cpufreq.c      | 6 +++---
>  xen/include/acpi/cpufreq/cpufreq.h | 7 +++++++
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
>  changes in v5:
>  1) put this patch prior to the "main body of intel pstate driver", which is 
>     one of the acceptable options suggested by the Jan.

Did I? With the description above, I still don't see why this would be
needed (here or later):

> --- a/xen/drivers/cpufreq/cpufreq.c
> +++ b/xen/drivers/cpufreq/cpufreq.c
> @@ -334,12 +334,11 @@ int cpufreq_del_cpu(unsigned int cpu)
>  
>      /* for HW_ALL, stop gov for each core of the _PSD domain */
>      /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */
> -    if (hw_all || (cpumask_weight(cpufreq_dom->map) ==
> -                   perf->domain_info.num_processors))
> +    if (!policy->internal_gov && (hw_all || (cpumask_weight(cpufreq_dom->map) ==
> +                   perf->domain_info.num_processors)))
>          __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
>  
>      cpufreq_statistic_exit(cpu);
> -    per_cpu(cpufreq_cpu_policy, cpu) = NULL;

Prior to this line we still have policy == per_cpu(cpufreq_cpu_policy, cpu),
i.e. the data needed is being made available to the ->exit() handler
called a few lines down. I.e. if you want to convince me or anyone
else that the change is needed, you will need to say in the description
why that is not enough.

> --- a/xen/include/acpi/cpufreq/cpufreq.h
> +++ b/xen/include/acpi/cpufreq/cpufreq.h
> @@ -53,6 +53,12 @@ struct perf_limits {
>      uint32_t min_policy_pct;
>  };
>  
> +struct internal_governor {
> +    char *avail_gov;
> +    uint32_t gov_num;
> +    uint32_t cur_gov;
> +};
> +
>  struct cpufreq_policy {
>      cpumask_var_t       cpus;          /* affected CPUs */
>      unsigned int        shared_type;   /* ANY or ALL affected CPUs
> @@ -66,6 +72,7 @@ struct cpufreq_policy {
>                                   * governors are used */
>      struct perf_limits  limits;
>      struct cpufreq_governor     *governor;
> +    struct internal_governor    *internal_gov;

Both of the changes to this file look unrelated to the purpose of the
patch, as does the one consumer of the new pointer.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver
  2015-09-14  2:32 ` [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver Wei Wang
  2015-09-17 15:51   ` Andrew Cooper
@ 2015-10-07 15:39   ` Jan Beulich
  1 sibling, 0 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-07 15:39 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> We simply grab the fundamental logic of the intel_pstate driver
> from Linux kernel, and customize it to Xen style. In the kernel,
> a user can adjust the limits via sysfs
> (limits.min_sysfs_pct/max_sysfs_pct). In Xen, the
> policy->limits.min_perf_pct/max_perf_pct acts as the transit
> station. A user interacts with it via xenpm.
> 
> The new xen/include/asm-x86/cpufreq.h header file is added.
> 
> Signed-off-by: Wei Wang <wei.w.wang@intel.com>

I went back to the replies I sent on earlier versions (namely v2), and
finding several issues un-addressed or only partly addressed I'm not
going to re-review this patch in detail, NAK-ing it in its current shape.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-09-14  2:32 ` [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load Wei Wang
  2015-09-17 16:08   ` Andrew Cooper
@ 2015-10-07 15:46   ` Jan Beulich
  2015-10-23  8:09     ` Wang, Wei W
                       ` (2 more replies)
  1 sibling, 3 replies; 44+ messages in thread
From: Jan Beulich @ 2015-10-07 15:46 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> @@ -40,6 +40,7 @@
>  #include <asm/processor.h>
>  #include <asm/percpu.h>
>  #include <asm/cpufeature.h>
> +#include <asm/cpufreq.h>

I think to make clear why this include is needed here it would be better
if you added the declaration of intel_pstate_init() also in this patch
instead of in the previous one.

> @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
>      int ret = 0;
>  
>      if ((cpufreq_controller == FREQCTL_xen) &&
> -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> -    else if ((cpufreq_controller == FREQCTL_xen) &&
> +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> +            ret = intel_pstate_init();
> +            if (ret)
> +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> +    } else if ((cpufreq_controller == FREQCTL_xen) &&
>          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
>          ret = powernow_register_driver();

Since you're basically modifying the entire body of the function,
please gets its coding style corrected as you fiddle with it.

> --- a/xen/arch/x86/acpi/cpufreq/intel_pstate.c
> +++ b/xen/arch/x86/acpi/cpufreq/intel_pstate.c
> @@ -843,7 +843,11 @@ int __init intel_pstate_init(void)
>      int cpu, rc = 0;
>      const struct x86_cpu_id *id;
>      struct cpu_defaults *cpu_info;
> +    static bool_t __read_mostly load;

Why __read_mostly when the function is __init?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-09-14  2:32 ` [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c Wei Wang
@ 2015-10-07 16:10   ` Jan Beulich
  2015-10-26  6:26     ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-07 16:10 UTC (permalink / raw)
  To: Wei Wang; +Cc: andrew.cooper3, xen-devel

>>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> --- a/tools/libxc/xc_pm.c
> +++ b/tools/libxc/xc_pm.c
> @@ -260,13 +260,14 @@ int xc_get_cpufreq_para(xc_interface *xch, int cpuid,
>      }
>      else
>      {
> -        user_para->cpuinfo_cur_freq = sys_para->cpuinfo_cur_freq;
> -        user_para->cpuinfo_max_freq = sys_para->cpuinfo_max_freq;
> -        user_para->cpuinfo_min_freq = sys_para->cpuinfo_min_freq;
> -        user_para->scaling_cur_freq = sys_para->scaling_cur_freq;
> -        user_para->scaling_max_freq = sys_para->scaling_max_freq;
> -        user_para->scaling_min_freq = sys_para->scaling_min_freq;
> -        user_para->turbo_enabled    = sys_para->turbo_enabled;
> +        user_para->cpuinfo_cur_freq  = sys_para->cpuinfo_cur_freq;
> +        user_para->cpuinfo_max_freq  = sys_para->cpuinfo_max_freq;
> +        user_para->cpuinfo_min_freq  = sys_para->cpuinfo_min_freq;
> +        user_para->scaling_cur_freq  = sys_para->scaling_cur_freq;
> +        user_para->scaling_max.pct   = sys_para->scaling_max_perf;
> +        user_para->scaling_min.pct   = sys_para->scaling_min_perf;
> +        user_para->scaling_turbo_pct = sys_para->scaling_turbo_pct;
> +        user_para->turbo_enabled     = sys_para->turbo_enabled;

You fail to communicate perf_alias here. How will the caller know?
It's also odd that you initialize .pct fields from _perf ones - can't the
naming and layout be arranged to match as much as possible?

> --- a/xen/drivers/acpi/pmstat.c
> +++ b/xen/drivers/acpi/pmstat.c
> @@ -191,7 +191,9 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>      uint32_t ret = 0;
>      const struct processor_pminfo *pmpt;
>      struct cpufreq_policy *policy;
> -    uint32_t gov_num = 0;
> +    struct perf_limits *limits;
> +    struct internal_governor *internal_gov;
> +    uint32_t cur_gov, gov_num = 0;
>      uint32_t *affected_cpus;
>      uint32_t *scaling_available_frequencies;
>      char     *scaling_available_governors;
> @@ -200,13 +202,21 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>  
>      pmpt = processor_pminfo[op->cpuid];
>      policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> +    limits = &policy->limits;
> +    internal_gov = policy->internal_gov;
> +    cur_gov = internal_gov ? internal_gov->cur_gov : 0;

Bad literal 0 (see the uses of cur_gov below).

>      if ( !pmpt || !pmpt->perf.states ||
> -         !policy || !policy->governor )
> +         !policy || (!policy->governor && !policy->internal_gov) )

Either you think you need the local variable internal_gov - then use
it everywhere, or drop it.

>          return -EINVAL;
>  
> -    list_for_each(pos, &cpufreq_governor_list)
> -        gov_num++;
> +    if ( internal_gov )
> +        gov_num = internal_gov->gov_num;

Since you don't need cur_gov ahead of this, please limit the number
of conditionals by moving its initialization here.

>      op->u.get_para.cpuinfo_cur_freq =
>          cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
>      op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
>      op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
>      op->u.get_para.scaling_cur_freq = policy->cur;
> -    op->u.get_para.scaling_max_freq = policy->max;
> -    op->u.get_para.scaling_min_freq = policy->min;
> +    if ( internal_gov )
> +    {
> +        op->u.get_para.scaling_max_perf = limits->max_perf_pct;
> +        op->u.get_para.scaling_min_perf = limits->min_perf_pct;
> +        op->u.get_para.scaling_turbo_pct = limits->turbo_pct;
> +    }
> +    else
> +    {
> +        op->u.get_para.scaling_max_perf = policy->max;
> +        op->u.get_para.scaling_min_perf = policy->min;
> +    }
>  
>      if ( cpufreq_driver->name[0] )
> +    {
>          strlcpy(op->u.get_para.scaling_driver, 
>              cpufreq_driver->name, CPUFREQ_NAME_LEN);
> +        if ( !strncmp(cpufreq_driver->name,
> +                     "intel_pstate", CPUFREQ_NAME_LEN) )
> +            op->u.get_para.perf_alias = PERCENTAGE;
> +        else
> +            op->u.get_para.perf_alias = FREQUENCY;

This should be done together with the other things in the if/else
right above.

> @@ -299,16 +357,36 @@ static int get_cpufreq_para(struct xen_sysctl_pm_op *op)
>  static int set_cpufreq_gov(struct xen_sysctl_pm_op *op)
>  {
>      struct cpufreq_policy new_policy, *old_policy;
> +    struct internal_governor *internal_gov;
>  
>      old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
>      if ( !old_policy )
>          return -EINVAL;
> +    internal_gov = old_policy->internal_gov;
>  
>      memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
>  
> -    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> -    if (new_policy.governor == NULL)
> -        return -EINVAL;
> +    if ( internal_gov && internal_gov->cur_gov )
> +    {
> +        if ( !strnicmp(op->u.set_gov.scaling_governor,
> +                       "performance", CPUFREQ_NAME_LEN) )
> +            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
> +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> +                       "powersave", CPUFREQ_NAME_LEN) )
> +            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
> +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> +                       "userspace", CPUFREQ_NAME_LEN) )
> +            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
> +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> +                       "ondemand", CPUFREQ_NAME_LEN) )
> +            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;

Wouldn't this better be done by the internal governor's code, so
it can also for itself decide which of the kinds it may not want to
support?

> @@ -340,6 +423,9 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
>      {
>          struct cpufreq_policy new_policy;
>  
> +	if ( !policy->governor || internal_gov )

Hard tab.

> @@ -347,10 +433,43 @@ static int set_cpufreq_para(struct xen_sysctl_pm_op *op)
>          break;
>      }
>  
> +    case SCALING_MAX_PCT:
> +    {
> +        struct cpufreq_policy new_policy;
> +        struct perf_limits *limits = &new_policy.limits;
> +
> +	if ( policy->governor || !internal_gov )
> +            return -EINVAL;
> +
> +        new_policy = *policy;
> +        limits->max_perf_pct = clamp_t(uint32_t, op->u.set_para.ctrl_value,
> +			limits->min_policy_pct, limits->max_policy_pct);

Afaict all three values are of the same type - why clamp_t() then
instead of clamp()?

> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -297,11 +297,28 @@ typedef struct xen_ondemand xen_ondemand_t;
>   * same as sysfs file name of native linux
>   */
>  #define CPUFREQ_NAME_LEN 16
> +
> +enum perf_alias {

xen_ prefix missing.

> +    FREQUENCY  = 0,
> +    PERCENTAGE = 1
> +};
> +
>  struct xen_get_cpufreq_para {
>      /* IN/OUT variable */
>      uint32_t cpu_num;
>      uint32_t freq_num;
>      uint32_t gov_num;
> +    int32_t turbo_enabled;
> +
> +    uint32_t cpuinfo_cur_freq;
> +    uint32_t cpuinfo_max_freq;
> +    uint32_t cpuinfo_min_freq;
> +    uint32_t scaling_cur_freq;
> +
> +    uint32_t scaling_turbo_pct;
> +    uint32_t scaling_max_perf;
> +    uint32_t scaling_min_perf;
> +    enum perf_alias perf_alias;
>  
>      /* for all governors */
>      /* OUT variable */
> @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
>      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
>      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
>      char scaling_driver[CPUFREQ_NAME_LEN];
> -
> -    uint32_t cpuinfo_cur_freq;
> -    uint32_t cpuinfo_max_freq;
> -    uint32_t cpuinfo_min_freq;
> -    uint32_t scaling_cur_freq;
> -
>      char scaling_governor[CPUFREQ_NAME_LEN];
> -    uint32_t scaling_max_freq;
> -    uint32_t scaling_min_freq;
>  
>      /* for specific governor */
>      union {
>          struct  xen_userspace userspace;
>          struct  xen_ondemand ondemand;
>      } u;
> -
> -    int32_t turbo_enabled;
>  };

Is all of this re-arrangement really needed? Also can't turbo_enabled
and scaling_turbo_pct be combined into a single field?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline
  2015-10-07 15:28   ` Jan Beulich
@ 2015-10-11  2:19     ` Wang, Wei W
  0 siblings, 0 replies; 44+ messages in thread
From: Wang, Wei W @ 2015-10-11  2:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 07/10/2015 23:29,  Jan Beulich wrote:
> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> > We change to NULL the cpufreq_cpu_policy pointer after the call of
> > cpufreq_driver->exit, because the pointer is still needed in
> > intel_pstate_set_pstate().
> >
> > Signed-off-by: Wei Wang <wei.w.wang@intel.com>
> > ---
> >  xen/drivers/cpufreq/cpufreq.c      | 6 +++---
> >  xen/include/acpi/cpufreq/cpufreq.h | 7 +++++++
> >  2 files changed, 10 insertions(+), 3 deletions(-)
> >
> > --- a/xen/drivers/cpufreq/cpufreq.c
> > +++ b/xen/drivers/cpufreq/cpufreq.c
> > @@ -334,12 +334,11 @@ int cpufreq_del_cpu(unsigned int cpu)
> >
> >      /* for HW_ALL, stop gov for each core of the _PSD domain */
> >      /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */
> > -    if (hw_all || (cpumask_weight(cpufreq_dom->map) ==
> > -                   perf->domain_info.num_processors))
> > +    if (!policy->internal_gov && (hw_all || (cpumask_weight(cpufreq_dom-
> >map) ==
> > +                   perf->domain_info.num_processors)))
> >          __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
> >
> >      cpufreq_statistic_exit(cpu);
> > -    per_cpu(cpufreq_cpu_policy, cpu) = NULL;
> 
> Prior to this line we still have policy == per_cpu(cpufreq_cpu_policy, cpu), i.e.
> the data needed is being made available to the ->exit() handler called a few lines
> down. I.e. if you want to convince me or anyone else that the change is needed,
> you will need to say in the description why that is not enough.
> 

OK. Without this patch, we need to change intel_pstate_set_pstate() to receive the "policy" parameter from ->exit(), instead of referencing the global per_cpu() variable. The cumbersome part is that we need to change all other callers of intel_pstate_set_pstate() to get the "policy" parameter as well and pass it to the intel_pstate_set_pstate() function. I will include this change in the next version to see if this way is better.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-10-07 15:46   ` Jan Beulich
@ 2015-10-23  8:09     ` Wang, Wei W
  2015-10-23  8:16     ` Wang, Wei W
  2015-10-23  8:18     ` Wang, Wei W
  2 siblings, 0 replies; 44+ messages in thread
From: Wang, Wei W @ 2015-10-23  8:09 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 07/10/2015 11:46,  Jan Beulich wrote:
> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
> >      int ret = 0;
> >
> >      if ((cpufreq_controller == FREQCTL_xen) &&
> > -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> > -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > -    else if ((cpufreq_controller == FREQCTL_xen) &&
> > +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> > +            ret = intel_pstate_init();
> > +            if (ret)
> > +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > +    } else if ((cpufreq_controller == FREQCTL_xen) &&
> >          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
> >          ret = powernow_register_driver();
> 
> Since you're basically modifying the entire body of the function, please gets its
> coding style corrected as you fiddle with it.
> 

The coding style here is aligned with the existing piece of code in this file. If it's also good to make the file have two coding styles, please let me know. 

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-10-07 15:46   ` Jan Beulich
  2015-10-23  8:09     ` Wang, Wei W
@ 2015-10-23  8:16     ` Wang, Wei W
  2015-10-23  8:18     ` Wang, Wei W
  2 siblings, 0 replies; 44+ messages in thread
From: Wang, Wei W @ 2015-10-23  8:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 07/10/2015 23:46,  Jan Beulich wrote:
> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
> >      int ret = 0;
> >
> >      if ((cpufreq_controller == FREQCTL_xen) &&
> > -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> > -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > -    else if ((cpufreq_controller == FREQCTL_xen) &&
> > +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> > +            ret = intel_pstate_init();
> > +            if (ret)
> > +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > +    } else if ((cpufreq_controller == FREQCTL_xen) &&
> >          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
> >          ret = powernow_register_driver();
> 
> Since you're basically modifying the entire body of the function, please gets its
> coding style corrected as you fiddle with it.

OK, I guess you was probably referring to the  this line - (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)).
Will align that as well.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-10-07 15:46   ` Jan Beulich
  2015-10-23  8:09     ` Wang, Wei W
  2015-10-23  8:16     ` Wang, Wei W
@ 2015-10-23  8:18     ` Wang, Wei W
  2015-10-23  8:35       ` Jan Beulich
  2 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-23  8:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 07/10/2015 23:46,  Jan Beulich wrote:
> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> > --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> > @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
> >      int ret = 0;
> >
> >      if ((cpufreq_controller == FREQCTL_xen) &&
> > -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> > -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > -    else if ((cpufreq_controller == FREQCTL_xen) &&
> > +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> > +            ret = intel_pstate_init();
> > +            if (ret)
> > +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> > +    } else if ((cpufreq_controller == FREQCTL_xen) &&
> >          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
> >          ret = powernow_register_driver();
> 
> Since you're basically modifying the entire body of the function, please gets its
> coding style corrected as you fiddle with it.

Ok, I guess you was probably referring to the remaining lines in the function - "(boot_cpu_data.x86_vendor == X86_VENDOR_AMD))".. Will align them as well.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-10-23  8:18     ` Wang, Wei W
@ 2015-10-23  8:35       ` Jan Beulich
  2015-10-23  8:48         ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-23  8:35 UTC (permalink / raw)
  To: Wei W Wang; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

>>> On 23.10.15 at 10:18, <wei.w.wang@intel.com> wrote:
> On 07/10/2015 23:46,  Jan Beulich wrote:
>> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
>> > --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
>> > +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
>> > @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
>> >      int ret = 0;
>> >
>> >      if ((cpufreq_controller == FREQCTL_xen) &&
>> > -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
>> > -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
>> > -    else if ((cpufreq_controller == FREQCTL_xen) &&
>> > +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
>> > +            ret = intel_pstate_init();
>> > +            if (ret)
>> > +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
>> > +    } else if ((cpufreq_controller == FREQCTL_xen) &&
>> >          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
>> >          ret = powernow_register_driver();
>> 
>> Since you're basically modifying the entire body of the function, please 
> gets its
>> coding style corrected as you fiddle with it.
> 
> Ok, I guess you was probably referring to the remaining lines in the 
> function - "(boot_cpu_data.x86_vendor == X86_VENDOR_AMD))".. Will align them 
> as well.

No, I'm not just talking about alignment. And the coding style of the
file is mixed already (see e.g. the following function, which admittedly
has even more blanks than needed), so getting this function into
proper shape since you modify it in its entirety is a step in the right
direction.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load
  2015-10-23  8:35       ` Jan Beulich
@ 2015-10-23  8:48         ` Wang, Wei W
  0 siblings, 0 replies; 44+ messages in thread
From: Wang, Wei W @ 2015-10-23  8:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 23/10/2015 16:36,  Jan Beulich wrote:
> >>> On 23.10.15 at 10:18, <wei.w.wang@intel.com> wrote:
> > On 07/10/2015 23:46,  Jan Beulich wrote:
> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> >> > --- a/xen/arch/x86/acpi/cpufreq/cpufreq.c
> >> > +++ b/xen/arch/x86/acpi/cpufreq/cpufreq.c
> >> > @@ -647,9 +648,11 @@ static int __init cpufreq_driver_init(void)
> >> >      int ret = 0;
> >> >
> >> >      if ((cpufreq_controller == FREQCTL_xen) &&
> >> > -        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL))
> >> > -        ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> >> > -    else if ((cpufreq_controller == FREQCTL_xen) &&
> >> > +        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)) {
> >> > +            ret = intel_pstate_init();
> >> > +            if (ret)
> >> > +                ret = cpufreq_register_driver(&acpi_cpufreq_driver);
> >> > +    } else if ((cpufreq_controller == FREQCTL_xen) &&
> >> >          (boot_cpu_data.x86_vendor == X86_VENDOR_AMD))
> >> >          ret = powernow_register_driver();
> >>
> >> Since you're basically modifying the entire body of the function,
> >> please
> > gets its
> >> coding style corrected as you fiddle with it.
> >
> > Ok, I guess you was probably referring to the remaining lines in the
> > function - "(boot_cpu_data.x86_vendor == X86_VENDOR_AMD))".. Will
> > align them as well.
> 
> No, I'm not just talking about alignment. And the coding style of the file is mixed
> already (see e.g. the following function, which admittedly has even more blanks
> than needed), so getting this function into proper shape since you modify it in its
> entirety is a step in the right direction.
> 

Ok. I will use Xen style brackets for this function.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-07 16:10   ` Jan Beulich
@ 2015-10-26  6:26     ` Wang, Wei W
  2015-10-26  7:02       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26  6:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 08/10/2015 12:11,  Jan Beulich wrote:
> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> > --- a/tools/libxc/xc_pm.c
> > +++ b/tools/libxc/xc_pm.c
> > @@ -260,13 +260,14 @@ int xc_get_cpufreq_para(xc_interface *xch, int
> cpuid,
> >      }
> >      else
> >      {
> > -        user_para->cpuinfo_cur_freq = sys_para->cpuinfo_cur_freq;
> > -        user_para->cpuinfo_max_freq = sys_para->cpuinfo_max_freq;
> > -        user_para->cpuinfo_min_freq = sys_para->cpuinfo_min_freq;
> > -        user_para->scaling_cur_freq = sys_para->scaling_cur_freq;
> > -        user_para->scaling_max_freq = sys_para->scaling_max_freq;
> > -        user_para->scaling_min_freq = sys_para->scaling_min_freq;
> > -        user_para->turbo_enabled    = sys_para->turbo_enabled;
> > +        user_para->cpuinfo_cur_freq  = sys_para->cpuinfo_cur_freq;
> > +        user_para->cpuinfo_max_freq  = sys_para->cpuinfo_max_freq;
> > +        user_para->cpuinfo_min_freq  = sys_para->cpuinfo_min_freq;
> > +        user_para->scaling_cur_freq  = sys_para->scaling_cur_freq;
> > +        user_para->scaling_max.pct   = sys_para->scaling_max_perf;
> > +        user_para->scaling_min.pct   = sys_para->scaling_min_perf;
> > +        user_para->scaling_turbo_pct = sys_para->scaling_turbo_pct;
> > +        user_para->turbo_enabled     = sys_para->turbo_enabled;
> 
> You fail to communicate perf_alias here. How will the caller know?

"user_para->perf_alias = sys_para->perf_alias" is added in the next patch "tools:...", because the the "perf_alias " field is added to xc_get_cpufreq_para there.

> >      op->u.get_para.cpuinfo_cur_freq =
> >          cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
> >      op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
> >      op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
> >      op->u.get_para.scaling_cur_freq = policy->cur;
> > -    op->u.get_para.scaling_max_freq = policy->max;
> > -    op->u.get_para.scaling_min_freq = policy->min;
> > +    if ( internal_gov )
> > +    {
> > +        op->u.get_para.scaling_max_perf = limits->max_perf_pct;
> > +        op->u.get_para.scaling_min_perf = limits->min_perf_pct;
> > +        op->u.get_para.scaling_turbo_pct = limits->turbo_pct;
> > +    }
> > +    else
> > +    {
> > +        op->u.get_para.scaling_max_perf = policy->max;
> > +        op->u.get_para.scaling_min_perf = policy->min;
> > +    }
> >
> >      if ( cpufreq_driver->name[0] )
> > +    {
> >          strlcpy(op->u.get_para.scaling_driver,
> >              cpufreq_driver->name, CPUFREQ_NAME_LEN);
> > +        if ( !strncmp(cpufreq_driver->name,
> > +                     "intel_pstate", CPUFREQ_NAME_LEN) )
> > +            op->u.get_para.perf_alias = PERCENTAGE;
> > +        else
> > +            op->u.get_para.perf_alias = FREQUENCY;
> 
> This should be done together with the other things in the if/else right above.

The code here is in a different file from the if/else above. Here in pmstat.c , we set the perf_alias ("meaning of data"). In the above if/else(xc_pm.c), we pass the value from one end to another.
 
> > @@ -299,16 +357,36 @@ static int get_cpufreq_para(struct
> > xen_sysctl_pm_op *op)  static int set_cpufreq_gov(struct
> > xen_sysctl_pm_op *op)  {
> >      struct cpufreq_policy new_policy, *old_policy;
> > +    struct internal_governor *internal_gov;
> >
> >      old_policy = per_cpu(cpufreq_cpu_policy, op->cpuid);
> >      if ( !old_policy )
> >          return -EINVAL;
> > +    internal_gov = old_policy->internal_gov;
> >
> >      memcpy(&new_policy, old_policy, sizeof(struct cpufreq_policy));
> >
> > -    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
> > -    if (new_policy.governor == NULL)
> > -        return -EINVAL;
> > +    if ( internal_gov && internal_gov->cur_gov )
> > +    {
> > +        if ( !strnicmp(op->u.set_gov.scaling_governor,
> > +                       "performance", CPUFREQ_NAME_LEN) )
> > +            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> > +                       "powersave", CPUFREQ_NAME_LEN) )
> > +            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> > +                       "userspace", CPUFREQ_NAME_LEN) )
> > +            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> > +                       "ondemand", CPUFREQ_NAME_LEN) )
> > +            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;
> 
> Wouldn't this better be done by the internal governor's code, so it can also for
> itself decide which of the kinds it may not want to support?

I think it should be pmstat.c-'s job here to set " internal_gov->cur_gov", which is later passed to the internal governor's implementation code, who then decides how to deal with the requested governor in "internal_gov->cur_gov".

> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -297,11 +297,28 @@ typedef struct xen_ondemand xen_ondemand_t;
> >   * same as sysfs file name of native linux
> >   */
> >  #define CPUFREQ_NAME_LEN 16
> > +
> > +enum perf_alias {
> 
> xen_ prefix missing.

This enum will also be used in xc_get_cpufreq_para, should we still add xen_ prefix here?

> > +    FREQUENCY  = 0,
> > +    PERCENTAGE = 1
> > +};
> > +
> >  struct xen_get_cpufreq_para {
> >      /* IN/OUT variable */
> >      uint32_t cpu_num;
> >      uint32_t freq_num;
> >      uint32_t gov_num;
> > +    int32_t turbo_enabled;
> > +
> > +    uint32_t cpuinfo_cur_freq;
> > +    uint32_t cpuinfo_max_freq;
> > +    uint32_t cpuinfo_min_freq;
> > +    uint32_t scaling_cur_freq;
> > +
> > +    uint32_t scaling_turbo_pct;
> > +    uint32_t scaling_max_perf;
> > +    uint32_t scaling_min_perf;
> > +    enum perf_alias perf_alias;
> >
> >      /* for all governors */
> >      /* OUT variable */
> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
> >      char scaling_driver[CPUFREQ_NAME_LEN];
> > -
> > -    uint32_t cpuinfo_cur_freq;
> > -    uint32_t cpuinfo_max_freq;
> > -    uint32_t cpuinfo_min_freq;
> > -    uint32_t scaling_cur_freq;
> > -
> >      char scaling_governor[CPUFREQ_NAME_LEN];
> > -    uint32_t scaling_max_freq;
> > -    uint32_t scaling_min_freq;
> >
> >      /* for specific governor */
> >      union {
> >          struct  xen_userspace userspace;
> >          struct  xen_ondemand ondemand;
> >      } u;
> > -
> > -    int32_t turbo_enabled;
> >  };
> 
> Is all of this re-arrangement really needed? Also can't turbo_enabled and
> scaling_turbo_pct be combined into a single field?

Personally, we should not combine the two. 
turbo_enabled is used by both the old pstate driver and intel_pstate, but scaling_turbo_pct is only used in intel_pstate. If we use "scaling_turbo_pct=0" to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to represent " turbo_enabled=1", then we will need to modify the old driver to use scaling_turbo_pct, i.e. changing the old driver to be aware of the "percentage" concept, which is proposed in intel_pstate. On the other side, I think keeping turbo_enabled and scaling_turbo_pct separated makes the code easier to read.
After re-arranging the layout, we got enough space to keep the two. I think the re-arranged layout doesn't look bad.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  6:26     ` Wang, Wei W
@ 2015-10-26  7:02       ` Jan Beulich
  2015-10-26  7:59         ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-26  7:02 UTC (permalink / raw)
  To: wei.w.wang; +Cc: andrew.cooper3, xen-devel

>>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
>On 08/10/2015 12:11,  Jan Beulich wrote:
>> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
>> > --- a/tools/libxc/xc_pm.c
>> > +++ b/tools/libxc/xc_pm.c
>> > @@ -260,13 +260,14 @@ int xc_get_cpufreq_para(xc_interface *xch, int
>> cpuid,
>> >      }
>> >      else
>> >      {
>> > -        user_para->cpuinfo_cur_freq = sys_para->cpuinfo_cur_freq;
>> > -        user_para->cpuinfo_max_freq = sys_para->cpuinfo_max_freq;
>> > -        user_para->cpuinfo_min_freq = sys_para->cpuinfo_min_freq;
>> > -        user_para->scaling_cur_freq = sys_para->scaling_cur_freq;
>> > -        user_para->scaling_max_freq = sys_para->scaling_max_freq;
>> > -        user_para->scaling_min_freq = sys_para->scaling_min_freq;
>> > -        user_para->turbo_enabled    = sys_para->turbo_enabled;
>> > +        user_para->cpuinfo_cur_freq  = sys_para->cpuinfo_cur_freq;
>> > +        user_para->cpuinfo_max_freq  = sys_para->cpuinfo_max_freq;
>> > +        user_para->cpuinfo_min_freq  = sys_para->cpuinfo_min_freq;
>> > +        user_para->scaling_cur_freq  = sys_para->scaling_cur_freq;
>> > +        user_para->scaling_max.pct   = sys_para->scaling_max_perf;
>> > +        user_para->scaling_min.pct   = sys_para->scaling_min_perf;
>> > +        user_para->scaling_turbo_pct = sys_para->scaling_turbo_pct;
>> > +        user_para->turbo_enabled     = sys_para->turbo_enabled;
>> 
>> You fail to communicate perf_alias here. How will the caller know?
>
>"user_para->perf_alias = sys_para->perf_alias" is added in the next patch "tools:...", because the the "perf_alias " field is added to xc_get_cpufreq_para there.

But the public interface structure field is already there at this point. I.e. at this
point in the series you're hiding information - another sign for improper breakup
of the series.

>> >      op->u.get_para.cpuinfo_cur_freq =
>> >          cpufreq_driver->get ? cpufreq_driver->get(op->cpuid) : policy->cur;
>> >      op->u.get_para.cpuinfo_max_freq = policy->cpuinfo.max_freq;
>> >      op->u.get_para.cpuinfo_min_freq = policy->cpuinfo.min_freq;
>> >      op->u.get_para.scaling_cur_freq = policy->cur;
>> > -    op->u.get_para.scaling_max_freq = policy->max;
>> > -    op->u.get_para.scaling_min_freq = policy->min;
>> > +    if ( internal_gov )
>> > +    {
>> > +        op->u.get_para.scaling_max_perf = limits->max_perf_pct;
>> > +        op->u.get_para.scaling_min_perf = limits->min_perf_pct;
>> > +        op->u.get_para.scaling_turbo_pct = limits->turbo_pct;
>> > +    }
>> > +    else
>> > +    {
>> > +        op->u.get_para.scaling_max_perf = policy->max;
>> > +        op->u.get_para.scaling_min_perf = policy->min;
>> > +    }
>> >
>> >      if ( cpufreq_driver->name[0] )
>> > +    {
>> >          strlcpy(op->u.get_para.scaling_driver,
>> >              cpufreq_driver->name, CPUFREQ_NAME_LEN);
>> > +        if ( !strncmp(cpufreq_driver->name,
>> > +                     "intel_pstate", CPUFREQ_NAME_LEN) )
>> > +            op->u.get_para.perf_alias = PERCENTAGE;
>> > +        else
>> > +            op->u.get_para.perf_alias = FREQUENCY;
>> 
>> This should be done together with the other things in the if/else right above.
>
>The code here is in a different file from the if/else above.

Huh? I don't think I trimmed the original mail this badly. From the context
above, both are directly adjacent in the same source file.

>> > -    new_policy.governor = __find_governor(op->u.set_gov.scaling_governor);
>> > -    if (new_policy.governor == NULL)
>> > -        return -EINVAL;
>> > +    if ( internal_gov && internal_gov->cur_gov )
>> > +    {
>> > +        if ( !strnicmp(op->u.set_gov.scaling_governor,
>> > +                       "performance", CPUFREQ_NAME_LEN) )
>> > +            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
>> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> > +                       "powersave", CPUFREQ_NAME_LEN) )
>> > +            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
>> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> > +                       "userspace", CPUFREQ_NAME_LEN) )
>> > +            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
>> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> > +                       "ondemand", CPUFREQ_NAME_LEN) )
>> > +            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;
>> 
>> Wouldn't this better be done by the internal governor's code, so it can also for
>> itself decide which of the kinds it may not want to support?
>
>I think it should be pmstat.c-'s job here to set " internal_gov->cur_gov", which
> is later passed to the internal governor's implementation code, who then
> decides how to deal with the requested governor in "internal_gov->cur_gov".

Which it then is able to communicate in which way? Without itself adjusting
->cur_gov again?

>> > --- a/xen/include/public/sysctl.h
>> > +++ b/xen/include/public/sysctl.h
>> > @@ -297,11 +297,28 @@ typedef struct xen_ondemand xen_ondemand_t;
>> >   * same as sysfs file name of native linux
>> >   */
>> >  #define CPUFREQ_NAME_LEN 16
>> > +
>> > +enum perf_alias {
>> 
>> xen_ prefix missing.
>
>This enum will also be used in xc_get_cpufreq_para, should we still add xen_ prefix here?

Of course: This is a public interface type, and hence should be prefixed suitably
to limit possible name clashes with non-Xen code.

>> >  struct xen_get_cpufreq_para {
>> >      /* IN/OUT variable */
>> >      uint32_t cpu_num;
>> >      uint32_t freq_num;
>> >      uint32_t gov_num;
>> > +    int32_t turbo_enabled;
>> > +
>> > +    uint32_t cpuinfo_cur_freq;
>> > +    uint32_t cpuinfo_max_freq;
>> > +    uint32_t cpuinfo_min_freq;
>> > +    uint32_t scaling_cur_freq;
>> > +
>> > +    uint32_t scaling_turbo_pct;
>> > +    uint32_t scaling_max_perf;
>> > +    uint32_t scaling_min_perf;
>> > +    enum perf_alias perf_alias;
>> >
>> >      /* for all governors */
>> >      /* OUT variable */
>> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
>> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
>> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
>> >      char scaling_driver[CPUFREQ_NAME_LEN];
>> > -
>> > -    uint32_t cpuinfo_cur_freq;
>> > -    uint32_t cpuinfo_max_freq;
>> > -    uint32_t cpuinfo_min_freq;
>> > -    uint32_t scaling_cur_freq;
>> > -
>> >      char scaling_governor[CPUFREQ_NAME_LEN];
>> > -    uint32_t scaling_max_freq;
>> > -    uint32_t scaling_min_freq;
>> >
>> >      /* for specific governor */
>> >      union {
>> >          struct  xen_userspace userspace;
>> >          struct  xen_ondemand ondemand;
>> >      } u;
>> > -
>> > -    int32_t turbo_enabled;
>> >  };
>> 
>> Is all of this re-arrangement really needed? Also can't turbo_enabled and
>> scaling_turbo_pct be combined into a single field?
>
>Personally, we should not combine the two. 
> turbo_enabled is used by both the old pstate driver and intel_pstate, but
> scaling_turbo_pct is only used in intel_pstate. If we use "scaling_turbo_pct=0"
> to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to represent
> " turbo_enabled=1", then we will need to modify the old driver to use
> scaling_turbo_pct, i.e. changing the old driver to be aware of the "percentage"
> concept, which is proposed in intel_pstate. On the other side, I think keeping
> turbo_enabled and scaling_turbo_pct separated makes the code easier to read.

Note that "combine" doesn't necessarily mean "eliminate the old one" - they
could as well become field of a union. The basic question you should ask yourself
in such cases is: "Do both fields have a meaning at the same time?" If the answer
is yes, then of course they should remain separate. If the answer is no _and_
their purpose is reasonably similar, then combining them should at least be
considered.

>After re-arranging the layout, we got enough space to keep the two. I think the
> re-arranged layout doesn't look bad.

And I didn't say it would - I asked whether it's necessary.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  7:02       ` Jan Beulich
@ 2015-10-26  7:59         ` Wang, Wei W
  2015-10-26  9:40           ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26  7:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 26/10/2015 15:03,  Jan Beulich wrote:
> >>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
> >On 08/10/2015 12:11,  Jan Beulich wrote:
> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> >> > -    new_policy.governor = __find_governor(op-
> >u.set_gov.scaling_governor);
> >> > -    if (new_policy.governor == NULL)
> >> > -        return -EINVAL;
> >> > +    if ( internal_gov && internal_gov->cur_gov )
> >> > +    {
> >> > +        if ( !strnicmp(op->u.set_gov.scaling_governor,
> >> > +                       "performance", CPUFREQ_NAME_LEN) )
> >> > +            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> >> > +                       "powersave", CPUFREQ_NAME_LEN) )
> >> > +            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> >> > +                       "userspace", CPUFREQ_NAME_LEN) )
> >> > +            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
> >> > +                       "ondemand", CPUFREQ_NAME_LEN) )
> >> > +            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;
> >>
> >> Wouldn't this better be done by the internal governor's code, so it
> >> can also for itself decide which of the kinds it may not want to support?
> >
> >I think it should be pmstat.c-'s job here to set "
> >internal_gov->cur_gov", which  is later passed to the internal
> >governor's implementation code, who then  decides how to deal with the
> requested governor in "internal_gov->cur_gov".
> 
> Which it then is able to communicate in which way? Without itself adjusting
> ->cur_gov again?
 
In this way: 
pmstat.c gets the string-represented governor adjusting request from the upper layer, parses it to the number-represented value (INTERNAL_GOV_), and feeds the number-represented one to the driver's internal governor implementation. If the internal governor implementation doesn't support the passed INTERNAL_GOV_XX, i.e. INTERNAL_GOV_XX goes to the "default:" section of the "switch()", which adjusts the internal_gov->cur_gov to the default one, e.g. INTERNAL_GOV_ONDEMAND.

> >> >  struct xen_get_cpufreq_para {
> >> >      /* IN/OUT variable */
> >> >      uint32_t cpu_num;
> >> >      uint32_t freq_num;
> >> >      uint32_t gov_num;
> >> > +    int32_t turbo_enabled;
> >> > +
> >> > +    uint32_t cpuinfo_cur_freq;
> >> > +    uint32_t cpuinfo_max_freq;
> >> > +    uint32_t cpuinfo_min_freq;
> >> > +    uint32_t scaling_cur_freq;
> >> > +
> >> > +    uint32_t scaling_turbo_pct;
> >> > +    uint32_t scaling_max_perf;
> >> > +    uint32_t scaling_min_perf;
> >> > +    enum perf_alias perf_alias;
> >> >
> >> >      /* for all governors */
> >> >      /* OUT variable */
> >> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
> >> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
> >> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
> >> >      char scaling_driver[CPUFREQ_NAME_LEN];
> >> > -
> >> > -    uint32_t cpuinfo_cur_freq;
> >> > -    uint32_t cpuinfo_max_freq;
> >> > -    uint32_t cpuinfo_min_freq;
> >> > -    uint32_t scaling_cur_freq;
> >> > -
> >> >      char scaling_governor[CPUFREQ_NAME_LEN];
> >> > -    uint32_t scaling_max_freq;
> >> > -    uint32_t scaling_min_freq;
> >> >
> >> >      /* for specific governor */
> >> >      union {
> >> >          struct  xen_userspace userspace;
> >> >          struct  xen_ondemand ondemand;
> >> >      } u;
> >> > -
> >> > -    int32_t turbo_enabled;
> >> >  };
> >>
> >> Is all of this re-arrangement really needed? Also can't turbo_enabled
> >> and scaling_turbo_pct be combined into a single field?
> >
> >Personally, we should not combine the two.
> > turbo_enabled is used by both the old pstate driver and intel_pstate,
> >but  scaling_turbo_pct is only used in intel_pstate. If we use
> "scaling_turbo_pct=0"
> > to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to represent
> >" turbo_enabled=1", then we will need to modify the old driver to use
> >scaling_turbo_pct, i.e. changing the old driver to be aware of the "percentage"
> > concept, which is proposed in intel_pstate. On the other side, I think
> >keeping  turbo_enabled and scaling_turbo_pct separated makes the code
> easier to read.
> 
> Note that "combine" doesn't necessarily mean "eliminate the old one" - they
> could as well become field of a union. The basic question you should ask
> yourself in such cases is: "Do both fields have a meaning at the same time?" If
> the answer is yes, then of course they should remain separate. If the answer is
> no _and_ their purpose is reasonably similar, then combining them should at
> least be considered.

Ok. I will keep the two separated, since they do have their own meaning at the same time. 

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  7:59         ` Wang, Wei W
@ 2015-10-26  9:40           ` Jan Beulich
  2015-10-26  9:48             ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-26  9:40 UTC (permalink / raw)
  To: Wei W Wang; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

>>> On 26.10.15 at 08:59, <wei.w.wang@intel.com> wrote:
> On 26/10/2015 15:03,  Jan Beulich wrote:
>> >>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
>> >On 08/10/2015 12:11,  Jan Beulich wrote:
>> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
>> >> > -    new_policy.governor = __find_governor(op-
>> >u.set_gov.scaling_governor);
>> >> > -    if (new_policy.governor == NULL)
>> >> > -        return -EINVAL;
>> >> > +    if ( internal_gov && internal_gov->cur_gov )
>> >> > +    {
>> >> > +        if ( !strnicmp(op->u.set_gov.scaling_governor,
>> >> > +                       "performance", CPUFREQ_NAME_LEN) )
>> >> > +            internal_gov->cur_gov = INTERNAL_GOV_PERFORMANCE;
>> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> >> > +                       "powersave", CPUFREQ_NAME_LEN) )
>> >> > +            internal_gov->cur_gov = INTERNAL_GOV_POWERSAVE;
>> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> >> > +                       "userspace", CPUFREQ_NAME_LEN) )
>> >> > +            internal_gov->cur_gov = INTERNAL_GOV_USERSPACE;
>> >> > +        else if ( !strnicmp(op->u.set_gov.scaling_governor,
>> >> > +                       "ondemand", CPUFREQ_NAME_LEN) )
>> >> > +            internal_gov->cur_gov = INTERNAL_GOV_ONDEMAND;
>> >>
>> >> Wouldn't this better be done by the internal governor's code, so it
>> >> can also for itself decide which of the kinds it may not want to support?
>> >
>> >I think it should be pmstat.c-'s job here to set "
>> >internal_gov->cur_gov", which  is later passed to the internal
>> >governor's implementation code, who then  decides how to deal with the
>> requested governor in "internal_gov->cur_gov".
>> 
>> Which it then is able to communicate in which way? Without itself adjusting
>> ->cur_gov again?
>  
> In this way: 
> pmstat.c gets the string-represented governor adjusting request from the 
> upper layer, parses it to the number-represented value (INTERNAL_GOV_), and 
> feeds the number-represented one to the driver's internal governor 
> implementation. If the internal governor implementation doesn't support the 
> passed INTERNAL_GOV_XX, i.e. INTERNAL_GOV_XX goes to the "default:" section 
> of the "switch()", which adjusts the internal_gov->cur_gov to the default one, 
> e.g. INTERNAL_GOV_ONDEMAND.

Well, okay, I'm tired of discussing issues like this.

>> >> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
>> >> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
>> >> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
>> >> >      char scaling_driver[CPUFREQ_NAME_LEN];
>> >> > -
>> >> > -    uint32_t cpuinfo_cur_freq;
>> >> > -    uint32_t cpuinfo_max_freq;
>> >> > -    uint32_t cpuinfo_min_freq;
>> >> > -    uint32_t scaling_cur_freq;
>> >> > -
>> >> >      char scaling_governor[CPUFREQ_NAME_LEN];
>> >> > -    uint32_t scaling_max_freq;
>> >> > -    uint32_t scaling_min_freq;
>> >> >
>> >> >      /* for specific governor */
>> >> >      union {
>> >> >          struct  xen_userspace userspace;
>> >> >          struct  xen_ondemand ondemand;
>> >> >      } u;
>> >> > -
>> >> > -    int32_t turbo_enabled;
>> >> >  };
>> >>
>> >> Is all of this re-arrangement really needed? Also can't turbo_enabled
>> >> and scaling_turbo_pct be combined into a single field?
>> >
>> >Personally, we should not combine the two.
>> > turbo_enabled is used by both the old pstate driver and intel_pstate,
>> >but  scaling_turbo_pct is only used in intel_pstate. If we use
>> "scaling_turbo_pct=0"
>> > to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to represent
>> >" turbo_enabled=1", then we will need to modify the old driver to use
>> >scaling_turbo_pct, i.e. changing the old driver to be aware of the "percentage"
>> > concept, which is proposed in intel_pstate. On the other side, I think
>> >keeping  turbo_enabled and scaling_turbo_pct separated makes the code
>> easier to read.
>> 
>> Note that "combine" doesn't necessarily mean "eliminate the old one" - they
>> could as well become field of a union. The basic question you should ask
>> yourself in such cases is: "Do both fields have a meaning at the same time?" If
>> the answer is yes, then of course they should remain separate. If the answer is
>> no _and_ their purpose is reasonably similar, then combining them should at
>> least be considered.
> 
> Ok. I will keep the two separated, since they do have their own meaning at 
> the same time. 

Being which?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  9:40           ` Jan Beulich
@ 2015-10-26  9:48             ` Wang, Wei W
  2015-10-26  9:53               ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26  9:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 26/10/2015 17:42,  Jan Beulich wrote:
> >>> On 26.10.15 at 08:59, <wei.w.wang@intel.com> wrote:
> > On 26/10/2015 15:03,  Jan Beulich wrote:
> >> >>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
> >> >On 08/10/2015 12:11,  Jan Beulich wrote:
> >> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> >> >> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
> >> >> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
> >> >> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
> >> >> >      char scaling_driver[CPUFREQ_NAME_LEN];
> >> >> > -
> >> >> > -    uint32_t cpuinfo_cur_freq;
> >> >> > -    uint32_t cpuinfo_max_freq;
> >> >> > -    uint32_t cpuinfo_min_freq;
> >> >> > -    uint32_t scaling_cur_freq;
> >> >> > -
> >> >> >      char scaling_governor[CPUFREQ_NAME_LEN];
> >> >> > -    uint32_t scaling_max_freq;
> >> >> > -    uint32_t scaling_min_freq;
> >> >> >
> >> >> >      /* for specific governor */
> >> >> >      union {
> >> >> >          struct  xen_userspace userspace;
> >> >> >          struct  xen_ondemand ondemand;
> >> >> >      } u;
> >> >> > -
> >> >> > -    int32_t turbo_enabled;
> >> >> >  };
> >> >>
> >> >> Is all of this re-arrangement really needed? Also can't
> >> >> turbo_enabled and scaling_turbo_pct be combined into a single field?
> >> >
> >> >Personally, we should not combine the two.
> >> > turbo_enabled is used by both the old pstate driver and
> >> >intel_pstate, but  scaling_turbo_pct is only used in intel_pstate.
> >> >If we use
> >> "scaling_turbo_pct=0"
> >> > to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to
> >> >represent " turbo_enabled=1", then we will need to modify the old
> >> >driver to use scaling_turbo_pct, i.e. changing the old driver to be aware of
> the "percentage"
> >> > concept, which is proposed in intel_pstate. On the other side, I
> >> >think keeping  turbo_enabled and scaling_turbo_pct separated makes
> >> >the code
> >> easier to read.
> >>
> >> Note that "combine" doesn't necessarily mean "eliminate the old one"
> >> - they could as well become field of a union. The basic question you
> >> should ask yourself in such cases is: "Do both fields have a meaning
> >> at the same time?" If the answer is yes, then of course they should
> >> remain separate. If the answer is no _and_ their purpose is
> >> reasonably similar, then combining them should at least be considered.
> >
> > Ok. I will keep the two separated, since they do have their own
> > meaning at the same time.
> 
> Being which?

Keeping both of the two there. Just as what they are now - two independent fields.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  9:48             ` Wang, Wei W
@ 2015-10-26  9:53               ` Jan Beulich
  2015-10-26 10:19                 ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-26  9:53 UTC (permalink / raw)
  To: Wei W Wang; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

>>> On 26.10.15 at 10:48, <wei.w.wang@intel.com> wrote:
> On 26/10/2015 17:42,  Jan Beulich wrote:
>> >>> On 26.10.15 at 08:59, <wei.w.wang@intel.com> wrote:
>> > On 26/10/2015 15:03,  Jan Beulich wrote:
>> >> >>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
>> >> >On 08/10/2015 12:11,  Jan Beulich wrote:
>> >> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
>> >> >> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
>> >> >> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
>> >> >> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
>> >> >> >      char scaling_driver[CPUFREQ_NAME_LEN];
>> >> >> > -
>> >> >> > -    uint32_t cpuinfo_cur_freq;
>> >> >> > -    uint32_t cpuinfo_max_freq;
>> >> >> > -    uint32_t cpuinfo_min_freq;
>> >> >> > -    uint32_t scaling_cur_freq;
>> >> >> > -
>> >> >> >      char scaling_governor[CPUFREQ_NAME_LEN];
>> >> >> > -    uint32_t scaling_max_freq;
>> >> >> > -    uint32_t scaling_min_freq;
>> >> >> >
>> >> >> >      /* for specific governor */
>> >> >> >      union {
>> >> >> >          struct  xen_userspace userspace;
>> >> >> >          struct  xen_ondemand ondemand;
>> >> >> >      } u;
>> >> >> > -
>> >> >> > -    int32_t turbo_enabled;
>> >> >> >  };
>> >> >>
>> >> >> Is all of this re-arrangement really needed? Also can't
>> >> >> turbo_enabled and scaling_turbo_pct be combined into a single field?
>> >> >
>> >> >Personally, we should not combine the two.
>> >> > turbo_enabled is used by both the old pstate driver and
>> >> >intel_pstate, but  scaling_turbo_pct is only used in intel_pstate.
>> >> >If we use
>> >> "scaling_turbo_pct=0"
>> >> > to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to
>> >> >represent " turbo_enabled=1", then we will need to modify the old
>> >> >driver to use scaling_turbo_pct, i.e. changing the old driver to be aware 
> of
>> the "percentage"
>> >> > concept, which is proposed in intel_pstate. On the other side, I
>> >> >think keeping  turbo_enabled and scaling_turbo_pct separated makes
>> >> >the code
>> >> easier to read.
>> >>
>> >> Note that "combine" doesn't necessarily mean "eliminate the old one"
>> >> - they could as well become field of a union. The basic question you
>> >> should ask yourself in such cases is: "Do both fields have a meaning
>> >> at the same time?" If the answer is yes, then of course they should
>> >> remain separate. If the answer is no _and_ their purpose is
>> >> reasonably similar, then combining them should at least be considered.
>> >
>> > Ok. I will keep the two separated, since they do have their own
>> > meaning at the same time.
>> 
>> Being which?
> 
> Keeping both of the two there. Just as what they are now - two independent 
> fields.

That wasn't the question; I rather inquired what "meaning at the same
time" both fields have.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26  9:53               ` Jan Beulich
@ 2015-10-26 10:19                 ` Wang, Wei W
  2015-10-26 10:35                   ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26 10:19 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 26/10/2015 17:54,  Jan Beulich wrote:
> >>> On 26.10.15 at 10:48, <wei.w.wang@intel.com> wrote:
> > On 26/10/2015 17:42,  Jan Beulich wrote:
> >> >>> On 26.10.15 at 08:59, <wei.w.wang@intel.com> wrote:
> >> > On 26/10/2015 15:03,  Jan Beulich wrote:
> >> >> >>> "Wang, Wei W" <wei.w.wang@intel.com> 10/26/15 7:27 AM >>>
> >> >> >On 08/10/2015 12:11,  Jan Beulich wrote:
> >> >> >> >>> On 14.09.15 at 04:32, <wei.w.wang@intel.com> wrote:
> >> >> >> > @@ -309,23 +326,13 @@ struct xen_get_cpufreq_para {
> >> >> >> >      XEN_GUEST_HANDLE_64(uint32) scaling_available_frequencies;
> >> >> >> >      XEN_GUEST_HANDLE_64(char)   scaling_available_governors;
> >> >> >> >      char scaling_driver[CPUFREQ_NAME_LEN];
> >> >> >> > -
> >> >> >> > -    uint32_t cpuinfo_cur_freq;
> >> >> >> > -    uint32_t cpuinfo_max_freq;
> >> >> >> > -    uint32_t cpuinfo_min_freq;
> >> >> >> > -    uint32_t scaling_cur_freq;
> >> >> >> > -
> >> >> >> >      char scaling_governor[CPUFREQ_NAME_LEN];
> >> >> >> > -    uint32_t scaling_max_freq;
> >> >> >> > -    uint32_t scaling_min_freq;
> >> >> >> >
> >> >> >> >      /* for specific governor */
> >> >> >> >      union {
> >> >> >> >          struct  xen_userspace userspace;
> >> >> >> >          struct  xen_ondemand ondemand;
> >> >> >> >      } u;
> >> >> >> > -
> >> >> >> > -    int32_t turbo_enabled;
> >> >> >> >  };
> >> >> >>
> >> >> >> Is all of this re-arrangement really needed? Also can't
> >> >> >> turbo_enabled and scaling_turbo_pct be combined into a single field?
> >> >> >
> >> >> >Personally, we should not combine the two.
> >> >> > turbo_enabled is used by both the old pstate driver and
> >> >> >intel_pstate, but  scaling_turbo_pct is only used in intel_pstate.
> >> >> >If we use
> >> >> "scaling_turbo_pct=0"
> >> >> > to represent "turbo_enabled=0", and "scaling_turbo_pct>0" to
> >> >> >represent " turbo_enabled=1", then we will need to modify the old
> >> >> >driver to use scaling_turbo_pct, i.e. changing the old driver to
> >> >> >be aware
> > of
> >> the "percentage"
> >> >> > concept, which is proposed in intel_pstate. On the other side, I
> >> >> >think keeping  turbo_enabled and scaling_turbo_pct separated
> >> >> >makes the code
> >> >> easier to read.
> >> >>
> >> >> Note that "combine" doesn't necessarily mean "eliminate the old one"
> >> >> - they could as well become field of a union. The basic question
> >> >> you should ask yourself in such cases is: "Do both fields have a
> >> >> meaning at the same time?" If the answer is yes, then of course
> >> >> they should remain separate. If the answer is no _and_ their
> >> >> purpose is reasonably similar, then combining them should at least be
> considered.
> >> >
> >> > Ok. I will keep the two separated, since they do have their own
> >> > meaning at the same time.
> >>
> >> Being which?
> >
> > Keeping both of the two there. Just as what they are now - two
> > independent fields.
> 
> That wasn't the question; I rather inquired what "meaning at the same time"
> both fields have.

turbo_enable: indicates if turbo is enabled or not.
turbo_pct: shows the capability of turbo in percentage. For example if the CPU has the following operating frequency range:
>From [Min] to [Max]: 1.2GH, 1.3GHZ,1.4GHZ,...,2.3GHZ, so 1.2--->2.3, length = 12
Turbo frequency: 3.7GHZ, so 1.2---->3.7, length = 26
Then turbo_pct = (26 - 12) / 26 = 54%, that is turbo_pct = 54.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26 10:19                 ` Wang, Wei W
@ 2015-10-26 10:35                   ` Jan Beulich
  2015-10-26 10:45                     ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-26 10:35 UTC (permalink / raw)
  To: Wei W Wang; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

>>> On 26.10.15 at 11:19, <wei.w.wang@intel.com> wrote:
> On 26/10/2015 17:54,  Jan Beulich wrote:
>> That wasn't the question; I rather inquired what "meaning at the same time"
>> both fields have.
> 
> turbo_enable: indicates if turbo is enabled or not.
> turbo_pct: shows the capability of turbo in percentage. For example if the 
> CPU has the following operating frequency range:
> From [Min] to [Max]: 1.2GH, 1.3GHZ,1.4GHZ,...,2.3GHZ, so 1.2--->2.3, length = 12
> Turbo frequency: 3.7GHZ, so 1.2---->3.7, length = 26
> Then turbo_pct = (26 - 12) / 26 = 54%, that is turbo_pct = 54.

So what meaning does e.g. "turbo_enabled = 1" plus "turbo_pct = 0"
have? Or "turbo_enabled = 0" plus "turbo_pct > 0"?

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26 10:35                   ` Jan Beulich
@ 2015-10-26 10:45                     ` Wang, Wei W
  2015-10-26 10:51                       ` Jan Beulich
  0 siblings, 1 reply; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26 10:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 26/10/2015 18:37,  Jan Beulich wrote:
> >>> On 26.10.15 at 11:19, <wei.w.wang@intel.com> wrote:
> > On 26/10/2015 17:54,  Jan Beulich wrote:
> >> That wasn't the question; I rather inquired what "meaning at the same time"
> >> both fields have.
> >
> > turbo_enable: indicates if turbo is enabled or not.
> > turbo_pct: shows the capability of turbo in percentage. For example if
> > the CPU has the following operating frequency range:
> > From [Min] to [Max]: 1.2GH, 1.3GHZ,1.4GHZ,...,2.3GHZ, so 1.2--->2.3,
> > length = 12 Turbo frequency: 3.7GHZ, so 1.2---->3.7, length = 26 Then
> > turbo_pct = (26 - 12) / 26 = 54%, that is turbo_pct = 54.
> 
> So what meaning does e.g. "turbo_enabled = 1" plus "turbo_pct = 0"
> have? Or "turbo_enabled = 0" plus "turbo_pct > 0"?

" turbo_pct = 54" is the property of the CPU, just like MAX=3.7GHZ. It should be something engraved on stone, not changeable. So, I think it's not a good idea to put them together.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26 10:45                     ` Wang, Wei W
@ 2015-10-26 10:51                       ` Jan Beulich
  2015-10-26 11:03                         ` Wang, Wei W
  0 siblings, 1 reply; 44+ messages in thread
From: Jan Beulich @ 2015-10-26 10:51 UTC (permalink / raw)
  To: Wei W Wang; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

>>> On 26.10.15 at 11:45, <wei.w.wang@intel.com> wrote:
> On 26/10/2015 18:37,  Jan Beulich wrote:
>> >>> On 26.10.15 at 11:19, <wei.w.wang@intel.com> wrote:
>> > On 26/10/2015 17:54,  Jan Beulich wrote:
>> >> That wasn't the question; I rather inquired what "meaning at the same time"
>> >> both fields have.
>> >
>> > turbo_enable: indicates if turbo is enabled or not.
>> > turbo_pct: shows the capability of turbo in percentage. For example if
>> > the CPU has the following operating frequency range:
>> > From [Min] to [Max]: 1.2GH, 1.3GHZ,1.4GHZ,...,2.3GHZ, so 1.2--->2.3,
>> > length = 12 Turbo frequency: 3.7GHZ, so 1.2---->3.7, length = 26 Then
>> > turbo_pct = (26 - 12) / 26 = 54%, that is turbo_pct = 54.
>> 
>> So what meaning does e.g. "turbo_enabled = 1" plus "turbo_pct = 0"
>> have? Or "turbo_enabled = 0" plus "turbo_pct > 0"?
> 
> " turbo_pct = 54" is the property of the CPU, just like MAX=3.7GHZ. It 
> should be something engraved on stone, not changeable. So, I think it's not a 
> good idea to put them together.

That's a valid statement, but not really an answer to the question(s)
raised.

Jan

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c
  2015-10-26 10:51                       ` Jan Beulich
@ 2015-10-26 11:03                         ` Wang, Wei W
  0 siblings, 0 replies; 44+ messages in thread
From: Wang, Wei W @ 2015-10-26 11:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3@citrix.com, xen-devel@lists.xen.org

On 26/10/2015 18:52,  Jan Beulich wrote:
> >>> On 26.10.15 at 11:45, <wei.w.wang@intel.com> wrote:
> > On 26/10/2015 18:37,  Jan Beulich wrote:
> >> >>> On 26.10.15 at 11:19, <wei.w.wang@intel.com> wrote:
> >> > On 26/10/2015 17:54,  Jan Beulich wrote:
> >> >> That wasn't the question; I rather inquired what "meaning at the same
> time"
> >> >> both fields have.
> >> >
> >> > turbo_enable: indicates if turbo is enabled or not.
> >> > turbo_pct: shows the capability of turbo in percentage. For example
> >> > if the CPU has the following operating frequency range:
> >> > From [Min] to [Max]: 1.2GH, 1.3GHZ,1.4GHZ,...,2.3GHZ, so
> >> > 1.2--->2.3, length = 12 Turbo frequency: 3.7GHZ, so 1.2---->3.7,
> >> > length = 26 Then turbo_pct = (26 - 12) / 26 = 54%, that is turbo_pct = 54.
> >>
> >> So what meaning does e.g. "turbo_enabled = 1" plus "turbo_pct = 0"
> >> have? Or "turbo_enabled = 0" plus "turbo_pct > 0"?
> >
> > " turbo_pct = 54" is the property of the CPU, just like MAX=3.7GHZ. It
> > should be something engraved on stone, not changeable. So, I think
> > it's not a good idea to put them together.
> 
> That's a valid statement, but not really an answer to the question(s) raised.

My point is that we should not discuss the two together, since they are not related to each other.
The answer is obvious:
"turbo_enabled = 1" plus "turbo_pct = 0", is not a possible case, since turbo_pct always equals to 54, as the valid statement states. However, if the CPU doesn't support turbo, then turbo_enabled won't be 1.
"turbo_enabled = 0" plus "turbo_pct =54", simply means that the CPU has its turbo function switched off, though it has a turbo capability of 54%.

Best,
Wei

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2015-10-26 11:03 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-14  2:32 [PATCH v5 0/9] Porting the intel_pstate driver to Xen Wei Wang
2015-09-14  2:32 ` [PATCH v5 1/9] x86/intel_pstate: add some calculation related support Wei Wang
2015-09-17 15:19   ` Andrew Cooper
2015-10-05 16:00     ` Jan Beulich
2015-09-14  2:32 ` [PATCH v5 2/9] x86/intel_pstate: APERF/MPERF feature detect Wei Wang
2015-09-17 15:26   ` Andrew Cooper
2015-10-05 16:14   ` Jan Beulich
2015-09-14  2:32 ` [PATCH v5 3/9] x86/intel_pstate: add a new driver interface, setpolicy() Wei Wang
2015-09-17 15:34   ` Andrew Cooper
2015-10-06 15:37   ` Jan Beulich
2015-09-14  2:32 ` [PATCH v5 4/9] x86/intel_pstate: relocate the driver register function Wei Wang
2015-09-17 15:38   ` Andrew Cooper
2015-09-21 13:13     ` Jan Beulich
2015-10-07 15:08   ` Jan Beulich
2015-09-14  2:32 ` [PATCH v5 5/9] x86/intel_pstate: changes in cpufreq_del_cpu for CPU offline Wei Wang
2015-09-17 15:43   ` Andrew Cooper
2015-10-07 15:28   ` Jan Beulich
2015-10-11  2:19     ` Wang, Wei W
2015-09-14  2:32 ` [PATCH v5 6/9] x86/intel_pstate: the main boby of the intel_pstate driver Wei Wang
2015-09-17 15:51   ` Andrew Cooper
2015-10-07 15:39   ` Jan Beulich
2015-09-14  2:32 ` [PATCH v5 7/9] x86/intel_pstate: add a booting param to select the driver to load Wei Wang
2015-09-17 16:08   ` Andrew Cooper
2015-09-21 13:15     ` Jan Beulich
2015-10-07 15:46   ` Jan Beulich
2015-10-23  8:09     ` Wang, Wei W
2015-10-23  8:16     ` Wang, Wei W
2015-10-23  8:18     ` Wang, Wei W
2015-10-23  8:35       ` Jan Beulich
2015-10-23  8:48         ` Wang, Wei W
2015-09-14  2:32 ` [PATCH v5 8/9] x86/intel_pstate: support the use of intel_pstate in pmstat.c Wei Wang
2015-10-07 16:10   ` Jan Beulich
2015-10-26  6:26     ` Wang, Wei W
2015-10-26  7:02       ` Jan Beulich
2015-10-26  7:59         ` Wang, Wei W
2015-10-26  9:40           ` Jan Beulich
2015-10-26  9:48             ` Wang, Wei W
2015-10-26  9:53               ` Jan Beulich
2015-10-26 10:19                 ` Wang, Wei W
2015-10-26 10:35                   ` Jan Beulich
2015-10-26 10:45                     ` Wang, Wei W
2015-10-26 10:51                       ` Jan Beulich
2015-10-26 11:03                         ` Wang, Wei W
2015-09-14  2:32 ` [PATCH v5 9/9] tools: enable xenpm to control the intel_pstate driver Wei Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).