* [RFC] pvops domain0 Cx/Px info parser patch
@ 2009-05-11 7:58 Yu, Ke
2009-05-11 18:48 ` Jeremy Fitzhardinge
0 siblings, 1 reply; 3+ messages in thread
From: Yu, Ke @ 2009-05-11 7:58 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Tian, Kevin, xen-devel@lists.xensource.com
[-- Attachment #1: Type: text/plain, Size: 3037 bytes --]
Hi, Jeremy
Since now the pv_ops domain0 is approaching, we are considering to port the dom0 power management related code, or more specifically, the Cx/Px ACPI info parsing code, to pv_ops domain0 tree, so that people can utilize Xen power management capability (Cx/PX) under pv_ops domain0. This RFC is a draft version for comments.
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states power management. This info is provided by BIOS ACPI table. Since hypervisor has no ACPI parser, this info has to be parsed by domain0 kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI sub-system. Fortunately, kernel already has good abstraction, and only several places need to add hook. To be more detail, there is an acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the Cx/Px parsing event will go to. This driver will call its acpi processor event handler, e.g. add/remove, start/stop, notify to handle these events. These event handlers in turn will call some library functions (in drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed, acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish the acpi info parsing.
So the conclusion is: adding hooks in acpi_processor_driver and those related library functions will satisfy our requirement.
To make the added hook cleaner, we introduce an interface called external control operation (struct processor_extcntl_ops). All the hooks are encapsulated in this interface processor_extcntl_ops . Here the "external" means the acpi processor is controlled by external entity, e.g. VMM. Every kind of external entity can has its implementation of this interface. In this patch, the interface for Xen is implemented.
== Patch Set Description ==
This patch set is based the xen/dom0/hackery branch. It has three patches:
1. acpi_parser_framework.patch
This patch introduces the interface of external control operation, and adds hooks to the acpi sub-system, including the acpi_processor_driver, and the related library functions.
2. acpi_parser_xen_driver.patch
This patch implements the Xen version of the external control operation interface.
3. acpi_parser_platform_hypercall.patch
This patches implements the xen_platform_op hypercall, to pass the parsed ACPI info to hypervisor.
== Misc ==
To make this patch works correctly, there is a corner case need to consider: processor in domain0 is virtual CPU, while BIOS ACPI table provides the physical CPU info, since the vCPU and pCPU is not 1:1 mapped, this may caused unexpected result. E.g. in UP domain0, Xen hypervisor may only get one physical processor info. Current linux-2.6.18-xen.hg tree solve this issue by using acpi_id instead of processor_id. However, this code is a bit tricky and hacky. We are still trying to find a cleaner way to solve this issue.
Comments are welcome.
Best Regards
Yu Ke and Tian Kevin
[-- Attachment #2: acpi_parser_framework.patch --]
[-- Type: application/octet-stream, Size: 21644 bytes --]
Introduce the external control operation interface for domain0 ACPI parser
Signed-off-by: Yu Ke <ke.yu@intel.com>
Tian Kevin <kevin.tian@intel.com>
---
drivers/acpi/Kconfig | 5 +
drivers/acpi/Makefile | 4 +
drivers/acpi/processor_core.c | 37 ++++++
drivers/acpi/processor_extcntl.c | 233 ++++++++++++++++++++++++++++++++++++++
drivers/acpi/processor_idle.c | 31 +++++
drivers/acpi/processor_perflib.c | 19 +++
drivers/cpufreq/Kconfig | 1
include/acpi/processor.h | 137 ++++++++++++++++++++++
8 files changed, 464 insertions(+), 3 deletions(-)
create mode 100644 drivers/acpi/processor_extcntl.c
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 8a851d0..6f4aa6c 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -310,4 +310,9 @@ config ACPI_SBS
This driver adds support for the Smart Battery System, another
type of access to battery information, found on some laptops.
+config PROCESSOR_EXTERNAL_CONTROL
+ bool
+ depends on (X86 || IA64) && XEN
+ default y
+
endif # ACPI
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index b130ea0..e076a2a 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -32,6 +32,10 @@ ifdef CONFIG_CPU_FREQ
processor-objs += processor_perflib.o
endif
+ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+processor-objs += processor_perflib.o processor_extcntl.o
+endif
+
obj-y += bus.o glue.o
obj-y += scan.o
# Keep EC driver first. Initialization of others depend on it.
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 0cc2fd3..b406a1f 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -620,7 +620,13 @@ static int acpi_processor_get_info(struct acpi_device *device)
if (pr->id == -1) {
if (ACPI_FAILURE
(acpi_processor_hotadd_init(pr->handle, &pr->id))) {
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (!processor_cntl_external()) {
+ return -ENODEV;
+ }
+#else
return -ENODEV;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
}
}
@@ -670,6 +676,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
struct acpi_processor *pr;
struct sys_device *sysdev;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ processor_extcntl_init();
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
pr = acpi_driver_data(device);
result = acpi_processor_get_info(device);
@@ -709,7 +719,8 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
/* _PDC call should be done before doing anything else (if reqd.). */
arch_acpi_processor_init_pdc(pr);
acpi_processor_set_pdc(pr);
-#ifdef CONFIG_CPU_FREQ
+
+#if defined(CONFIG_CPU_FREQ) || defined(CONFIG_PROCESSOR_EXTERNAL_CONTROL)
acpi_processor_ppc_has_changed(pr);
#endif
acpi_processor_get_throttling_info(pr);
@@ -718,6 +729,12 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ result = processor_extcntl_prepare(pr);
+ if (result)
+ goto end;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -938,6 +955,12 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -977,11 +1000,23 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (processor_cntl_external())
+ processor_notify_external(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_extcntl.c b/drivers/acpi/processor_extcntl.c
new file mode 100644
index 0000000..53674b3
--- /dev/null
+++ b/drivers/acpi/processor_extcntl.c
@@ -0,0 +1,233 @@
+/*
+ * processor_extcntl.c - channel to external control logic
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <acpi/processor.h>
+
+#define ACPI_PROCESSOR_CLASS "processor"
+#define ACPI_PROCESSOR_DRIVER_NAME "ACPI Processor Driver"
+#define _COMPONENT ACPI_PROCESSOR_COMPONENT
+ACPI_MODULE_NAME("acpi_processor")
+
+static int processor_extcntl_parse_csd(struct acpi_processor *pr);
+static int processor_extcntl_get_performance(struct acpi_processor *pr);
+/*
+ * External processor control logic may register with its own set of
+ * ops to get ACPI related notification. One example is like VMM.
+ */
+const struct processor_extcntl_ops *processor_extcntl_ops;
+EXPORT_SYMBOL(processor_extcntl_ops);
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done = 0;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,"No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_fadt.pstate_cnt, acpi_fadt.smi_cmd));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_notify_external(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ if (!processor_cntl_external())
+ return -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !processor_extcntl_ops->pm_ops[type])
+ break;
+
+ ret = processor_extcntl_ops->pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (processor_extcntl_ops->hotplug)
+ ret = processor_extcntl_ops->hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * External control logic can decide to grab full or part of physical
+ * processor control bits. Take a VMM for example, physical processors
+ * are owned by VMM and thus existence information like hotplug is
+ * always required to be notified to VMM. Similar is processor idle
+ * state which is also necessarily controlled by VMM. But for other
+ * control bits like performance/throttle states, VMM may choose to
+ * control or not upon its own policy.
+ */
+void processor_extcntl_init(void)
+{
+ if (!processor_extcntl_ops)
+ arch_acpi_processor_init_extcntl(&processor_extcntl_ops);
+}
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy external control model.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+ /* parse cstate dependency information */
+ if (processor_pm_external())
+ processor_extcntl_parse_csd(pr);
+
+ /* Initialize performance states */
+ if (processor_pmperf_external())
+ processor_extcntl_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Currently no _CSD is implemented which is why existing ACPI code
+ * doesn't parse _CSD at all. But to keep interface complete with
+ * external control logic, we put a placeholder here for future
+ * compatibility.
+ */
+static int processor_extcntl_parse_csd(struct acpi_processor *pr)
+{
+ int i;
+
+ for (i = 0; i < pr->power.count; i++) {
+ if (!pr->power.states[i].valid)
+ continue;
+
+ /* No dependency by default */
+ pr->power.states[i].domain_info = NULL;
+ pr->power.states[i].csd_count = 0;
+ }
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with external control
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_extcntl_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to external logic. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that external logic exists */
+ processor_notify_smm();
+
+ processor_notify_external(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 08def2f..2452942 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -489,6 +489,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* cache control methods to notify external logic */
+ if (processor_pm_external())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1177,6 +1183,15 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (processor_pm_external()) {
+ acpi_processor_get_power_info(pr);
+ processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1240,9 +1255,18 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (!processor_pm_external())
+ {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
+#else
acpi_processor_setup_cpuidle(pr);
if (cpuidle_register_device(&pr->power.dev))
return -EIO;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
@@ -1259,6 +1283,13 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
acpi_driver_data(device));
if (!entry)
return -EIO;
+
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (processor_pm_external())
+ processor_notify_external(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
return 0;
}
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index 68fd3d2..bccebbc 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -162,9 +162,15 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
if (ret < 0)
return (ret);
else
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ return processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
+#else
return cpufreq_update_policy(pr->id);
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
}
+#ifdef CONFIG_CPU_FREQ
void acpi_processor_ppc_init(void)
{
if (!cpufreq_register_notifier
@@ -183,6 +189,7 @@ void acpi_processor_ppc_exit(void)
acpi_processor_ppc_status &= ~PPC_REGISTERED;
}
+#endif
static int acpi_processor_get_performance_control(struct acpi_processor *pr)
{
@@ -324,7 +331,10 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+#ifndef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+static
+#endif
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -365,6 +375,7 @@ static int acpi_processor_get_performance_info(struct acpi_processor *pr)
return result;
}
+#ifdef CONFIG_CPU_FREQ
int acpi_processor_notify_smm(struct module *calling_module)
{
acpi_status status;
@@ -425,8 +436,12 @@ int acpi_processor_notify_smm(struct module *calling_module)
}
EXPORT_SYMBOL(acpi_processor_notify_smm);
+#endif
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+#ifndef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+static
+#endif
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index a8c8d9c..2de122f 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -1,5 +1,6 @@
config CPU_FREQ
bool "CPU Frequency scaling"
+ depends on !PROCESSOR_EXTERNAL_CONTROL
help
CPU Frequency scaling allows you to change the clock speed of
CPUs on the fly. This is a nice method to save power, because
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index 0574add..89ffc6a 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -42,6 +42,17 @@
struct acpi_processor_cx;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+struct acpi_csd_package {
+ acpi_integer num_entries;
+ acpi_integer revision;
+ acpi_integer domain;
+ acpi_integer coord_type;
+ acpi_integer num_processors;
+ acpi_integer index;
+} __attribute__ ((packed));
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
struct acpi_power_register {
u8 descriptor;
u16 length;
@@ -74,6 +85,12 @@ struct acpi_processor_cx {
u32 power;
u32 usage;
u64 time;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* Require raw information for external control logic */
+ u32 csd_count;
+ struct acpi_power_register reg;
+ struct acpi_csd_package *domain_info;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
@@ -304,6 +321,9 @@ static inline void acpi_processor_ppc_exit(void)
{
return;
}
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+#else
static inline int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
static unsigned int printout = 1;
@@ -316,6 +336,7 @@ static inline int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
}
return 0;
}
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
#endif /* CONFIG_CPU_FREQ */
/* in processor_throttling.c */
@@ -352,4 +373,120 @@ static inline void acpi_thermal_cpufreq_exit(void)
}
#endif
+/*
+ * Following are interfaces geared to external processor PM control
+ * logic like a VMM
+ */
+/* Events notified to external control logic */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+struct processor_extcntl_ops {
+ /* Transfer processor PM events to external control logic */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to external control logic */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+extern const struct processor_extcntl_ops *processor_extcntl_ops;
+
+static inline int processor_cntl_external(void)
+{
+ return (processor_extcntl_ops != NULL);
+}
+
+static inline int processor_pm_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+static inline int processor_pmperf_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+static inline int processor_pmthr_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_THR] != NULL);
+}
+
+extern int processor_notify_external(struct acpi_processor *pr,
+ int event, int type);
+extern void processor_extcntl_init(void);
+extern int processor_extcntl_prepare(struct acpi_processor *pr);
+extern int acpi_processor_get_performance_info(struct acpi_processor *pr);
+extern int acpi_processor_get_psd(struct acpi_processor *pr);
+void arch_acpi_processor_init_extcntl(const struct processor_extcntl_ops **);
+#else
+static inline int processor_cntl_external(void) {return 0;}
+static inline int processor_pm_external(void) {return 0;}
+static inline int processor_pmperf_external(void) {return 0;}
+static inline int processor_pmthr_external(void) {return 0;}
+static inline int processor_notify_external(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline void processor_extcntl_init(void) {}
+static inline int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
+#ifdef CONFIG_XEN
+static inline void xen_convert_pct_reg(struct xen_pct_register *xpct,
+ struct acpi_pct_register *apct)
+{
+ xpct->descriptor = apct->descriptor;
+ xpct->length = apct->length;
+ xpct->space_id = apct->space_id;
+ xpct->bit_width = apct->bit_width;
+ xpct->bit_offset = apct->bit_offset;
+ xpct->reserved = apct->reserved;
+ xpct->address = apct->address;
+}
+
+static inline void xen_convert_pss_states(struct xen_processor_px *xpss,
+ struct acpi_processor_px *apss, int state_count)
+{
+ int i;
+ for(i=0; i<state_count; i++) {
+ xpss->core_frequency = apss->core_frequency;
+ xpss->power = apss->power;
+ xpss->transition_latency = apss->transition_latency;
+ xpss->bus_master_latency = apss->bus_master_latency;
+ xpss->control = apss->control;
+ xpss->status = apss->status;
+ xpss++;
+ apss++;
+ }
+}
+
+static inline void xen_convert_psd_pack(struct xen_psd_package *xpsd,
+ struct acpi_psd_package *apsd)
+{
+ xpsd->num_entries = apsd->num_entries;
+ xpsd->revision = apsd->revision;
+ xpsd->domain = apsd->domain;
+ xpsd->coord_type = apsd->coord_type;
+ xpsd->num_processors = apsd->num_processors;
+}
+
+#endif /* CONFIG_XEN */
+
#endif
[-- Attachment #3: acpi_parser_platform_hypercall.patch --]
[-- Type: application/octet-stream, Size: 5831 bytes --]
Domain0 acpi parser related platform hypercall
Signed-off-by: Yu Ke <ke.yu@intel.com>
Tian Kevin <kevin.tian@intel.com>
---
arch/x86/include/asm/acpi.h | 4 +
include/xen/interface/platform.h | 114 ++++++++++++++++++++++++++++++++++++++
include/xen/interface/xen.h | 1
3 files changed, 119 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index 4518dc5..82cb49d 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -30,6 +30,10 @@
#include <asm/mmu.h>
#include <asm/mpspec.h>
+#ifdef CONFIG_XEN
+#include <xen/interface/platform.h>
+#endif /* CONFIG_XEN */
+
#define COMPILER_DEPENDENT_INT64 long long
#define COMPILER_DEPENDENT_UINT64 unsigned long long
diff --git a/include/xen/interface/platform.h b/include/xen/interface/platform.h
index da548f3..c753cc8 100644
--- a/include/xen/interface/platform.h
+++ b/include/xen/interface/platform.h
@@ -199,6 +199,119 @@ struct xenpf_getidletime {
typedef struct xenpf_getidletime xenpf_getidletime_t;
DEFINE_GUEST_HANDLE_STRUCT(xenpf_getidletime_t);
+#define XENPF_set_processor_pminfo 54
+
+/* ability bits */
+#define XEN_PROCESSOR_PM_CX 1
+#define XEN_PROCESSOR_PM_PX 2
+#define XEN_PROCESSOR_PM_TX 4
+
+/* cmd type */
+#define XEN_PM_CX 0
+#define XEN_PM_PX 1
+#define XEN_PM_TX 2
+
+/* Px sub info type */
+#define XEN_PX_PCT 1
+#define XEN_PX_PSS 2
+#define XEN_PX_PPC 4
+#define XEN_PX_PSD 8
+
+struct xen_power_register {
+ uint32_t space_id;
+ uint32_t bit_width;
+ uint32_t bit_offset;
+ uint32_t access_size;
+ uint64_t address;
+};
+
+struct xen_processor_csd {
+ uint32_t domain; /* domain number of one dependent group */
+ uint32_t coord_type; /* coordination type */
+ uint32_t num; /* number of processors in same domain */
+};
+typedef struct xen_processor_csd xen_processor_csd_t;
+DEFINE_GUEST_HANDLE_STRUCT(xen_processor_csd);
+
+struct xen_processor_cx {
+ struct xen_power_register reg; /* GAS for Cx trigger register */
+ uint8_t type; /* cstate value, c0: 0, c1: 1, ... */
+ uint32_t latency; /* worst latency (ms) to enter/exit this cstate */
+ uint32_t power; /* average power consumption(mW) */
+ uint32_t dpcnt; /* number of dependency entries */
+ GUEST_HANDLE(xen_processor_csd) dp; /* NULL if no dependency */
+};
+typedef struct xen_processor_cx xen_processor_cx_t;
+DEFINE_GUEST_HANDLE_STRUCT(xen_processor_cx);
+
+struct xen_processor_flags {
+ uint32_t bm_control:1;
+ uint32_t bm_check:1;
+ uint32_t has_cst:1;
+ uint32_t power_setup_done:1;
+ uint32_t bm_rld_set:1;
+};
+
+struct xen_processor_power {
+ uint32_t count; /* number of C state entries in array below */
+ struct xen_processor_flags flags; /* global flags of this processor */
+ GUEST_HANDLE(xen_processor_cx) states; /* supported c states */
+};
+
+struct xen_pct_register {
+ uint8_t descriptor;
+ uint16_t length;
+ uint8_t space_id;
+ uint8_t bit_width;
+ uint8_t bit_offset;
+ uint8_t reserved;
+ uint64_t address;
+};
+
+struct xen_processor_px {
+ uint64_t core_frequency; /* megahertz */
+ uint64_t power; /* milliWatts */
+ uint64_t transition_latency; /* microseconds */
+ uint64_t bus_master_latency; /* microseconds */
+ uint64_t control; /* control value */
+ uint64_t status; /* success indicator */
+};
+typedef struct xen_processor_px xen_processor_px_t;
+DEFINE_GUEST_HANDLE_STRUCT(xen_processor_px);
+
+struct xen_psd_package {
+ uint64_t num_entries;
+ uint64_t revision;
+ uint64_t domain;
+ uint64_t coord_type;
+ uint64_t num_processors;
+};
+
+struct xen_processor_performance {
+ uint32_t flags; /* flag for Px sub info type */
+ uint32_t platform_limit; /* Platform limitation on freq usage */
+ struct xen_pct_register control_register;
+ struct xen_pct_register status_register;
+ uint32_t state_count; /* total available performance states */
+ GUEST_HANDLE(xen_processor_px) states;
+ struct xen_psd_package domain_info;
+ uint32_t shared_type; /* coordination type of this processor */
+};
+typedef struct xen_processor_performance xen_processor_performance_t;
+DEFINE_GUEST_HANDLE_STRUCT(xen_processor_performance);
+
+struct xenpf_set_processor_pminfo {
+ /* IN variables */
+ uint32_t id; /* ACPI CPU ID */
+ uint32_t type; /* {XEN_PM_CX, XEN_PM_PX} */
+ union {
+ struct xen_processor_power power;/* Cx: _CST/_CSD */
+ struct xen_processor_performance perf; /* Px: _PPC/_PCT/_PSS/_PSD */
+ };
+};
+typedef struct xenpf_set_processor_pminfo xenpf_set_processor_pminfo_t;
+DEFINE_GUEST_HANDLE_STRUCT(xenpf_set_processor_pminfo);
+
struct xen_platform_op {
uint32_t cmd;
uint32_t interface_version; /* XENPF_INTERFACE_VERSION */
@@ -213,6 +326,7 @@ struct xen_platform_op {
struct xenpf_enter_acpi_sleep enter_acpi_sleep;
struct xenpf_change_freq change_freq;
struct xenpf_getidletime getidletime;
+ struct xenpf_set_processor_pminfo set_pminfo;
uint8_t pad[128];
} u;
};
diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 605769b..9edacd0 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -493,6 +493,7 @@ struct dom0_vga_console_info {
/* These flags are passed in the 'flags' field of start_info_t. */
#define SIF_PRIVILEGED (1<<0) /* Is the domain privileged? */
#define SIF_INITDOMAIN (1<<1) /* Is this the initial control domain? */
+#define SIF_PM_MASK (0xFF<<8) /* reserve 1 byte for xen-pm options */
typedef uint64_t cpumap_t;
[-- Attachment #4: acpi_parser_xen_driver.patch --]
[-- Type: application/octet-stream, Size: 7047 bytes --]
Xen implementation of the external control operation interface
Signed-off-by: Yu Ke <ke.yu@intel.com>
Tian Kevin <kevin.tian@intel.com>
---
arch/x86/kernel/acpi/Makefile | 3
arch/x86/kernel/acpi/processor_extcntl_xen.c | 211 ++++++++++++++++++++++++++
2 files changed, 214 insertions(+), 0 deletions(-)
create mode 100644 arch/x86/kernel/acpi/processor_extcntl_xen.c
diff --git a/arch/x86/kernel/acpi/Makefile b/arch/x86/kernel/acpi/Makefile
index fd5ca97..f9a7ae9 100644
--- a/arch/x86/kernel/acpi/Makefile
+++ b/arch/x86/kernel/acpi/Makefile
@@ -5,6 +5,9 @@ obj-$(CONFIG_ACPI_SLEEP) += sleep.o wakeup_rm.o wakeup_$(BITS).o
ifneq ($(CONFIG_ACPI_PROCESSOR),)
obj-y += cstate.o processor.o
+ifneq ($(CONFIG_PROCESSOR_EXTERNAL_CONTROL),)
+obj-$(CONFIG_XEN) += processor_extcntl_xen.o
+endif
endif
$(obj)/wakeup_rm.o: $(obj)/realmode/wakeup.bin
diff --git a/arch/x86/kernel/acpi/processor_extcntl_xen.c b/arch/x86/kernel/acpi/processor_extcntl_xen.c
new file mode 100644
index 0000000..938b875
--- /dev/null
+++ b/arch/x86/kernel/acpi/processor_extcntl_xen.c
@@ -0,0 +1,211 @@
+/*
+ * processor_extcntl_xen.c - interface to notify Xen
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <linux/cpufreq.h>
+#include <acpi/processor.h>
+
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+static int xen_cx_notifier(struct acpi_processor *pr, int action)
+{
+ int ret, count = 0, i;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_CX,
+ };
+ struct xen_processor_cx *data, *buf;
+ struct acpi_processor_cx *cx;
+
+ if (action == PROCESSOR_PM_CHANGE)
+ return -EINVAL;
+
+ /* Convert to Xen defined structure and hypercall */
+ buf = kzalloc(pr->power.count * sizeof(struct xen_processor_cx),
+ GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ data = buf;
+ for (i = 1; i <= pr->power.count; i++) {
+ cx = &pr->power.states[i];
+ /* Skip invalid cstate entry */
+ if (!cx->valid)
+ continue;
+
+ data->type = cx->type;
+ data->latency = cx->latency;
+ data->power = cx->power;
+ data->reg.space_id = cx->reg.space_id;
+ data->reg.bit_width = cx->reg.bit_width;
+ data->reg.bit_offset = cx->reg.bit_offset;
+ data->reg.access_size = cx->reg.reserved;
+ data->reg.address = cx->reg.address;
+
+ /* Get dependency relationships */
+ if (cx->csd_count) {
+ printk("Wow! _CSD is found. Not support for now!\n");
+ kfree(buf);
+ return -EINVAL;
+ } else {
+ data->dpcnt = 0;
+ set_xen_guest_handle(data->dp, NULL);
+ }
+
+ data++;
+ count++;
+ }
+
+ if (!count) {
+ printk("No available Cx info for cpu %d\n", pr->acpi_id);
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ op.u.set_pminfo.power.count = count;
+ op.u.set_pminfo.power.flags.bm_control = pr->flags.bm_control;
+ op.u.set_pminfo.power.flags.bm_check = pr->flags.bm_check;
+ op.u.set_pminfo.power.flags.has_cst = pr->flags.has_cst;
+ op.u.set_pminfo.power.flags.power_setup_done = pr->flags.power_setup_done;
+
+ set_xen_guest_handle(op.u.set_pminfo.power.states, buf);
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(buf);
+ return ret;
+}
+
+static int xen_px_notifier(struct acpi_processor *pr, int action)
+{
+ int ret = -EINVAL;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_PX,
+ };
+ struct xen_processor_performance *perf;
+ struct xen_processor_px *states = NULL;
+ struct acpi_processor_performance *px;
+ struct acpi_psd_package *pdomain;
+
+ if (!pr)
+ return -EINVAL;
+
+ perf = &op.u.set_pminfo.perf;
+ px = pr->performance;
+
+ switch(action) {
+ case PROCESSOR_PM_CHANGE:
+ /* ppc dynamic handle */
+ perf->flags = XEN_PX_PPC;
+ perf->platform_limit = pr->performance_platform_limit;
+
+ ret = HYPERVISOR_dom0_op(&op);
+ break;
+
+ case PROCESSOR_PM_INIT:
+ /* px normal init */
+ perf->flags = XEN_PX_PPC |
+ XEN_PX_PCT |
+ XEN_PX_PSS |
+ XEN_PX_PSD;
+
+ /* ppc */
+ perf->platform_limit = pr->performance_platform_limit;
+
+ /* pct */
+ xen_convert_pct_reg(&perf->control_register, &px->control_register);
+ xen_convert_pct_reg(&perf->status_register, &px->status_register);
+
+ /* pss */
+ perf->state_count = px->state_count;
+ states = kzalloc(px->state_count*sizeof(xen_processor_px_t),GFP_KERNEL);
+ if (!states)
+ return -ENOMEM;
+ xen_convert_pss_states(states, px->states, px->state_count);
+ set_xen_guest_handle(perf->states, states);
+
+ /* psd */
+ pdomain = &px->domain_info;
+ xen_convert_psd_pack(&perf->domain_info, pdomain);
+ if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ALL;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_HW;
+ else {
+ ret = -ENODEV;
+ kfree(states);
+ break;
+ }
+
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(states);
+ break;
+
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int xen_tx_notifier(struct acpi_processor *pr, int action)
+{
+ return -EINVAL;
+}
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event)
+{
+ return -EINVAL;
+}
+
+static struct processor_extcntl_ops xen_extcntl_ops = {
+ .hotplug = xen_hotplug_notifier,
+};
+
+void arch_acpi_processor_init_extcntl(const struct processor_extcntl_ops **ops)
+{
+ unsigned int pmbits = (xen_start_info->flags & SIF_PM_MASK) >> 8;
+
+ if (!pmbits)
+ return;
+ if (pmbits & XEN_PROCESSOR_PM_CX)
+ xen_extcntl_ops.pm_ops[PM_TYPE_IDLE] = xen_cx_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_PX)
+ xen_extcntl_ops.pm_ops[PM_TYPE_PERF] = xen_px_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_TX)
+ xen_extcntl_ops.pm_ops[PM_TYPE_THR] = xen_tx_notifier;
+
+ *ops = &xen_extcntl_ops;
+}
+EXPORT_SYMBOL(arch_acpi_processor_init_extcntl);
[-- Attachment #5: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [RFC] pvops domain0 Cx/Px info parser patch
2009-05-11 7:58 [RFC] pvops domain0 Cx/Px info parser patch Yu, Ke
@ 2009-05-11 18:48 ` Jeremy Fitzhardinge
2009-05-12 1:15 ` Yu, Ke
0 siblings, 1 reply; 3+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-11 18:48 UTC (permalink / raw)
To: Yu, Ke; +Cc: Tian, Kevin, xen-devel@lists.xensource.com
[-- Attachment #1: Type: text/plain, Size: 577 bytes --]
Yu, Ke wrote:
> Hi, Jeremy
>
> Since now the pv_ops domain0 is approaching, we are considering to port the dom0 power management related code, or more specifically, the Cx/Px ACPI info parsing code, to pv_ops domain0 tree, so that people can utilize Xen power management capability (Cx/PX) under pv_ops domain0. This RFC is a draft version for comments.
>
Hi. I commented on the last patch, but saw no response. Did you get
those comments?
Oh, perhaps they never got sent. I've attached them now. I have not
looked at these patches yet. Do they differ much?
J
[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 15944 bytes --]
[-- Attachment #2.1.1: Type: text/plain, Size: 6893 bytes --]
Yu, Ke wrote:
Hi Jeremy,
Kevin and I originally write the domain0 host S3 code and Cx/Px state ACPI info parsing code in the linux-2.6.18-xen.hg tree. Since now the pv_ops domain0 is approaching, we are considering to port these code to pv_ops domain0 tree. It is very nice that you already port the host S3 code, so in this email we post the Cx/Px state ACPI info parsing code. This is a draft version intending for early comments.
Thanks for doing this. I looked at the changes in 2.6.18-xen and put them into the too-hard basket.
Some general comments about patch submission:
* please send patches one-per-mail, with [PATCH N/M] in the subject line to indicate ordering
* in each mail, include the full changelog description like you have below
* each person signing off the mail should have a Signed-off-by: line
This makes it easier to do line-by-line commentary on the patches and also process it with the tools I have on hand.
I just committed these patches to my tree, rebased onto xen-tip/dom0/core as xen-tip/dom0/acpi-parser, but I haven't even compile-tested to see if the rebase worked. But I did cut-and-paste the overview into the first patch comment so its recorded properly.
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states power management. This info is provided by BIOS ACPI table. Since hypervisor has no ACPI parser, this info has to be parsed by domain0 kernel ACPI sub-system, and then passed to hypervisor by hypercall.
Couldn't this be done with a variant of the cpufreq_acpi driver? It needs to parse the tables, but rather than acting on them directly it passes the information to Xen?
To make this happen, the key point is to add hook in the kernel ACPI sub-system. Fortunately, kernel already has good abstraction, and only several places need to add hook. To be more detail, there is an acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the Cx/Px parsing event will go to. This driver will call its acpi processor event handler, e.g. add/remove, start/stop, notify to handle these events. These event handlers in turn will call some library functions (in drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed, acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish the acpi info parsing.
So the conclusion is: adding hooks in acpi_processor_driver and those related library functions will satisfy our requirement.
To make the added hook cleaner, we introduce an interface called external control operation (struct processor_extcntl_ops). All the hooks are encapsulated in this interface processor_extcntl_ops . Here the "external" means the acpi processor is controlled by external entity, e.g. VMM. Every kind of external entity can has its implementation of this interface. In this patch, the interface for Xen is implemented.
Is the issue that the hypervisor will change the CPUs states asynchronously under the kernel, and the kernel wants to be informed about those state changes, along with some kind of mapping from pcpu to vcpu?
What use does the kernel make of that information?
I guess I don't really understand what the larger problem domain is.
Do you have any plans for other hypervisor backends? Does a kvm one make sense?
== Patch Set Description ==
This patch set is based the xen/dom0/hackery branch. It has three patches:
1. acpi_parser_framework.patch
This patch introduces the interface of external control operation, and adds hooks to the acpi sub-system, including the acpi_processor_driver, and the related library functions.
2. acpi_parser_xen_driver.patch
This patch implements the Xen version of the external control operation interface.
3. acpi_parser_platform_hypercall.patch
This patches implements the xen_platform_op hypercall, to pass the parsed ACPI info to hypervisor.
== Misc ==
To make this patch works correctly, there is a corner case need to consider: processor in domain0 is virtual CPU, while BIOS ACPI table provides the physical CPU info, since the vCPU and pCPU is not 1:1 mapped, this may caused unexpected result. E.g. in UP domain0, Xen hypervisor may only get one physical processor info. Current linux-2.6.18-xen.hg tree solve this issue by using acpi_id instead of processor_id. However, this code is a bit tricky and hacky. We are still trying to find a cleaner way to solve this issue.
Is this a corner case? I would think its the common case.
Anyway, this is draft version for your early comments, and after your reviewing, we will send it to wider scope, e.g. kernel ACPI sub-system developer in Intel.
OK, some more specific comments:
Cut down on the number of #ifdefs. If CONFIG_PROCESSOR_EXTERNAL_CONTROL is not set, define the functions to appropriate no-op definitions. For example, rather than:
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ if (!processor_cntl_external()) {
+ return -ENODEV;
+ }
+#else
return -ENODEV;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
just define processor_cntl_external() to always return 0. (Oh, it looks like you already do this. Why all the #ifdefs then?)
Also, rather than:
+ if (processor_cntl_external())
+ processor_notify_external(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
can't processor_notify_external() do the test itself? (In addition to the #ifdef comments.)
This seems pointless:
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+#ifndef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+static
+#endif
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
config CPU_FREQ
bool "CPU Frequency scaling"
+ depends on !PROCESSOR_EXTERNAL_CONTROL
This isn't acceptible. A single kernel image could be running on bare hardware, or be booted as a Xen dom0 kernel. It needs to be able to handle both CPUFREQ and external processor control. Unless external processor control somehow subsumes all the cpufreq driver functions? Anyway, all decisions need to be made at runtime.
Put all the Xen-specific code into a Xen-specific header. I've already added include/xen/acpi.h for the S3 changes, so perhaps that would be suitable.
+ifneq ($(CONFIG_PROCESSOR_EXTERNAL_CONTROL),)
+obj-$(CONFIG_XEN) += processor_extcntl_xen.o
+endif
No, do this in Kconfig, with something like:
config XEN_PROCESSOR_EXTERNAL_CONTROL
def_bool y
depends on XEN && PROCESSOR_EXTERNAL_CONTROL
And maybe XEN_ACPI and/or XEN_DOM0? If this is useful for non-x86 (ia64), then put it in drivers/xen.
J
[-- Attachment #2.1.2: Type: text/html, Size: 8012 bytes --]
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 3+ messages in thread* RE: [RFC] pvops domain0 Cx/Px info parser patch
2009-05-11 18:48 ` Jeremy Fitzhardinge
@ 2009-05-12 1:15 ` Yu, Ke
0 siblings, 0 replies; 3+ messages in thread
From: Yu, Ke @ 2009-05-12 1:15 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Tian, Kevin, xen-devel@lists.xensource.com
Hi, Jeremy
Glad to see your the comments. I did not get the previous replying email, so just resend this RFC. There is no difference. I will reply your last email in a separate email.
Best Regards
Ke
>-----Original Message-----
>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]
>Sent: Tuesday, May 12, 2009 2:48 AM
>To: Yu, Ke
>Cc: xen-devel@lists.xensource.com; Tian, Kevin
>Subject: Re: [RFC] pvops domain0 Cx/Px info parser patch
>
>Yu, Ke wrote:
>> Hi, Jeremy
>>
>> Since now the pv_ops domain0 is approaching, we are considering to port the
>dom0 power management related code, or more specifically, the Cx/Px ACPI info
>parsing code, to pv_ops domain0 tree, so that people can utilize Xen power
>management capability (Cx/PX) under pv_ops domain0. This RFC is a draft version
>for comments.
>>
>
>Hi. I commented on the last patch, but saw no response. Did you get
>those comments?
>
>Oh, perhaps they never got sent. I've attached them now. I have not
>looked at these patches yet. Do they differ much?
>
> J
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-05-12 1:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-11 7:58 [RFC] pvops domain0 Cx/Px info parser patch Yu, Ke
2009-05-11 18:48 ` Jeremy Fitzhardinge
2009-05-12 1:15 ` Yu, Ke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.