* FW: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
@ 2009-07-29 2:55 Yu, Ke
2009-07-29 4:14 ` Brown, Len
0 siblings, 1 reply; 17+ messages in thread
From: Yu, Ke @ 2009-07-29 2:55 UTC (permalink / raw)
To: Brown, Len; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
[-- Attachment #1: Type: text/plain, Size: 20987 bytes --]
Hi Len,
This patch is the pv_ops domain0 changes to the acpi subsystem. The following has the detail description on its purpose and implementation. May I have your comments?
Best Regards
Ke
-----Original Message-----
From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of Yu, Ke
Sent: Sunday, July 19, 2009 2:46 PM
To: Jeremy Fitzhardinge
Cc: Tian, Kevin; xen-devel@lists.xensource.com
Subject: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
Introduce the external control operation interface for domain0 ACPI parser
From: Yu Ke <ke.yu@intel.com>
This patch introduces the interface of external control operation, and
adds hooks to the acpi sub-system, including the acpi_processor_driver,
and the related library functions.
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states
power management. This info is provided by BIOS ACPI table. Since
hypervisor has no ACPI parser, this info has to be parsed by domain0
kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI
sub-system. Fortunately, kernel already has good abstraction, and
only several places need to add hook. To be more detail, there is an
acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the
Cx/Px parsing event will go to. This driver will call its acpi processor
event handler, e.g. add/remove, start/stop, notify to handle these
events. These event handlers in turn will call some library functions (in
drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed,
acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish
the acpi info parsing.
So the conclusion is: adding hooks in acpi_processor_driver and those
related library functions will satisfy our requirement.
To make the added hook cleaner, we introduce an interface called
external control operation (struct processor_extcntl_ops). All the hooks
are encapsulated in this interface processor_extcntl_ops . Here the
"external" means the acpi processor is controlled by external entity,
e.g. VMM. Every kind of external entity can has its implementation of
this interface. In this patch, the interface for Xen is implemented.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
---
drivers/acpi/Kconfig | 5 +
drivers/acpi/Makefile | 1
drivers/acpi/processor_core.c | 16 +++
drivers/acpi/processor_extcntl.c | 208 ++++++++++++++++++++++++++++++++++++++
drivers/acpi/processor_idle.c | 24 ++++
drivers/acpi/processor_perflib.c | 9 +-
include/acpi/processor.h | 81 +++++++++++++++
7 files changed, 338 insertions(+), 6 deletions(-)
create mode 100644 drivers/acpi/processor_extcntl.c
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 431f8b4..e932ee6 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -332,4 +332,9 @@ config ACPI_SBS
To compile this driver as a module, choose M here:
the modules will be called sbs and sbshc.
+config PROCESSOR_EXTERNAL_CONTROL
+ bool
+ depends on ACPI_PROCESSOR && CPU_FREQ
+ default y
+
endif # ACPI
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 03a985b..2a42a08 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -61,3 +61,4 @@ obj-$(CONFIG_ACPI_SBS) += sbs.o
processor-y := processor_core.o processor_throttling.o
processor-y += processor_idle.o processor_thermal.o
processor-$(CONFIG_CPU_FREQ) += processor_perflib.o
+processor-$(CONFIG_PROCESSOR_EXTERNAL_CONTROL) += processor_extcntl.o
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 45ad328..0b6facc 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -740,6 +740,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+ result = processor_extcntl_prepare(pr);
+ if (result)
+ goto end;
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -952,6 +956,10 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -991,11 +999,19 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+ if (processor_cntl_external())
+ processor_notify_external(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_extcntl.c b/drivers/acpi/processor_extcntl.c
new file mode 100644
index 0000000..af3191f
--- /dev/null
+++ b/drivers/acpi/processor_extcntl.c
@@ -0,0 +1,208 @@
+/*
+ * processor_extcntl.c - channel to external control logic
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <acpi/processor.h>
+
+#define ACPI_PROCESSOR_CLASS "processor"
+#define ACPI_PROCESSOR_DRIVER_NAME "ACPI Processor Driver"
+#define _COMPONENT ACPI_PROCESSOR_COMPONENT
+ACPI_MODULE_NAME("acpi_processor")
+
+static int processor_extcntl_get_performance(struct acpi_processor *pr);
+/*
+ * External processor control logic may register with its own set of
+ * ops to get ACPI related notification. One example is like VMM.
+ */
+const struct processor_extcntl_ops *processor_extcntl_ops;
+EXPORT_SYMBOL(processor_extcntl_ops);
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done = 0;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,"No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_gbl_FADT.pstate_control, acpi_gbl_FADT.smi_command));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_notify_external(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ if (!processor_cntl_external())
+ return -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !processor_extcntl_ops->pm_ops[type])
+ break;
+
+ ret = processor_extcntl_ops->pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (processor_extcntl_ops->hotplug)
+ ret = processor_extcntl_ops->hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * External control logic can decide to grab full or part of physical
+ * processor control bits. Take a VMM for example, physical processors
+ * are owned by VMM and thus existence information like hotplug is
+ * always required to be notified to VMM. Similar is processor idle
+ * state which is also necessarily controlled by VMM. But for other
+ * control bits like performance/throttle states, VMM may choose to
+ * control or not upon its own policy.
+ */
+void processor_extcntl_register(struct processor_extcntl_ops* ops)
+{
+ if (!processor_extcntl_ops)
+ processor_extcntl_ops=ops;
+}
+EXPORT_SYMBOL(processor_extcntl_register);
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy external control model.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+
+ /* Initialize performance states */
+ if (processor_pmperf_external())
+ processor_extcntl_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with external control
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_extcntl_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to external logic. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that external logic exists */
+ processor_notify_smm();
+
+ processor_notify_external(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index abbe2bb..49ccb84 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -425,6 +425,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* cache control methods to notify external logic */
+ if (processor_pm_external())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1120,6 +1126,13 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+ if (processor_pm_external()) {
+ acpi_processor_get_power_info(pr);
+ processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1183,9 +1196,14 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
- acpi_processor_setup_cpuidle(pr);
- if (cpuidle_register_device(&pr->power.dev))
- return -EIO;
+ if (processor_pm_external())
+ processor_notify_external(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+ else {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index cafb410..b222cdb 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -154,13 +154,16 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
int ret;
- if (ignore_ppc)
+ if (ignore_ppc && !processor_pmperf_external())
return 0;
ret = acpi_processor_get_platform_limit(pr);
if (ret < 0)
return (ret);
+ else if (processor_pmperf_external())
+ return processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
else
return cpufreq_update_policy(pr->id);
}
@@ -324,7 +327,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -426,7 +429,7 @@ int acpi_processor_notify_smm(struct module *calling_module)
EXPORT_SYMBOL(acpi_processor_notify_smm);
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index b09c4fd..d6bb2d2 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -77,6 +77,10 @@ struct acpi_processor_cx {
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* Require raw information for external control logic */
+ struct acpi_power_register reg;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
};
struct acpi_processor_power {
@@ -295,6 +299,8 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
void acpi_processor_ppc_init(void);
void acpi_processor_ppc_exit(void);
int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+int acpi_processor_get_performance_info(struct acpi_processor *pr);
+int acpi_processor_get_psd(struct acpi_processor *pr);
#else
static inline void acpi_processor_ppc_init(void)
{
@@ -352,4 +358,79 @@ static inline void acpi_thermal_cpufreq_exit(void)
}
#endif
+/*
+ * Following are interfaces geared to external processor PM control
+ * logic like a VMM
+ */
+/* Events notified to external control logic */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+struct processor_extcntl_ops {
+ /* Transfer processor PM events to external control logic */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to external control logic */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+extern const struct processor_extcntl_ops *processor_extcntl_ops;
+
+static inline int processor_cntl_external(void)
+{
+ return (processor_extcntl_ops != NULL);
+}
+
+static inline int processor_pm_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+static inline int processor_pmperf_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+static inline int processor_pmthr_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_THR] != NULL);
+}
+
+extern int processor_notify_external(struct acpi_processor *pr,
+ int event, int type);
+extern void processor_extcntl_register(struct processor_extcntl_ops* ops);
+extern int processor_extcntl_prepare(struct acpi_processor *pr);
+#else
+static inline int processor_cntl_external(void) {return 0;}
+static inline int processor_pm_external(void) {return 0;}
+static inline int processor_pmperf_external(void) {return 0;}
+static inline int processor_pmthr_external(void) {return 0;}
+static inline int processor_notify_external(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline void processor_extcntl_register(struct processor_extcntl_ops* ops)
+{
+ return 0;
+}
+static inline int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
#endif
[-- Attachment #2: external-control-framework.patch --]
[-- Type: application/octet-stream, Size: 17666 bytes --]
Introduce the external control operation interface for domain0 ACPI parser
From: Yu Ke <ke.yu@intel.com>
This patch introduces the interface of external control operation, and
adds hooks to the acpi sub-system, including the acpi_processor_driver,
and the related library functions.
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states
power management. This info is provided by BIOS ACPI table. Since
hypervisor has no ACPI parser, this info has to be parsed by domain0
kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI
sub-system. Fortunately, kernel already has good abstraction, and
only several places need to add hook. To be more detail, there is an
acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the
Cx/Px parsing event will go to. This driver will call its acpi processor
event handler, e.g. add/remove, start/stop, notify to handle these
events. These event handlers in turn will call some library functions (in
drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed,
acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish
the acpi info parsing.
So the conclusion is: adding hooks in acpi_processor_driver and those
related library functions will satisfy our requirement.
To make the added hook cleaner, we introduce an interface called
external control operation (struct processor_extcntl_ops). All the hooks
are encapsulated in this interface processor_extcntl_ops . Here the
"external" means the acpi processor is controlled by external entity,
e.g. VMM. Every kind of external entity can has its implementation of
this interface. In this patch, the interface for Xen is implemented.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
---
drivers/acpi/Kconfig | 5 +
drivers/acpi/Makefile | 1
drivers/acpi/processor_core.c | 16 +++
drivers/acpi/processor_extcntl.c | 208 ++++++++++++++++++++++++++++++++++++++
drivers/acpi/processor_idle.c | 24 ++++
drivers/acpi/processor_perflib.c | 9 +-
include/acpi/processor.h | 81 +++++++++++++++
7 files changed, 338 insertions(+), 6 deletions(-)
create mode 100644 drivers/acpi/processor_extcntl.c
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 431f8b4..e932ee6 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -332,4 +332,9 @@ config ACPI_SBS
To compile this driver as a module, choose M here:
the modules will be called sbs and sbshc.
+config PROCESSOR_EXTERNAL_CONTROL
+ bool
+ depends on ACPI_PROCESSOR && CPU_FREQ
+ default y
+
endif # ACPI
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 03a985b..2a42a08 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -61,3 +61,4 @@ obj-$(CONFIG_ACPI_SBS) += sbs.o
processor-y := processor_core.o processor_throttling.o
processor-y += processor_idle.o processor_thermal.o
processor-$(CONFIG_CPU_FREQ) += processor_perflib.o
+processor-$(CONFIG_PROCESSOR_EXTERNAL_CONTROL) += processor_extcntl.o
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 45ad328..0b6facc 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -740,6 +740,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+ result = processor_extcntl_prepare(pr);
+ if (result)
+ goto end;
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -952,6 +956,10 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -991,11 +999,19 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+ if (processor_cntl_external())
+ processor_notify_external(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+ if (processor_cntl_external())
+ processor_notify_external(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_extcntl.c b/drivers/acpi/processor_extcntl.c
new file mode 100644
index 0000000..af3191f
--- /dev/null
+++ b/drivers/acpi/processor_extcntl.c
@@ -0,0 +1,208 @@
+/*
+ * processor_extcntl.c - channel to external control logic
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <acpi/processor.h>
+
+#define ACPI_PROCESSOR_CLASS "processor"
+#define ACPI_PROCESSOR_DRIVER_NAME "ACPI Processor Driver"
+#define _COMPONENT ACPI_PROCESSOR_COMPONENT
+ACPI_MODULE_NAME("acpi_processor")
+
+static int processor_extcntl_get_performance(struct acpi_processor *pr);
+/*
+ * External processor control logic may register with its own set of
+ * ops to get ACPI related notification. One example is like VMM.
+ */
+const struct processor_extcntl_ops *processor_extcntl_ops;
+EXPORT_SYMBOL(processor_extcntl_ops);
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done = 0;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,"No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_gbl_FADT.pstate_control, acpi_gbl_FADT.smi_command));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_notify_external(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ if (!processor_cntl_external())
+ return -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !processor_extcntl_ops->pm_ops[type])
+ break;
+
+ ret = processor_extcntl_ops->pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (processor_extcntl_ops->hotplug)
+ ret = processor_extcntl_ops->hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * External control logic can decide to grab full or part of physical
+ * processor control bits. Take a VMM for example, physical processors
+ * are owned by VMM and thus existence information like hotplug is
+ * always required to be notified to VMM. Similar is processor idle
+ * state which is also necessarily controlled by VMM. But for other
+ * control bits like performance/throttle states, VMM may choose to
+ * control or not upon its own policy.
+ */
+void processor_extcntl_register(struct processor_extcntl_ops* ops)
+{
+ if (!processor_extcntl_ops)
+ processor_extcntl_ops=ops;
+}
+EXPORT_SYMBOL(processor_extcntl_register);
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy external control model.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+
+ /* Initialize performance states */
+ if (processor_pmperf_external())
+ processor_extcntl_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with external control
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_extcntl_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to external logic. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that external logic exists */
+ processor_notify_smm();
+
+ processor_notify_external(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index abbe2bb..49ccb84 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -425,6 +425,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* cache control methods to notify external logic */
+ if (processor_pm_external())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1120,6 +1126,13 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+ if (processor_pm_external()) {
+ acpi_processor_get_power_info(pr);
+ processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1183,9 +1196,14 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
- acpi_processor_setup_cpuidle(pr);
- if (cpuidle_register_device(&pr->power.dev))
- return -EIO;
+ if (processor_pm_external())
+ processor_notify_external(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+ else {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index cafb410..b222cdb 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -154,13 +154,16 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
int ret;
- if (ignore_ppc)
+ if (ignore_ppc && !processor_pmperf_external())
return 0;
ret = acpi_processor_get_platform_limit(pr);
if (ret < 0)
return (ret);
+ else if (processor_pmperf_external())
+ return processor_notify_external(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
else
return cpufreq_update_policy(pr->id);
}
@@ -324,7 +327,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -426,7 +429,7 @@ int acpi_processor_notify_smm(struct module *calling_module)
EXPORT_SYMBOL(acpi_processor_notify_smm);
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index b09c4fd..d6bb2d2 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -77,6 +77,10 @@ struct acpi_processor_cx {
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+ /* Require raw information for external control logic */
+ struct acpi_power_register reg;
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
};
struct acpi_processor_power {
@@ -295,6 +299,8 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
void acpi_processor_ppc_init(void);
void acpi_processor_ppc_exit(void);
int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+int acpi_processor_get_performance_info(struct acpi_processor *pr);
+int acpi_processor_get_psd(struct acpi_processor *pr);
#else
static inline void acpi_processor_ppc_init(void)
{
@@ -352,4 +358,79 @@ static inline void acpi_thermal_cpufreq_exit(void)
}
#endif
+/*
+ * Following are interfaces geared to external processor PM control
+ * logic like a VMM
+ */
+/* Events notified to external control logic */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_PROCESSOR_EXTERNAL_CONTROL
+struct processor_extcntl_ops {
+ /* Transfer processor PM events to external control logic */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to external control logic */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+extern const struct processor_extcntl_ops *processor_extcntl_ops;
+
+static inline int processor_cntl_external(void)
+{
+ return (processor_extcntl_ops != NULL);
+}
+
+static inline int processor_pm_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+static inline int processor_pmperf_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+static inline int processor_pmthr_external(void)
+{
+ return processor_cntl_external() &&
+ (processor_extcntl_ops->pm_ops[PM_TYPE_THR] != NULL);
+}
+
+extern int processor_notify_external(struct acpi_processor *pr,
+ int event, int type);
+extern void processor_extcntl_register(struct processor_extcntl_ops* ops);
+extern int processor_extcntl_prepare(struct acpi_processor *pr);
+#else
+static inline int processor_cntl_external(void) {return 0;}
+static inline int processor_pm_external(void) {return 0;}
+static inline int processor_pmperf_external(void) {return 0;}
+static inline int processor_pmthr_external(void) {return 0;}
+static inline int processor_notify_external(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline void processor_extcntl_register(struct processor_extcntl_ops* ops)
+{
+ return 0;
+}
+static inline int processor_extcntl_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_PROCESSOR_EXTERNAL_CONTROL */
+
#endif
[-- Attachment #3: ATT00001.txt --]
[-- Type: text/plain, Size: 142 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply related [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 2:55 FW: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser Yu, Ke
@ 2009-07-29 4:14 ` Brown, Len
2009-07-29 6:20 ` Yu, Ke
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Brown, Len @ 2009-07-29 4:14 UTC (permalink / raw)
To: Yu, Ke; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
So the xen hypervizor is responsible for power management decisions,
but it doesn't know how to talk to the power management
controls in the platform?
Why is that a good idea?
Does somebody expect all this dom0 stuff to really live
in the upstream linux kernel source tree?
I'd like the patch better if you
s/extcntl/xen/ to make it clear why this code exists --
or is there an expected "external control" other than Xen?
-Len
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 4:14 ` Brown, Len
@ 2009-07-29 6:20 ` Yu, Ke
2009-07-29 16:50 ` Jeremy Fitzhardinge
2009-07-30 15:37 ` Len Brown
2009-07-29 14:47 ` Yu, Ke
2009-07-29 16:43 ` Jeremy Fitzhardinge
2 siblings, 2 replies; 17+ messages in thread
From: Yu, Ke @ 2009-07-29 6:20 UTC (permalink / raw)
To: Brown, Len; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
[-- Attachment #1: Type: text/plain, Size: 1537 bytes --]
<Brown, Len> Wrote:
>
>So the xen hypervizor is responsible for power management decisions,
>but it doesn't know how to talk to the power management
>controls in the platform?
>
>Why is that a good idea?
Your question is reasonable. there is also debate here before. People discuss if it is possible to add acpica stuff to xen hypervisor and let xen control the acpi completely. Unfortunately, this will lead dilemma here, i.e. some devices are controlled by dom0 and also need acpi info, e.g. battery, thermal, etc. and pci hotplug in dom0 is another example. Tian Kevin has detail description on this issue in the attached mail.
On the other hand, the dom0 acpica approach has other benefit, i.e. current linux acpica stuff is pretty mature and has numerous bug fix. Leveraging acpica in linux kernel is more practical.
So I would like to say it is a practical idea rather than a good idea.
>
>Does somebody expect all this dom0 stuff to really live
>in the upstream linux kernel source tree?
There are lots of discussion on the dom0 stuff acceptance, and I did not see decision yet. Anyway, I would like to make this patch as clean as possible, so that it can benefit both xen and kernel.
>
>I'd like the patch better if you
>
>s/extcntl/xen/ to make it clear why this code exists --
>or is there an expected "external control" other than Xen?
>
>-Len
Ok, I can try this. BTW, besides this naming change, do you have other comment on the code? So that I can make it more clean.
Best Regards
Ke
[-- Attachment #2: RE Xen-devel Re PATCH RFC x86acpi don't ignore IO APICsjustbecause there's no local APIC.txt --]
[-- Type: text/plain, Size: 3485 bytes --]
From: xen-devel-bounces@lists.xensource.com on behalf of Tian, Kevin
[kevin.tian@intel.com]
Sent: 2009Äê6ÔÂ20ÈÕÐÇÆÚÁù 16:58
To: Eric W. Biederman; Keir Fraser
Cc: Jeremy Fitzhardinge; Xen-devel; the arch/x86 maintainers; Linux
Kernel Mailing List; Ingo Molnar; Nakajima, Jun; H. Peter Anvin;
Thomas Gleixner; Len Brown
Subject: RE: [Xen-devel] Re: [PATCH RFC] x86/acpi: don't ignore I/O APICs
just because there's no local APIC
Attachments: ATT00001.txt
>From: Eric W. Biederman
>Sent: 2009Äê6ÔÂ20ÈÕ 16:22
>
>Keir Fraser <keir.fraser@eu.citrix.com> writes:
>
>> On 20/06/2009 00:44, "Nakajima, Jun" <jun.nakajima@intel.com> wrote:
>>
>>>> I assume that putting AML into Xen has been considered, but I don't
>>>> anything about those deliberations. Keir? Jun?
>>>>
>>>
>>> Yes, it was one of the options years ago. We did not do
>that because Linux and
>>> Solaris (as dom0) already had the AML interpreter and it's
>overkill and
>>> redundant to have such a large component in the Xen
>hypervisor. Since the
>>> hypervisor does most of the power management (i.e. P, C,
>S-state, etc.)
>>> getting the info from dom0 today, we might want to
>reconsider the option.
>>
>> Yes, we could reconsider. However is there any stuff that
>dom0 remains
>> responsible for (e.g., PCI management, and therefore PCI
>hotplug) where it
>> would continue to need to be OSPM, interpreting certain AML
>objects? In
>> general how safe would it be to have two layered entities
>both playing at
>> being OSPM?
>
>Short of running the oddball acpi based drivers. I'm not familiar with
>any acpi in the pci management.
>
PCIe hotplug is defined well by its own BUS spec. But conventional
PCI hotplug is implemented all kinds of strange things. Some is
through ACPI, and thus by moving ACPI into Xen, a new 'virtual' hotplug
architecture has to be introduced into dom0 Linux. Or Xen needs to
emulate some known interface but as said there's no common standard
for PCI hotplug. What's worse is the docking station support which
contains diverse legacy devices. How Xen pass those legacy device
hotplug events into dom0 Linux become another gray area suffering from
same question like whether IOAPIC needs to be changed for Xen...
Above comes from the exclusive assumption that ACPI is removed
from dom0 by moving into Xen.
Another choice is to have two layered ACPI in both dom0 and Xen
with dom0's ACPI virtualized a bit by Xen. However it's messy as
ACPI encodes most stuff in its own AML encode as a gray box.
Many ACPI methods talk to hardware bits internally even by hard
coded I/O registers. You don't know whether one ACPI event
should be handled by Xen or not, until some AML methods have
been evaluated which then may already consume and change
some device states and not reversible. Then Xen have to emulate
those states when injecting a virtual ACPI event into dom0 as
dom0 ACPI methods need to consume same states. However
automatic generating emulation code for diverse ACPI implementations
to me is far more complex than any discussion here.
So the real trouble is ACPI , which encode all platform bits if
they're not included in any existing BUS spec, such as power,
thermal, processor, battery, PCI routing, hotplug, EC, etc. Some
are owned by dom0 and some by Xen. However ACPI's AML encoding
makes automatic division between two categories really difficult.
Thanks,
Kevin
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 4:14 ` Brown, Len
2009-07-29 6:20 ` Yu, Ke
@ 2009-07-29 14:47 ` Yu, Ke
2009-07-30 16:29 ` Len Brown
2009-07-29 16:43 ` Jeremy Fitzhardinge
2 siblings, 1 reply; 17+ messages in thread
From: Yu, Ke @ 2009-07-29 14:47 UTC (permalink / raw)
To: Brown, Len; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
[-- Attachment #1: Type: text/plain, Size: 27165 bytes --]
>I'd like the patch better if you
>
>s/extcntl/xen/ to make it clear why this code exists --
>or is there an expected "external control" other than Xen?
>
>-Len
This attached is the revised patch per your suggestion, which move all the external control logic to xen specific file, thus reduce the modification to acpi subsystem to only several places.
== PATCH ==
Leverage domain0 ACPI parser for xen
From: Yu Ke <ke.yu@intel.com>
This patch reuse dom0 ACPI parser to get C/P state for Xen
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states
power management. This info is provided by BIOS ACPI table. Since
hypervisor has no ACPI parser, this info has to be parsed by domain0
kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI
sub-system. Fortunately, kernel already has good abstraction, and
only several places need to add hook. To be more detail, there is an
acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the
Cx/Px parsing event will go to. This driver will call its acpi processor
event handler, e.g. add/remove, start/stop, notify to handle these
events. These event handlers in turn will call some library functions (in
drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed,
acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish
the acpi info parsing.
So this patch add the xen hook in these places to notify xen for the parsed
Cx/Px state information.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
---
drivers/acpi/processor_core.c | 17 ++
drivers/acpi/processor_idle.c | 25 ++
drivers/acpi/processor_perflib.c | 10 +
drivers/xen/Kconfig | 7 +
drivers/xen/Makefile | 3
drivers/xen/processor_extcntl.c | 413 ++++++++++++++++++++++++++++++++++++++
include/acpi/processor.h | 6 +
include/xen/acpi.h | 55 +++++
8 files changed, 528 insertions(+), 8 deletions(-)
create mode 100644 drivers/xen/processor_extcntl.c
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 84e0f3c..2707d65 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_DEVICE_NAME "Processor"
@@ -751,6 +752,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+ result = processor_cntl_xen_prepare(pr);
+ if (result)
+ goto end;
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -963,6 +968,10 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -1002,11 +1011,19 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 0efa59e..8994aff 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#include <asm/processor.h>
#define ACPI_PROCESSOR_CLASS "processor"
@@ -455,6 +456,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* cache control methods to notify xen*/
+ if (processor_cntl_xen_pm())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1141,6 +1148,13 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+ if (processor_cntl_xen_pm()) {
+ acpi_processor_get_power_info(pr);
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1204,9 +1218,14 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
- acpi_processor_setup_cpuidle(pr);
- if (cpuidle_register_device(&pr->power.dev))
- return -EIO;
+ if (processor_cntl_xen_pm())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+ else {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index 60e543d..8375075 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -38,6 +38,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_FILE_PERFORMANCE "performance"
@@ -154,13 +155,16 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
int ret;
- if (ignore_ppc)
+ if (ignore_ppc && !processor_cntl_xen_pmperf())
return 0;
ret = acpi_processor_get_platform_limit(pr);
if (ret < 0)
return (ret);
+ else if (processor_cntl_xen_pmperf())
+ return processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
else
return cpufreq_update_policy(pr->id);
}
@@ -330,7 +334,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -432,7 +436,7 @@ int acpi_processor_notify_smm(struct module *calling_module)
EXPORT_SYMBOL(acpi_processor_notify_smm);
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 3b1c421..d303c25 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -90,4 +90,9 @@ config XEN_XENBUS_FRONTEND
config XEN_S3
def_bool y
- depends on XEN_DOM0 && ACPI
\ No newline at end of file
+ depends on XEN_DOM0 && ACPI
+
+config ACPI_PROCESSOR_XEN
+ bool
+ depends on XEN_DOM0 && ACPI_PROCESSOR && CPU_FREQ
+ default y
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 386c775..42c9ace 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_BLKDEV_BACKEND) += blkback/
obj-$(CONFIG_XEN_NETDEV_BACKEND) += netback/
obj-$(CONFIG_XENFS) += xenfs/
obj-$(CONFIG_XEN_SYS_HYPERVISOR) += sys-hypervisor.o
-obj-$(CONFIG_XEN_S3) += acpi.o
\ No newline at end of file
+obj-$(CONFIG_XEN_S3) += acpi.o
+obj-$(CONFIG_ACPI_PROCESSOR_XEN) += processor_extcntl.o
diff --git a/drivers/xen/processor_extcntl.c b/drivers/xen/processor_extcntl.c
new file mode 100644
index 0000000..7307cd8
--- /dev/null
+++ b/drivers/xen/processor_extcntl.c
@@ -0,0 +1,413 @@
+/*
+ * processor_extcntl.c - interface to notify Xen
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <linux/cpufreq.h>
+#include <acpi/processor.h>
+#include <xen/acpi.h>
+
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr);
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event);
+
+static struct processor_cntl_xen_ops xen_ops = {
+ .hotplug = xen_hotplug_notifier,
+};
+
+int processor_cntl_xen(void)
+{
+ return 1;
+}
+
+int processor_cntl_xen_pm(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+int processor_cntl_xen_pmperf(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+int processor_cntl_xen_pmthr(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_THR] != NULL);
+}
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done = 0;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,"No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_gbl_FADT.pstate_control, acpi_gbl_FADT.smi_command));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_cntl_xen_notify(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !xen_ops.pm_ops[type])
+ break;
+
+ ret = xen_ops.pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (xen_ops.hotplug)
+ ret = xen_ops.hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy xen.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+
+ /* Initialize performance states */
+ if (processor_cntl_xen_pmperf())
+ processor_cntl_xen_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with xen PM
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to xen. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that xen exists */
+ processor_notify_smm();
+
+ processor_cntl_xen_notify(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
+
+static inline void xen_convert_pct_reg(struct xen_pct_register *xpct,
+ struct acpi_pct_register *apct)
+{
+ xpct->descriptor = apct->descriptor;
+ xpct->length = apct->length;
+ xpct->space_id = apct->space_id;
+ xpct->bit_width = apct->bit_width;
+ xpct->bit_offset = apct->bit_offset;
+ xpct->reserved = apct->reserved;
+ xpct->address = apct->address;
+}
+
+static inline void xen_convert_pss_states(struct xen_processor_px *xpss,
+ struct acpi_processor_px *apss, int state_count)
+{
+ int i;
+ for(i=0; i<state_count; i++) {
+ xpss->core_frequency = apss->core_frequency;
+ xpss->power = apss->power;
+ xpss->transition_latency = apss->transition_latency;
+ xpss->bus_master_latency = apss->bus_master_latency;
+ xpss->control = apss->control;
+ xpss->status = apss->status;
+ xpss++;
+ apss++;
+ }
+}
+
+static inline void xen_convert_psd_pack(struct xen_psd_package *xpsd,
+ struct acpi_psd_package *apsd)
+{
+ xpsd->num_entries = apsd->num_entries;
+ xpsd->revision = apsd->revision;
+ xpsd->domain = apsd->domain;
+ xpsd->coord_type = apsd->coord_type;
+ xpsd->num_processors = apsd->num_processors;
+}
+
+static int xen_cx_notifier(struct acpi_processor *pr, int action)
+{
+ int ret, count = 0, i;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_CX,
+ };
+ struct xen_processor_cx *data, *buf;
+ struct acpi_processor_cx *cx;
+
+ if (action == PROCESSOR_PM_CHANGE)
+ return -EINVAL;
+
+ /* Convert to Xen defined structure and hypercall */
+ buf = kzalloc(pr->power.count * sizeof(struct xen_processor_cx),
+ GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ data = buf;
+ for (i = 1; i <= pr->power.count; i++) {
+ cx = &pr->power.states[i];
+ /* Skip invalid cstate entry */
+ if (!cx->valid)
+ continue;
+
+ data->type = cx->type;
+ data->latency = cx->latency;
+ data->power = cx->power;
+ data->reg.space_id = cx->reg.space_id;
+ data->reg.bit_width = cx->reg.bit_width;
+ data->reg.bit_offset = cx->reg.bit_offset;
+ data->reg.access_size = cx->reg.reserved;
+ data->reg.address = cx->reg.address;
+
+ /* Get dependency relationships, _CSD is not supported yet */
+ data->dpcnt = 0;
+ set_xen_guest_handle(data->dp, NULL);
+
+ data++;
+ count++;
+ }
+
+ if (!count) {
+ printk("No available Cx info for cpu %d\n", pr->acpi_id);
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ op.u.set_pminfo.power.count = count;
+ op.u.set_pminfo.power.flags.bm_control = pr->flags.bm_control;
+ op.u.set_pminfo.power.flags.bm_check = pr->flags.bm_check;
+ op.u.set_pminfo.power.flags.has_cst = pr->flags.has_cst;
+ op.u.set_pminfo.power.flags.power_setup_done = pr->flags.power_setup_done;
+
+ set_xen_guest_handle(op.u.set_pminfo.power.states, buf);
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(buf);
+ return ret;
+}
+
+static int xen_px_notifier(struct acpi_processor *pr, int action)
+{
+ int ret = -EINVAL;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_PX,
+ };
+ struct xen_processor_performance *perf;
+ struct xen_processor_px *states = NULL;
+ struct acpi_processor_performance *px;
+ struct acpi_psd_package *pdomain;
+
+ if (!pr)
+ return -EINVAL;
+
+ perf = &op.u.set_pminfo.perf;
+ px = pr->performance;
+
+ switch(action) {
+ case PROCESSOR_PM_CHANGE:
+ /* ppc dynamic handle */
+ perf->flags = XEN_PX_PPC;
+ perf->platform_limit = pr->performance_platform_limit;
+
+ ret = HYPERVISOR_dom0_op(&op);
+ break;
+
+ case PROCESSOR_PM_INIT:
+ /* px normal init */
+ perf->flags = XEN_PX_PPC |
+ XEN_PX_PCT |
+ XEN_PX_PSS |
+ XEN_PX_PSD;
+
+ /* ppc */
+ perf->platform_limit = pr->performance_platform_limit;
+
+ /* pct */
+ xen_convert_pct_reg(&perf->control_register, &px->control_register);
+ xen_convert_pct_reg(&perf->status_register, &px->status_register);
+
+ /* pss */
+ perf->state_count = px->state_count;
+ states = kzalloc(px->state_count*sizeof(xen_processor_px_t),GFP_KERNEL);
+ if (!states)
+ return -ENOMEM;
+ xen_convert_pss_states(states, px->states, px->state_count);
+ set_xen_guest_handle(perf->states, states);
+
+ /* psd */
+ pdomain = &px->domain_info;
+ xen_convert_psd_pack(&perf->domain_info, pdomain);
+ if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ALL;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_HW;
+ else {
+ ret = -ENODEV;
+ kfree(states);
+ break;
+ }
+
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(states);
+ break;
+
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int xen_tx_notifier(struct acpi_processor *pr, int action)
+{
+ return -EINVAL;
+}
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event)
+{
+ return -EINVAL;
+}
+
+static int __init xen_acpi_processor_extcntl_init(void)
+{
+ unsigned int pmbits = (xen_start_info->flags & SIF_PM_MASK) >> 8;
+
+ if (!pmbits)
+ return 0;
+ if (pmbits & XEN_PROCESSOR_PM_CX)
+ xen_ops.pm_ops[PM_TYPE_IDLE] = xen_cx_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_PX)
+ xen_ops.pm_ops[PM_TYPE_PERF] = xen_px_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_TX)
+ xen_ops.pm_ops[PM_TYPE_THR] = xen_tx_notifier;
+
+ return 0;
+}
+
+subsys_initcall(xen_acpi_processor_extcntl_init);
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index baf1e0a..14c7e4c 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -77,6 +77,10 @@ struct acpi_processor_cx {
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* Require raw information for xen*/
+ struct acpi_power_register reg;
+#endif /* CONFIG_ACPI_PROCESSOE_XEN */
};
struct acpi_processor_power {
@@ -295,6 +299,8 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
void acpi_processor_ppc_init(void);
void acpi_processor_ppc_exit(void);
int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+int acpi_processor_get_performance_info(struct acpi_processor *pr);
+int acpi_processor_get_psd(struct acpi_processor *pr);
#else
static inline void acpi_processor_ppc_init(void)
{
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index fea4cfb..636c3e6 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -20,4 +20,59 @@ static inline bool xen_pv_acpi(void)
int acpi_notify_hypervisor_state(u8 sleep_state,
u32 pm1a_cnt, u32 pm1b_cnd);
+/*
+ * Following are interfaces for xen acpi processor control
+ */
+
+/* Events notified to xen */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+#include <acpi/acpi_drivers.h>
+#include <acpi/processor.h>
+
+struct processor_cntl_xen_ops {
+ /* Transfer processor PM events to xen */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to xen */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+
+extern int processor_cntl_xen(void);
+extern int processor_cntl_xen_pm(void);
+extern int processor_cntl_xen_pmperf(void);
+extern int processor_cntl_xen_pmthr(void);
+extern int processor_cntl_xen_prepare(struct acpi_processor *pr);
+extern int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type);
+
+#else
+
+static inline int processor_cntl_xen(void) {return 0;}
+static inline int processor_cntl_xen_pm(void) {return 0;}
+static inline int processor_cntl_xen_pmperf(void) {return 0;}
+static inline int processor_cntl_xen_pmthr(void) {return 0;}
+static inline int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_ACPI_PROCESSOR_XEN */
+
#endif /* _XEN_ACPI_H */
[-- Attachment #2: 03-introduce-the-external-control.patch.patch --]
[-- Type: application/octet-stream, Size: 22203 bytes --]
Leverage domain0 ACPI parser for xen
From: Yu Ke <ke.yu@intel.com>
This patch reuse dom0 ACPI parser to get C/P state for Xen
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states
power management. This info is provided by BIOS ACPI table. Since
hypervisor has no ACPI parser, this info has to be parsed by domain0
kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI
sub-system. Fortunately, kernel already has good abstraction, and
only several places need to add hook. To be more detail, there is an
acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the
Cx/Px parsing event will go to. This driver will call its acpi processor
event handler, e.g. add/remove, start/stop, notify to handle these
events. These event handlers in turn will call some library functions (in
drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed,
acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish
the acpi info parsing.
So this patch add the xen hook in these places to notify xen for the parsed
Cx/Px state information.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
---
drivers/acpi/processor_core.c | 17 ++
drivers/acpi/processor_idle.c | 25 ++
drivers/acpi/processor_perflib.c | 10 +
drivers/xen/Kconfig | 7 +
drivers/xen/Makefile | 3
drivers/xen/processor_extcntl.c | 413 ++++++++++++++++++++++++++++++++++++++
include/acpi/processor.h | 6 +
include/xen/acpi.h | 55 +++++
8 files changed, 528 insertions(+), 8 deletions(-)
create mode 100644 drivers/xen/processor_extcntl.c
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 84e0f3c..2707d65 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_DEVICE_NAME "Processor"
@@ -751,6 +752,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+ result = processor_cntl_xen_prepare(pr);
+ if (result)
+ goto end;
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -963,6 +968,10 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -1002,11 +1011,19 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 0efa59e..8994aff 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#include <asm/processor.h>
#define ACPI_PROCESSOR_CLASS "processor"
@@ -455,6 +456,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* cache control methods to notify xen*/
+ if (processor_cntl_xen_pm())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1141,6 +1148,13 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+ if (processor_cntl_xen_pm()) {
+ acpi_processor_get_power_info(pr);
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1204,9 +1218,14 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
- acpi_processor_setup_cpuidle(pr);
- if (cpuidle_register_device(&pr->power.dev))
- return -EIO;
+ if (processor_cntl_xen_pm())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+ else {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index 60e543d..8375075 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -38,6 +38,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_FILE_PERFORMANCE "performance"
@@ -154,13 +155,16 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
int ret;
- if (ignore_ppc)
+ if (ignore_ppc && !processor_cntl_xen_pmperf())
return 0;
ret = acpi_processor_get_platform_limit(pr);
if (ret < 0)
return (ret);
+ else if (processor_cntl_xen_pmperf())
+ return processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
else
return cpufreq_update_policy(pr->id);
}
@@ -330,7 +334,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -432,7 +436,7 @@ int acpi_processor_notify_smm(struct module *calling_module)
EXPORT_SYMBOL(acpi_processor_notify_smm);
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 3b1c421..d303c25 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -90,4 +90,9 @@ config XEN_XENBUS_FRONTEND
config XEN_S3
def_bool y
- depends on XEN_DOM0 && ACPI
\ No newline at end of file
+ depends on XEN_DOM0 && ACPI
+
+config ACPI_PROCESSOR_XEN
+ bool
+ depends on XEN_DOM0 && ACPI_PROCESSOR && CPU_FREQ
+ default y
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 386c775..42c9ace 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_BLKDEV_BACKEND) += blkback/
obj-$(CONFIG_XEN_NETDEV_BACKEND) += netback/
obj-$(CONFIG_XENFS) += xenfs/
obj-$(CONFIG_XEN_SYS_HYPERVISOR) += sys-hypervisor.o
-obj-$(CONFIG_XEN_S3) += acpi.o
\ No newline at end of file
+obj-$(CONFIG_XEN_S3) += acpi.o
+obj-$(CONFIG_ACPI_PROCESSOR_XEN) += processor_extcntl.o
diff --git a/drivers/xen/processor_extcntl.c b/drivers/xen/processor_extcntl.c
new file mode 100644
index 0000000..7307cd8
--- /dev/null
+++ b/drivers/xen/processor_extcntl.c
@@ -0,0 +1,413 @@
+/*
+ * processor_extcntl.c - interface to notify Xen
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <linux/cpufreq.h>
+#include <acpi/processor.h>
+#include <xen/acpi.h>
+
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr);
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event);
+
+static struct processor_cntl_xen_ops xen_ops = {
+ .hotplug = xen_hotplug_notifier,
+};
+
+int processor_cntl_xen(void)
+{
+ return 1;
+}
+
+int processor_cntl_xen_pm(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+int processor_cntl_xen_pmperf(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+int processor_cntl_xen_pmthr(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_THR] != NULL);
+}
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done = 0;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,"No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_gbl_FADT.pstate_control, acpi_gbl_FADT.smi_command));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_cntl_xen_notify(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !xen_ops.pm_ops[type])
+ break;
+
+ ret = xen_ops.pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (xen_ops.hotplug)
+ ret = xen_ops.hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy xen.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+
+ /* Initialize performance states */
+ if (processor_cntl_xen_pmperf())
+ processor_cntl_xen_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with xen PM
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to xen. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that xen exists */
+ processor_notify_smm();
+
+ processor_cntl_xen_notify(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
+
+static inline void xen_convert_pct_reg(struct xen_pct_register *xpct,
+ struct acpi_pct_register *apct)
+{
+ xpct->descriptor = apct->descriptor;
+ xpct->length = apct->length;
+ xpct->space_id = apct->space_id;
+ xpct->bit_width = apct->bit_width;
+ xpct->bit_offset = apct->bit_offset;
+ xpct->reserved = apct->reserved;
+ xpct->address = apct->address;
+}
+
+static inline void xen_convert_pss_states(struct xen_processor_px *xpss,
+ struct acpi_processor_px *apss, int state_count)
+{
+ int i;
+ for(i=0; i<state_count; i++) {
+ xpss->core_frequency = apss->core_frequency;
+ xpss->power = apss->power;
+ xpss->transition_latency = apss->transition_latency;
+ xpss->bus_master_latency = apss->bus_master_latency;
+ xpss->control = apss->control;
+ xpss->status = apss->status;
+ xpss++;
+ apss++;
+ }
+}
+
+static inline void xen_convert_psd_pack(struct xen_psd_package *xpsd,
+ struct acpi_psd_package *apsd)
+{
+ xpsd->num_entries = apsd->num_entries;
+ xpsd->revision = apsd->revision;
+ xpsd->domain = apsd->domain;
+ xpsd->coord_type = apsd->coord_type;
+ xpsd->num_processors = apsd->num_processors;
+}
+
+static int xen_cx_notifier(struct acpi_processor *pr, int action)
+{
+ int ret, count = 0, i;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_CX,
+ };
+ struct xen_processor_cx *data, *buf;
+ struct acpi_processor_cx *cx;
+
+ if (action == PROCESSOR_PM_CHANGE)
+ return -EINVAL;
+
+ /* Convert to Xen defined structure and hypercall */
+ buf = kzalloc(pr->power.count * sizeof(struct xen_processor_cx),
+ GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ data = buf;
+ for (i = 1; i <= pr->power.count; i++) {
+ cx = &pr->power.states[i];
+ /* Skip invalid cstate entry */
+ if (!cx->valid)
+ continue;
+
+ data->type = cx->type;
+ data->latency = cx->latency;
+ data->power = cx->power;
+ data->reg.space_id = cx->reg.space_id;
+ data->reg.bit_width = cx->reg.bit_width;
+ data->reg.bit_offset = cx->reg.bit_offset;
+ data->reg.access_size = cx->reg.reserved;
+ data->reg.address = cx->reg.address;
+
+ /* Get dependency relationships, _CSD is not supported yet */
+ data->dpcnt = 0;
+ set_xen_guest_handle(data->dp, NULL);
+
+ data++;
+ count++;
+ }
+
+ if (!count) {
+ printk("No available Cx info for cpu %d\n", pr->acpi_id);
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ op.u.set_pminfo.power.count = count;
+ op.u.set_pminfo.power.flags.bm_control = pr->flags.bm_control;
+ op.u.set_pminfo.power.flags.bm_check = pr->flags.bm_check;
+ op.u.set_pminfo.power.flags.has_cst = pr->flags.has_cst;
+ op.u.set_pminfo.power.flags.power_setup_done = pr->flags.power_setup_done;
+
+ set_xen_guest_handle(op.u.set_pminfo.power.states, buf);
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(buf);
+ return ret;
+}
+
+static int xen_px_notifier(struct acpi_processor *pr, int action)
+{
+ int ret = -EINVAL;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_PX,
+ };
+ struct xen_processor_performance *perf;
+ struct xen_processor_px *states = NULL;
+ struct acpi_processor_performance *px;
+ struct acpi_psd_package *pdomain;
+
+ if (!pr)
+ return -EINVAL;
+
+ perf = &op.u.set_pminfo.perf;
+ px = pr->performance;
+
+ switch(action) {
+ case PROCESSOR_PM_CHANGE:
+ /* ppc dynamic handle */
+ perf->flags = XEN_PX_PPC;
+ perf->platform_limit = pr->performance_platform_limit;
+
+ ret = HYPERVISOR_dom0_op(&op);
+ break;
+
+ case PROCESSOR_PM_INIT:
+ /* px normal init */
+ perf->flags = XEN_PX_PPC |
+ XEN_PX_PCT |
+ XEN_PX_PSS |
+ XEN_PX_PSD;
+
+ /* ppc */
+ perf->platform_limit = pr->performance_platform_limit;
+
+ /* pct */
+ xen_convert_pct_reg(&perf->control_register, &px->control_register);
+ xen_convert_pct_reg(&perf->status_register, &px->status_register);
+
+ /* pss */
+ perf->state_count = px->state_count;
+ states = kzalloc(px->state_count*sizeof(xen_processor_px_t),GFP_KERNEL);
+ if (!states)
+ return -ENOMEM;
+ xen_convert_pss_states(states, px->states, px->state_count);
+ set_xen_guest_handle(perf->states, states);
+
+ /* psd */
+ pdomain = &px->domain_info;
+ xen_convert_psd_pack(&perf->domain_info, pdomain);
+ if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ALL;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_HW;
+ else {
+ ret = -ENODEV;
+ kfree(states);
+ break;
+ }
+
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(states);
+ break;
+
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int xen_tx_notifier(struct acpi_processor *pr, int action)
+{
+ return -EINVAL;
+}
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event)
+{
+ return -EINVAL;
+}
+
+static int __init xen_acpi_processor_extcntl_init(void)
+{
+ unsigned int pmbits = (xen_start_info->flags & SIF_PM_MASK) >> 8;
+
+ if (!pmbits)
+ return 0;
+ if (pmbits & XEN_PROCESSOR_PM_CX)
+ xen_ops.pm_ops[PM_TYPE_IDLE] = xen_cx_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_PX)
+ xen_ops.pm_ops[PM_TYPE_PERF] = xen_px_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_TX)
+ xen_ops.pm_ops[PM_TYPE_THR] = xen_tx_notifier;
+
+ return 0;
+}
+
+subsys_initcall(xen_acpi_processor_extcntl_init);
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index baf1e0a..14c7e4c 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -77,6 +77,10 @@ struct acpi_processor_cx {
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* Require raw information for xen*/
+ struct acpi_power_register reg;
+#endif /* CONFIG_ACPI_PROCESSOE_XEN */
};
struct acpi_processor_power {
@@ -295,6 +299,8 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
void acpi_processor_ppc_init(void);
void acpi_processor_ppc_exit(void);
int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+int acpi_processor_get_performance_info(struct acpi_processor *pr);
+int acpi_processor_get_psd(struct acpi_processor *pr);
#else
static inline void acpi_processor_ppc_init(void)
{
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index fea4cfb..636c3e6 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -20,4 +20,59 @@ static inline bool xen_pv_acpi(void)
int acpi_notify_hypervisor_state(u8 sleep_state,
u32 pm1a_cnt, u32 pm1b_cnd);
+/*
+ * Following are interfaces for xen acpi processor control
+ */
+
+/* Events notified to xen */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+#include <acpi/acpi_drivers.h>
+#include <acpi/processor.h>
+
+struct processor_cntl_xen_ops {
+ /* Transfer processor PM events to xen */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to xen */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+
+extern int processor_cntl_xen(void);
+extern int processor_cntl_xen_pm(void);
+extern int processor_cntl_xen_pmperf(void);
+extern int processor_cntl_xen_pmthr(void);
+extern int processor_cntl_xen_prepare(struct acpi_processor *pr);
+extern int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type);
+
+#else
+
+static inline int processor_cntl_xen(void) {return 0;}
+static inline int processor_cntl_xen_pm(void) {return 0;}
+static inline int processor_cntl_xen_pmperf(void) {return 0;}
+static inline int processor_cntl_xen_pmthr(void) {return 0;}
+static inline int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_ACPI_PROCESSOR_XEN */
+
#endif /* _XEN_ACPI_H */
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 4:14 ` Brown, Len
2009-07-29 6:20 ` Yu, Ke
2009-07-29 14:47 ` Yu, Ke
@ 2009-07-29 16:43 ` Jeremy Fitzhardinge
2009-07-30 8:59 ` Yu, Ke
2 siblings, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-29 16:43 UTC (permalink / raw)
To: Brown, Len; +Cc: Yu, Ke, linux-acpi@vger.kernel.org, Tian, Kevin
On 07/28/09 21:14, Brown, Len wrote:
> Does somebody expect all this dom0 stuff to really live
> in the upstream linux kernel source tree?
>
The control domain (dom0) patches are held up by a very specific set of
concerns that I'm working on now. I don't think there's any fundamental
reason it won't make it in at some point, and there's no reason not to
lay the groundwork in the meantime.
These acpi patches are very useful, but not essential functionality.
I'd like to see them evolve along lines that are acceptable to everyone
before making a serious push upstream.
> s/extcntl/xen/ to make it clear why this code exists --
> or is there an expected "external control" other than Xen?
>
I dislike making Xen-specific changes. Ideally we can find a way to fit
these changes into some other abstraction which already exists, or if
added would be useful to more than one user.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 6:20 ` Yu, Ke
@ 2009-07-29 16:50 ` Jeremy Fitzhardinge
2009-07-30 9:18 ` Yu, Ke
2009-07-30 15:37 ` Len Brown
1 sibling, 1 reply; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-29 16:50 UTC (permalink / raw)
To: Yu, Ke; +Cc: Brown, Len, linux-acpi@vger.kernel.org, Tian, Kevin
On 07/28/09 23:20, Yu, Ke wrote:
> Your question is reasonable. there is also debate here before. People discuss if it is possible to add acpica stuff to xen hypervisor and let xen control the acpi completely. Unfortunately, this will lead dilemma here, i.e. some devices are controlled by dom0 and also need acpi info, e.g. battery, thermal, etc. and pci hotplug in dom0 is another example. Tian Kevin has detail description on this issue in the attached mail.
>
What would happen if we special-cased dom0 VCPUs to be bound 1:1 to
PCPUs. Would that simplify this stuff? Wouldn't that make CPU control
equivalent to battery, fan, etc?
> On the other hand, the dom0 acpica approach has other benefit, i.e. current linux acpica stuff is pretty mature and has numerous bug fix. Leveraging acpica in linux kernel is more practical.
>
My understanding is that all that code is supposed to be
kernel-independent. We could lift all that code as-is, write some new
kernel interface shims and shove it all into Xen. But I don't know if
that solves any more problems than it causes.
>> s/extcntl/xen/ to make it clear why this code exists --
>> or is there an expected "external control" other than Xen?
>>
>> -Len
>>
>
> Ok, I can try this. BTW, besides this naming change, do you have other comment on the code? So that I can make it more clean.
>
I don't think that's a good idea. If we need to do that, then there's a
deeper design problem we need to address. (And, if nothing else, people
have become hypersensitive to naked Xen-specific code references in
non-Xen files, and I'm tired of having those arguments.)
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 16:43 ` Jeremy Fitzhardinge
@ 2009-07-30 8:59 ` Yu, Ke
2009-07-30 15:03 ` Brown, Len
0 siblings, 1 reply; 17+ messages in thread
From: Yu, Ke @ 2009-07-30 8:59 UTC (permalink / raw)
To: Jeremy Fitzhardinge, Brown, Len; +Cc: linux-acpi@vger.kernel.org, Tian, Kevin
>> s/extcntl/xen/ to make it clear why this code exists --
>> or is there an expected "external control" other than Xen?
>>
>
>I dislike making Xen-specific changes. Ideally we can find a way to fit
>these changes into some other abstraction which already exists, or if
>added would be useful to more than one user.
>
> J
The "external logic" is the way we trying to make it a generic abstraction, but right now I cannot told who would be the external entity except xen. Maybe in the future if other software want to leverage linux acpica, they would also use it.
If people has other idea on the abstraction, I would be happy to hear that.
BTW, Len, do you think the hook this patch adds to the acpi subsystem is OK, regardless its name being "xen" or "extcntl"?
Best Regards
Ke
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 16:50 ` Jeremy Fitzhardinge
@ 2009-07-30 9:18 ` Yu, Ke
2009-07-30 16:00 ` Len Brown
2009-07-30 17:23 ` Jeremy Fitzhardinge
0 siblings, 2 replies; 17+ messages in thread
From: Yu, Ke @ 2009-07-30 9:18 UTC (permalink / raw)
To: Jeremy Fitzhardinge; +Cc: Brown, Len, linux-acpi@vger.kernel.org, Tian, Kevin
>-----Original Message-----
>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]
>Sent: Thursday, July 30, 2009 12:50 AM
>To: Yu, Ke
>Cc: Brown, Len; linux-acpi@vger.kernel.org; Tian, Kevin
>Subject: Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external
>control operation interface for domain0 ACPI parser
>
>On 07/28/09 23:20, Yu, Ke wrote:
>> Your question is reasonable. there is also debate here before. People discuss if
>it is possible to add acpica stuff to xen hypervisor and let xen control the acpi
>completely. Unfortunately, this will lead dilemma here, i.e. some devices are
>controlled by dom0 and also need acpi info, e.g. battery, thermal, etc. and pci
>hotplug in dom0 is another example. Tian Kevin has detail description on this
>issue in the attached mail.
>>
>
>What would happen if we special-cased dom0 VCPUs to be bound 1:1 to
>PCPUs. Would that simplify this stuff? Wouldn't that make CPU control
>equivalent to battery, fan, etc?
It helps in current approach, i.e. acpi dsdt table is owned and parsed by dom0. Actually, Tx state support in current approach need 1:1 vcpu-pcpu binding.
Meanwhile, in current approach, we still need to let dom0 passing acpi info to xen, which is what this patch intend to do.
>
>> On the other hand, the dom0 acpica approach has other benefit, i.e. current
>linux acpica stuff is pretty mature and has numerous bug fix. Leveraging acpica in
>linux kernel is more practical.
>>
>
>My understanding is that all that code is supposed to be
>kernel-independent. We could lift all that code as-is, write some new
>kernel interface shims and shove it all into Xen. But I don't know if
>that solves any more problems than it causes.
>
>>> s/extcntl/xen/ to make it clear why this code exists --
>>> or is there an expected "external control" other than Xen?
>>>
>>> -Len
>>>
>>
>> Ok, I can try this. BTW, besides this naming change, do you have other
>comment on the code? So that I can make it more clean.
>>
>
>I don't think that's a good idea. If we need to do that, then there's a
>deeper design problem we need to address. (And, if nothing else, people
>have become hypersensitive to naked Xen-specific code references in
>non-Xen files, and I'm tired of having those arguments.)
>
> J
Can you explain what the deeper design problem is?
Best Regards
Ke
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 8:59 ` Yu, Ke
@ 2009-07-30 15:03 ` Brown, Len
0 siblings, 0 replies; 17+ messages in thread
From: Brown, Len @ 2009-07-30 15:03 UTC (permalink / raw)
To: Yu, Ke, Jeremy Fitzhardinge; +Cc: linux-acpi@vger.kernel.org, Tian, Kevin
>>> s/extcntl/xen/ to make it clear why this code exists --
>>> or is there an expected "external control" other than Xen?
>>>
>>
>>I dislike making Xen-specific changes. Ideally we can find a
>way to fit
>>these changes into some other abstraction which already exists, or if
>>added would be useful to more than one user.
>
>
>The "external logic" is the way we trying to make it a generic
>abstraction, but right now I cannot told who would be the
>external entity except xen. Maybe in the future if other
>software want to leverage linux acpica, they would also use it.
>
>If people has other idea on the abstraction, I would be happy
>to hear that.
>
>BTW, Len, do you think the hook this patch adds to the acpi
>subsystem is OK, regardless its name being "xen" or "extcntl"?
Call it xen when xen is the only user.
When you have a 2nd user, call it something more general.
-Len
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 6:20 ` Yu, Ke
2009-07-29 16:50 ` Jeremy Fitzhardinge
@ 2009-07-30 15:37 ` Len Brown
2009-07-30 20:52 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 17+ messages in thread
From: Len Brown @ 2009-07-30 15:37 UTC (permalink / raw)
To: Yu, Ke; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
> >So the xen hypervizor is responsible for power management decisions,
> >but it doesn't know how to talk to the power management
> >controls in the platform?
> >
> >Why is that a good idea?
>
> Your question is reasonable. there is also debate here before. People discuss if it is possible to add acpica stuff to xen hypervisor and let xen control the acpi completely. Unfortunately, this will lead dilemma here, i.e. some devices are controlled by dom0 and also need acpi info, e.g. battery, thermal, etc. and pci hotplug in dom0 is another example. Tian Kevin has detail description on this issue in the attached mail.
>
> On the other hand, the dom0 acpica approach has other benefit, i.e. current linux acpica stuff is pretty mature and has numerous bug fix. Leveraging acpica in linux kernel is more practical.
I agree with Kevin that it would be a mistake to put ACPI both into
both dom0 and the hypervisor. Frankly, on many levels, ACPI was
designed with Windows in mind, and the further an OS strays from
how Windows does things, the less likely you'll run well on many
systems. Obviously, Xen looks nothing like windows, or any other OS,
for it seems to have not one division between implementation and
policy, but multiple...
So I have a fundamental lack of understanding of the logic
behind the partitioning behind the hypervisor and dom0.
Maybe somebody explain it to me in terms that I'll understand?
It reminds me of the partitioning between the Mach microkernel
and the "user-space OS personality, eg Unix". This looked really neat
in proposals for funding and academic papers, but in reality it turned
out to have little value other than employing programmers to
re-invent the wheel, only to discover that the original round
wheel was better than the square one that they produced...
-Len Brown, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 9:18 ` Yu, Ke
@ 2009-07-30 16:00 ` Len Brown
2009-07-30 20:36 ` Jeremy Fitzhardinge
2009-07-30 17:23 ` Jeremy Fitzhardinge
1 sibling, 1 reply; 17+ messages in thread
From: Len Brown @ 2009-07-30 16:00 UTC (permalink / raw)
To: Yu, Ke; +Cc: Jeremy Fitzhardinge, linux-acpi@vger.kernel.org, Tian, Kevin
> >My understanding is that all that code is supposed to be
> >kernel-independent. We could lift all that code as-is, write some new
> >kernel interface shims and shove it all into Xen. But I don't know if
> >that solves any more problems than it causes.
Many have incorporated ACPICA into their OS, from BeOS to BSD,
Solaris to Linux. And I'm not going to tell you that you can't
do the same and make the xen hypervisor into an OS that knows
about both policy and the hardware it is running on.
However, ACPI != all the ACPI code in Linux. ACPICA is the
stuff in the drivers/acpi/acpica directory, and nothing else
(asside from a few header files outside that directory)
Also, I'm not sure the OS you build will be competitive with
other OS's when you're done.
> >>> s/extcntl/xen/ to make it clear why this code exists --
> >>> or is there an expected "external control" other than Xen?
...
> >I don't think that's a good idea. If we need to do that, then there's a
> >deeper design problem we need to address. (And, if nothing else, people
> >have become hypersensitive to naked Xen-specific code references in
> >non-Xen files, and I'm tired of having those arguments.)
>
> Can you explain what the deeper design problem is?
Obfuscating the code by calling a Xen-specific abstraction
"extcntl" instead of "xen" will not fly.
Jeremy's desire to create and use generic abstractions that have multliple
users is a good thing. It simply does not apply here.
I have no objection to having a xen-specific hook being called xen_*
To do otherwise would be a dis-service to future maintainers of the code.
thanks,
Len Brown, Intel Open Source Technology Center.
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-29 14:47 ` Yu, Ke
@ 2009-07-30 16:29 ` Len Brown
2009-07-30 22:04 ` Jeremy Fitzhardinge
2009-07-31 8:05 ` Yu, Ke
0 siblings, 2 replies; 17+ messages in thread
From: Len Brown @ 2009-07-30 16:29 UTC (permalink / raw)
To: Yu, Ke; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
Unclear that the power management partitioning between xen hypervisor
and dom0 is fully baked.
Uncear (to me) what xen is doing internally with these power management
objects, and how that differs from what Linux would do.
While patches to the Linux kernel may be a good RFE, prototype, or base
for discussion, the unknowns above need to be addressed to before it
makes much sense to spent a large amount of time on the source.
some things did jump out of the patch, however...
I do not recommend believing _PSD.
Our experience is that 50% of the time it is crap.
Why does xen_processor_px exists when it is the same as acpi_processor_px?
ditto for acpi_processor_cx and xen_processor_cx
Lose the ifdefs.
Lose the tests for xen on the inline code when
they can be inside the called routines.
This doesn't look like an abstraction layer,
it looks more like a simple conduit.
The thing at the other end (xen) will need to know
just as much about these data structures as the
thing that sent them (linux)
the patch doesn't apply to the upstream kernel --
even the ACPI specific parts.
checkpatch.pl:
total: 168 errors, 42 warnings, 643 lines checked
Don't send me another patch that can't pass checkpatch.pl,
even if just an RFC.
thanks,
Len Brown, Intel Open Source Technology Center
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 9:18 ` Yu, Ke
2009-07-30 16:00 ` Len Brown
@ 2009-07-30 17:23 ` Jeremy Fitzhardinge
1 sibling, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-30 17:23 UTC (permalink / raw)
To: Yu, Ke; +Cc: Brown, Len, linux-acpi@vger.kernel.org, Tian, Kevin
On 07/30/09 02:18, Yu, Ke wrote:
> Can you explain what the deeper design problem is?
>
I don't know. It may be that ACPI and Xen-style virtualization
fundimentally have designs which can't easily co-exist, and it is
unavoidable. I'm simply saying there appears to be a problem because
the need to put a Xen-specific hook in a piece of generic code is
generally a symptom of a design problem.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 16:00 ` Len Brown
@ 2009-07-30 20:36 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-30 20:36 UTC (permalink / raw)
To: Len Brown; +Cc: Yu, Ke, linux-acpi@vger.kernel.org, Tian, Kevin
On 07/30/09 09:00, Len Brown wrote:
>>> My understanding is that all that code is supposed to be
>>> kernel-independent. We could lift all that code as-is, write some new
>>> kernel interface shims and shove it all into Xen. But I don't know if
>>> that solves any more problems than it causes.
>>>
>
> Many have incorporated ACPICA into their OS, from BeOS to BSD,
> Solaris to Linux. And I'm not going to tell you that you can't
> do the same and make the xen hypervisor into an OS that knows
> about both policy and the hardware it is running on.
>
> However, ACPI != all the ACPI code in Linux. ACPICA is the
> stuff in the drivers/acpi/acpica directory, and nothing else
> (asside from a few header files outside that directory)
>
> Also, I'm not sure the OS you build will be competitive with
> other OS's when you're done.
>
Yes. The general intent is that Xen tries to avoid having to
reimplement stuff that the guest operating systems are already much
better at. However, it seems that ACPI's
>
>>>>> s/extcntl/xen/ to make it clear why this code exists --
>>>>> or is there an expected "external control" other than Xen?
>>>>>
> ...
>
>>> I don't think that's a good idea. If we need to do that, then there's a
>>> deeper design problem we need to address. (And, if nothing else, people
>>> have become hypersensitive to naked Xen-specific code references in
>>> non-Xen files, and I'm tired of having those arguments.)
>>>
>> Can you explain what the deeper design problem is?
>>
>
> Obfuscating the code by calling a Xen-specific abstraction
> "extcntl" instead of "xen" will not fly.
>
> Jeremy's desire to create and use generic abstractions that have multliple
> users is a good thing. It simply does not apply here.
>
OK. I don't really understand ACPIs overall design or philosophy,
beyond a rough idea of what kinds of things it gets used for. If there
really is no scope for refactoring things to make Ke's changes fit in
more naturally, then I guess we can live with some explicit hooks.
> I have no objection to having a xen-specific hook being called xen_*
> To do otherwise would be a dis-service to future maintainers of the code.
>
I agree completely. I have no interest in extra layers of
indirection/"abstraction" which are just disguises rather than having
some inherent value.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 15:37 ` Len Brown
@ 2009-07-30 20:52 ` Jeremy Fitzhardinge
0 siblings, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-30 20:52 UTC (permalink / raw)
To: Len Brown; +Cc: Yu, Ke, linux-acpi@vger.kernel.org, Tian, Kevin, Xen-devel
On 07/30/09 08:37, Len Brown wrote:
> I agree with Kevin that it would be a mistake to put ACPI both into
> both dom0 and the hypervisor. Frankly, on many levels, ACPI was
> designed with Windows in mind, and the further an OS strays from
> how Windows does things, the less likely you'll run well on many
> systems. Obviously, Xen looks nothing like windows, or any other OS,
> for it seems to have not one division between implementation and
> policy, but multiple...
>
> So I have a fundamental lack of understanding of the logic
> behind the partitioning behind the hypervisor and dom0.
> Maybe somebody explain it to me in terms that I'll understand?
>
The basic idea is that Xen controls the things it must, and leaves
everything else to guest kernels. At heart that means it controls the
physical CPUs (which includes things like local APICs) and memory (by
maintaining control over the CPU's paging hardware via the pagetables).
Everything beyond that is left to guest domains which handle various
responsibilities; informally we refer to "dom0" which is "the"
privileged domain which handles things like hardware discovery, device
drivers, domain creation, and a number of other services. However
there's no inherent reason why all these jobs must be aggregated
together. For example, there's active work on having specific "driver
domains" which have responsibility for a specific piece of hardware or
class of hardware, but don't (and can't) do any of the other
"privileged" jobs.
> It reminds me of the partitioning between the Mach microkernel
> and the "user-space OS personality, eg Unix". This looked really neat
> in proposals for funding and academic papers, but in reality it turned
> out to have little value other than employing programmers to
> re-invent the wheel, only to discover that the original round
> wheel was better than the square one that they produced...
>
There are some parallels, and there's even a paper with a title
something like "hypervisors: microkernels done right". I think the
rough sketch of the argument is that a Mach-like system with lots of
server processes fails in practise because its a poor match for the APIs
that people actually want to use and are used to, and if you want
Unix-like functionality then you just need to put a Unix in there.
Xen's breakdown is more along the lines of the hardware architecture
itself, and therefore is in some sense more "natural": kernels which run
processes are real kernels, with the full programming interfaces
everyone wants to use; shared resources like CPU and memory can be
multiplexed fairly easily using well-understood techniques; with some
extra work, you can even let guest domains have direct hardware access
using their normal device drivers so that the hypervisor doesn't need to
deal with them.
ACPI - which wears many hats affecting many aspects of the system -
doesn't have any neat dividing lines, and doesn't follow the contours of
the underlying architecture, and so appears to be a poor match for a
Xen-like model. If it were possible to partition ACPI into broad
function groupings, then maybe it would be possible to get a better fit,
but I'm not sure whether that's possible.
In general Xen doesn't have much use for ACPI; aside from this specific
case of needing to know how to control the CPUs for power management, it
doesn't really care about ACPI's services. It doesn't do interrupt
routing, it doesn't really need to know about thermal management, any
anything it does need to know it can be told by the kernel which does
have a much greater interest in ACPI. S3 suspend/resume needs to go via
Xen for the final stage, but aside from that it can all be handled in Linux.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 16:29 ` Len Brown
@ 2009-07-30 22:04 ` Jeremy Fitzhardinge
2009-07-31 8:05 ` Yu, Ke
1 sibling, 0 replies; 17+ messages in thread
From: Jeremy Fitzhardinge @ 2009-07-30 22:04 UTC (permalink / raw)
To: Len Brown; +Cc: Yu, Ke, linux-acpi@vger.kernel.org, Tian, Kevin
On 07/30/09 09:29, Len Brown wrote:
> Unclear that the power management partitioning between xen hypervisor
> and dom0 is fully baked.
>
> Uncear (to me) what xen is doing internally with these power management
> objects, and how that differs from what Linux would do.
>
Yes. The key thing is that Xen is the only entity which really knows
about physical CPUs and how they're being used, and so is the only thing
which can correctly apply the chosen policy. If any particular guest
domain did it, it would only take into account that particular domain's
CPU use, and ignore everyone else (or have the extra complexity of
extracting system-wide usage from Xen then applying that to its own policy).
If I understand correctly, the code currently relies on Linux running
the _PSD method with its AML interpreter, and then feeding the results
to Xen as it doesn't have an AML interpreter. And putting AML into Xen
would be an all-or-nothing proposition, because the entity which runs
AML maintains a lot of state which can't be separated between Xen and
Linux, and can't be shared.
J
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser
2009-07-30 16:29 ` Len Brown
2009-07-30 22:04 ` Jeremy Fitzhardinge
@ 2009-07-31 8:05 ` Yu, Ke
1 sibling, 0 replies; 17+ messages in thread
From: Yu, Ke @ 2009-07-31 8:05 UTC (permalink / raw)
To: Len Brown; +Cc: linux-acpi@vger.kernel.org, Jeremy Fitzhardinge, Tian, Kevin
[-- Attachment #1: Type: text/plain, Size: 4086 bytes --]
>-----Original Message-----
>From: Len Brown [mailto:lenb@kernel.org]
>Sent: Friday, July 31, 2009 12:29 AM
>To: Yu, Ke
>Cc: linux-acpi@vger.kernel.org; Jeremy Fitzhardinge; Tian, Kevin
>Subject: RE: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external
>control operation interface for domain0 ACPI parser
>
>Unclear that the power management partitioning between xen hypervisor
>and dom0 is fully baked.
>
>Uncear (to me) what xen is doing internally with these power management
>objects, and how that differs from what Linux would do.
>
>While patches to the Linux kernel may be a good RFE, prototype, or base
>for discussion, the unknowns above need to be addressed to before it
>makes much sense to spent a large amount of time on the source.
Oh yes, I would like to add the background. In xen architecture, only hypervisor can control the physical CPU, so all the physical CPU Cx/Px power management is done in hypervisor side. And the cpuidle/cpufreq driver in dom0 is disabled. the Xen PM algorithm is ported from linux side, so it is similar as linux. For cpuidle part, when idle vcpu is scheduled (similar as idle process in linux), it will invoke cpuidle routine. cpuidle routine firstly decide which C state to go by calling cpuidle governor (e.g. menu governor), then cpuidle routine will enter C state using the ACPI method (e.g. I/O port, mwait described in _CST).
For cpufreq part, cpufreq driver will register the governor (e.g. ondemand) and cpu driver (cpufreq-acpi driver). once the cpufreq start on certain CPU, the governor will change the frequency by calling the corresponding cpu driver. http://wiki.xensource.com/xenwiki/xenpm has one picture on the cpufreq architecture.
All the necessary ACPI information in the above process is parsed by dom0 and pass to hypervisor.
>
>some things did jump out of the patch, however...
>
>I do not recommend believing _PSD.
>Our experience is that 50% of the time it is crap.
That is true, we also meet crap _PSD in especially the new platform. We will add more check on the _PSD data.
>
>Why does xen_processor_px exists when it is the same as acpi_processor_px?
>ditto for acpi_processor_cx and xen_processor_cx
xen_processor_px is part of the hypercall interface, and mainly used in hypervisor. since hypervisor has no acpi_processor_px as kernel has, so we have to add the same data structure for information passing purpose. similar for xen_processor_cx.
>
>Lose the ifdefs
Can you elaborate which part lose the ifdef? for the drivers/xen/processor_extcntl.c, since it has "obj-$(CONFIG_ACPI_PROCESSOR_XEN) += processor_extcntl.o" in Makefile, it implies #ifdef CONFIG_ACPI_PROCESSOR_XEN.
>Lose the tests for xen on the inline code when
>they can be inside the called routines.
Can you elaborate which part?
>
>This doesn't look like an abstraction layer,
>it looks more like a simple conduit.
>The thing at the other end (xen) will need to know
>just as much about these data structures as the
>thing that sent them (linux)
The reason is that xen and linux use the similar power management algorithm, so the data structure xen need is also similar as Linux side. This actually make life easier, in that we don't need to invent an abstraction layer here to handle the difference, since there is actually no difference.
>
>the patch doesn't apply to the upstream kernel --
>even the ACPI specific parts.
That is strange. I can successfully apply the ACPI specific parts to the upstream (linus 2.6 tree, parent commit b592972).
For the xen part, it is based on Jeremy's git tree (git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git) rebase/master branch, so it cannot be cleanly applied to upstream.
>
>checkpatch.pl:
>total: 168 errors, 42 warnings, 643 lines checked
>
>Don't send me another patch that can't pass checkpatch.pl,
>even if just an RFC.
Sorry for that. I have fix the code style issue and the attached updated version can pass the checkpatch.pl
BTW, thanks for your comments.
Best Regards
Ke
[-- Attachment #2: 03.patch --]
[-- Type: application/octet-stream, Size: 21948 bytes --]
commit 60864023151926a582d403a899e69dd53421cb34
Author: Yu Ke <ke.yu@intel.com>
Date: Fri Jul 31 11:00:15 2009 +0800
Leverage domain0 ACPI parser for xen
This patch reuse dom0 ACPI parser to get C/P state for Xen
=== Overview ===
Requirement: Xen hypervisor need Cx/Px ACPI info to do the Cx/Px states
power management. This info is provided by BIOS ACPI table. Since
hypervisor has no ACPI parser, this info has to be parsed by domain0
kernel ACPI sub-system, and then passed to hypervisor by hypercall.
To make this happen, the key point is to add hook in the kernel ACPI
sub-system. Fortunately, kernel already has good abstraction, and
only several places need to add hook. To be more detail, there is an
acpi_processor_driver (in drivers/acpi/processor_core.c) , which all the
Cx/Px parsing event will go to. This driver will call its acpi processor
event handler, e.g. add/remove, start/stop, notify to handle these
events. These event handlers in turn will call some library functions (in
drivers/acpi/processor_perflib.c), e.g. acpi_processor_ppc_has_changed,
acpi_processor_ppc_has_changed, acpi_processor_cst_has_changed, to finish
the acpi info parsing.
So this patch add the xen hook in these places to notify xen for the parsed
Cx/Px state information.
Signed-off-by: Yu Ke <ke.yu@intel.com>
Signed-off-by: Tian Kevin <kevin.tian@intel.com>
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 84e0f3c..2707d65 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_DEVICE_NAME "Processor"
@@ -751,6 +752,10 @@ static int __cpuinit acpi_processor_start(struct acpi_device *device)
acpi_processor_power_init(pr, device);
+ result = processor_cntl_xen_prepare(pr);
+ if (result)
+ goto end;
+
pr->cdev = thermal_cooling_device_register("Processor", device,
&processor_cooling_ops);
if (IS_ERR(pr->cdev)) {
@@ -963,6 +968,10 @@ int acpi_processor_device_add(acpi_handle handle, struct acpi_device **device)
if (!pr)
return -ENODEV;
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if ((pr->id >= 0) && (pr->id < nr_cpu_ids)) {
kobject_uevent(&(*device)->dev.kobj, KOBJ_ONLINE);
}
@@ -1002,11 +1011,19 @@ static void __ref acpi_processor_hotplug_notify(acpi_handle handle,
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_HOTPLUG, HOTPLUG_TYPE_ADD);
+
if (pr->id >= 0 && (pr->id < nr_cpu_ids)) {
kobject_uevent(&device->dev.kobj, KOBJ_OFFLINE);
break;
}
+ if (processor_cntl_xen())
+ processor_cntl_xen_notify(pr, PROCESSOR_HOTPLUG,
+ HOTPLUG_TYPE_REMOVE);
+
result = acpi_processor_start(device);
if ((!result) && ((pr->id >= 0) && (pr->id < nr_cpu_ids))) {
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 0efa59e..8994aff 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -58,6 +58,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#include <asm/processor.h>
#define ACPI_PROCESSOR_CLASS "processor"
@@ -455,6 +456,12 @@ static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
cx.power = obj->integer.value;
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* cache control methods to notify xen*/
+ if (processor_cntl_xen_pm())
+ memcpy(&cx.reg, reg, sizeof(*reg));
+#endif
+
current_count++;
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
@@ -1141,6 +1148,13 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
if (!pr->flags.power_setup_done)
return -ENODEV;
+ if (processor_cntl_xen_pm()) {
+ acpi_processor_get_power_info(pr);
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_IDLE);
+ return ret;
+ }
+
cpuidle_pause_and_lock();
cpuidle_disable_device(&pr->power.dev);
acpi_processor_get_power_info(pr);
@@ -1204,9 +1218,14 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
* platforms that only support C1.
*/
if (pr->flags.power) {
- acpi_processor_setup_cpuidle(pr);
- if (cpuidle_register_device(&pr->power.dev))
- return -EIO;
+ if (processor_cntl_xen_pm())
+ processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_INIT, PM_TYPE_IDLE);
+ else {
+ acpi_processor_setup_cpuidle(pr);
+ if (cpuidle_register_device(&pr->power.dev))
+ return -EIO;
+ }
printk(KERN_INFO PREFIX "CPU%d (power states:", pr->id);
for (i = 1; i <= pr->power.count; i++)
diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c
index 60e543d..8375075 100644
--- a/drivers/acpi/processor_perflib.c
+++ b/drivers/acpi/processor_perflib.c
@@ -38,6 +38,7 @@
#include <acpi/acpi_bus.h>
#include <acpi/acpi_drivers.h>
#include <acpi/processor.h>
+#include <xen/acpi.h>
#define ACPI_PROCESSOR_CLASS "processor"
#define ACPI_PROCESSOR_FILE_PERFORMANCE "performance"
@@ -154,13 +155,16 @@ int acpi_processor_ppc_has_changed(struct acpi_processor *pr)
{
int ret;
- if (ignore_ppc)
+ if (ignore_ppc && !processor_cntl_xen_pmperf())
return 0;
ret = acpi_processor_get_platform_limit(pr);
if (ret < 0)
return (ret);
+ else if (processor_cntl_xen_pmperf())
+ return processor_cntl_xen_notify(pr,
+ PROCESSOR_PM_CHANGE, PM_TYPE_PERF);
else
return cpufreq_update_policy(pr->id);
}
@@ -330,7 +334,7 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
return result;
}
-static int acpi_processor_get_performance_info(struct acpi_processor *pr)
+int acpi_processor_get_performance_info(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
@@ -432,7 +436,7 @@ int acpi_processor_notify_smm(struct module *calling_module)
EXPORT_SYMBOL(acpi_processor_notify_smm);
-static int acpi_processor_get_psd(struct acpi_processor *pr)
+int acpi_processor_get_psd(struct acpi_processor *pr)
{
int result = 0;
acpi_status status = AE_OK;
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 3b1c421..d303c25 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -90,4 +90,9 @@ config XEN_XENBUS_FRONTEND
config XEN_S3
def_bool y
- depends on XEN_DOM0 && ACPI
\ No newline at end of file
+ depends on XEN_DOM0 && ACPI
+
+config ACPI_PROCESSOR_XEN
+ bool
+ depends on XEN_DOM0 && ACPI_PROCESSOR && CPU_FREQ
+ default y
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index 386c775..42c9ace 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -9,4 +9,5 @@ obj-$(CONFIG_XEN_BLKDEV_BACKEND) += blkback/
obj-$(CONFIG_XEN_NETDEV_BACKEND) += netback/
obj-$(CONFIG_XENFS) += xenfs/
obj-$(CONFIG_XEN_SYS_HYPERVISOR) += sys-hypervisor.o
-obj-$(CONFIG_XEN_S3) += acpi.o
\ No newline at end of file
+obj-$(CONFIG_XEN_S3) += acpi.o
+obj-$(CONFIG_ACPI_PROCESSOR_XEN) += processor_extcntl.o
diff --git a/drivers/xen/processor_extcntl.c b/drivers/xen/processor_extcntl.c
new file mode 100644
index 0000000..85d708c
--- /dev/null
+++ b/drivers/xen/processor_extcntl.c
@@ -0,0 +1,418 @@
+/*
+ * processor_extcntl.c - interface to notify Xen
+ *
+ * Copyright (C) 2008, Intel corporation
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/acpi.h>
+#include <linux/pm.h>
+#include <linux/cpu.h>
+
+#include <linux/cpufreq.h>
+#include <acpi/processor.h>
+#include <xen/acpi.h>
+
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr);
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event);
+
+static struct processor_cntl_xen_ops xen_ops = {
+ .hotplug = xen_hotplug_notifier,
+};
+
+int processor_cntl_xen(void)
+{
+ return 1;
+}
+
+int processor_cntl_xen_pm(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_IDLE] != NULL);
+}
+
+int processor_cntl_xen_pmperf(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_PERF] != NULL);
+}
+
+int processor_cntl_xen_pmthr(void)
+{
+ return (xen_ops.pm_ops[PM_TYPE_THR] != NULL);
+}
+
+static int processor_notify_smm(void)
+{
+ acpi_status status;
+ static int is_done;
+
+ /* only need successfully notify BIOS once */
+ /* avoid double notification which may lead to unexpected result */
+ if (is_done)
+ return 0;
+
+ /* Can't write pstate_cnt to smi_cmd if either value is zero */
+ if ((!acpi_gbl_FADT.smi_command) || (!acpi_gbl_FADT.pstate_control)) {
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No SMI port or pstate_cnt\n"));
+ return 0;
+ }
+
+ ACPI_DEBUG_PRINT((ACPI_DB_INFO,
+ "Writing pstate_cnt [0x%x] to smi_cmd [0x%x]\n",
+ acpi_gbl_FADT.pstate_control, acpi_gbl_FADT.smi_command));
+
+ status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
+ (u32) acpi_gbl_FADT.pstate_control, 8);
+ if (ACPI_FAILURE(status))
+ return status;
+
+ is_done = 1;
+
+ return 0;
+}
+
+int processor_cntl_xen_notify(struct acpi_processor *pr, int event, int type)
+{
+ int ret = -EINVAL;
+
+ switch (event) {
+ case PROCESSOR_PM_INIT:
+ case PROCESSOR_PM_CHANGE:
+ if ((type >= PM_TYPE_MAX) ||
+ !xen_ops.pm_ops[type])
+ break;
+
+ ret = xen_ops.pm_ops[type](pr, event);
+ break;
+ case PROCESSOR_HOTPLUG:
+ if (xen_ops.hotplug)
+ ret = xen_ops.hotplug(pr, type);
+ break;
+ default:
+ printk(KERN_ERR "Unsupport processor events %d.\n", event);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * This is called from ACPI processor init, and targeted to hold
+ * some tricky housekeeping jobs to satisfy xen.
+ * For example, we may put dependency parse stub here for idle
+ * and performance state. Those information may be not available
+ * if splitting from dom0 control logic like cpufreq driver.
+ */
+int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+
+ /* Initialize performance states */
+ if (processor_cntl_xen_pmperf())
+ processor_cntl_xen_get_performance(pr);
+
+ return 0;
+}
+
+/*
+ * Existing ACPI module does parse performance states at some point,
+ * when acpi-cpufreq driver is loaded which however is something
+ * we'd like to disable to avoid confliction with xen PM
+ * logic. So we have to collect raw performance information here
+ * when ACPI processor object is found and started.
+ */
+static int processor_cntl_xen_get_performance(struct acpi_processor *pr)
+{
+ int ret;
+ struct acpi_processor_performance *perf;
+ struct acpi_psd_package *pdomain;
+
+ if (pr->performance)
+ return -EBUSY;
+
+ perf = kzalloc(sizeof(struct acpi_processor_performance), GFP_KERNEL);
+ if (!perf)
+ return -ENOMEM;
+
+ pr->performance = perf;
+ /* Get basic performance state information */
+ ret = acpi_processor_get_performance_info(pr);
+ if (ret < 0)
+ goto err_out;
+
+ /*
+ * Well, here we need retrieve performance dependency information
+ * from _PSD object. The reason why existing interface is not used
+ * is due to the reason that existing interface sticks to Linux cpu
+ * id to construct some bitmap, however we want to split ACPI
+ * processor objects from Linux cpu id logic. For example, even
+ * when Linux is configured as UP, we still want to parse all ACPI
+ * processor objects to xen. In this case, it's preferred
+ * to use ACPI ID instead.
+ */
+ pdomain = &pr->performance->domain_info;
+ pdomain->num_processors = 0;
+ ret = acpi_processor_get_psd(pr);
+ if (ret < 0) {
+ /*
+ * _PSD is optional - assume no coordination if absent (or
+ * broken), matching native kernels' behavior.
+ */
+ pdomain->num_entries = ACPI_PSD_REV0_ENTRIES;
+ pdomain->revision = ACPI_PSD_REV0_REVISION;
+ pdomain->domain = pr->acpi_id;
+ pdomain->coord_type = DOMAIN_COORD_TYPE_SW_ALL;
+ pdomain->num_processors = 1;
+ }
+
+ /* Some sanity check */
+ if ((pdomain->revision != ACPI_PSD_REV0_REVISION) ||
+ (pdomain->num_entries != ACPI_PSD_REV0_ENTRIES) ||
+ ((pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ALL) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_SW_ANY) &&
+ (pdomain->coord_type != DOMAIN_COORD_TYPE_HW_ALL))) {
+ ret = -EINVAL;
+ goto err_out;
+ }
+
+ /* Last step is to notify BIOS that xen exists */
+ processor_notify_smm();
+
+ processor_cntl_xen_notify(pr, PROCESSOR_PM_INIT, PM_TYPE_PERF);
+
+ return 0;
+err_out:
+ pr->performance = NULL;
+ kfree(perf);
+ return ret;
+}
+
+static inline void xen_convert_pct_reg(struct xen_pct_register *xpct,
+ struct acpi_pct_register *apct)
+{
+ xpct->descriptor = apct->descriptor;
+ xpct->length = apct->length;
+ xpct->space_id = apct->space_id;
+ xpct->bit_width = apct->bit_width;
+ xpct->bit_offset = apct->bit_offset;
+ xpct->reserved = apct->reserved;
+ xpct->address = apct->address;
+}
+
+static inline void xen_convert_pss_states(struct xen_processor_px *xpss,
+ struct acpi_processor_px *apss, int state_count)
+{
+ int i;
+ for (i = 0; i < state_count; i++) {
+ xpss->core_frequency = apss->core_frequency;
+ xpss->power = apss->power;
+ xpss->transition_latency = apss->transition_latency;
+ xpss->bus_master_latency = apss->bus_master_latency;
+ xpss->control = apss->control;
+ xpss->status = apss->status;
+ xpss++;
+ apss++;
+ }
+}
+
+static inline void xen_convert_psd_pack(struct xen_psd_package *xpsd,
+ struct acpi_psd_package *apsd)
+{
+ xpsd->num_entries = apsd->num_entries;
+ xpsd->revision = apsd->revision;
+ xpsd->domain = apsd->domain;
+ xpsd->coord_type = apsd->coord_type;
+ xpsd->num_processors = apsd->num_processors;
+}
+
+static int xen_cx_notifier(struct acpi_processor *pr, int action)
+{
+ int ret, count = 0, i;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_CX,
+ };
+ struct xen_processor_cx *data, *buf;
+ struct acpi_processor_cx *cx;
+
+ if (action == PROCESSOR_PM_CHANGE)
+ return -EINVAL;
+
+ /* Convert to Xen defined structure and hypercall */
+ buf = kzalloc(pr->power.count * sizeof(struct xen_processor_cx),
+ GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ data = buf;
+ for (i = 1; i <= pr->power.count; i++) {
+ cx = &pr->power.states[i];
+ /* Skip invalid cstate entry */
+ if (!cx->valid)
+ continue;
+
+ data->type = cx->type;
+ data->latency = cx->latency;
+ data->power = cx->power;
+ data->reg.space_id = cx->reg.space_id;
+ data->reg.bit_width = cx->reg.bit_width;
+ data->reg.bit_offset = cx->reg.bit_offset;
+ data->reg.access_size = cx->reg.reserved;
+ data->reg.address = cx->reg.address;
+
+ /* Get dependency relationships, _CSD is not supported yet */
+ data->dpcnt = 0;
+ set_xen_guest_handle(data->dp, NULL);
+
+ data++;
+ count++;
+ }
+
+ if (!count) {
+ printk(KERN_ERR "No available Cx info for cpu %d\n",
+ pr->acpi_id);
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ op.u.set_pminfo.power.count = count;
+ op.u.set_pminfo.power.flags.bm_control = pr->flags.bm_control;
+ op.u.set_pminfo.power.flags.bm_check = pr->flags.bm_check;
+ op.u.set_pminfo.power.flags.has_cst = pr->flags.has_cst;
+ op.u.set_pminfo.power.flags.power_setup_done =
+ pr->flags.power_setup_done;
+
+ set_xen_guest_handle(op.u.set_pminfo.power.states, buf);
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(buf);
+ return ret;
+}
+
+static int xen_px_notifier(struct acpi_processor *pr, int action)
+{
+ int ret = -EINVAL;
+ xen_platform_op_t op = {
+ .cmd = XENPF_set_processor_pminfo,
+ .interface_version = XENPF_INTERFACE_VERSION,
+ .u.set_pminfo.id = pr->acpi_id,
+ .u.set_pminfo.type = XEN_PM_PX,
+ };
+ struct xen_processor_performance *perf;
+ struct xen_processor_px *states = NULL;
+ struct acpi_processor_performance *px;
+ struct acpi_psd_package *pdomain;
+
+ if (!pr)
+ return -EINVAL;
+
+ perf = &op.u.set_pminfo.perf;
+ px = pr->performance;
+
+ switch (action) {
+ case PROCESSOR_PM_CHANGE:
+ /* ppc dynamic handle */
+ perf->flags = XEN_PX_PPC;
+ perf->platform_limit = pr->performance_platform_limit;
+
+ ret = HYPERVISOR_dom0_op(&op);
+ break;
+
+ case PROCESSOR_PM_INIT:
+ /* px normal init */
+ perf->flags = XEN_PX_PPC |
+ XEN_PX_PCT |
+ XEN_PX_PSS |
+ XEN_PX_PSD;
+
+ /* ppc */
+ perf->platform_limit = pr->performance_platform_limit;
+
+ /* pct */
+ xen_convert_pct_reg(&perf->control_register,
+ &px->control_register);
+ xen_convert_pct_reg(&perf->status_register,
+ &px->status_register);
+
+ /* pss */
+ perf->state_count = px->state_count;
+ states = kzalloc(px->state_count*sizeof(xen_processor_px_t),
+ GFP_KERNEL);
+ if (!states)
+ return -ENOMEM;
+ xen_convert_pss_states(states, px->states, px->state_count);
+ set_xen_guest_handle(perf->states, states);
+
+ /* psd */
+ pdomain = &px->domain_info;
+ xen_convert_psd_pack(&perf->domain_info, pdomain);
+ if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ALL;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ANY)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+ else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
+ perf->shared_type = CPUFREQ_SHARED_TYPE_HW;
+ else {
+ ret = -ENODEV;
+ kfree(states);
+ break;
+ }
+
+ ret = HYPERVISOR_dom0_op(&op);
+ kfree(states);
+ break;
+
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static int xen_tx_notifier(struct acpi_processor *pr, int action)
+{
+ return -EINVAL;
+}
+static int xen_hotplug_notifier(struct acpi_processor *pr, int event)
+{
+ return -EINVAL;
+}
+
+static int __init xen_acpi_processor_extcntl_init(void)
+{
+ unsigned int pmbits = (xen_start_info->flags & SIF_PM_MASK) >> 8;
+
+ if (!pmbits)
+ return 0;
+ if (pmbits & XEN_PROCESSOR_PM_CX)
+ xen_ops.pm_ops[PM_TYPE_IDLE] = xen_cx_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_PX)
+ xen_ops.pm_ops[PM_TYPE_PERF] = xen_px_notifier;
+ if (pmbits & XEN_PROCESSOR_PM_TX)
+ xen_ops.pm_ops[PM_TYPE_THR] = xen_tx_notifier;
+
+ return 0;
+}
+
+subsys_initcall(xen_acpi_processor_extcntl_init);
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index baf1e0a..14c7e4c 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -77,6 +77,10 @@ struct acpi_processor_cx {
struct acpi_processor_cx_policy promotion;
struct acpi_processor_cx_policy demotion;
char desc[ACPI_CX_DESC_LEN];
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+ /* Require raw information for xen*/
+ struct acpi_power_register reg;
+#endif /* CONFIG_ACPI_PROCESSOE_XEN */
};
struct acpi_processor_power {
@@ -295,6 +299,8 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
void acpi_processor_ppc_init(void);
void acpi_processor_ppc_exit(void);
int acpi_processor_ppc_has_changed(struct acpi_processor *pr);
+int acpi_processor_get_performance_info(struct acpi_processor *pr);
+int acpi_processor_get_psd(struct acpi_processor *pr);
#else
static inline void acpi_processor_ppc_init(void)
{
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index fea4cfb..6ba2f5b 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -20,4 +20,59 @@ static inline bool xen_pv_acpi(void)
int acpi_notify_hypervisor_state(u8 sleep_state,
u32 pm1a_cnt, u32 pm1b_cnd);
+/*
+ * Following are interfaces for xen acpi processor control
+ */
+
+/* Events notified to xen */
+#define PROCESSOR_PM_INIT 1
+#define PROCESSOR_PM_CHANGE 2
+#define PROCESSOR_HOTPLUG 3
+
+/* Objects for the PM events */
+#define PM_TYPE_IDLE 0
+#define PM_TYPE_PERF 1
+#define PM_TYPE_THR 2
+#define PM_TYPE_MAX 3
+
+/* Processor hotplug events */
+#define HOTPLUG_TYPE_ADD 0
+#define HOTPLUG_TYPE_REMOVE 1
+
+#ifdef CONFIG_ACPI_PROCESSOR_XEN
+#include <acpi/acpi_drivers.h>
+#include <acpi/processor.h>
+
+struct processor_cntl_xen_ops {
+ /* Transfer processor PM events to xen */
+int (*pm_ops[PM_TYPE_MAX])(struct acpi_processor *pr, int event);
+ /* Notify physical processor status to xen */
+ int (*hotplug)(struct acpi_processor *pr, int type);
+};
+
+extern int processor_cntl_xen(void);
+extern int processor_cntl_xen_pm(void);
+extern int processor_cntl_xen_pmperf(void);
+extern int processor_cntl_xen_pmthr(void);
+extern int processor_cntl_xen_prepare(struct acpi_processor *pr);
+extern int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type);
+
+#else
+
+static inline int processor_cntl_xen(void) { return 0; }
+static inline int processor_cntl_xen_pm(void) { return 0; }
+static inline int processor_cntl_xen_pmperf(void) { return 0; }
+static inline int processor_cntl_xen_pmthr(void) { return 0; }
+static inline int processor_cntl_xen_notify(struct acpi_processor *pr,
+ int event, int type)
+{
+ return 0;
+}
+static inline int processor_cntl_xen_prepare(struct acpi_processor *pr)
+{
+ return 0;
+}
+#endif /* CONFIG_ACPI_PROCESSOR_XEN */
+
#endif /* _XEN_ACPI_H */
^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-07-31 8:06 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-29 2:55 FW: [Xen-devel] [PATCH][pvops_dom0][2/4] Introduce the external control operation interface for domain0 ACPI parser Yu, Ke
2009-07-29 4:14 ` Brown, Len
2009-07-29 6:20 ` Yu, Ke
2009-07-29 16:50 ` Jeremy Fitzhardinge
2009-07-30 9:18 ` Yu, Ke
2009-07-30 16:00 ` Len Brown
2009-07-30 20:36 ` Jeremy Fitzhardinge
2009-07-30 17:23 ` Jeremy Fitzhardinge
2009-07-30 15:37 ` Len Brown
2009-07-30 20:52 ` Jeremy Fitzhardinge
2009-07-29 14:47 ` Yu, Ke
2009-07-30 16:29 ` Len Brown
2009-07-30 22:04 ` Jeremy Fitzhardinge
2009-07-31 8:05 ` Yu, Ke
2009-07-29 16:43 ` Jeremy Fitzhardinge
2009-07-30 8:59 ` Yu, Ke
2009-07-30 15:03 ` Brown, Len
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox