public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* Paravirtualizing bits of acpi access
@ 2009-03-21  6:09 Jeremy Fitzhardinge
  2009-03-21 17:10 ` Rafael J. Wysocki
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-21  6:09 UTC (permalink / raw)
  To: Brown, Len; +Cc: linux-acpi, Xen-devel, the arch/x86 maintainers

[-- Attachment #1: Type: text/plain, Size: 2371 bytes --]

Hi Len,

I have a patch here ported from the linux2.6.18-xen tree to make host S3 
suspend work under Xen (attached).

The salient part is this:

--- a/drivers/acpi/acpica/hwsleep.c
+++ b/drivers/acpi/acpica/hwsleep.c
@@ -46,6 +46,9 @@
 #include "accommon.h"
 #include "actables.h"
 
+#include <xen/acpi.h>
+#include <asm/xen/hypervisor.h>
+
 #define _COMPONENT          ACPI_HARDWARE
 ACPI_MODULE_NAME("hwsleep")
 
@@ -337,14 +340,19 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state)
 
 	ACPI_FLUSH_CPU_CACHE();
 
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
-					PM1Acontrol);
-	if (ACPI_FAILURE(status)) {
-		return_ACPI_STATUS(status);
-	}
+	if (!xen_pv_domain()) {
+		status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
+						PM1Acontrol);
+		if (ACPI_FAILURE(status)) {
+			return_ACPI_STATUS(status);
+		}
+
+		status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
+						PM1Bcontrol);
+	} else
+		status = acpi_notify_hypervisor_state(sleep_state,
+						      PM1Acontrol, PM1Bcontrol);
 
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
-					PM1Bcontrol);
 	if (ACPI_FAILURE(status)) {
 		return_ACPI_STATUS(status);
 	}


where acpi_notify_hypervisor_state() more or less maps directly onto a 
Xen hypercall, which in turn performs the corresponding acpi operation 
from within Xen.

I'm guessing you won't find this patch appealing as-is because it sticks 
a great big if (xen) into an otherwise arch (and OS?) independent piece 
of code.  In this particular instance, its fairly easy to envisage 
encapsulating these two register writes into their own function which 
architectures can override, which makes it fairly easy for me to put a 
Xen hook in somewhere on the arch/x86 side of the fence.

But because Xen ends up doing the low-level cpu state save/restore, 
several other parts of the S3 suspend path can be skipped on the Linux 
side.  I'm wondering if you have any thoughts about how that can be 
done, or if putting the Xen code in as-is is acceptable?

(BTW, xen_pv_domain() expands to a constant 0 when Xen isn't enabled in 
the config, so the Xen-dependent bits will collapse down to nothing.  
But it is defined in asm/xen/hypervisor.h, which is only present on x86 
and ia64 architectures; on the other hand, believe those are the only 
architectures using acpi.)

Thanks,
    J

[-- Attachment #2: xen-s3-sleep.patch --]
[-- Type: text/x-patch, Size: 5123 bytes --]

>From 7bdec2cce7de8fb41207238c61f3075220c70a55 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date: Mon, 23 Feb 2009 00:06:06 -0800
Subject: [PATCH] xen: Enable ACPI sleep in Xen

Open CONFIG_ACPI_SLEEP in xenlinux, to enable ACPI based
power management. Basically, user can trigger power event
now by "echo *** > /sys/power/state". Also gear to pm
interface defined between xenlinux and Xen.

Also sync to xen interface headers consequently

[ From http://xenbits.xensource.com/linux-2.6.18-xen.hg
 change c68699484a65 ]

Signed-off-by Ke Yu <ke.yu@intel.com>
Signed-off-by Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 7c243a2..a89de8d 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -12,6 +12,8 @@
 #include <asm/segment.h>
 #include <asm/desc.h>
 
+#include <asm/xen/hypervisor.h>
+
 #include "realmode/wakeup.h"
 #include "sleep.h"
 
@@ -37,6 +39,9 @@ int acpi_save_state_mem(void)
 {
 	struct wakeup_header *header;
 
+	if (xen_pv_domain())
+		return 0;	/* Xen will do this for us */
+
 	if (!acpi_realmode) {
 		printk(KERN_ERR "Could not allocate memory during boot, "
 		       "S3 disabled\n");
diff --git a/drivers/acpi/acpica/hwsleep.c b/drivers/acpi/acpica/hwsleep.c
index a2af2a4..bf719f1 100644
--- a/drivers/acpi/acpica/hwsleep.c
+++ b/drivers/acpi/acpica/hwsleep.c
@@ -46,6 +46,9 @@
 #include "accommon.h"
 #include "actables.h"
 
+#include <xen/acpi.h>
+#include <asm/xen/hypervisor.h>
+
 #define _COMPONENT          ACPI_HARDWARE
 ACPI_MODULE_NAME("hwsleep")
 
@@ -337,14 +340,19 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state)
 
 	ACPI_FLUSH_CPU_CACHE();
 
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
-					PM1Acontrol);
-	if (ACPI_FAILURE(status)) {
-		return_ACPI_STATUS(status);
-	}
+	if (!xen_pv_domain()) {
+		status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
+						PM1Acontrol);
+		if (ACPI_FAILURE(status)) {
+			return_ACPI_STATUS(status);
+		}
+
+		status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
+						PM1Bcontrol);
+	} else
+		status = acpi_notify_hypervisor_state(sleep_state,
+						      PM1Acontrol, PM1Bcontrol);
 
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
-					PM1Bcontrol);
 	if (ACPI_FAILURE(status)) {
 		return_ACPI_STATUS(status);
 	}
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 5192666..c855dec 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -59,7 +59,7 @@ static int acpi_sleep_prepare(u32 acpi_state)
 {
 #ifdef CONFIG_ACPI_SLEEP
 	/* do we have a wakeup address for S2 and S3? */
-	if (acpi_state == ACPI_STATE_S3) {
+	if (!xen_pv_domain() && acpi_state == ACPI_STATE_S3) {
 		if (!acpi_wakeup_address) {
 			return -EFAULT;
 		}
@@ -242,7 +242,16 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
 		break;
 
 	case ACPI_STATE_S3:
-		do_suspend_lowlevel();
+		if (!xen_pv_domain())
+			do_suspend_lowlevel();
+		else {
+			/*
+			 * Xen will save and restore CPU context, so
+			 * we can skip that and just go straight to
+			 * the suspend.
+			 */
+			acpi_enter_sleep_state(acpi_state);
+		}
 		break;
 	}
 
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 51cbaa5..0138113 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -76,3 +76,7 @@ config XEN_COMPAT_XENFS
 
 config XEN_XENBUS_FRONTEND
        tristate
+
+config XEN_S3
+       def_bool y
+       depends on XEN_DOM0 && ACPI
\ No newline at end of file
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index bb88673..4b01fc8 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -7,4 +7,6 @@ obj-$(CONFIG_XEN_BALLOON)		+= balloon.o
 obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
 obj-$(CONFIG_XEN_BLKDEV_BACKEND)	+= blkback/
 obj-$(CONFIG_XEN_NETDEV_BACKEND)	+= netback/
-obj-$(CONFIG_XENFS)			+= xenfs/
\ No newline at end of file
+obj-$(CONFIG_XENFS)			+= xenfs/
+
+obj-$(CONFIG_XEN_S3)			+= acpi.o
\ No newline at end of file
diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c
new file mode 100644
index 0000000..e6d3d0e
--- /dev/null
+++ b/drivers/xen/acpi.c
@@ -0,0 +1,23 @@
+#include <xen/acpi.h>
+
+#include <xen/interface/platform.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+int acpi_notify_hypervisor_state(u8 sleep_state,
+				 u32 pm1a_cnt, u32 pm1b_cnt)
+{
+	struct xen_platform_op op = {
+		.cmd = XENPF_enter_acpi_sleep,
+		.interface_version = XENPF_INTERFACE_VERSION,
+		.u = {
+			.enter_acpi_sleep = {
+				.pm1a_cnt_val = (u16)pm1a_cnt,
+				.pm1b_cnt_val = (u16)pm1b_cnt,
+				.sleep_state = sleep_state,
+			},
+		},
+	};
+
+	return HYPERVISOR_dom0_op(&op);
+}
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
new file mode 100644
index 0000000..0d1e462
--- /dev/null
+++ b/include/xen/acpi.h
@@ -0,0 +1,9 @@
+#ifndef _XEN_ACPI_H
+#define _XEN_ACPI_H
+
+#include <linux/types.h>
+
+int acpi_notify_hypervisor_state(u8 sleep_state,
+				 u32 pm1a_cnt, u32 pm1b_cnd);
+
+#endif	/* _XEN_ACPI_H */

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-21  6:09 Paravirtualizing bits of acpi access Jeremy Fitzhardinge
@ 2009-03-21 17:10 ` Rafael J. Wysocki
  2009-03-22  4:26   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-21 17:10 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Brown, Len, linux-acpi, Xen-devel, the arch/x86 maintainers

On Saturday 21 March 2009, Jeremy Fitzhardinge wrote:
> Hi Len,
> 
> I have a patch here ported from the linux2.6.18-xen tree to make host S3 
> suspend work under Xen (attached).
> 
> The salient part is this:
> 
> --- a/drivers/acpi/acpica/hwsleep.c
> +++ b/drivers/acpi/acpica/hwsleep.c
> @@ -46,6 +46,9 @@
>  #include "accommon.h"
>  #include "actables.h"
>  
> +#include <xen/acpi.h>
> +#include <asm/xen/hypervisor.h>
> +
>  #define _COMPONENT          ACPI_HARDWARE
>  ACPI_MODULE_NAME("hwsleep")
>  
> @@ -337,14 +340,19 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state)
>  
>  	ACPI_FLUSH_CPU_CACHE();
>  
> -	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
> -					PM1Acontrol);
> -	if (ACPI_FAILURE(status)) {
> -		return_ACPI_STATUS(status);
> -	}
> +	if (!xen_pv_domain()) {
> +		status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
> +						PM1Acontrol);
> +		if (ACPI_FAILURE(status)) {
> +			return_ACPI_STATUS(status);
> +		}
> +
> +		status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
> +						PM1Bcontrol);
> +	} else
> +		status = acpi_notify_hypervisor_state(sleep_state,
> +						      PM1Acontrol, PM1Bcontrol);
>  
> -	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
> -					PM1Bcontrol);
>  	if (ACPI_FAILURE(status)) {
>  		return_ACPI_STATUS(status);
>  	}
> 
> 
> where acpi_notify_hypervisor_state() more or less maps directly onto a 
> Xen hypercall, which in turn performs the corresponding acpi operation 
> from within Xen.
> 
> I'm guessing you won't find this patch appealing as-is because it sticks 
> a great big if (xen) into an otherwise arch (and OS?) independent piece 
> of code.  In this particular instance, its fairly easy to envisage 
> encapsulating these two register writes into their own function which 
> architectures can override, which makes it fairly easy for me to put a 
> Xen hook in somewhere on the arch/x86 side of the fence.
> 
> But because Xen ends up doing the low-level cpu state save/restore, 
> several other parts of the S3 suspend path can be skipped on the Linux 
> side.  I'm wondering if you have any thoughts about how that can be 
> done, or if putting the Xen code in as-is is acceptable?
> 
> (BTW, xen_pv_domain() expands to a constant 0 when Xen isn't enabled in 
> the config, so the Xen-dependent bits will collapse down to nothing.  
> But it is defined in asm/xen/hypervisor.h, which is only present on x86 
> and ia64 architectures; on the other hand, believe those are the only 
> architectures using acpi.)

Well, why don't you implement the platform suspend operations for Xen?
I guess you don't want ACPI _PTS to be executed during suspend as well.

Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-21 17:10 ` Rafael J. Wysocki
@ 2009-03-22  4:26   ` Jeremy Fitzhardinge
  2009-03-22 11:28     ` Rafael J. Wysocki
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-22  4:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Brown, Len, linux-acpi, Xen-devel, the arch/x86 maintainers

Rafael J. Wysocki wrote:
> Well, why don't you implement the platform suspend operations for Xen?
> I guess you don't want ACPI _PTS to be executed during suspend as well.
>   

I don't know.  What's _PTS?

I think for the most part we want Linux to do most of the acpi work of 
bringing the machine into an idle state.  Its just that Xen is 
responsible for the very low level cpu context save/restore, because the 
Linux kernel is still running on vcpus rather than the physical cpus.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-22  4:26   ` Jeremy Fitzhardinge
@ 2009-03-22 11:28     ` Rafael J. Wysocki
  2009-03-22 13:14       ` Ingo Molnar
  2009-03-22 17:07       ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-22 11:28 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Brown, Len, linux-acpi, Xen-devel, the arch/x86 maintainers

On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> Rafael J. Wysocki wrote:
> > Well, why don't you implement the platform suspend operations for Xen?
> > I guess you don't want ACPI _PTS to be executed during suspend as well.
> >   
> 
> I don't know.  What's _PTS?

It's an ACPI method called to prepare the platform to enter the sleep state
(the name stands for "prepare to sleep").  Executing it may affect the
hardware.

> I think for the most part we want Linux to do most of the acpi work of 
> bringing the machine into an idle state.  Its just that Xen is 
> responsible for the very low level cpu context save/restore, because the 
> Linux kernel is still running on vcpus rather than the physical cpus.

I think you really should not execute any global ACPI methods to suspend a
guest, because that may affect the host.  That's why I think it's better to
regard Xen as a platform and implement a separate set of suspend operations for
it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-22 11:28     ` Rafael J. Wysocki
@ 2009-03-22 13:14       ` Ingo Molnar
  2009-03-22 13:17         ` Rafael J. Wysocki
  2009-03-22 17:07       ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 32+ messages in thread
From: Ingo Molnar @ 2009-03-22 13:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Brown, Len, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, linux-acpi


* Rafael J. Wysocki <rjw@sisk.pl> wrote:

> On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> > Rafael J. Wysocki wrote:
> > > Well, why don't you implement the platform suspend operations for Xen?
> > > I guess you don't want ACPI _PTS to be executed during suspend as well.
> > >   
> > 
> > I don't know.  What's _PTS?
> 
> It's an ACPI method called to prepare the platform to enter the sleep state
> (the name stands for "prepare to sleep").  Executing it may affect the
> hardware.
> 
> > I think for the most part we want Linux to do most of the acpi 
> > work of bringing the machine into an idle state.  Its just that 
> > Xen is responsible for the very low level cpu context 
> > save/restore, because the Linux kernel is still running on vcpus 
> > rather than the physical cpus.
> 
> I think you really should not execute any global ACPI methods to 
> suspend a guest, because that may affect the host.  That's why I 
> think it's better to regard Xen as a platform and implement a 
> separate set of suspend operations for it.

I'd agree with that. That also allows the reuse of existing 
callbacks, right?

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-22 13:14       ` Ingo Molnar
@ 2009-03-22 13:17         ` Rafael J. Wysocki
  0 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-22 13:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, Brown, Len, linux-acpi, Xen-devel,
	the arch/x86 maintainers

On Sunday 22 March 2009, Ingo Molnar wrote:
> 
> * Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> > > Rafael J. Wysocki wrote:
> > > > Well, why don't you implement the platform suspend operations for Xen?
> > > > I guess you don't want ACPI _PTS to be executed during suspend as well.
> > > >   
> > > 
> > > I don't know.  What's _PTS?
> > 
> > It's an ACPI method called to prepare the platform to enter the sleep state
> > (the name stands for "prepare to sleep").  Executing it may affect the
> > hardware.
> > 
> > > I think for the most part we want Linux to do most of the acpi 
> > > work of bringing the machine into an idle state.  Its just that 
> > > Xen is responsible for the very low level cpu context 
> > > save/restore, because the Linux kernel is still running on vcpus 
> > > rather than the physical cpus.
> > 
> > I think you really should not execute any global ACPI methods to 
> > suspend a guest, because that may affect the host.  That's why I 
> > think it's better to regard Xen as a platform and implement a 
> > separate set of suspend operations for it.
> 
> I'd agree with that. That also allows the reuse of existing 
> callbacks, right?

Yes, it does.

Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-22 11:28     ` Rafael J. Wysocki
  2009-03-22 13:14       ` Ingo Molnar
@ 2009-03-22 17:07       ` Jeremy Fitzhardinge
  2009-03-22 18:00         ` Rafael J. Wysocki
  2009-03-23  3:29         ` Tian, Kevin
  1 sibling, 2 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-22 17:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Brown, Len, linux-acpi, Xen-devel, the arch/x86 maintainers

Rafael J. Wysocki wrote:
> On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
>   
>> Rafael J. Wysocki wrote:
>>     
>>> Well, why don't you implement the platform suspend operations for Xen?
>>> I guess you don't want ACPI _PTS to be executed during suspend as well.
>>>   
>>>       
>> I don't know.  What's _PTS?
>>     
>
> It's an ACPI method called to prepare the platform to enter the sleep state
> (the name stands for "prepare to sleep").  Executing it may affect the
> hardware.
>   

OK, that's what we want.  Dom0 is the control domain which is 
responsible for the bulk of the hardware; Xen itself has very little 
hardware knowledge.

> I think you really should not execute any global ACPI methods to suspend a
> guest, because that may affect the host.  That's why I think it's better to
> regard Xen as a platform and implement a separate set of suspend operations for
> it.
>   

In this case we're talking about the special privileged domain which can 
be considered to be on the "host" side of the line. 

That said, I'd be interested in looking at a suspend operations-based 
approach if you think its the right way to go.  But I'm concerned that 
we'd end up with a big set of very similar-looking parallel functions 
just to deal with some difference in detail near the bottom.  Can you 
give me a pointer to where this gets put together for acpi?

Thanks,
    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Paravirtualizing bits of acpi access
  2009-03-22 17:07       ` Jeremy Fitzhardinge
@ 2009-03-22 18:00         ` Rafael J. Wysocki
  2009-03-23  3:29         ` Tian, Kevin
  1 sibling, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-22 18:00 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Brown, Len, linux-acpi, Xen-devel, the arch/x86 maintainers

On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> Rafael J. Wysocki wrote:
> > On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> >   
> >> Rafael J. Wysocki wrote:
> >>     
> >>> Well, why don't you implement the platform suspend operations for Xen?
> >>> I guess you don't want ACPI _PTS to be executed during suspend as well.
> >>>   
> >>>       
> >> I don't know.  What's _PTS?
> >>     
> >
> > It's an ACPI method called to prepare the platform to enter the sleep state
> > (the name stands for "prepare to sleep").  Executing it may affect the
> > hardware.
> >   
> 
> OK, that's what we want.  Dom0 is the control domain which is 
> responsible for the bulk of the hardware; Xen itself has very little 
> hardware knowledge.
> 
> > I think you really should not execute any global ACPI methods to suspend a
> > guest, because that may affect the host.  That's why I think it's better to
> > regard Xen as a platform and implement a separate set of suspend operations for
> > it.
> >   
> 
> In this case we're talking about the special privileged domain which can 
> be considered to be on the "host" side of the line. 
> 
> That said, I'd be interested in looking at a suspend operations-based 
> approach if you think its the right way to go.  But I'm concerned that 
> we'd end up with a big set of very similar-looking parallel functions 
> just to deal with some difference in detail near the bottom.  Can you 
> give me a pointer to where this gets put together for acpi?

Please have a look at include/linux/suspend.h for the prototypes and
drivers/acpi/sleep.c contains the ACPI implementation.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: Re: Paravirtualizing bits of acpi access
  2009-03-22 17:07       ` Jeremy Fitzhardinge
  2009-03-22 18:00         ` Rafael J. Wysocki
@ 2009-03-23  3:29         ` Tian, Kevin
  2009-03-23 18:20           ` [Xen-devel] " Rafael J. Wysocki
  1 sibling, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2009-03-23  3:29 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge', Rafael J. Wysocki
  Cc: Brown, Len, Cihula, Joseph, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org

>From: Jeremy Fitzhardinge
>Sent: Monday, March 23, 2009 1:08 AM
>
>Rafael J. Wysocki wrote:
>> On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
>>   
>>> Rafael J. Wysocki wrote:
>>>     
>>>> Well, why don't you implement the platform suspend 
>operations for Xen?
>>>> I guess you don't want ACPI _PTS to be executed during 
>suspend as well.
>>>>   
>>>>       
>>> I don't know.  What's _PTS?
>>>     
>>
>> It's an ACPI method called to prepare the platform to enter 
>the sleep state
>> (the name stands for "prepare to sleep").  Executing it may 
>affect the
>> hardware.
>>   
>
>OK, that's what we want.  Dom0 is the control domain which is 
>responsible for the bulk of the hardware; Xen itself has very little 
>hardware knowledge.

yes, Xen only takes care of several core system devices (pic, 
ioapic, serial, timer sources) and cpus, and let dom0 control all
the rest. _PTS, imo, will not affect Xen controlled devices as 
even on native those devices are suspended after _PTS. Also
Xen doesn't incorporate ACPI parser and thus can't evaluate any
ACPI method on its own.

>
>> I think you really should not execute any global ACPI 
>methods to suspend a
>> guest, because that may affect the host.  That's why I think 
>it's better to
>> regard Xen as a platform and implement a separate set of 
>suspend operations for
>> it.
>>   
>
>In this case we're talking about the special privileged domain 
>which can 
>be considered to be on the "host" side of the line. 
>
>That said, I'd be interested in looking at a suspend operations-based 
>approach if you think its the right way to go.  But I'm concerned that 
>we'd end up with a big set of very similar-looking parallel functions 
>just to deal with some difference in detail near the bottom.  Can you 
>give me a pointer to where this gets put together for acpi?
>

As Jeremy pointed out, dom0 is a special domain which cooperate
with Xen to fulfill the whole S3 sequence, i.e. dom0 still carries 99%
existing ACPI S3 flow, with several exceptions as below:

a) No need to prepare wakeup stub (since it's Xen to be first waken
up after resume), and no expectation that execution flow will be 
resumed from its wakeup stub (context is resumed at hypercall
return)

b) No need to save/restore bsp context and gear to Xen by hypercall
at last step where originally hardware reg bits are written

And then Xen jumps in to finish remaining steps. From this angle,
Xen is not a completely new platform and, well, S3 is more like a
'S1' type from dom0's p.o.v with a different trigger method. Then is
it overkilled to introduce a new set of ops with 99% content 
duplicated?

Thanks,
Kevin 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23  3:29         ` Tian, Kevin
@ 2009-03-23 18:20           ` Rafael J. Wysocki
  2009-03-23 19:07             ` Jeremy Fitzhardinge
  2009-03-23 19:52             ` [Xen-devel] " Matthew Garrett
  0 siblings, 2 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-23 18:20 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: 'Jeremy Fitzhardinge', Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

On Monday 23 March 2009, Tian, Kevin wrote:
> >From: Jeremy Fitzhardinge
> >Sent: Monday, March 23, 2009 1:08 AM
> >
> >Rafael J. Wysocki wrote:
> >> On Sunday 22 March 2009, Jeremy Fitzhardinge wrote:
> >>   
> >>> Rafael J. Wysocki wrote:
> >>>     
> >>>> Well, why don't you implement the platform suspend 
> >operations for Xen?
> >>>> I guess you don't want ACPI _PTS to be executed during 
> >suspend as well.
> >>>>   
> >>>>       
> >>> I don't know.  What's _PTS?
> >>>     
> >>
> >> It's an ACPI method called to prepare the platform to enter 
> >the sleep state
> >> (the name stands for "prepare to sleep").  Executing it may 
> >affect the
> >> hardware.
> >>   
> >
> >OK, that's what we want.  Dom0 is the control domain which is 
> >responsible for the bulk of the hardware; Xen itself has very little 
> >hardware knowledge.
> 
> yes, Xen only takes care of several core system devices (pic, 
> ioapic, serial, timer sources) and cpus, and let dom0 control all
> the rest. _PTS, imo, will not affect Xen controlled devices as 
> even on native those devices are suspended after _PTS. Also
> Xen doesn't incorporate ACPI parser and thus can't evaluate any
> ACPI method on its own.
> 
> >
> >> I think you really should not execute any global ACPI 
> >methods to suspend a
> >> guest, because that may affect the host.  That's why I think 
> >it's better to
> >> regard Xen as a platform and implement a separate set of 
> >suspend operations for
> >> it.
> >>   
> >
> >In this case we're talking about the special privileged domain 
> >which can 
> >be considered to be on the "host" side of the line. 
> >
> >That said, I'd be interested in looking at a suspend operations-based 
> >approach if you think its the right way to go.  But I'm concerned that 
> >we'd end up with a big set of very similar-looking parallel functions 
> >just to deal with some difference in detail near the bottom.  Can you 
> >give me a pointer to where this gets put together for acpi?
> >
> 
> As Jeremy pointed out, dom0 is a special domain which cooperate
> with Xen to fulfill the whole S3 sequence, i.e. dom0 still carries 99%
> existing ACPI S3 flow, with several exceptions as below:
> 
> a) No need to prepare wakeup stub (since it's Xen to be first waken
> up after resume), and no expectation that execution flow will be 
> resumed from its wakeup stub (context is resumed at hypercall
> return)
> 
> b) No need to save/restore bsp context and gear to Xen by hypercall
> at last step where originally hardware reg bits are written
> 
> And then Xen jumps in to finish remaining steps. From this angle,
> Xen is not a completely new platform and, well, S3 is more like a
> 'S1' type from dom0's p.o.v with a different trigger method. Then is
> it overkilled to introduce a new set of ops with 99% content 
> duplicated?

IMO, no, it isn't.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 18:20           ` [Xen-devel] " Rafael J. Wysocki
@ 2009-03-23 19:07             ` Jeremy Fitzhardinge
  2009-03-23 20:27               ` Rafael J. Wysocki
  2009-03-23 19:52             ` [Xen-devel] " Matthew Garrett
  1 sibling, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-23 19:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tian, Kevin, Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Cihula, Joseph

Rafael J. Wysocki wrote:
> On Monday 23 March 2009, Tian, Kevin wrote:
>   
>> And then Xen jumps in to finish remaining steps. From this angle,
>> Xen is not a completely new platform and, well, S3 is more like a
>> 'S1' type from dom0's p.o.v with a different trigger method. Then is
>> it overkilled to introduce a new set of ops with 99% content 
>> duplicated?
>>     
>
> IMO, no, it isn't.

Hm.  Well, lets take acpi_suspend_enter() as a specific example.  The 
Xen change here is:

@@ -240,11 +240,20 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
 		barrier();
 		status = acpi_enter_sleep_state(acpi_state);
 		break;
 
 	case ACPI_STATE_S3:
-		do_suspend_lowlevel();
+		if (!xen_pv_domain())
+			do_suspend_lowlevel();
+		else {
+			/*
+			 * Xen will save and restore CPU context, so
+			 * we can skip that and just go straight to
+			 * the suspend.
+			 */
+			acpi_enter_sleep_state(acpi_state);
+		}
 		break;
 	}
 
 	/* If ACPI is not enabled by the BIOS, we need to enable it here. */
 	if (set_sci_en_on_resume)


Which is, functionally, adding one if() and a new line of code, in the 
middle of a ~70 line function.

Are you suggesting that it would be best to copy this whole function so 
that I can put one line of Xen-specific code in the middle, rather than 
just making this change?

Some other functions, the Xen vs. non-Xen changes are larger; 
acpi_sleep_prepare() could reasonably have a Xen-specific variant 
because a big chunk of it is setting up the wakeup vector (which is 
unnecessary under Xen), and the rest can be easily pulled into common 
code.  But unfortunately acpi_sleep_prepare isn't itself an operation, 
and is only called at the bottom of 2-3 level deep callchains.

I think that rather than having a separate xen-acpi 
platform_suspend_ops, it would make more sense to have a acpi_ops within 
acpi/sleep.c and handle the differences that way.  I'll see how it turns 
out.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 18:20           ` [Xen-devel] " Rafael J. Wysocki
  2009-03-23 19:07             ` Jeremy Fitzhardinge
@ 2009-03-23 19:52             ` Matthew Garrett
  2009-03-23 20:22               ` Rafael J. Wysocki
  1 sibling, 1 reply; 32+ messages in thread
From: Matthew Garrett @ 2009-03-23 19:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tian, Kevin, 'Jeremy Fitzhardinge', Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

On Mon, Mar 23, 2009 at 07:20:12PM +0100, Rafael J. Wysocki wrote:
> On Monday 23 March 2009, Tian, Kevin wrote:
> > And then Xen jumps in to finish remaining steps. From this angle,
> > Xen is not a completely new platform and, well, S3 is more like a
> > 'S1' type from dom0's p.o.v with a different trigger method. Then is
> > it overkilled to introduce a new set of ops with 99% content 
> > duplicated?
> 
> IMO, no, it isn't.

I'd disagree. Duplicating the code means remembering to apply bugfixes 
to both parts.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 19:52             ` [Xen-devel] " Matthew Garrett
@ 2009-03-23 20:22               ` Rafael J. Wysocki
  0 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-23 20:22 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Tian, Kevin, 'Jeremy Fitzhardinge', Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

On Monday 23 March 2009, Matthew Garrett wrote:
> On Mon, Mar 23, 2009 at 07:20:12PM +0100, Rafael J. Wysocki wrote:
> > On Monday 23 March 2009, Tian, Kevin wrote:
> > > And then Xen jumps in to finish remaining steps. From this angle,
> > > Xen is not a completely new platform and, well, S3 is more like a
> > > 'S1' type from dom0's p.o.v with a different trigger method. Then is
> > > it overkilled to introduce a new set of ops with 99% content 
> > > duplicated?
> > 
> > IMO, no, it isn't.
> 
> I'd disagree. Duplicating the code means remembering to apply bugfixes 
> to both parts.

What code are you referring to exactly?  The callbacks that are literally the
same can be pointed to by the new set of operations just fine.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 19:07             ` Jeremy Fitzhardinge
@ 2009-03-23 20:27               ` Rafael J. Wysocki
  2009-03-23 20:42                 ` Jeremy Fitzhardinge
  2009-03-24  5:14                 ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-23 20:27 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Tian, Kevin, Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Cihula, Joseph

On Monday 23 March 2009, Jeremy Fitzhardinge wrote:
> Rafael J. Wysocki wrote:
> > On Monday 23 March 2009, Tian, Kevin wrote:
> >   
> >> And then Xen jumps in to finish remaining steps. From this angle,
> >> Xen is not a completely new platform and, well, S3 is more like a
> >> 'S1' type from dom0's p.o.v with a different trigger method. Then is
> >> it overkilled to introduce a new set of ops with 99% content 
> >> duplicated?
> >>     
> >
> > IMO, no, it isn't.
> 
> Hm.  Well, lets take acpi_suspend_enter() as a specific example.  The 
> Xen change here is:
> 
> @@ -240,11 +240,20 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
>  		barrier();
>  		status = acpi_enter_sleep_state(acpi_state);
>  		break;
>  
>  	case ACPI_STATE_S3:
> -		do_suspend_lowlevel();
> +		if (!xen_pv_domain())
> +			do_suspend_lowlevel();
> +		else {
> +			/*
> +			 * Xen will save and restore CPU context, so
> +			 * we can skip that and just go straight to
> +			 * the suspend.
> +			 */
> +			acpi_enter_sleep_state(acpi_state);
> +		}
>  		break;
>  	}

Hmm, in that case it may be more appropriate to modify
do_suspend_lowlevel().  Have you considered doing that?

>  	/* If ACPI is not enabled by the BIOS, we need to enable it here. */
>  	if (set_sci_en_on_resume)
> 
> 
> Which is, functionally, adding one if() and a new line of code, in the 
> middle of a ~70 line function.
> 
> Are you suggesting that it would be best to copy this whole function so 
> that I can put one line of Xen-specific code in the middle, rather than 
> just making this change?

No.

> Some other functions, the Xen vs. non-Xen changes are larger; 
> acpi_sleep_prepare() could reasonably have a Xen-specific variant 
> because a big chunk of it is setting up the wakeup vector (which is 
> unnecessary under Xen), and the rest can be easily pulled into common 
> code.  But unfortunately acpi_sleep_prepare isn't itself an operation, 
> and is only called at the bottom of 2-3 level deep callchains.
> 
> I think that rather than having a separate xen-acpi 
> platform_suspend_ops, it would make more sense to have a acpi_ops within 
> acpi/sleep.c and handle the differences that way.  I'll see how it turns 
> out.

Yes, that may be better.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 20:27               ` Rafael J. Wysocki
@ 2009-03-23 20:42                 ` Jeremy Fitzhardinge
  2009-03-24  5:14                 ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-23 20:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tian, Kevin, Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Cihula, Joseph

Rafael J. Wysocki wrote:
> Hmm, in that case it may be more appropriate to modify
> do_suspend_lowlevel().  Have you considered doing that?
>   

do_suspend_lowlevel is in asm with 32 and 64-bit variants, so it is a 
little awkward to deal with.  But, yes, I was thinking of adding a 
do_suspend() with this logic in it, which calls do_suspend_lowlevel() as 
appropriate.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-23 20:27               ` Rafael J. Wysocki
  2009-03-23 20:42                 ` Jeremy Fitzhardinge
@ 2009-03-24  5:14                 ` Jeremy Fitzhardinge
  2009-03-24  5:33                   ` Tian, Kevin
  1 sibling, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-24  5:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tian, Kevin, Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Cihula, Joseph

Rafael J. Wysocki wrote:
> Yes, that may be better.
>   

How does this look now?  I had a second look at the Xen-hooks, and found 
that they were mostly unnecessary (they were preventing useless code 
from running, but there's no harm in letting it run).

The result is below.  Does this look reasonable?

Thanks,
    J

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 7c243a2..8ebda00 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -12,6 +12,9 @@
 #include <asm/segment.h>
 #include <asm/desc.h>
 
+#include <xen/acpi.h>
+#include <asm/xen/hypervisor.h>
+
 #include "realmode/wakeup.h"
 #include "sleep.h"
 
@@ -25,6 +28,34 @@ static unsigned long acpi_realmode;
 static char temp_stack[4096];
 #endif
 
+/* 
+ * Override final register write when entering sleep state so we can
+ * direct it to a hypercall in the Xen case.
+ */
+acpi_status arch_acpi_enter_sleep_state(u8 sleep_state, u32
+					PM1Acontrol, u32 PM1Bcontrol)
+{
+	acpi_status ret;
+
+	if (xen_pv_acpi()) {
+		int err;
+
+		err = acpi_notify_hypervisor_state(sleep_state,
+						   PM1Acontrol, PM1Bcontrol);
+
+		ret = AE_OK;
+		if (err) {
+			ACPI_DEBUG_PRINT((ACPI_DB_INIT,
+					  "Hypervisor failure [%d]\n", err));
+			ret = AE_ERROR;
+		}
+	} else
+		ret = default_acpi_enter_sleep_state(sleep_state,
+						     PM1Acontrol, PM1Bcontrol);
+
+	return ret;
+}
+
 /**
  * acpi_save_state_mem - save kernel state
  *
diff --git a/drivers/acpi/acpica/hwsleep.c b/drivers/acpi/acpica/hwsleep.c
index a2af2a4..6196a7e 100644
--- a/drivers/acpi/acpica/hwsleep.c
+++ b/drivers/acpi/acpica/hwsleep.c
@@ -209,6 +209,83 @@ acpi_status acpi_enter_sleep_state_prep(u8 sleep_state)
 
 ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state_prep)
 
+/*
+ * Default implementation of final register write to enter sleep
+ * state, available for architecture versions to call if necessary.
+ */
+acpi_status default_acpi_enter_sleep_state(u8 sleep_state, u32
+					   PM1Acontrol, u32 PM1Bcontrol)
+{
+	acpi_status status;
+	u32 in_value;
+
+	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
+					PM1Acontrol);
+	if (ACPI_FAILURE(status)) {
+		return_ACPI_STATUS(status);
+	}
+
+	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
+					PM1Bcontrol);
+
+	if (ACPI_FAILURE(status)) {
+		return_ACPI_STATUS(status);
+	}
+
+	if (sleep_state > ACPI_STATE_S3) {
+		struct acpi_bit_register_info *sleep_enable_reg_info;
+
+		/*
+		 * We wanted to sleep > S3, but it didn't happen (by virtue of the
+		 * fact that we are still executing!)
+		 *
+		 * Wait ten seconds, then try again. This is to get S4/S5 to work on
+		 * all machines.
+		 *
+		 * We wait so long to allow chipsets that poll this reg very slowly to
+		 * still read the right value. Ideally, this block would go
+		 * away entirely.
+		 */
+		acpi_os_stall(10000000);
+
+		sleep_enable_reg_info =
+			acpi_hw_get_bit_register_info(ACPI_BITREG_SLEEP_ENABLE);
+
+		status = acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
+						sleep_enable_reg_info->
+						access_bit_mask);
+		if (ACPI_FAILURE(status)) {
+			return_ACPI_STATUS(status);
+		}
+	}
+
+	/* Wait until we enter sleep state */
+
+	do {
+		status = acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
+						    &in_value);
+		if (ACPI_FAILURE(status)) {
+			return_ACPI_STATUS(status);
+		}
+
+		/* Spin until we wake */
+
+	} while (!in_value);
+
+	return_ACPI_STATUS(AE_OK);
+}
+
+/* 
+ * Weak version of final write to enter sleep state, so that
+ * architectures can override it.
+ */
+acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
+					    u32 PM1Acontrol, u32 PM1Bcontrol)
+{
+	return default_acpi_enter_sleep_state(sleep_state,
+					      PM1Acontrol, PM1Bcontrol);
+}
+
 /*******************************************************************************
  *
  * FUNCTION:    acpi_enter_sleep_state
@@ -227,7 +304,6 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state)
 	u32 PM1Bcontrol;
 	struct acpi_bit_register_info *sleep_type_reg_info;
 	struct acpi_bit_register_info *sleep_enable_reg_info;
-	u32 in_value;
 	struct acpi_object_list arg_list;
 	union acpi_object arg;
 	acpi_status status;
@@ -337,54 +413,7 @@ acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state)
 
 	ACPI_FLUSH_CPU_CACHE();
 
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
-					PM1Acontrol);
-	if (ACPI_FAILURE(status)) {
-		return_ACPI_STATUS(status);
-	}
-
-	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
-					PM1Bcontrol);
-	if (ACPI_FAILURE(status)) {
-		return_ACPI_STATUS(status);
-	}
-
-	if (sleep_state > ACPI_STATE_S3) {
-		/*
-		 * We wanted to sleep > S3, but it didn't happen (by virtue of the
-		 * fact that we are still executing!)
-		 *
-		 * Wait ten seconds, then try again. This is to get S4/S5 to work on
-		 * all machines.
-		 *
-		 * We wait so long to allow chipsets that poll this reg very slowly to
-		 * still read the right value. Ideally, this block would go
-		 * away entirely.
-		 */
-		acpi_os_stall(10000000);
-
-		status = acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
-						sleep_enable_reg_info->
-						access_bit_mask);
-		if (ACPI_FAILURE(status)) {
-			return_ACPI_STATUS(status);
-		}
-	}
-
-	/* Wait until we enter sleep state */
-
-	do {
-		status = acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
-						    &in_value);
-		if (ACPI_FAILURE(status)) {
-			return_ACPI_STATUS(status);
-		}
-
-		/* Spin until we wake */
-
-	} while (!in_value);
-
-	return_ACPI_STATUS(AE_OK);
+	return arch_acpi_enter_sleep_state(sleep_state, PM1Acontrol, PM1Bcontrol);
 }
 
 ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 5192666..bd1cc1a 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -19,6 +19,8 @@
 
 #include <asm/io.h>
 
+#include <xen/acpi.h>
+
 #include <acpi/acpi_bus.h>
 #include <acpi/acpi_drivers.h>
 #include "sleep.h"
@@ -209,6 +211,21 @@ static int acpi_suspend_begin(suspend_state_t pm_state)
 	return error;
 }
 
+static void do_suspend(void)
+{
+	if (!xen_pv_acpi()) {
+		do_suspend_lowlevel();
+		return;
+	}
+
+	/*
+	 * Xen will save and restore CPU context, so
+	 * we can skip that and just go straight to
+	 * the suspend.
+	 */
+	acpi_enter_sleep_state(ACPI_STATE_S3);
+}
+
 /**
  *	acpi_suspend_enter - Actually enter a sleep state.
  *	@pm_state: ignored
@@ -242,7 +259,7 @@ static int acpi_suspend_enter(suspend_state_t pm_state)
 		break;
 
 	case ACPI_STATE_S3:
-		do_suspend_lowlevel();
+		do_suspend();
 		break;
 	}
 
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 51cbaa5..0138113 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -76,3 +76,7 @@ config XEN_COMPAT_XENFS
 
 config XEN_XENBUS_FRONTEND
        tristate
+
+config XEN_S3
+       def_bool y
+       depends on XEN_DOM0 && ACPI
\ No newline at end of file
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index bb88673..4b01fc8 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -7,4 +7,6 @@ obj-$(CONFIG_XEN_BALLOON)		+= balloon.o
 obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
 obj-$(CONFIG_XEN_BLKDEV_BACKEND)	+= blkback/
 obj-$(CONFIG_XEN_NETDEV_BACKEND)	+= netback/
-obj-$(CONFIG_XENFS)			+= xenfs/
\ No newline at end of file
+obj-$(CONFIG_XENFS)			+= xenfs/
+
+obj-$(CONFIG_XEN_S3)			+= acpi.o
\ No newline at end of file
diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c
new file mode 100644
index 0000000..e6d3d0e
--- /dev/null
+++ b/drivers/xen/acpi.c
@@ -0,0 +1,23 @@
+#include <xen/acpi.h>
+
+#include <xen/interface/platform.h>
+#include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
+
+int acpi_notify_hypervisor_state(u8 sleep_state,
+				 u32 pm1a_cnt, u32 pm1b_cnt)
+{
+	struct xen_platform_op op = {
+		.cmd = XENPF_enter_acpi_sleep,
+		.interface_version = XENPF_INTERFACE_VERSION,
+		.u = {
+			.enter_acpi_sleep = {
+				.pm1a_cnt_val = (u16)pm1a_cnt,
+				.pm1b_cnt_val = (u16)pm1b_cnt,
+				.sleep_state = sleep_state,
+			},
+		},
+	};
+
+	return HYPERVISOR_dom0_op(&op);
+}
diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index cc40102..f39f396 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -372,6 +372,12 @@ acpi_status acpi_enter_sleep_state_prep(u8 sleep_state);
 
 acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state);
 
+acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
+					       u32 PM1Acontrol, u32 PM1Bcontrol);
+
+acpi_status default_acpi_enter_sleep_state(u8 sleep_state,
+					   u32 PM1Acontrol, u32 PM1Bcontrol);
+
 acpi_status asmlinkage acpi_enter_sleep_state_s4bios(void);
 
 acpi_status acpi_leave_sleep_state_prep(u8 sleep_state);
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
new file mode 100644
index 0000000..fea4cfb
--- /dev/null
+++ b/include/xen/acpi.h
@@ -0,0 +1,23 @@
+#ifndef _XEN_ACPI_H
+#define _XEN_ACPI_H
+
+#include <linux/types.h>
+
+#ifdef CONFIG_XEN_S3
+#include <asm/xen/hypervisor.h>
+
+static inline bool xen_pv_acpi(void)
+{
+	return xen_pv_domain();
+}
+#else
+static inline bool xen_pv_acpi(void)
+{
+	return false;
+}
+#endif
+
+int acpi_notify_hypervisor_state(u8 sleep_state,
+				 u32 pm1a_cnt, u32 pm1b_cnd);
+
+#endif	/* _XEN_ACPI_H */



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24  5:14                 ` Jeremy Fitzhardinge
@ 2009-03-24  5:33                   ` Tian, Kevin
  2009-03-24  5:42                     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2009-03-24  5:33 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge', Rafael J. Wysocki
  Cc: Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Cihula, Joseph

>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] 
>Sent: Tuesday, March 24, 2009 1:15 PM
>
>Rafael J. Wysocki wrote:
>> Yes, that may be better.
>>   
>
>How does this look now?  I had a second look at the Xen-hooks, 
>and found 
>that they were mostly unnecessary (they were preventing useless code 
>from running, but there's no harm in letting it run).

The reason why those useless code are prevented, is in case of
clobberring 0-1M area which if contains valid Xen content. in
acpi_save_state_mem, dom0's wakeup code is copied to area
allocated in acpi_reserve_bootmem. If every Xen's usage on 0-1M
is based on copy-on-use style, such as trampoline code for AP
boot, it's ok. But I'm not sure whether Xen puts some persistent
content there. IIRC, that boot time allocation usually returns sth
like 0x90000 since wakeup stub is first run in real mode. But if
for dom0 alloc_bootmem_low never gives <1M page, then this
prevention could be skipped as your thought.

>
>The result is below.  Does this look reasonable?
>
>Thanks,
>    J
>
>diff --git a/arch/x86/kernel/acpi/sleep.c 
>b/arch/x86/kernel/acpi/sleep.c
>index 7c243a2..8ebda00 100644
>--- a/arch/x86/kernel/acpi/sleep.c
>+++ b/arch/x86/kernel/acpi/sleep.c
>@@ -12,6 +12,9 @@
> #include <asm/segment.h>
> #include <asm/desc.h>
> 
>+#include <xen/acpi.h>
>+#include <asm/xen/hypervisor.h>
>+
> #include "realmode/wakeup.h"
> #include "sleep.h"
> 
>@@ -25,6 +28,34 @@ static unsigned long acpi_realmode;
> static char temp_stack[4096];
> #endif
> 
>+/* 
>+ * Override final register write when entering sleep state so we can
>+ * direct it to a hypercall in the Xen case.
>+ */
>+acpi_status arch_acpi_enter_sleep_state(u8 sleep_state, u32
>+					PM1Acontrol, u32 PM1Bcontrol)
>+{
>+	acpi_status ret;
>+
>+	if (xen_pv_acpi()) {
>+		int err;
>+
>+		err = acpi_notify_hypervisor_state(sleep_state,
>+						   PM1Acontrol, 
>PM1Bcontrol);
>+
>+		ret = AE_OK;
>+		if (err) {
>+			ACPI_DEBUG_PRINT((ACPI_DB_INIT,
>+					  "Hypervisor failure 
>[%d]\n", err));
>+			ret = AE_ERROR;
>+		}
>+	} else
>+		ret = default_acpi_enter_sleep_state(sleep_state,
>+						     
>PM1Acontrol, PM1Bcontrol);
>+
>+	return ret;
>+}
>+
> /**
>  * acpi_save_state_mem - save kernel state
>  *
>diff --git a/drivers/acpi/acpica/hwsleep.c 
>b/drivers/acpi/acpica/hwsleep.c
>index a2af2a4..6196a7e 100644
>--- a/drivers/acpi/acpica/hwsleep.c
>+++ b/drivers/acpi/acpica/hwsleep.c
>@@ -209,6 +209,83 @@ acpi_status 
>acpi_enter_sleep_state_prep(u8 sleep_state)
> 
> ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state_prep)
> 
>+/*
>+ * Default implementation of final register write to enter sleep
>+ * state, available for architecture versions to call if necessary.
>+ */
>+acpi_status default_acpi_enter_sleep_state(u8 sleep_state, u32
>+					   PM1Acontrol, u32 PM1Bcontrol)
>+{
>+	acpi_status status;
>+	u32 in_value;
>+
>+	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
>+					PM1Acontrol);
>+	if (ACPI_FAILURE(status)) {
>+		return_ACPI_STATUS(status);
>+	}
>+
>+	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
>+					PM1Bcontrol);
>+
>+	if (ACPI_FAILURE(status)) {
>+		return_ACPI_STATUS(status);
>+	}
>+
>+	if (sleep_state > ACPI_STATE_S3) {
>+		struct acpi_bit_register_info *sleep_enable_reg_info;
>+
>+		/*
>+		 * We wanted to sleep > S3, but it didn't 
>happen (by virtue of the
>+		 * fact that we are still executing!)
>+		 *
>+		 * Wait ten seconds, then try again. This is to 
>get S4/S5 to work on
>+		 * all machines.
>+		 *
>+		 * We wait so long to allow chipsets that poll 
>this reg very slowly to
>+		 * still read the right value. Ideally, this 
>block would go
>+		 * away entirely.
>+		 */
>+		acpi_os_stall(10000000);
>+
>+		sleep_enable_reg_info =
>+			
>acpi_hw_get_bit_register_info(ACPI_BITREG_SLEEP_ENABLE);
>+
>+		status = 
>acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
>+						sleep_enable_reg_info->
>+						access_bit_mask);
>+		if (ACPI_FAILURE(status)) {
>+			return_ACPI_STATUS(status);
>+		}
>+	}
>+
>+	/* Wait until we enter sleep state */
>+
>+	do {
>+		status = 
>acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
>+						    &in_value);
>+		if (ACPI_FAILURE(status)) {
>+			return_ACPI_STATUS(status);
>+		}
>+
>+		/* Spin until we wake */
>+
>+	} while (!in_value);
>+
>+	return_ACPI_STATUS(AE_OK);
>+}
>+
>+/* 
>+ * Weak version of final write to enter sleep state, so that
>+ * architectures can override it.
>+ */
>+acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
>+					    u32 PM1Acontrol, 
>u32 PM1Bcontrol)
>+{
>+	return default_acpi_enter_sleep_state(sleep_state,
>+					      PM1Acontrol, PM1Bcontrol);
>+}
>+
> 
>/**************************************************************
>*****************
>  *
>  * FUNCTION:    acpi_enter_sleep_state
>@@ -227,7 +304,6 @@ acpi_status asmlinkage 
>acpi_enter_sleep_state(u8 sleep_state)
> 	u32 PM1Bcontrol;
> 	struct acpi_bit_register_info *sleep_type_reg_info;
> 	struct acpi_bit_register_info *sleep_enable_reg_info;
>-	u32 in_value;
> 	struct acpi_object_list arg_list;
> 	union acpi_object arg;
> 	acpi_status status;
>@@ -337,54 +413,7 @@ acpi_status asmlinkage 
>acpi_enter_sleep_state(u8 sleep_state)
> 
> 	ACPI_FLUSH_CPU_CACHE();
> 
>-	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
>-					PM1Acontrol);
>-	if (ACPI_FAILURE(status)) {
>-		return_ACPI_STATUS(status);
>-	}
>-
>-	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
>-					PM1Bcontrol);
>-	if (ACPI_FAILURE(status)) {
>-		return_ACPI_STATUS(status);
>-	}
>-
>-	if (sleep_state > ACPI_STATE_S3) {
>-		/*
>-		 * We wanted to sleep > S3, but it didn't 
>happen (by virtue of the
>-		 * fact that we are still executing!)
>-		 *
>-		 * Wait ten seconds, then try again. This is to 
>get S4/S5 to work on
>-		 * all machines.
>-		 *
>-		 * We wait so long to allow chipsets that poll 
>this reg very slowly to
>-		 * still read the right value. Ideally, this 
>block would go
>-		 * away entirely.
>-		 */
>-		acpi_os_stall(10000000);
>-
>-		status = 
>acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
>-						sleep_enable_reg_info->
>-						access_bit_mask);
>-		if (ACPI_FAILURE(status)) {
>-			return_ACPI_STATUS(status);
>-		}
>-	}
>-
>-	/* Wait until we enter sleep state */
>-
>-	do {
>-		status = 
>acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
>-						    &in_value);
>-		if (ACPI_FAILURE(status)) {
>-			return_ACPI_STATUS(status);
>-		}
>-
>-		/* Spin until we wake */
>-
>-	} while (!in_value);
>-
>-	return_ACPI_STATUS(AE_OK);
>+	return arch_acpi_enter_sleep_state(sleep_state, 
>PM1Acontrol, PM1Bcontrol);
> }
> 
> ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state)
>diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
>index 5192666..bd1cc1a 100644
>--- a/drivers/acpi/sleep.c
>+++ b/drivers/acpi/sleep.c
>@@ -19,6 +19,8 @@
> 
> #include <asm/io.h>
> 
>+#include <xen/acpi.h>
>+
> #include <acpi/acpi_bus.h>
> #include <acpi/acpi_drivers.h>
> #include "sleep.h"
>@@ -209,6 +211,21 @@ static int 
>acpi_suspend_begin(suspend_state_t pm_state)
> 	return error;
> }
> 
>+static void do_suspend(void)
>+{
>+	if (!xen_pv_acpi()) {
>+		do_suspend_lowlevel();
>+		return;
>+	}
>+
>+	/*
>+	 * Xen will save and restore CPU context, so
>+	 * we can skip that and just go straight to
>+	 * the suspend.
>+	 */
>+	acpi_enter_sleep_state(ACPI_STATE_S3);
>+}
>+
> /**
>  *	acpi_suspend_enter - Actually enter a sleep state.
>  *	@pm_state: ignored
>@@ -242,7 +259,7 @@ static int 
>acpi_suspend_enter(suspend_state_t pm_state)
> 		break;
> 
> 	case ACPI_STATE_S3:
>-		do_suspend_lowlevel();
>+		do_suspend();
> 		break;
> 	}
> 
>diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>index 51cbaa5..0138113 100644
>--- a/drivers/xen/Kconfig
>+++ b/drivers/xen/Kconfig
>@@ -76,3 +76,7 @@ config XEN_COMPAT_XENFS
> 
> config XEN_XENBUS_FRONTEND
>        tristate
>+
>+config XEN_S3
>+       def_bool y
>+       depends on XEN_DOM0 && ACPI
>\ No newline at end of file
>diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>index bb88673..4b01fc8 100644
>--- a/drivers/xen/Makefile
>+++ b/drivers/xen/Makefile
>@@ -7,4 +7,6 @@ obj-$(CONFIG_XEN_BALLOON)		+= balloon.o
> obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
> obj-$(CONFIG_XEN_BLKDEV_BACKEND)	+= blkback/
> obj-$(CONFIG_XEN_NETDEV_BACKEND)	+= netback/
>-obj-$(CONFIG_XENFS)			+= xenfs/
>\ No newline at end of file
>+obj-$(CONFIG_XENFS)			+= xenfs/
>+
>+obj-$(CONFIG_XEN_S3)			+= acpi.o
>\ No newline at end of file
>diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c
>new file mode 100644
>index 0000000..e6d3d0e
>--- /dev/null
>+++ b/drivers/xen/acpi.c
>@@ -0,0 +1,23 @@
>+#include <xen/acpi.h>
>+
>+#include <xen/interface/platform.h>
>+#include <asm/xen/hypercall.h>
>+#include <asm/xen/hypervisor.h>
>+
>+int acpi_notify_hypervisor_state(u8 sleep_state,
>+				 u32 pm1a_cnt, u32 pm1b_cnt)
>+{
>+	struct xen_platform_op op = {
>+		.cmd = XENPF_enter_acpi_sleep,
>+		.interface_version = XENPF_INTERFACE_VERSION,
>+		.u = {
>+			.enter_acpi_sleep = {
>+				.pm1a_cnt_val = (u16)pm1a_cnt,
>+				.pm1b_cnt_val = (u16)pm1b_cnt,
>+				.sleep_state = sleep_state,
>+			},
>+		},
>+	};
>+
>+	return HYPERVISOR_dom0_op(&op);
>+}
>diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
>index cc40102..f39f396 100644
>--- a/include/acpi/acpixf.h
>+++ b/include/acpi/acpixf.h
>@@ -372,6 +372,12 @@ acpi_status 
>acpi_enter_sleep_state_prep(u8 sleep_state);
> 
> acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state);
> 
>+acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
>+					       u32 PM1Acontrol, 
>u32 PM1Bcontrol);
>+
>+acpi_status default_acpi_enter_sleep_state(u8 sleep_state,
>+					   u32 PM1Acontrol, u32 
>PM1Bcontrol);
>+
> acpi_status asmlinkage acpi_enter_sleep_state_s4bios(void);
> 
> acpi_status acpi_leave_sleep_state_prep(u8 sleep_state);
>diff --git a/include/xen/acpi.h b/include/xen/acpi.h
>new file mode 100644
>index 0000000..fea4cfb
>--- /dev/null
>+++ b/include/xen/acpi.h
>@@ -0,0 +1,23 @@
>+#ifndef _XEN_ACPI_H
>+#define _XEN_ACPI_H
>+
>+#include <linux/types.h>
>+
>+#ifdef CONFIG_XEN_S3
>+#include <asm/xen/hypervisor.h>
>+
>+static inline bool xen_pv_acpi(void)
>+{
>+	return xen_pv_domain();
>+}
>+#else
>+static inline bool xen_pv_acpi(void)
>+{
>+	return false;
>+}
>+#endif
>+
>+int acpi_notify_hypervisor_state(u8 sleep_state,
>+				 u32 pm1a_cnt, u32 pm1b_cnd);
>+
>+#endif	/* _XEN_ACPI_H */
>
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24  5:33                   ` Tian, Kevin
@ 2009-03-24  5:42                     ` Jeremy Fitzhardinge
  2009-03-24  5:45                       ` Tian, Kevin
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-24  5:42 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

Tian, Kevin wrote:
> The reason why those useless code are prevented, is in case of
> clobberring 0-1M area which if contains valid Xen content. in
> acpi_save_state_mem, dom0's wakeup code is copied to area
> allocated in acpi_reserve_bootmem. If every Xen's usage on 0-1M
> is based on copy-on-use style, such as trampoline code for AP
> boot, it's ok. But I'm not sure whether Xen puts some persistent
> content there. IIRC, that boot time allocation usually returns sth
> like 0x90000 since wakeup stub is first run in real mode. But if
> for dom0 alloc_bootmem_low never gives <1M page, then this
> prevention could be skipped as your thought.
>   

Yes, but the memory allocated is only in 0-1M in dom0's pseudo-physical 
space, not in Xen's machine space.  It allocates the memory and does a 
virt_to_phys on it, but there's no special effort to remap it as mfns or 
anything.  There should be no possible conflict with Xen's use of the 
real 0-1M region.

Not that I've actually tried to execute that patch yet, so I could well 
be overlooking something...

    J

>   
>> The result is below.  Does this look reasonable?
>>
>> Thanks,
>>    J
>>
>> diff --git a/arch/x86/kernel/acpi/sleep.c 
>> b/arch/x86/kernel/acpi/sleep.c
>> index 7c243a2..8ebda00 100644
>> --- a/arch/x86/kernel/acpi/sleep.c
>> +++ b/arch/x86/kernel/acpi/sleep.c
>> @@ -12,6 +12,9 @@
>> #include <asm/segment.h>
>> #include <asm/desc.h>
>>
>> +#include <xen/acpi.h>
>> +#include <asm/xen/hypervisor.h>
>> +
>> #include "realmode/wakeup.h"
>> #include "sleep.h"
>>
>> @@ -25,6 +28,34 @@ static unsigned long acpi_realmode;
>> static char temp_stack[4096];
>> #endif
>>
>> +/* 
>> + * Override final register write when entering sleep state so we can
>> + * direct it to a hypercall in the Xen case.
>> + */
>> +acpi_status arch_acpi_enter_sleep_state(u8 sleep_state, u32
>> +					PM1Acontrol, u32 PM1Bcontrol)
>> +{
>> +	acpi_status ret;
>> +
>> +	if (xen_pv_acpi()) {
>> +		int err;
>> +
>> +		err = acpi_notify_hypervisor_state(sleep_state,
>> +						   PM1Acontrol, 
>> PM1Bcontrol);
>> +
>> +		ret = AE_OK;
>> +		if (err) {
>> +			ACPI_DEBUG_PRINT((ACPI_DB_INIT,
>> +					  "Hypervisor failure 
>> [%d]\n", err));
>> +			ret = AE_ERROR;
>> +		}
>> +	} else
>> +		ret = default_acpi_enter_sleep_state(sleep_state,
>> +						     
>> PM1Acontrol, PM1Bcontrol);
>> +
>> +	return ret;
>> +}
>> +
>> /**
>>  * acpi_save_state_mem - save kernel state
>>  *
>> diff --git a/drivers/acpi/acpica/hwsleep.c 
>> b/drivers/acpi/acpica/hwsleep.c
>> index a2af2a4..6196a7e 100644
>> --- a/drivers/acpi/acpica/hwsleep.c
>> +++ b/drivers/acpi/acpica/hwsleep.c
>> @@ -209,6 +209,83 @@ acpi_status 
>> acpi_enter_sleep_state_prep(u8 sleep_state)
>>
>> ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state_prep)
>>
>> +/*
>> + * Default implementation of final register write to enter sleep
>> + * state, available for architecture versions to call if necessary.
>> + */
>> +acpi_status default_acpi_enter_sleep_state(u8 sleep_state, u32
>> +					   PM1Acontrol, u32 PM1Bcontrol)
>> +{
>> +	acpi_status status;
>> +	u32 in_value;
>> +
>> +	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
>> +					PM1Acontrol);
>> +	if (ACPI_FAILURE(status)) {
>> +		return_ACPI_STATUS(status);
>> +	}
>> +
>> +	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
>> +					PM1Bcontrol);
>> +
>> +	if (ACPI_FAILURE(status)) {
>> +		return_ACPI_STATUS(status);
>> +	}
>> +
>> +	if (sleep_state > ACPI_STATE_S3) {
>> +		struct acpi_bit_register_info *sleep_enable_reg_info;
>> +
>> +		/*
>> +		 * We wanted to sleep > S3, but it didn't 
>> happen (by virtue of the
>> +		 * fact that we are still executing!)
>> +		 *
>> +		 * Wait ten seconds, then try again. This is to 
>> get S4/S5 to work on
>> +		 * all machines.
>> +		 *
>> +		 * We wait so long to allow chipsets that poll 
>> this reg very slowly to
>> +		 * still read the right value. Ideally, this 
>> block would go
>> +		 * away entirely.
>> +		 */
>> +		acpi_os_stall(10000000);
>> +
>> +		sleep_enable_reg_info =
>> +			
>> acpi_hw_get_bit_register_info(ACPI_BITREG_SLEEP_ENABLE);
>> +
>> +		status = 
>> acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
>> +						sleep_enable_reg_info->
>> +						access_bit_mask);
>> +		if (ACPI_FAILURE(status)) {
>> +			return_ACPI_STATUS(status);
>> +		}
>> +	}
>> +
>> +	/* Wait until we enter sleep state */
>> +
>> +	do {
>> +		status = 
>> acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
>> +						    &in_value);
>> +		if (ACPI_FAILURE(status)) {
>> +			return_ACPI_STATUS(status);
>> +		}
>> +
>> +		/* Spin until we wake */
>> +
>> +	} while (!in_value);
>> +
>> +	return_ACPI_STATUS(AE_OK);
>> +}
>> +
>> +/* 
>> + * Weak version of final write to enter sleep state, so that
>> + * architectures can override it.
>> + */
>> +acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
>> +					    u32 PM1Acontrol, 
>> u32 PM1Bcontrol)
>> +{
>> +	return default_acpi_enter_sleep_state(sleep_state,
>> +					      PM1Acontrol, PM1Bcontrol);
>> +}
>> +
>>
>> /**************************************************************
>> *****************
>>  *
>>  * FUNCTION:    acpi_enter_sleep_state
>> @@ -227,7 +304,6 @@ acpi_status asmlinkage 
>> acpi_enter_sleep_state(u8 sleep_state)
>> 	u32 PM1Bcontrol;
>> 	struct acpi_bit_register_info *sleep_type_reg_info;
>> 	struct acpi_bit_register_info *sleep_enable_reg_info;
>> -	u32 in_value;
>> 	struct acpi_object_list arg_list;
>> 	union acpi_object arg;
>> 	acpi_status status;
>> @@ -337,54 +413,7 @@ acpi_status asmlinkage 
>> acpi_enter_sleep_state(u8 sleep_state)
>>
>> 	ACPI_FLUSH_CPU_CACHE();
>>
>> -	status = acpi_hw_register_write(ACPI_REGISTER_PM1A_CONTROL,
>> -					PM1Acontrol);
>> -	if (ACPI_FAILURE(status)) {
>> -		return_ACPI_STATUS(status);
>> -	}
>> -
>> -	status = acpi_hw_register_write(ACPI_REGISTER_PM1B_CONTROL,
>> -					PM1Bcontrol);
>> -	if (ACPI_FAILURE(status)) {
>> -		return_ACPI_STATUS(status);
>> -	}
>> -
>> -	if (sleep_state > ACPI_STATE_S3) {
>> -		/*
>> -		 * We wanted to sleep > S3, but it didn't 
>> happen (by virtue of the
>> -		 * fact that we are still executing!)
>> -		 *
>> -		 * Wait ten seconds, then try again. This is to 
>> get S4/S5 to work on
>> -		 * all machines.
>> -		 *
>> -		 * We wait so long to allow chipsets that poll 
>> this reg very slowly to
>> -		 * still read the right value. Ideally, this 
>> block would go
>> -		 * away entirely.
>> -		 */
>> -		acpi_os_stall(10000000);
>> -
>> -		status = 
>> acpi_hw_register_write(ACPI_REGISTER_PM1_CONTROL,
>> -						sleep_enable_reg_info->
>> -						access_bit_mask);
>> -		if (ACPI_FAILURE(status)) {
>> -			return_ACPI_STATUS(status);
>> -		}
>> -	}
>> -
>> -	/* Wait until we enter sleep state */
>> -
>> -	do {
>> -		status = 
>> acpi_get_register_unlocked(ACPI_BITREG_WAKE_STATUS,
>> -						    &in_value);
>> -		if (ACPI_FAILURE(status)) {
>> -			return_ACPI_STATUS(status);
>> -		}
>> -
>> -		/* Spin until we wake */
>> -
>> -	} while (!in_value);
>> -
>> -	return_ACPI_STATUS(AE_OK);
>> +	return arch_acpi_enter_sleep_state(sleep_state, 
>> PM1Acontrol, PM1Bcontrol);
>> }
>>
>> ACPI_EXPORT_SYMBOL(acpi_enter_sleep_state)
>> diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
>> index 5192666..bd1cc1a 100644
>> --- a/drivers/acpi/sleep.c
>> +++ b/drivers/acpi/sleep.c
>> @@ -19,6 +19,8 @@
>>
>> #include <asm/io.h>
>>
>> +#include <xen/acpi.h>
>> +
>> #include <acpi/acpi_bus.h>
>> #include <acpi/acpi_drivers.h>
>> #include "sleep.h"
>> @@ -209,6 +211,21 @@ static int 
>> acpi_suspend_begin(suspend_state_t pm_state)
>> 	return error;
>> }
>>
>> +static void do_suspend(void)
>> +{
>> +	if (!xen_pv_acpi()) {
>> +		do_suspend_lowlevel();
>> +		return;
>> +	}
>> +
>> +	/*
>> +	 * Xen will save and restore CPU context, so
>> +	 * we can skip that and just go straight to
>> +	 * the suspend.
>> +	 */
>> +	acpi_enter_sleep_state(ACPI_STATE_S3);
>> +}
>> +
>> /**
>>  *	acpi_suspend_enter - Actually enter a sleep state.
>>  *	@pm_state: ignored
>> @@ -242,7 +259,7 @@ static int 
>> acpi_suspend_enter(suspend_state_t pm_state)
>> 		break;
>>
>> 	case ACPI_STATE_S3:
>> -		do_suspend_lowlevel();
>> +		do_suspend();
>> 		break;
>> 	}
>>
>> diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
>> index 51cbaa5..0138113 100644
>> --- a/drivers/xen/Kconfig
>> +++ b/drivers/xen/Kconfig
>> @@ -76,3 +76,7 @@ config XEN_COMPAT_XENFS
>>
>> config XEN_XENBUS_FRONTEND
>>        tristate
>> +
>> +config XEN_S3
>> +       def_bool y
>> +       depends on XEN_DOM0 && ACPI
>> \ No newline at end of file
>> diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
>> index bb88673..4b01fc8 100644
>> --- a/drivers/xen/Makefile
>> +++ b/drivers/xen/Makefile
>> @@ -7,4 +7,6 @@ obj-$(CONFIG_XEN_BALLOON)		+= balloon.o
>> obj-$(CONFIG_XEN_DEV_EVTCHN)	+= evtchn.o
>> obj-$(CONFIG_XEN_BLKDEV_BACKEND)	+= blkback/
>> obj-$(CONFIG_XEN_NETDEV_BACKEND)	+= netback/
>> -obj-$(CONFIG_XENFS)			+= xenfs/
>> \ No newline at end of file
>> +obj-$(CONFIG_XENFS)			+= xenfs/
>> +
>> +obj-$(CONFIG_XEN_S3)			+= acpi.o
>> \ No newline at end of file
>> diff --git a/drivers/xen/acpi.c b/drivers/xen/acpi.c
>> new file mode 100644
>> index 0000000..e6d3d0e
>> --- /dev/null
>> +++ b/drivers/xen/acpi.c
>> @@ -0,0 +1,23 @@
>> +#include <xen/acpi.h>
>> +
>> +#include <xen/interface/platform.h>
>> +#include <asm/xen/hypercall.h>
>> +#include <asm/xen/hypervisor.h>
>> +
>> +int acpi_notify_hypervisor_state(u8 sleep_state,
>> +				 u32 pm1a_cnt, u32 pm1b_cnt)
>> +{
>> +	struct xen_platform_op op = {
>> +		.cmd = XENPF_enter_acpi_sleep,
>> +		.interface_version = XENPF_INTERFACE_VERSION,
>> +		.u = {
>> +			.enter_acpi_sleep = {
>> +				.pm1a_cnt_val = (u16)pm1a_cnt,
>> +				.pm1b_cnt_val = (u16)pm1b_cnt,
>> +				.sleep_state = sleep_state,
>> +			},
>> +		},
>> +	};
>> +
>> +	return HYPERVISOR_dom0_op(&op);
>> +}
>> diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
>> index cc40102..f39f396 100644
>> --- a/include/acpi/acpixf.h
>> +++ b/include/acpi/acpixf.h
>> @@ -372,6 +372,12 @@ acpi_status 
>> acpi_enter_sleep_state_prep(u8 sleep_state);
>>
>> acpi_status asmlinkage acpi_enter_sleep_state(u8 sleep_state);
>>
>> +acpi_status __weak arch_acpi_enter_sleep_state(u8 sleep_state,
>> +					       u32 PM1Acontrol, 
>> u32 PM1Bcontrol);
>> +
>> +acpi_status default_acpi_enter_sleep_state(u8 sleep_state,
>> +					   u32 PM1Acontrol, u32 
>> PM1Bcontrol);
>> +
>> acpi_status asmlinkage acpi_enter_sleep_state_s4bios(void);
>>
>> acpi_status acpi_leave_sleep_state_prep(u8 sleep_state);
>> diff --git a/include/xen/acpi.h b/include/xen/acpi.h
>> new file mode 100644
>> index 0000000..fea4cfb
>> --- /dev/null
>> +++ b/include/xen/acpi.h
>> @@ -0,0 +1,23 @@
>> +#ifndef _XEN_ACPI_H
>> +#define _XEN_ACPI_H
>> +
>> +#include <linux/types.h>
>> +
>> +#ifdef CONFIG_XEN_S3
>> +#include <asm/xen/hypervisor.h>
>> +
>> +static inline bool xen_pv_acpi(void)
>> +{
>> +	return xen_pv_domain();
>> +}
>> +#else
>> +static inline bool xen_pv_acpi(void)
>> +{
>> +	return false;
>> +}
>> +#endif
>> +
>> +int acpi_notify_hypervisor_state(u8 sleep_state,
>> +				 u32 pm1a_cnt, u32 pm1b_cnd);
>> +
>> +#endif	/* _XEN_ACPI_H */
>>
>>
>>     
> >


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24  5:42                     ` Jeremy Fitzhardinge
@ 2009-03-24  5:45                       ` Tian, Kevin
  2009-03-24  7:05                         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2009-03-24  5:45 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge'
  Cc: Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] 
>Sent: Tuesday, March 24, 2009 1:43 PM
>
>Tian, Kevin wrote:
>> The reason why those useless code are prevented, is in case of
>> clobberring 0-1M area which if contains valid Xen content. in
>> acpi_save_state_mem, dom0's wakeup code is copied to area
>> allocated in acpi_reserve_bootmem. If every Xen's usage on 0-1M
>> is based on copy-on-use style, such as trampoline code for AP
>> boot, it's ok. But I'm not sure whether Xen puts some persistent
>> content there. IIRC, that boot time allocation usually returns sth
>> like 0x90000 since wakeup stub is first run in real mode. But if
>> for dom0 alloc_bootmem_low never gives <1M page, then this
>> prevention could be skipped as your thought.
>>   
>
>Yes, but the memory allocated is only in 0-1M in dom0's 
>pseudo-physical 
>space, not in Xen's machine space.  It allocates the memory and does a 
>virt_to_phys on it, but there's no special effort to remap it 
>as mfns or 
>anything.  There should be no possible conflict with Xen's use of the 
>real 0-1M region.

OK, then it's safe to avoid that change. I had thought that dom0's 0-1M
is identity-mapped to machine 0-1M... :-)

Thanks,
Kevin

>
>Not that I've actually tried to execute that patch yet, so I 
>could well 
>be overlooking something...
>
>    J
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24  5:45                       ` Tian, Kevin
@ 2009-03-24  7:05                         ` Jeremy Fitzhardinge
  2009-03-24 16:45                           ` Bjorn Helgaas
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-24  7:05 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

Tian, Kevin wrote:
> OK, then it's safe to avoid that change. I had thought that dom0's 0-1M
> is identity-mapped to machine 0-1M... :-)

No, only the ISA 640k-1M region.

    J


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24  7:05                         ` Jeremy Fitzhardinge
@ 2009-03-24 16:45                           ` Bjorn Helgaas
  2009-03-24 17:28                             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Bjorn Helgaas @ 2009-03-24 16:45 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Tian, Kevin, Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

On Tuesday 24 March 2009 01:05:43 am Jeremy Fitzhardinge wrote:
> Tian, Kevin wrote:
> > OK, then it's safe to avoid that change. I had thought that dom0's 0-1M
> > is identity-mapped to machine 0-1M... :-)
> 
> No, only the ISA 640k-1M region.

I'm speaking out of turn here because I don't work on Xen or
suspend/resume.  However, I do try to clean up random bits of
ACPI, and I have to say this patch looks like a pain in the
maintenance department.  Having tests for a specific hypervisor
is unpleasant.  We don't want to end up with tests for a collection
of hypervisors.  It looks like suspend becomes a weird hybrid of
ACPI and Xen, which makes it harder to reason about future suspend
changes.  And all this discussion about 640k-1M and dom0 identity
mapping and "there's no special effort to remap it" and whether
there are conflicts makes me nervous.  There's no way all those
assumptions can be remembered or verified five years down the road.

Bjorn

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Re: Paravirtualizing bits of acpi access
  2009-03-24 16:45                           ` Bjorn Helgaas
@ 2009-03-24 17:28                             ` Jeremy Fitzhardinge
  2009-03-24 17:51                               ` [Xen-devel] " Cihula, Joseph
                                                 ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-24 17:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Brown, Len, Tian, Kevin, Xen-devel, the arch/x86 maintainers,
	Rafael J. Wysocki, linux-acpi@vger.kernel.org, Cihula, Joseph

Bjorn Helgaas wrote:
> On Tuesday 24 March 2009 01:05:43 am Jeremy Fitzhardinge wrote:
>   
>> Tian, Kevin wrote:
>>     
>>> OK, then it's safe to avoid that change. I had thought that dom0's 0-1M
>>> is identity-mapped to machine 0-1M... :-)
>>>       
>> No, only the ISA 640k-1M region.
>>     
>
> I'm speaking out of turn here because I don't work on Xen or
> suspend/resume.  However, I do try to clean up random bits of
> ACPI, and I have to say this patch looks like a pain in the
> maintenance department.  Having tests for a specific hypervisor
> is unpleasant.  We don't want to end up with tests for a collection
> of hypervisors.

I agree.  If we start to see other hypervisor-specific changes in this 
area, we'd need to rethink this approach.  But I'm not inclined to add a 
layer of infrastructure to just deal with Xen. (IOW, abstract only when 
there's something to abstract.)

>   It looks like suspend becomes a weird hybrid of
> ACPI and Xen, which makes it harder to reason about future suspend
> changes.  And all this discussion about 640k-1M and dom0 identity
> mapping and "there's no special effort to remap it" and whether
> there are conflicts makes me nervous.  There's no way all those
> assumptions can be remembered or verified five years down the road.
>   

Well, I think Kevin was over-complicating things in his own mind.  The 
upshot is that the normal "running on bare metal" code can do its normal 
thing, and if we happen to be running under Xen we can make it a no-op.  
In other words, the acpi developers don't really need to worry about the 
"running under Xen case", for the most part.

The two changes this patch makes are, unfortunately, unavoidable:

   1. Turn the final "enter sleep" into a hypercall, so that Xen can do
      all the low-level context save/restore.  The normal baremetal case
      is still localized in acpica/hwsleep.c, but it (may) make an
      excursion into arch code to see if it should do something else, and
   2. Directly enter the sleep state, rather than save cpu context
      (which Xen does)

Though, come to think of it, perhaps there's no harm in letting the 
kernel do its own state-saving.  I'll check.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24 17:28                             ` Jeremy Fitzhardinge
@ 2009-03-24 17:51                               ` Cihula, Joseph
  2009-03-27 21:57                                 ` Len Brown
  2009-03-24 23:40                               ` Tian, Kevin
  2009-03-24 23:51                               ` Tian, Kevin
  2 siblings, 1 reply; 32+ messages in thread
From: Cihula, Joseph @ 2009-03-24 17:51 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Bjorn Helgaas
  Cc: Tian, Kevin, Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org, Wang, Shane

> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]
> Sent: Tuesday, March 24, 2009 10:28 AM
>
> Bjorn Helgaas wrote:
> > On Tuesday 24 March 2009 01:05:43 am Jeremy Fitzhardinge wrote:
> >
> >> Tian, Kevin wrote:
> >>
> >>> OK, then it's safe to avoid that change. I had thought that dom0's 0-1M
> >>> is identity-mapped to machine 0-1M... :-)
> >>>
> >> No, only the ISA 640k-1M region.
> >>
> >
> > I'm speaking out of turn here because I don't work on Xen or
> > suspend/resume.  However, I do try to clean up random bits of
> > ACPI, and I have to say this patch looks like a pain in the
> > maintenance department.  Having tests for a specific hypervisor
> > is unpleasant.  We don't want to end up with tests for a collection
> > of hypervisors.
>
> I agree.  If we start to see other hypervisor-specific changes in this
> area, we'd need to rethink this approach.  But I'm not inclined to add a
> layer of infrastructure to just deal with Xen. (IOW, abstract only when
> there's something to abstract.)
>
> >   It looks like suspend becomes a weird hybrid of
> > ACPI and Xen, which makes it harder to reason about future suspend
> > changes.  And all this discussion about 640k-1M and dom0 identity
> > mapping and "there's no special effort to remap it" and whether
> > there are conflicts makes me nervous.  There's no way all those
> > assumptions can be remembered or verified five years down the road.
> >
>
> Well, I think Kevin was over-complicating things in his own mind.  The
> upshot is that the normal "running on bare metal" code can do its normal
> thing, and if we happen to be running under Xen we can make it a no-op.
> In other words, the acpi developers don't really need to worry about the
> "running under Xen case", for the most part.
>
> The two changes this patch makes are, unfortunately, unavoidable:
>
>    1. Turn the final "enter sleep" into a hypercall, so that Xen can do
>       all the low-level context save/restore.  The normal baremetal case
>       is still localized in acpica/hwsleep.c, but it (may) make an
>       excursion into arch code to see if it should do something else, and
>    2. Directly enter the sleep state, rather than save cpu context
>       (which Xen does)
>
> Though, come to think of it, perhaps there's no harm in letting the
> kernel do its own state-saving.  I'll check.
>
>     J

The Intel(R) TXT patch that I'm getting ready to post seems like it is logically very similar to Xen in its handling of shutdown (from the perspective of the kernel).  In the TXT case, the kernel performs the various HW and kernel preparation but the final platform entry into Sx is done by the tboot code (after some other work).  Below are the relevant changes from the TXT patch, where tboot_sleep() just translates the ACPI sleep param to a tboot-specific value and then calls tboot_shutdown(), and tboot_shutdown() simply calls into the tboot code to perform the actual platform sleep.

diff -r 855cb34ca992 arch/x86/kernel/reboot.c
--- a/arch/x86/kernel/reboot.c  Tue Mar 17 19:53:17 2009 -0400
+++ b/arch/x86/kernel/reboot.c  Tue Mar 24 09:37:22 2009 -0700
@@ -22,6 +22,8 @@
 #else
 # include <asm/iommu.h>
 #endif
+
+#include <asm/tboot.h>

 #include <mach_ipi.h>

@@ -436,6 +438,8 @@ static void native_machine_emergency_res
        if (reboot_emergency)
                emergency_vmx_disable_all();

+       tboot_shutdown(TB_SHUTDOWN_REBOOT);
+
        /* Tell the BIOS if we want cold or warm reboot */
        *((unsigned short *)__va(0x472)) = reboot_mode;

@@ -501,11 +505,19 @@ static void native_machine_emergency_res

 void native_machine_shutdown(void)
 {
+#ifdef CONFIG_SMP
+       /* The boot cpu is always logical cpu 0 */
+       int reboot_cpu_id = 0;
+#endif
+
+       /* TXT requires VMX to be off for all shutdowns */
+       if (tboot_in_measured_env()) {
+               emergency_vmx_disable_all();
+               local_irq_enable();
+       }
+
        /* Stop the cpus and apics */
 #ifdef CONFIG_SMP
-
-       /* The boot cpu is always logical cpu 0 */
-       int reboot_cpu_id = 0;

 #ifdef CONFIG_X86_32
        /* See if there has been given a command line override */
@@ -562,6 +574,8 @@ static void native_machine_halt(void)
        /* stop other cpus and apics */
        machine_shutdown();

+       tboot_shutdown(TB_SHUTDOWN_HALT);
+
        /* stop this cpu */
        stop_this_cpu(NULL);
 }
@@ -573,6 +587,8 @@ static void native_machine_power_off(voi
                        machine_shutdown();
                pm_power_off();
        }
+       /* a fallback in case there is no PM info available */
+       tboot_shutdown(TB_SHUTDOWN_HALT);
 }

 struct machine_ops machine_ops = {
diff -r 855cb34ca992 arch/x86/kernel/smpboot.c
--- a/arch/x86/kernel/smpboot.c Tue Mar 17 19:53:17 2009 -0400
+++ b/arch/x86/kernel/smpboot.c Tue Mar 24 09:37:22 2009 -0700
@@ -63,6 +63,7 @@
 #include <asm/vmi.h>
 #include <asm/genapic.h>
 #include <asm/setup.h>
+#include <asm/tboot.h>
 #include <linux/mc146818rtc.h>

 #include <mach_apic.h>
@@ -1436,7 +1437,10 @@ void native_play_dead(void)
 void native_play_dead(void)
 {
        play_dead_common();
-       wbinvd_halt();
+       if (tboot_in_measured_env())
+               tboot_shutdown(TB_SHUTDOWN_WFS);
+       else
+               wbinvd_halt();
 }

 #else /* ... !CONFIG_HOTPLUG_CPU */
diff -r 855cb34ca992 drivers/acpi/acpica/hwsleep.c
--- a/drivers/acpi/acpica/hwsleep.c     Tue Mar 17 19:53:17 2009 -0400
+++ b/drivers/acpi/acpica/hwsleep.c     Tue Mar 24 09:37:22 2009 -0700
@@ -45,6 +45,7 @@
 #include <acpi/acpi.h>
 #include "accommon.h"
 #include "actables.h"
+#include <asm/tboot.h>

 #define _COMPONENT          ACPI_HARDWARE
 ACPI_MODULE_NAME("hwsleep")
@@ -332,6 +333,39 @@ acpi_status asmlinkage acpi_enter_sleep_

        PM1Acontrol |= sleep_enable_reg_info->access_bit_mask;
        PM1Bcontrol |= sleep_enable_reg_info->access_bit_mask;
+
+#ifdef CONFIG_TXT
+#define TB_COPY_GAS(tbg, g)                 \
+       tbg.space_id = g.space_id;          \
+       tbg.bit_width = g.bit_width;        \
+       tbg.bit_offset = g.bit_offset;      \
+       tbg.access_width = g.access_width;  \
+       tbg.address = g.address;
+
+       if (tboot_in_measured_env()) {
+               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1a_cnt_blk,
+                           acpi_gbl_FADT.xpm1a_control_block);
+               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1b_cnt_blk,
+                           acpi_gbl_FADT.xpm1b_control_block);
+               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1a_evt_blk,
+                           acpi_gbl_FADT.xpm1a_event_block);
+               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1b_evt_blk,
+                           acpi_gbl_FADT.xpm1b_event_block);
+               tboot_shared->acpi_sinfo.pm1a_cnt_val = PM1Acontrol;
+               tboot_shared->acpi_sinfo.pm1b_cnt_val = PM1Bcontrol;
+               /* we need phys addr of waking vector, but can't use
+                  virt_to_phys() on &acpi_gbl_FACS because it is ioremap'ed,
+                  so calc from FACS phys addr */
+               tboot_shared->acpi_sinfo.wakeup_vector = acpi_gbl_FADT.facs +
+                       ((void *)&acpi_gbl_FACS->firmware_waking_vector -
+                        (void *)acpi_gbl_FACS);
+               tboot_shared->acpi_sinfo.vector_width = 32;
+               tboot_shared->acpi_sinfo.kernel_s3_resume_vector =
+                               acpi_wakeup_address;
+
+               tboot_sleep(sleep_state);
+       }
+#endif

        /* Write #2: SLP_TYP + SLP_EN */



^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24 17:28                             ` Jeremy Fitzhardinge
  2009-03-24 17:51                               ` [Xen-devel] " Cihula, Joseph
@ 2009-03-24 23:40                               ` Tian, Kevin
  2009-03-24 23:51                               ` Tian, Kevin
  2 siblings, 0 replies; 32+ messages in thread
From: Tian, Kevin @ 2009-03-24 23:40 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge', Bjorn Helgaas
  Cc: Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] 
>Sent: Wednesday, March 25, 2009 1:28 AM
>
>
>>   It looks like suspend becomes a weird hybrid of
>> ACPI and Xen, which makes it harder to reason about future suspend
>> changes.  And all this discussion about 640k-1M and dom0 identity
>> mapping and "there's no special effort to remap it" and whether
>> there are conflicts makes me nervous.  There's no way all those
>> assumptions can be remembered or verified five years down the road.
>>   
>
>Well, I think Kevin was over-complicating things in his own mind.  The 
>upshot is that the normal "running on bare metal" code can do 
>its normal 
>thing, and if we happen to be running under Xen we can make it 
>a no-op.  
>In other words, the acpi developers don't really need to worry 
>about the 
>"running under Xen case", for the most part.

Yes, I'm just trying to think about corner case which is however
not true per Jeremy's expanation. There's nothing to affect bare
metal running. :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24 17:28                             ` Jeremy Fitzhardinge
  2009-03-24 17:51                               ` [Xen-devel] " Cihula, Joseph
  2009-03-24 23:40                               ` Tian, Kevin
@ 2009-03-24 23:51                               ` Tian, Kevin
  2009-03-25  0:45                                 ` Jeremy Fitzhardinge
  2 siblings, 1 reply; 32+ messages in thread
From: Tian, Kevin @ 2009-03-24 23:51 UTC (permalink / raw)
  To: 'Jeremy Fitzhardinge', Bjorn Helgaas
  Cc: Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Cihula, Joseph

>From: Jeremy Fitzhardinge [mailto:jeremy@goop.org] 
>Sent: Wednesday, March 25, 2009 1:28 AM
>
>Bjorn Helgaas wrote:
>> On Tuesday 24 March 2009 01:05:43 am Jeremy Fitzhardinge wrote:
>>   
>>> Tian, Kevin wrote:
>>>     
>>>> OK, then it's safe to avoid that change. I had thought 
>that dom0's 0-1M
>>>> is identity-mapped to machine 0-1M... :-)
>>>>       
>>> No, only the ISA 640k-1M region.
>>>     
>>
>> I'm speaking out of turn here because I don't work on Xen or
>> suspend/resume.  However, I do try to clean up random bits of
>> ACPI, and I have to say this patch looks like a pain in the
>> maintenance department.  Having tests for a specific hypervisor
>> is unpleasant.  We don't want to end up with tests for a collection
>> of hypervisors.
>
>I agree.  If we start to see other hypervisor-specific changes in this 
>area, we'd need to rethink this approach.  But I'm not 
>inclined to add a 
>layer of infrastructure to just deal with Xen. (IOW, abstract 
>only when 
>there's something to abstract.)
>
>>   It looks like suspend becomes a weird hybrid of
>> ACPI and Xen, which makes it harder to reason about future suspend
>> changes.  And all this discussion about 640k-1M and dom0 identity
>> mapping and "there's no special effort to remap it" and whether
>> there are conflicts makes me nervous.  There's no way all those
>> assumptions can be remembered or verified five years down the road.
>>   
>
>Well, I think Kevin was over-complicating things in his own mind.  The 
>upshot is that the normal "running on bare metal" code can do 
>its normal 
>thing, and if we happen to be running under Xen we can make it 
>a no-op.  
>In other words, the acpi developers don't really need to worry 
>about the 
>"running under Xen case", for the most part.
>
>The two changes this patch makes are, unfortunately, unavoidable:
>
>   1. Turn the final "enter sleep" into a hypercall, so that Xen can do
>      all the low-level context save/restore.  The normal 
>baremetal case
>      is still localized in acpica/hwsleep.c, but it (may) make an
>      excursion into arch code to see if it should do 
>something else, and
>   2. Directly enter the sleep state, rather than save cpu context
>      (which Xen does)
>
>Though, come to think of it, perhaps there's no harm in letting the 
>kernel do its own state-saving.  I'll check.
>

Well, I guess it's doable, since do_suspend_lowlevel also needs
to restore processor context upon S3 failure (function return from
acpi_enter_sleep_state instead of from wakeup stub). Then only
1st change remains which is the minimal change same to what
TXT S3 requires.

Thanks,
Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Re: Paravirtualizing bits of acpi access
  2009-03-24 23:51                               ` Tian, Kevin
@ 2009-03-25  0:45                                 ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-25  0:45 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Brown, Len, Xen-devel, Cihula, Joseph, the arch/x86 maintainers,
	Rafael J. Wysocki, linux-acpi@vger.kernel.org, Bjorn Helgaas

Tian, Kevin wrote:
>> Though, come to think of it, perhaps there's no harm in letting the 
>> kernel do its own state-saving.  I'll check.
>>
>>     
>
> Well, I guess it's doable, since do_suspend_lowlevel also needs
> to restore processor context upon S3 failure (function return from
> acpi_enter_sleep_state instead of from wakeup stub). 

 From a quick look, it seems that all the instructions in the restore 
code are ones that Xen will trap and emulate; but with this kind of 
thing, its all in the testing...

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-24 17:51                               ` [Xen-devel] " Cihula, Joseph
@ 2009-03-27 21:57                                 ` Len Brown
  2009-03-27 23:20                                   ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 32+ messages in thread
From: Len Brown @ 2009-03-27 21:57 UTC (permalink / raw)
  To: Cihula, Joseph
  Cc: Jeremy Fitzhardinge, Bjorn Helgaas, Tian, Kevin,
	Rafael J. Wysocki, Brown, Len, Xen-devel,
	the arch/x86 maintainers, linux-acpi@vger.kernel.org, Wang, Shane


> > The two changes this patch makes are, unfortunately, unavoidable:
> >
> >    1. Turn the final "enter sleep" into a hypercall, so that Xen can do
> >       all the low-level context save/restore.  The normal baremetal case
> >       is still localized in acpica/hwsleep.c, but it (may) make an
> >       excursion into arch code to see if it should do something else, and
> >    2. Directly enter the sleep state, rather than save cpu context
> >       (which Xen does)
> >
> > Though, come to think of it, perhaps there's no harm in letting the
> > kernel do its own state-saving.  I'll check.

Certainly the less different than bare metal you can be, the better.

The last thing we need is to complicate something that in Linux
has only been sort of working for 


> diff -r 855cb34ca992 drivers/acpi/acpica/hwsleep.c
> --- a/drivers/acpi/acpica/hwsleep.c     Tue Mar 17 19:53:17 2009 -0400
> +++ b/drivers/acpi/acpica/hwsleep.c     Tue Mar 24 09:37:22 2009 -0700
> @@ -45,6 +45,7 @@
>  #include <acpi/acpi.h>
>  #include "accommon.h"
>  #include "actables.h"
> +#include <asm/tboot.h>
> 
>  #define _COMPONENT          ACPI_HARDWARE
>  ACPI_MODULE_NAME("hwsleep")
> @@ -332,6 +333,39 @@ acpi_status asmlinkage acpi_enter_sleep_
> 
>         PM1Acontrol |= sleep_enable_reg_info->access_bit_mask;
>         PM1Bcontrol |= sleep_enable_reg_info->access_bit_mask;
> +
> +#ifdef CONFIG_TXT
> +#define TB_COPY_GAS(tbg, g)                 \
> +       tbg.space_id = g.space_id;          \
> +       tbg.bit_width = g.bit_width;        \
> +       tbg.bit_offset = g.bit_offset;      \
> +       tbg.access_width = g.access_width;  \
> +       tbg.address = g.address;
> +
> +       if (tboot_in_measured_env()) {
> +               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1a_cnt_blk,
> +                           acpi_gbl_FADT.xpm1a_control_block);
> +               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1b_cnt_blk,
> +                           acpi_gbl_FADT.xpm1b_control_block);

Who'd a thunk that suddently everybody would want to scribble
on acpi_enter_sleep_state()?

Note that acpica/hwsleep.c is a file from ACPICA that we share
with BSD etc.  Yes, we manage local changes in Linux, but we
try to reduce them to zero over time, else we create a big
maintenace headache.

perhaps tboot_in_measured_env() could compile in as 0
for !CONFIG_TXT and you can get rid of the #ifdefs?

Jeremy, I'm not excited about a proposed change to acpixf.h --
this is the API to ACPICA...

thanks,
-Len Brown, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-27 21:57                                 ` Len Brown
@ 2009-03-27 23:20                                   ` Jeremy Fitzhardinge
  2009-03-28  1:01                                     ` Len Brown
  0 siblings, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-27 23:20 UTC (permalink / raw)
  To: Len Brown
  Cc: Cihula, Joseph, Bjorn Helgaas, Tian, Kevin, Rafael J. Wysocki,
	Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Wang, Shane

Len Brown wrote:
>> diff -r 855cb34ca992 drivers/acpi/acpica/hwsleep.c
>> --- a/drivers/acpi/acpica/hwsleep.c     Tue Mar 17 19:53:17 2009 -0400
>> +++ b/drivers/acpi/acpica/hwsleep.c     Tue Mar 24 09:37:22 2009 -0700
>> @@ -45,6 +45,7 @@
>>  #include <acpi/acpi.h>
>>  #include "accommon.h"
>>  #include "actables.h"
>> +#include <asm/tboot.h>
>>
>>  #define _COMPONENT          ACPI_HARDWARE
>>  ACPI_MODULE_NAME("hwsleep")
>> @@ -332,6 +333,39 @@ acpi_status asmlinkage acpi_enter_sleep_
>>
>>         PM1Acontrol |= sleep_enable_reg_info->access_bit_mask;
>>         PM1Bcontrol |= sleep_enable_reg_info->access_bit_mask;
>> +
>> +#ifdef CONFIG_TXT
>> +#define TB_COPY_GAS(tbg, g)                 \
>> +       tbg.space_id = g.space_id;          \
>> +       tbg.bit_width = g.bit_width;        \
>> +       tbg.bit_offset = g.bit_offset;      \
>> +       tbg.access_width = g.access_width;  \
>> +       tbg.address = g.address;
>> +
>> +       if (tboot_in_measured_env()) {
>> +               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1a_cnt_blk,
>> +                           acpi_gbl_FADT.xpm1a_control_block);
>> +               TB_COPY_GAS(tboot_shared->acpi_sinfo.pm1b_cnt_blk,
>> +                           acpi_gbl_FADT.xpm1b_control_block);
>>     
>
> Who'd a thunk that suddently everybody would want to scribble
> on acpi_enter_sleep_state()?
>
> Note that acpica/hwsleep.c is a file from ACPICA that we share
> with BSD etc.  Yes, we manage local changes in Linux, but we
> try to reduce them to zero over time, else we create a big
> maintenace headache.
>
> perhaps tboot_in_measured_env() could compile in as 0
> for !CONFIG_TXT and you can get rid of the #ifdefs?
>
> Jeremy, I'm not excited about a proposed change to acpixf.h --
> this is the API to ACPICA...
>   
Do you have an issue with the mechanism (using weak function, etc), or 
just the placement of the prototypes in that header?  Would there be a 
better header to put them in?  Or would you prefer some other mechanism?

It certainly seems like Xen and tboot should be able to share the same 
hook, given that they're doing similar things for similar reasons.

(I don't really understand the structure of all the acpi stuff; I'm just 
wading in and making a mess of things until I can close the lid of 
laptop successfully.)

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-27 23:20                                   ` Jeremy Fitzhardinge
@ 2009-03-28  1:01                                     ` Len Brown
  2009-03-28  2:19                                       ` Tian, Kevin
  2009-03-28  3:19                                       ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 32+ messages in thread
From: Len Brown @ 2009-03-28  1:01 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Cihula, Joseph, Bjorn Helgaas, Tian, Kevin, Rafael J. Wysocki,
	Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Wang, Shane


> > Jeremy, I'm not excited about a proposed change to acpixf.h --
> > this is the API to ACPICA...
> >   
> Do you have an issue with the mechanism (using weak function, etc), or just
> the placement of the prototypes in that header?  Would there be a better
> header to put them in?  Or would you prefer some other mechanism?
> 
> It certainly seems like Xen and tboot should be able to share the same hook,
> given that they're doing similar things for similar reasons.
> 
> (I don't really understand the structure of all the acpi stuff; I'm just
> wading in and making a mess of things until I can close the lid of laptop
> successfully.)

Everything in acpi/acpica/ is source code that we share with BSD
via the ACPICA project http://acpica.org/

ACPICA also supplies a couple of the headers outside that directory,
eg. acpixf.h, which is the API between the kernel and ACPICA.

We can, and do, change that API when it makes sense.
However, we want to think carefully before changing it,
for we either cause Linux to diverge, or we have to sell
the same change to several other operating systems.
So ideally we wouuld need to make no Linux-specific, or Xen-specific
changes in that directory.

One possibility is to have this called via
function pointer from ASM and scribble over the default
function pointer for the Xen case.  In that case, the Xen
version of the routine would live someplace other than acpi/acpica/ -
someplace with the word xen in the path.  If using _weak can effectively
do that at link time, then fine, if we can do it w/o changing the API --
a _weak annotation is an easy ACPICA/Linux divergencen to manage.

I don't know if Xen and TXT are exclusive, or if we'd ever need
to handle both cases at the same time.  I guess that will influence
if the TXT and Xen use the same approach or something different.

thanks,
Len Brown, Intel Open Source Technology Center


^ permalink raw reply	[flat|nested] 32+ messages in thread

* RE: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-28  1:01                                     ` Len Brown
@ 2009-03-28  2:19                                       ` Tian, Kevin
  2009-03-28  3:19                                       ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 32+ messages in thread
From: Tian, Kevin @ 2009-03-28  2:19 UTC (permalink / raw)
  To: Len Brown, Jeremy Fitzhardinge
  Cc: Cihula, Joseph, Bjorn Helgaas, Rafael J. Wysocki, Brown, Len,
	Xen-devel, the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Wang, Shane

>From: Len Brown [mailto:lenb@kernel.org] 
>Sent: 2009年3月28日 9:02
>> > Jeremy, I'm not excited about a proposed change to acpixf.h --
>> > this is the API to ACPICA...
>> >   
>> Do you have an issue with the mechanism (using weak 
>function, etc), or just
>> the placement of the prototypes in that header?  Would there 
>be a better
>> header to put them in?  Or would you prefer some other mechanism?
>> 
>> It certainly seems like Xen and tboot should be able to 
>share the same hook,
>> given that they're doing similar things for similar reasons.
>> 
>> (I don't really understand the structure of all the acpi 
>stuff; I'm just
>> wading in and making a mess of things until I can close the 
>lid of laptop
>> successfully.)
>
>Everything in acpi/acpica/ is source code that we share with BSD
>via the ACPICA project http://acpica.org/
>
>ACPICA also supplies a couple of the headers outside that directory,
>eg. acpixf.h, which is the API between the kernel and ACPICA.
>
>We can, and do, change that API when it makes sense.
>However, we want to think carefully before changing it,
>for we either cause Linux to diverge, or we have to sell
>the same change to several other operating systems.
>So ideally we wouuld need to make no Linux-specific, or Xen-specific
>changes in that directory.
>
>One possibility is to have this called via
>function pointer from ASM and scribble over the default
>function pointer for the Xen case.  In that case, the Xen
>version of the routine would live someplace other than acpi/acpica/ -
>someplace with the word xen in the path.  If using _weak can 
>effectively
>do that at link time, then fine, if we can do it w/o changing 
>the API --
>a _weak annotation is an easy ACPICA/Linux divergencen to manage.
>
>I don't know if Xen and TXT are exclusive, or if we'd ever need
>to handle both cases at the same time.  I guess that will influence
>if the TXT and Xen use the same approach or something different.
>

When only Xen exists, S3 flow is:
dom0 S3 -> Xen S3

When only TXT is enabled, it's:
dom0 S3 -> TXT S3

When both Xen and TXT exist, TXT is not exposed to dom0 and thus
the S3 flow is:
dom0 S3 -> Xen S3 -> TXT S3

I.e, dom0 only needs to care one case at given time. Transition to
TXT is only required if system software is the lowest level on top of
hardware.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-28  1:01                                     ` Len Brown
  2009-03-28  2:19                                       ` Tian, Kevin
@ 2009-03-28  3:19                                       ` Jeremy Fitzhardinge
  2009-03-28 13:56                                         ` Rafael J. Wysocki
  1 sibling, 1 reply; 32+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-28  3:19 UTC (permalink / raw)
  To: Len Brown
  Cc: Cihula, Joseph, Bjorn Helgaas, Tian, Kevin, Rafael J. Wysocki,
	Brown, Len, Xen-devel, the arch/x86 maintainers,
	linux-acpi@vger.kernel.org, Wang, Shane

Len Brown wrote:
>>> Jeremy, I'm not excited about a proposed change to acpixf.h --
>>> this is the API to ACPICA...
>>>   
>>>       
>> Do you have an issue with the mechanism (using weak function, etc), or just
>> the placement of the prototypes in that header?  Would there be a better
>> header to put them in?  Or would you prefer some other mechanism?
>>
>> It certainly seems like Xen and tboot should be able to share the same hook,
>> given that they're doing similar things for similar reasons.
>>
>> (I don't really understand the structure of all the acpi stuff; I'm just
>> wading in and making a mess of things until I can close the lid of laptop
>> successfully.)
>>     
>
> Everything in acpi/acpica/ is source code that we share with BSD
> via the ACPICA project http://acpica.org/
>
> ACPICA also supplies a couple of the headers outside that directory,
> eg. acpixf.h, which is the API between the kernel and ACPICA.
>
> We can, and do, change that API when it makes sense.
> However, we want to think carefully before changing it,
> for we either cause Linux to diverge, or we have to sell
> the same change to several other operating systems.
> So ideally we wouuld need to make no Linux-specific, or Xen-specific
> changes in that directory.
>
> One possibility is to have this called via
> function pointer from ASM and scribble over the default
> function pointer for the Xen case.  In that case, the Xen
> version of the routine would live someplace other than acpi/acpica/ -
> someplace with the word xen in the path.

Yes, that would be easy enough to do; we could overwrite it only when 
actually running under Xen.

However, we don't need to replace the whole of acpi_enter_sleep_state(); 
there are two options: we could duplicate the early part of the function 
in the Xen version of it, or break just the differing part out via 
function pointer.  The former has the disadvantage of duplicating code, 
but it does allow the same function pointer to be used by the tboot version.

>   If using _weak can effectively
> do that at link time, then fine, if we can do it w/o changing the API --
> a _weak annotation is an easy ACPICA/Linux divergencen to manage.
>   

The weak approach is what my proposed patch does:

    * the default native-hardware version of the sleep-entry register
      writes is broken out into default_acpi_enter_sleep_state()
    * it introduces a weak arch_acpi_enter_sleep_state() which just
      calls the default case, leaving the normal function unchanged
    * in arch/x86/kernel/acpi/sleep.c, it adds an override
      arch_acpi_enter_sleep_state(), which checks to see if we're
      running under Xen; if not, then it simply calls
      default_acpi_enter_sleep_state() as usual; if it does, it calls
      xen_acpi_enter_sleep_state()
    * xen_acpi_enter_sleep-state() is defined in arch/x86/xen/acpi.c

(Actually it didn't break the Xen version out separately, but it easily 
could.)

On the whole, using a function pointer would be a bit cleaner, as it 
removes the need for the weak glue functions, at the cost of some 
(possible) code duplication.

> I don't know if Xen and TXT are exclusive, or if we'd ever need
> to handle both cases at the same time.  I guess that will influence
> if the TXT and Xen use the same approach or something different.
>   

As Kevin said, they're exclusive, so they could share the same function 
pointer.

    J

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Xen-devel] Re: Paravirtualizing bits of acpi access
  2009-03-28  3:19                                       ` Jeremy Fitzhardinge
@ 2009-03-28 13:56                                         ` Rafael J. Wysocki
  0 siblings, 0 replies; 32+ messages in thread
From: Rafael J. Wysocki @ 2009-03-28 13:56 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Len Brown, Cihula, Joseph, Bjorn Helgaas, Tian, Kevin, Brown, Len,
	Xen-devel, the arch/x86 maintainers, linux-acpi@vger.kernel.org,
	Wang, Shane

On Saturday 28 March 2009, Jeremy Fitzhardinge wrote:
> Len Brown wrote:
> >>> Jeremy, I'm not excited about a proposed change to acpixf.h --
> >>> this is the API to ACPICA...
> >>>   
> >>>       
> >> Do you have an issue with the mechanism (using weak function, etc), or just
> >> the placement of the prototypes in that header?  Would there be a better
> >> header to put them in?  Or would you prefer some other mechanism?
> >>
> >> It certainly seems like Xen and tboot should be able to share the same hook,
> >> given that they're doing similar things for similar reasons.
> >>
> >> (I don't really understand the structure of all the acpi stuff; I'm just
> >> wading in and making a mess of things until I can close the lid of laptop
> >> successfully.)
> >>     
> >
> > Everything in acpi/acpica/ is source code that we share with BSD
> > via the ACPICA project http://acpica.org/
> >
> > ACPICA also supplies a couple of the headers outside that directory,
> > eg. acpixf.h, which is the API between the kernel and ACPICA.
> >
> > We can, and do, change that API when it makes sense.
> > However, we want to think carefully before changing it,
> > for we either cause Linux to diverge, or we have to sell
> > the same change to several other operating systems.
> > So ideally we wouuld need to make no Linux-specific, or Xen-specific
> > changes in that directory.
> >
> > One possibility is to have this called via
> > function pointer from ASM and scribble over the default
> > function pointer for the Xen case.  In that case, the Xen
> > version of the routine would live someplace other than acpi/acpica/ -
> > someplace with the word xen in the path.
> 
> Yes, that would be easy enough to do; we could overwrite it only when 
> actually running under Xen.
> 
> However, we don't need to replace the whole of acpi_enter_sleep_state(); 
> there are two options: we could duplicate the early part of the function 
> in the Xen version of it, or break just the differing part out via 
> function pointer.  The former has the disadvantage of duplicating code, 
> but it does allow the same function pointer to be used by the tboot version.
> 
> >   If using _weak can effectively
> > do that at link time, then fine, if we can do it w/o changing the API --
> > a _weak annotation is an easy ACPICA/Linux divergencen to manage.
> >   
> 
> The weak approach is what my proposed patch does:
> 
>     * the default native-hardware version of the sleep-entry register
>       writes is broken out into default_acpi_enter_sleep_state()
>     * it introduces a weak arch_acpi_enter_sleep_state() which just
>       calls the default case, leaving the normal function unchanged
>     * in arch/x86/kernel/acpi/sleep.c, it adds an override
>       arch_acpi_enter_sleep_state(), which checks to see if we're
>       running under Xen; if not, then it simply calls
>       default_acpi_enter_sleep_state() as usual; if it does, it calls
>       xen_acpi_enter_sleep_state()
>     * xen_acpi_enter_sleep-state() is defined in arch/x86/xen/acpi.c
> 
> (Actually it didn't break the Xen version out separately, but it easily 
> could.)
> 
> On the whole, using a function pointer would be a bit cleaner, as it 
> removes the need for the weak glue functions, at the cost of some 
> (possible) code duplication.
> 
> > I don't know if Xen and TXT are exclusive, or if we'd ever need
> > to handle both cases at the same time.  I guess that will influence
> > if the TXT and Xen use the same approach or something different.
> >   
> 
> As Kevin said, they're exclusive, so they could share the same function 
> pointer.

FWIW, if you only care about suspen to RAM (S3). I'm still thinking that
do_suspend_lowlevel() is the place to work on.  After all
acpi_enter_sleep_state() is called from there.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2009-03-28 13:56 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-21  6:09 Paravirtualizing bits of acpi access Jeremy Fitzhardinge
2009-03-21 17:10 ` Rafael J. Wysocki
2009-03-22  4:26   ` Jeremy Fitzhardinge
2009-03-22 11:28     ` Rafael J. Wysocki
2009-03-22 13:14       ` Ingo Molnar
2009-03-22 13:17         ` Rafael J. Wysocki
2009-03-22 17:07       ` Jeremy Fitzhardinge
2009-03-22 18:00         ` Rafael J. Wysocki
2009-03-23  3:29         ` Tian, Kevin
2009-03-23 18:20           ` [Xen-devel] " Rafael J. Wysocki
2009-03-23 19:07             ` Jeremy Fitzhardinge
2009-03-23 20:27               ` Rafael J. Wysocki
2009-03-23 20:42                 ` Jeremy Fitzhardinge
2009-03-24  5:14                 ` Jeremy Fitzhardinge
2009-03-24  5:33                   ` Tian, Kevin
2009-03-24  5:42                     ` Jeremy Fitzhardinge
2009-03-24  5:45                       ` Tian, Kevin
2009-03-24  7:05                         ` Jeremy Fitzhardinge
2009-03-24 16:45                           ` Bjorn Helgaas
2009-03-24 17:28                             ` Jeremy Fitzhardinge
2009-03-24 17:51                               ` [Xen-devel] " Cihula, Joseph
2009-03-27 21:57                                 ` Len Brown
2009-03-27 23:20                                   ` Jeremy Fitzhardinge
2009-03-28  1:01                                     ` Len Brown
2009-03-28  2:19                                       ` Tian, Kevin
2009-03-28  3:19                                       ` Jeremy Fitzhardinge
2009-03-28 13:56                                         ` Rafael J. Wysocki
2009-03-24 23:40                               ` Tian, Kevin
2009-03-24 23:51                               ` Tian, Kevin
2009-03-25  0:45                                 ` Jeremy Fitzhardinge
2009-03-23 19:52             ` [Xen-devel] " Matthew Garrett
2009-03-23 20:22               ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox