public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                   ` <20070426113005.GU17387@elf.ucw.cz>
@ 2007-04-26 16:31                     ` Johannes Berg
  2007-04-26 18:40                       ` Rafael J. Wysocki
  2007-04-29 12:48                       ` [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) R. J. Wysocki
  0 siblings, 2 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-26 16:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Mike Galbraith,
	linux-kernel, Con Kolivas, suspend2-devel, linux-pm,
	Andrew Morton, Linus Torvalds, Thomas Gleixner, Arjan van de Ven

On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:

> > From looking at pm_ops which I was recently working with a lot, it seems
> > that it was designed by somebody who was reading the ACPI documentation
> > and was otherwise pretty clueless, even at that level std tries to look
> > like suspend. IMHO that is one of the first things that should be ripped
> > out, no pm_ops for STD, it's a pain to work with.
> 
> That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> low-level enter is pretty similar).
> 
> Patches would be welcome

That was easier than I thought. This applies on top of a patch that
makes kernel/power/user.c optional since I had no idea how to fix it,
problems I see:
 * it surfaces kernel implementation details about pm_ops and thus makes
   the whole thing very fragile
 * it has yet another interface (yuck) to determine whether to reboot,
   shut down etc, doesn't use /sys/power/disk
 * I generally had no idea wtf it is doing in some places

Anyway, this patch is only compile tested, it
 * introduces include/linux/hibernate.h with hibernate_ops and
   a new hibernate() function to hibernate the system
 * rips apart a lot of the suspend code and puts it back together using
   the hibernate_ops
 * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
 * might apply/compile against -mm, I have all my and some of Rafael's
   suspend/hibernate work in my tree.
 * breaks user suspend as I noted above
 * is incomplete, somewhere pm_suspend_disk() is still defined iirc

johannes
---
 Documentation/power/userland-swsusp.txt |   26 +++----
 drivers/acpi/sleep/main.c               |   89 ++++++++++++++++++++----
 drivers/acpi/sleep/proc.c               |    3 
 drivers/i2c/chips/tps65010.c            |    2 
 include/linux/hibernate.h               |   36 +++++++++
 include/linux/pm.h                      |   31 --------
 kernel/power/disk.c                     |  117 +++++++++++++++++++-------------
 kernel/power/main.c                     |   47 +++++-------
 kernel/power/power.h                    |   13 ---
 kernel/power/user.c                     |   28 +------
 kernel/sys.c                            |    3 
 11 files changed, 231 insertions(+), 164 deletions(-)

--- wireless-dev.orig/include/linux/pm.h	2007-04-26 18:15:00.440691185 +0200
+++ wireless-dev/include/linux/pm.h	2007-04-26 18:15:09.410691185 +0200
@@ -107,26 +107,11 @@ typedef int __bitwise suspend_state_t;
 #define PM_SUSPEND_ON		((__force suspend_state_t) 0)
 #define PM_SUSPEND_STANDBY	((__force suspend_state_t) 1)
 #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
-#define PM_SUSPEND_DISK		((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX		((__force suspend_state_t) 5)
-
-typedef int __bitwise suspend_disk_method_t;
-
-/* invalid must be 0 so struct pm_ops initialisers can leave it out */
-#define PM_DISK_INVALID		((__force suspend_disk_method_t) 0)
-#define	PM_DISK_PLATFORM	((__force suspend_disk_method_t) 1)
-#define	PM_DISK_SHUTDOWN	((__force suspend_disk_method_t) 2)
-#define	PM_DISK_REBOOT		((__force suspend_disk_method_t) 3)
-#define	PM_DISK_TEST		((__force suspend_disk_method_t) 4)
-#define	PM_DISK_TESTPROC	((__force suspend_disk_method_t) 5)
-#define	PM_DISK_MAX		((__force suspend_disk_method_t) 6)
+#define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
 
 /**
  * struct pm_ops - Callbacks for managing platform dependent suspend states.
  * @valid: Callback to determine whether the given state can be entered.
- * 	If %CONFIG_SOFTWARE_SUSPEND is set then %PM_SUSPEND_DISK is
- *	always valid and never passed to this call. If not assigned,
- *	no suspend states are valid.
  *	Valid states are advertised in /sys/power/state but can still
  *	be rejected by prepare or enter if the conditions aren't right.
  *	There is a %pm_valid_only_mem function available that can be assigned
@@ -140,24 +125,12 @@ typedef int __bitwise suspend_disk_metho
  *
  * @finish: Called when the system has left the given state and all devices
  *	are resumed. The return value is ignored.
- *
- * @pm_disk_mode: The generic code always allows one of the shutdown methods
- *	%PM_DISK_SHUTDOWN, %PM_DISK_REBOOT, %PM_DISK_TEST and
- *	%PM_DISK_TESTPROC. If this variable is set, the mode it is set
- *	to is allowed in addition to those modes and is also made default.
- *	When this mode is sent selected, the @prepare call will be called
- *	before suspending to disk (if present), the @enter call should be
- *	present and will be called after all state has been saved and the
- *	machine is ready to be powered off; the @finish callback is called
- *	after state has been restored. All these calls are called with
- *	%PM_SUSPEND_DISK as the state.
  */
 struct pm_ops {
 	int (*valid)(suspend_state_t state);
 	int (*prepare)(suspend_state_t state);
 	int (*enter)(suspend_state_t state);
 	int (*finish)(suspend_state_t state);
-	suspend_disk_method_t pm_disk_mode;
 };
 
 /**
@@ -276,8 +249,6 @@ extern void device_power_up(void);
 extern void device_resume(void);
 
 #ifdef CONFIG_PM
-extern suspend_disk_method_t pm_disk_mode;
-
 extern int device_suspend(pm_message_t state);
 extern int device_prepare_suspend(pm_message_t state);
 
--- wireless-dev.orig/kernel/power/main.c	2007-04-26 18:15:00.790691185 +0200
+++ wireless-dev/kernel/power/main.c	2007-04-26 18:15:09.410691185 +0200
@@ -21,6 +21,7 @@
 #include <linux/resume-trace.h>
 #include <linux/freezer.h>
 #include <linux/vmstat.h>
+#include <linux/hibernate.h>
 
 #include "power.h"
 
@@ -30,7 +31,6 @@
 DEFINE_MUTEX(pm_mutex);
 
 struct pm_ops *pm_ops;
-suspend_disk_method_t pm_disk_mode = PM_DISK_SHUTDOWN;
 
 /**
  *	pm_set_ops - Set the global power method table. 
@@ -41,10 +41,6 @@ void pm_set_ops(struct pm_ops * ops)
 {
 	mutex_lock(&pm_mutex);
 	pm_ops = ops;
-	if (ops && ops->pm_disk_mode != PM_DISK_INVALID) {
-		pm_disk_mode = ops->pm_disk_mode;
-	} else
-		pm_disk_mode = PM_DISK_SHUTDOWN;
 	mutex_unlock(&pm_mutex);
 }
 
@@ -184,24 +180,12 @@ static void suspend_finish(suspend_state
 static const char * const pm_states[PM_SUSPEND_MAX] = {
 	[PM_SUSPEND_STANDBY]	= "standby",
 	[PM_SUSPEND_MEM]	= "mem",
-	[PM_SUSPEND_DISK]	= "disk",
 };
 
 static inline int valid_state(suspend_state_t state)
 {
-	/* Suspend-to-disk does not really need low-level support.
-	 * It can work with shutdown/reboot if needed. If it isn't
-	 * configured, then it cannot be supported.
-	 */
-	if (state == PM_SUSPEND_DISK)
-#ifdef CONFIG_SOFTWARE_SUSPEND
-		return 1;
-#else
-		return 0;
-#endif
-
-	/* all other states need lowlevel support and need to be
-	 * valid to the lowlevel implementation, no valid callback
+	/* All states need lowlevel support and need to be valid
+	 * to the lowlevel implementation, no valid callback
 	 * implies that none are valid. */
 	if (!pm_ops || !pm_ops->valid || !pm_ops->valid(state))
 		return 0;
@@ -229,11 +213,6 @@ static int enter_state(suspend_state_t s
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
-	if (state == PM_SUSPEND_DISK) {
-		error = pm_suspend_disk();
-		goto Unlock;
-	}
-
 	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 	if ((error = suspend_prepare(state)))
 		goto Unlock;
@@ -251,7 +230,7 @@ static int enter_state(suspend_state_t s
 
 /**
  *	pm_suspend - Externally visible function for suspending system.
- *	@state:		Enumarted value of state to enter.
+ *	@state:		Enumerated value of state to enter.
  *
  *	Determine whether or not value is within range, get state 
  *	structure, and enter (above).
@@ -283,13 +262,19 @@ decl_subsys(power,NULL,NULL);
 static ssize_t state_show(struct subsystem * subsys, char * buf)
 {
 	int i;
-	char * s = buf;
+	char *s = buf;
 
 	for (i = 0; i < PM_SUSPEND_MAX; i++) {
 		if (pm_states[i] && valid_state(i))
-			s += sprintf(s,"%s ", pm_states[i]);
+			s += sprintf(s, "%s ", pm_states[i]);
 	}
-	s += sprintf(s,"\n");
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	s += sprintf(s, "%s\n", "disk");
+#else
+	if (s != buf)
+		/* convert the last space to a newline */
+		*(s-1) = "\n";
+#endif
 	return (s - buf);
 }
 
@@ -304,6 +289,12 @@ static ssize_t state_store(struct subsys
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
+	/* first check hibernate */
+	if (strncmp(buf, "disk", len)) {
+		error = hibernate();
+		return error ? error : n;
+	}
+
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && !strncmp(buf, *s, len))
 			break;
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ wireless-dev/include/linux/hibernate.h	2007-04-26 18:21:38.130691185 +0200
@@ -0,0 +1,36 @@
+#ifndef __LINUX_HIBERNATE
+#define __LINUX_HIBERNATE
+/*
+ * hibernate ('suspend to disk') functionality
+ */
+
+/**
+ * struct hibernate_ops - hibernate platform support
+ *
+ * The methods in this structure allow a platform to override what
+ * happens for shutting down the machine when going into hibernation.
+ *
+ * All three methods must be assigned.
+ *
+ * @prepare: prepare system for hibernation
+ * @enter: shut down system after state has been saved to disk
+ * @finish: finish/clean up after state has been reloaded
+ */
+struct hibernate_ops {
+	int (*prepare)(void);
+	int (*enter)(void);
+	void (*finish)(void);
+};
+
+/**
+ * hibernate_set_ops - set the global hibernate operations
+ * @ops: the hibernate operations to use from now on.
+ */
+void hibernate_set_ops(struct hibernate_ops *ops);
+
+/**
+ * hibernate - hibernate the system
+ */
+int hibernate(void);
+
+#endif /* __LINUX_HIBERNATE */
--- wireless-dev.orig/kernel/power/disk.c	2007-04-26 18:15:00.800691185 +0200
+++ wireless-dev/kernel/power/disk.c	2007-04-26 18:15:09.420691185 +0200
@@ -21,45 +21,72 @@
 #include <linux/console.h>
 #include <linux/cpu.h>
 #include <linux/freezer.h>
+#include <linux/hibernate.h>
 
 #include "power.h"
 
 
-static int noresume = 0;
+static int noresume;
 char resume_file[256] = CONFIG_PM_STD_PARTITION;
 dev_t swsusp_resume_device;
 sector_t swsusp_resume_block;
 
+static struct hibernate_ops *hibernate_ops;
+static int pm_disk_mode;
+
+enum {
+	PM_DISK_INVALID,
+	PM_DISK_PLATFORM,
+	PM_DISK_TEST,
+	PM_DISK_TESTPROC,
+	PM_DISK_SHUTDOWN,
+	PM_DISK_REBOOT,
+	/* keep last */
+	__PM_DISK_AFTER_LAST
+};
+#define PM_DISK_MAX (__PM_DISK_AFTER_LAST-1)
+#define PM_DISK_FIRST (PM_DISK_INVALID + 1)
+
+void hibernate_set_ops(struct hibernate_ops *ops)
+{
+	BUG_ON(!hibernate_ops->prepare);
+	BUG_ON(!hibernate_ops->enter);
+	BUG_ON(!hibernate_ops->finish);
+	mutex_lock(&pm_mutex);
+	hibernate_ops = ops;
+	mutex_unlock(&pm_mutex);
+}
+
+
 /**
- *	platform_prepare - prepare the machine for hibernation using the
- *	platform driver if so configured and return an error code if it fails
+ *	hibernate_platform_prepare - prepare the machine for hibernation using
+ *	the platform driver if so configured and return an error code if it
+ *	fails.
  */
 
-static inline int platform_prepare(void)
+int hibernate_platform_prepare(void)
 {
-	int error = 0;
-
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
 	case PM_DISK_TESTPROC:
 	case PM_DISK_SHUTDOWN:
 	case PM_DISK_REBOOT:
 		break;
-	default:
-		if (pm_ops && pm_ops->prepare)
-			error = pm_ops->prepare(PM_SUSPEND_DISK);
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops)
+			return hibernate_ops->prepare();
 	}
-	return error;
+	return 0;
 }
 
 /**
- *	power_down - Shut machine down for hibernate.
+ *	hibernate_power_down - Shut machine down for hibernate.
  *
  *	Use the platform driver, if configured so; otherwise try
  *	to power off or reboot.
  */
 
-static void power_down(void)
+static void hibernate_power_down(void)
 {
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
@@ -70,11 +97,10 @@ static void power_down(void)
 	case PM_DISK_REBOOT:
 		kernel_restart(NULL);
 		break;
-	default:
-		if (pm_ops && pm_ops->enter) {
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops) {
 			kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-			pm_ops->enter(PM_SUSPEND_DISK);
-			break;
+			hibernate_ops->enter();
 		}
 	}
 
@@ -85,7 +111,7 @@ static void power_down(void)
 	while(1);
 }
 
-static inline void platform_finish(void)
+void hibernate_platform_finish(void)
 {
 	switch (pm_disk_mode) {
 	case PM_DISK_TEST:
@@ -93,9 +119,9 @@ static inline void platform_finish(void)
 	case PM_DISK_SHUTDOWN:
 	case PM_DISK_REBOOT:
 		break;
-	default:
-		if (pm_ops && pm_ops->finish)
-			pm_ops->finish(PM_SUSPEND_DISK);
+	case PM_DISK_PLATFORM:
+		if (hibernate_ops)
+			hibernate_ops->finish();
 	}
 }
 
@@ -118,13 +144,13 @@ static int prepare_processes(void)
 }
 
 /**
- *	pm_suspend_disk - The granpappy of hibernation power management.
+ *	hibernate - The granpappy of hibernation power management.
  *
  *	If not, then call swsusp to do its thing, then figure out how
  *	to power down the system.
  */
 
-int pm_suspend_disk(void)
+int hibernate(void)
 {
 	int error;
 
@@ -147,7 +173,7 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Finish;
 
-	error = platform_prepare();
+	error = hibernate_platform_prepare();
 	if (error)
 		goto Finish;
 
@@ -175,13 +201,13 @@ int pm_suspend_disk(void)
 
 	if (in_suspend) {
 		enable_nonboot_cpus();
-		platform_finish();
+		hibernate_platform_finish();
 		device_resume();
 		resume_console();
 		pr_debug("PM: writing image.\n");
 		error = swsusp_write();
 		if (!error)
-			power_down();
+			hibernate_power_down();
 		else {
 			swsusp_free();
 			goto Finish;
@@ -194,7 +220,7 @@ int pm_suspend_disk(void)
  Enable_cpus:
 	enable_nonboot_cpus();
  Resume_devices:
-	platform_finish();
+	hibernate_platform_finish();
 	device_resume();
 	resume_console();
  Finish:
@@ -211,7 +237,7 @@ int pm_suspend_disk(void)
  *	Called as a late_initcall (so all devices are discovered and
  *	initialized), we call swsusp to see if we have a saved image or not.
  *	If so, we quiesce devices, the restore the saved image. We will
- *	return above (in pm_suspend_disk() ) if everything goes well.
+ *	return above (in hibernate() ) if everything goes well.
  *	Otherwise, we fail gracefully and return to the normally
  *	scheduled program.
  *
@@ -311,12 +337,13 @@ static const char * const pm_disk_modes[
  *
  *	Suspend-to-disk can be handled in several ways. We have a few options
  *	for putting the system to sleep - using the platform driver (e.g. ACPI
- *	or other pm_ops), powering off the system or rebooting the system
- *	(for testing) as well as the two test modes.
+ *	or other hibernate_ops), powering off the system or rebooting the
+ *	system (for testing) as well as the two test modes.
  *
  *	The system can support 'platform', and that is known a priori (and
- *	encoded in pm_ops). However, the user may choose 'shutdown' or 'reboot'
- *	as alternatives, as well as the test modes 'test' and 'testproc'.
+ *	encoded by the presence of hibernate_ops). However, the user may choose
+ *	'shutdown' or 'reboot' as alternatives, as well as the test modes 'test'
+ *	and 'testproc'.
  *
  *	show() will display what the mode is currently set to.
  *	store() will accept one of
@@ -328,7 +355,7 @@ static const char * const pm_disk_modes[
  *	'testproc'
  *
  *	It will only change to 'platform' if the system
- *	supports it (as determined from pm_ops->pm_disk_mode).
+ *	supports it (as determined by having hibernate_ops).
  */
 
 static ssize_t disk_show(struct subsystem * subsys, char * buf)
@@ -336,7 +363,7 @@ static ssize_t disk_show(struct subsyste
 	int i;
 	char *start = buf;
 
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
+	for (i = PM_DISK_FIRST; i <= PM_DISK_MAX; i++) {
 		if (!pm_disk_modes[i])
 			continue;
 		switch (i) {
@@ -345,9 +372,8 @@ static ssize_t disk_show(struct subsyste
 		case PM_DISK_TEST:
 		case PM_DISK_TESTPROC:
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (i == pm_ops->pm_disk_mode))
+		case PM_DISK_PLATFORM:
+			if (hibernate_ops)
 				break;
 			/* not a valid mode, continue with loop */
 			continue;
@@ -370,19 +396,19 @@ static ssize_t disk_store(struct subsyst
 	int i;
 	int len;
 	char *p;
-	suspend_disk_method_t mode = 0;
+	int mode = PM_DISK_INVALID;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
 	mutex_lock(&pm_mutex);
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
+	for (i = PM_DISK_FIRST; i < PM_DISK_MAX; i++) {
 		if (!strncmp(buf, pm_disk_modes[i], len)) {
 			mode = i;
 			break;
 		}
 	}
-	if (mode) {
+	if (mode != PM_DISK_INVALID) {
 		switch (mode) {
 		case PM_DISK_SHUTDOWN:
 		case PM_DISK_REBOOT:
@@ -390,19 +416,18 @@ static ssize_t disk_store(struct subsyst
 		case PM_DISK_TESTPROC:
 			pm_disk_mode = mode;
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (mode == pm_ops->pm_disk_mode))
+		case PM_DISK_PLATFORM:
+			if (hibernate_ops)
 				pm_disk_mode = mode;
 			else
 				error = -EINVAL;
 		}
-	} else {
+	} else
 		error = -EINVAL;
-	}
 
-	pr_debug("PM: suspend-to-disk mode set to '%s'\n",
-		 pm_disk_modes[mode]);
+	if (!error)
+		pr_debug("PM: suspend-to-disk mode set to '%s'\n",
+			 pm_disk_modes[mode]);
 	mutex_unlock(&pm_mutex);
 	return error ? error : n;
 }
--- wireless-dev.orig/kernel/power/user.c	2007-04-26 18:15:01.130691185 +0200
+++ wireless-dev/kernel/power/user.c	2007-04-26 18:15:09.420691185 +0200
@@ -128,22 +128,6 @@ static ssize_t snapshot_write(struct fil
 	return res;
 }
 
-static inline int platform_prepare(void)
-{
-	int error = 0;
-
-	if (pm_ops && pm_ops->prepare)
-		error = pm_ops->prepare(PM_SUSPEND_DISK);
-
-	return error;
-}
-
-static inline void platform_finish(void)
-{
-	if (pm_ops && pm_ops->finish)
-		pm_ops->finish(PM_SUSPEND_DISK);
-}
-
 static inline int snapshot_suspend(int platform_suspend)
 {
 	int error;
@@ -155,7 +139,7 @@ static inline int snapshot_suspend(int p
 		goto Finish;
 
 	if (platform_suspend) {
-		error = platform_prepare();
+		error = hibernate_platform_prepare();
 		if (error)
 			goto Finish;
 	}
@@ -172,7 +156,7 @@ static inline int snapshot_suspend(int p
 	enable_nonboot_cpus();
  Resume_devices:
 	if (platform_suspend)
-		platform_finish();
+		hibernate_platform_finish();
 
 	device_resume();
 	resume_console();
@@ -188,7 +172,7 @@ static inline int snapshot_restore(int p
 	mutex_lock(&pm_mutex);
 	pm_prepare_console();
 	if (platform_suspend) {
-		error = platform_prepare();
+		error = hibernate_platform_prepare();
 		if (error)
 			goto Finish;
 	}
@@ -204,7 +188,7 @@ static inline int snapshot_restore(int p
 	enable_nonboot_cpus();
  Resume_devices:
 	if (platform_suspend)
-		platform_finish();
+		hibernate_platform_finish();
 
 	device_resume();
 	resume_console();
@@ -406,13 +390,15 @@ static int snapshot_ioctl(struct inode *
 		case PMOPS_ENTER:
 			if (data->platform_suspend) {
 				kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-				error = pm_ops->enter(PM_SUSPEND_DISK);
+				error = hibernate_ops->enter();
+				/* how can this possibly do the right thing? */
 				error = 0;
 			}
 			break;
 
 		case PMOPS_FINISH:
 			if (data->platform_suspend)
+				/* and why doesn't this invoke anything??? */
 				error = 0;
 
 			break;
--- wireless-dev.orig/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:02.120691185 +0200
+++ wireless-dev/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:09.440691185 +0200
@@ -93,21 +93,23 @@ SNAPSHOT_S2RAM - suspend to RAM; using t
 	to resume the system from RAM if there's enough battery power or restore
 	its state on the basis of the saved suspend image otherwise)
 
-SNAPSHOT_PMOPS - enable the usage of the pmops->prepare, pmops->enter and
-	pmops->finish methods (the in-kernel swsusp knows these as the "platform
-	method") which are needed on many machines to (among others) speed up
-	the resume by letting the BIOS skip some steps or to let the system
-	recognise the correct state of the hardware after the resume (in
-	particular on many machines this ensures that unplugged AC
-	adapters get correctly detected and that kacpid does not run wild after
-	the resume).  The last ioctl() argument can take one of the three
-	values, defined in kernel/power/power.h:
+SNAPSHOT_PMOPS - enable the usage of the hibernate_ops->prepare,
+	hibernate_ops->enter and hibernate_ops->finish methods (the in-kernel
+	swsusp knows these as the "platform method") which are needed on many
+	machines to (among others) speed up the resume by letting the BIOS skip
+	some steps or to let the system recognise the correct state of the
+	hardware after the resume (in particular on many machines this ensures
+	that unplugged AC adapters get correctly detected and that kacpid does
+	not run wild after the resume).  The last ioctl() argument can take one
+	of the three values, defined in kernel/power/power.h:
 	PMOPS_PREPARE - make the kernel carry out the
-		pm_ops->prepare(PM_SUSPEND_DISK) operation
+		hibernate_ops->prepare() operation
 	PMOPS_ENTER - make the kernel power off the system by calling
-		pm_ops->enter(PM_SUSPEND_DISK)
+		hibernate_ops->enter()
 	PMOPS_FINISH - make the kernel carry out the
-		pm_ops->finish(PM_SUSPEND_DISK) operation
+		hibernate_ops->finish() operation
+	Note that the actual constants are misnamed because they surface
+	internal kernel implementation details that have changed.
 
 The device's read() operation can be used to transfer the snapshot image from
 the kernel.  It has the following limitations:
--- wireless-dev.orig/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:02.150691185 +0200
+++ wireless-dev/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:09.440691185 +0200
@@ -354,7 +354,7 @@ static void tps65010_interrupt(struct tp
 			 * also needs to get error handling and probably
 			 * an #ifdef CONFIG_SOFTWARE_SUSPEND
 			 */
-			pm_suspend(PM_SUSPEND_DISK);
+			hibernate();
 #endif
 			poll = 1;
 		}
--- wireless-dev.orig/kernel/sys.c	2007-04-26 18:15:01.310691185 +0200
+++ wireless-dev/kernel/sys.c	2007-04-26 18:15:09.450691185 +0200
@@ -25,6 +25,7 @@
 #include <linux/security.h>
 #include <linux/dcookies.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <linux/tty.h>
 #include <linux/signal.h>
 #include <linux/cn_proc.h>
@@ -881,7 +882,7 @@ asmlinkage long sys_reboot(int magic1, i
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	case LINUX_REBOOT_CMD_SW_SUSPEND:
 		{
-			int ret = pm_suspend(PM_SUSPEND_DISK);
+			int ret = hibernate();
 			unlock_kernel();
 			return ret;
 		}
--- wireless-dev.orig/drivers/acpi/sleep/main.c	2007-04-26 18:15:02.290691185 +0200
+++ wireless-dev/drivers/acpi/sleep/main.c	2007-04-26 18:15:09.630691185 +0200
@@ -15,6 +15,7 @@
 #include <linux/dmi.h>
 #include <linux/device.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <acpi/acpi_bus.h>
 #include <acpi/acpi_drivers.h>
 #include "sleep.h"
@@ -29,7 +30,6 @@ static u32 acpi_suspend_states[] = {
 	[PM_SUSPEND_ON] = ACPI_STATE_S0,
 	[PM_SUSPEND_STANDBY] = ACPI_STATE_S1,
 	[PM_SUSPEND_MEM] = ACPI_STATE_S3,
-	[PM_SUSPEND_DISK] = ACPI_STATE_S4,
 	[PM_SUSPEND_MAX] = ACPI_STATE_S5
 };
 
@@ -94,14 +94,6 @@ static int acpi_pm_enter(suspend_state_t
 		do_suspend_lowlevel();
 		break;
 
-	case PM_SUSPEND_DISK:
-		if (acpi_pm_ops.pm_disk_mode == PM_DISK_PLATFORM)
-			status = acpi_enter_sleep_state(acpi_state);
-		break;
-	case PM_SUSPEND_MAX:
-		acpi_power_off();
-		break;
-
 	default:
 		return -EINVAL;
 	}
@@ -157,12 +149,13 @@ int acpi_suspend(u32 acpi_state)
 	suspend_state_t states[] = {
 		[1] = PM_SUSPEND_STANDBY,
 		[3] = PM_SUSPEND_MEM,
-		[4] = PM_SUSPEND_DISK,
 		[5] = PM_SUSPEND_MAX
 	};
 
 	if (acpi_state < 6 && states[acpi_state])
 		return pm_suspend(states[acpi_state]);
+	if (acpi_state == 4)
+		return hibernate();
 	return -EINVAL;
 }
 
@@ -189,6 +182,71 @@ static struct pm_ops acpi_pm_ops = {
 	.finish = acpi_pm_finish,
 };
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static int acpi_hib_prepare(void)
+{
+	return acpi_sleep_prepare(ACPI_STATE_S4);
+}
+
+static int acpi_hib_enter(void)
+{
+	acpi_status status = AE_OK;
+	unsigned long flags = 0;
+	u32 acpi_state = acpi_suspend_states[pm_state];
+
+	ACPI_FLUSH_CPU_CACHE();
+
+	/* Do arch specific saving of state. */
+	int error = acpi_save_state_mem();
+	if (error)
+		return error;
+
+	local_irq_save(flags);
+	acpi_enable_wakeup_device(acpi_state);
+	status = acpi_enter_sleep_state(acpi_state);
+
+	/* ACPI 3.0 specs (P62) says that it's the responsabilty
+	 * of the OSPM to clear the status bit [ implying that the
+	 * POWER_BUTTON event should not reach userspace ]
+	 */
+	if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3))
+		acpi_clear_event(ACPI_EVENT_POWER_BUTTON);
+
+	local_irq_restore(flags);
+	printk(KERN_DEBUG "Back to C!\n");
+
+	/* restore processor state
+	 * We should only be here if we're coming back from STR or STD.
+	 * And, in the case of the latter, the memory image should have already
+	 * been loaded from disk.
+	 */
+	acpi_restore_state_mem();
+
+	return ACPI_SUCCESS(status) ? 0 : -EFAULT;
+}
+
+static void acpi_hib_finish(void)
+{
+	acpi_leave_sleep_state(ACPI_STATE_S4);
+	acpi_disable_wakeup_device(ACPI_STATE_S4);
+
+	/* reset firmware waking vector */
+	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
+
+	if (init_8259A_after_S1) {
+		printk("Broken toshiba laptop -> kicking interrupts\n");
+		init_8259A(0);
+	}
+	return 0;
+}
+
+static struct hibernate_ops acpi_hib_ops = {
+	.prepare = acpi_hib_prepare,
+	.enter = acpi_hib_enter,
+	.finish = acpi_hib_finish,
+};
+#endif /* CONFIG_SOFTWARE_SUSPEND */
+
 /*
  * Toshiba fails to preserve interrupts over S1, reinitialization
  * of 8259 is needed after S1 resume.
@@ -227,13 +285,16 @@ int __init acpi_sleep_init(void)
 			sleep_states[i] = 1;
 			printk(" S%d", i);
 		}
-		if (i == ACPI_STATE_S4) {
-			if (sleep_states[i])
-				acpi_pm_ops.pm_disk_mode = PM_DISK_PLATFORM;
-		}
 	}
 	printk(")\n");
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	if (sleep_states[ACPI_STATE_S4])
+		hibernate_set_ops(&acpi_hib_ops);
+#else
+	sleep_states[ACPI_STATE_S4] = 0;
+#endif
+
 	pm_set_ops(&acpi_pm_ops);
 	return 0;
 }
--- wireless-dev.orig/kernel/power/power.h	2007-04-26 18:15:01.240691185 +0200
+++ wireless-dev/kernel/power/power.h	2007-04-26 18:15:09.630691185 +0200
@@ -13,16 +13,6 @@ struct swsusp_info {
 
 
 
-#ifdef CONFIG_SOFTWARE_SUSPEND
-extern int pm_suspend_disk(void);
-
-#else
-static inline int pm_suspend_disk(void)
-{
-	return -EPERM;
-}
-#endif
-
 extern struct mutex pm_mutex;
 
 #define power_attr(_name) \
@@ -179,3 +169,6 @@ extern int suspend_enter(suspend_state_t
 struct timeval;
 extern void swsusp_show_speed(struct timeval *, struct timeval *,
 				unsigned int, char *);
+
+extern int hibernate_platform_prepare(void);
+extern void hibernate_platform_finish(void);
--- wireless-dev.orig/drivers/acpi/sleep/proc.c	2007-04-26 18:15:02.720691185 +0200
+++ wireless-dev/drivers/acpi/sleep/proc.c	2007-04-26 18:15:09.630691185 +0200
@@ -1,6 +1,7 @@
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
 #include <linux/suspend.h>
+#include <linux/hibernate.h>
 #include <linux/bcd.h>
 #include <asm/uaccess.h>
 
@@ -60,7 +61,7 @@ acpi_system_write_sleep(struct file *fil
 	state = simple_strtoul(str, NULL, 0);
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	if (state == 4) {
-		error = pm_suspend(PM_SUSPEND_DISK);
+		error = hibernate();
 		goto Done;
 	}
 #endif

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 18:40                       ` Rafael J. Wysocki
@ 2007-04-26 18:40                         ` Johannes Berg
       [not found]                         ` <1177612802.6814.121.camel@johannes.berg>
  1 sibling, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-26 18:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven


[-- Attachment #1.1: Type: text/plain, Size: 1240 bytes --]

On Thu, 2007-04-26 at 20:40 +0200, Rafael J. Wysocki wrote:

> >  * it surfaces kernel implementation details about pm_ops and thus makes
> >    the whole thing very fragile
> 
> Can you elaborate?

Well it tells userspace about pm_ops->enter/prepare/finish etc.
Also, it seems that it needs a "release memory now" operation instead of
just releasing it when the fd is closed?

> >  * it has yet another interface (yuck) to determine whether to reboot,
> >    shut down etc, doesn't use /sys/power/disk
> 
> Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.

Heh.

> >  * I generally had no idea wtf it is doing in some places
> 
> I could have told you if you had asked. :-)

I was offline ;)

> Do we need hibernate_ops at all?  There's only one user anyway and I'm not
> sure there will be more of them in the future.

I'm pretty sure there won't be, but there's no way to do it cleanly
without pm_ops since even acpi doesn't do this all the time but only
when some set of conditions is true. Hence, it needs to be able to
determine the availability of the platform mode at run time rather than
build time (build time => we could use weak symbols, arch hooks, ...)

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
  2007-04-26 16:31                     ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Johannes Berg
@ 2007-04-26 18:40                       ` Rafael J. Wysocki
  2007-04-26 18:40                         ` Johannes Berg
       [not found]                         ` <1177612802.6814.121.camel@johannes.berg>
  2007-04-29 12:48                       ` [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) R. J. Wysocki
  1 sibling, 2 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 18:40 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven

On Thursday, 26 April 2007 18:31, Johannes Berg wrote:
> On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:
> 
> > > From looking at pm_ops which I was recently working with a lot, it seems
> > > that it was designed by somebody who was reading the ACPI documentation
> > > and was otherwise pretty clueless, even at that level std tries to look
> > > like suspend. IMHO that is one of the first things that should be ripped
> > > out, no pm_ops for STD, it's a pain to work with.
> > 
> > That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> > low-level enter is pretty similar).
> > 
> > Patches would be welcome
> 
> That was easier than I thought. This applies on top of a patch that
> makes kernel/power/user.c optional since I had no idea how to fix it,
> problems I see:
>  * it surfaces kernel implementation details about pm_ops and thus makes
>    the whole thing very fragile

Can you elaborate?

>  * it has yet another interface (yuck) to determine whether to reboot,
>    shut down etc, doesn't use /sys/power/disk

Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.

>  * I generally had no idea wtf it is doing in some places

I could have told you if you had asked. :-)

> Anyway, this patch is only compile tested, it
>  * introduces include/linux/hibernate.h with hibernate_ops and
>    a new hibernate() function to hibernate the system

Do we need hibernate_ops at all?  There's only one user anyway and I'm not
sure there will be more of them in the future.

>  * rips apart a lot of the suspend code and puts it back together using
>    the hibernate_ops
>  * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
>  * might apply/compile against -mm, I have all my and some of Rafael's
>    suspend/hibernate work in my tree.
>  * breaks user suspend as I noted above
>  * is incomplete, somewhere pm_suspend_disk() is still defined iirc

I think I can fix it up, just give me some time.

The idea is good, I think we should do someting like this.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                         ` <1177612802.6814.121.camel@johannes.berg>
@ 2007-04-26 19:02                           ` Rafael J. Wysocki
       [not found]                           ` <200704262102.38568.rjw@sisk.pl>
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-26 19:02 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven

On Thursday, 26 April 2007 20:40, Johannes Berg wrote:
> On Thu, 2007-04-26 at 20:40 +0200, Rafael J. Wysocki wrote:
> 
> > >  * it surfaces kernel implementation details about pm_ops and thus makes
> > >    the whole thing very fragile
> > 
> > Can you elaborate?
> 
> Well it tells userspace about pm_ops->enter/prepare/finish etc.
> Also, it seems that it needs a "release memory now" operation instead of
> just releasing it when the fd is closed?

Yes.  That's because we want to be able to repeat creating the image
without closing the fd in some situations.

> > >  * it has yet another interface (yuck) to determine whether to reboot,
> > >    shut down etc, doesn't use /sys/power/disk
> > 
> > Yes.  In fact it was meant as a replacement for /sys/power/disk at one point.
> 
> Heh.
> 
> > >  * I generally had no idea wtf it is doing in some places
> > 
> > I could have told you if you had asked. :-)
> 
> I was offline ;)
> 
> > Do we need hibernate_ops at all?  There's only one user anyway and I'm not
> > sure there will be more of them in the future.
> 
> I'm pretty sure there won't be, but there's no way to do it cleanly
> without pm_ops since even acpi doesn't do this all the time but only
> when some set of conditions is true. Hence, it needs to be able to
> determine the availability of the platform mode at run time rather than
> build time (build time => we could use weak symbols, arch hooks, ...)

Still, we could use a global var 'platform_hibernation' or something like this,
I think.  Then, we can do

#define platform_hibernation	0

on the architectures that don't need it and make ACPI use it instead of this
"dynamic linking".

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                           ` <200704262102.38568.rjw@sisk.pl>
@ 2007-04-27  9:41                             ` Johannes Berg
       [not found]                             ` <1177666915.7828.35.camel@johannes.berg>
  1 sibling, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-27  9:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven


[-- Attachment #1.1: Type: text/plain, Size: 616 bytes --]

On Thu, 2007-04-26 at 21:02 +0200, Rafael J. Wysocki wrote:

> Yes.  That's because we want to be able to repeat creating the image
> without closing the fd in some situations.

Oh yeah, I just checked and it's not in fact necessary. I'm just
confused.

> Still, we could use a global var 'platform_hibernation' or something like this,
> I think.  Then, we can do
> 
> #define platform_hibernation	0
> 
> on the architectures that don't need it and make ACPI use it instead of this
> "dynamic linking".

No, because acpi doesn't know at build time whether it can actually do
S4 or not.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                             ` <1177666915.7828.35.camel@johannes.berg>
@ 2007-04-27 10:09                               ` Johannes Berg
  2007-04-27 10:18                               ` Rafael J. Wysocki
       [not found]                               ` <200704271218.07120.rjw@sisk.pl>
  2 siblings, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-27 10:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nick Piggin, Nigel Cunningham, suspend2-devel, Mike Galbraith,
	linux-kernel, Con Kolivas, Andrew Morton, Thomas Gleixner,
	Pavel Machek, Ingo Molnar, Linus Torvalds, linux-pm,
	Arjan van de Ven


[-- Attachment #1.1: Type: text/plain, Size: 415 bytes --]

On Fri, 2007-04-27 at 11:41 +0200, Johannes Berg wrote:

> No, because acpi doesn't know at build time whether it can actually do
> S4 or not.

Actually, you could probably do it by making some weak symbol for it
that only ACPI overrides, and then check in the ACPI code if S4 is
possible, otherwise somehow invoke the old symbol or copy the code or
something. Seems a bit more fragile though.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                             ` <1177666915.7828.35.camel@johannes.berg>
  2007-04-27 10:09                               ` Johannes Berg
@ 2007-04-27 10:18                               ` Rafael J. Wysocki
       [not found]                               ` <200704271218.07120.rjw@sisk.pl>
  2 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-27 10:18 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven

On Friday, 27 April 2007 11:41, Johannes Berg wrote:
> On Thu, 2007-04-26 at 21:02 +0200, Rafael J. Wysocki wrote:
> 
> > Yes.  That's because we want to be able to repeat creating the image
> > without closing the fd in some situations.
> 
> Oh yeah, I just checked and it's not in fact necessary. I'm just
> confused.
> 
> > Still, we could use a global var 'platform_hibernation' or something like this,
> > I think.  Then, we can do
> > 
> > #define platform_hibernation	0
> > 
> > on the architectures that don't need it and make ACPI use it instead of this
> > "dynamic linking".
> 
> No, because acpi doesn't know at build time whether it can actually do
> S4 or not.

That's not a problem, I think.

1) We define platform_hibernation if CONFIG_ACPI is set.

2) In the ACPI code we do

if (can do S4)
	platform_hibernation = 1;

3) We have functions arch_platform_prepare()/finish()/enter() that are defined
to be noops for anything but ACPI systems and for ACPI systems they are
defined like this:

int arch_platform_enter(void)
{
	if (!platform_hibernation)
		return 0;

	...
}

I think it should work.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                               ` <200704271218.07120.rjw@sisk.pl>
@ 2007-04-27 10:19                                 ` Johannes Berg
       [not found]                                 ` <1177669179.7828.53.camel@johannes.berg>
  1 sibling, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-27 10:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven


[-- Attachment #1.1: Type: text/plain, Size: 1248 bytes --]

On Fri, 2007-04-27 at 12:18 +0200, Rafael J. Wysocki wrote:

> 1) We define platform_hibernation if CONFIG_ACPI is set.

Let's just define it always then in the common code so we don't have
even more magic bits platforms need to define even if they don't care at
all. And please don't put #ifdef CONFIG_ACPI into the common code ;)
Maybe #ifdef CONFIG_ARCH_NEEDS_HIBERNATE_HOOKS or something.

> 2) In the ACPI code we do
> 
> if (can do S4)
> 	platform_hibernation = 1;

Gotcha.

> 3) We have functions arch_platform_prepare()/finish()/enter() that are defined
> to be noops for anything but ACPI systems and for ACPI systems they are
> defined like this:
> 
> int arch_platform_enter(void)
> {
> 	if (!platform_hibernation)
> 		return 0;
> 
> 	...
> }
> 
> I think it should work.

You could reduce code churn in all other platforms by making these weak
symbols like the irq hooks I did for pm_ops. It looks like it can work
and possibly is even less intrusive than my hibernate_ops patch. Though
then again my hibernate_ops patch removed a lot of stuff that is now no
longer necessary, and also completely removed the PM_SUSPEND_DISK foo...
we probably want that regardless of how we invoke ACPI.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                                   ` <200704271409.56687.rjw@sisk.pl>
@ 2007-04-27 12:07                                     ` Johannes Berg
  0 siblings, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-04-27 12:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven


[-- Attachment #1.1: Type: text/plain, Size: 347 bytes --]

On Fri, 2007-04-27 at 14:09 +0200, Rafael J. Wysocki wrote:

> Yes.  Still, I'd like to rework your patch to deal with ACPI without
> introducing hibernate_ops .  I'm going to do this later today if you don't
> mind. :-)

Not at all :) That's why I actually sent it out instead of just saying
"well I give up it breaks user.c"

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy)
       [not found]                                 ` <1177669179.7828.53.camel@johannes.berg>
       [not found]                                   ` <200704271409.56687.rjw@sisk.pl>
@ 2007-04-27 12:09                                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-27 12:09 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nick Piggin, Nigel Cunningham, Ingo Molnar, Pavel Machek,
	Mike Galbraith, linux-kernel, Con Kolivas, suspend2-devel,
	linux-pm, Andrew Morton, Linus Torvalds, Thomas Gleixner,
	Arjan van de Ven

On Friday, 27 April 2007 12:19, Johannes Berg wrote:
> On Fri, 2007-04-27 at 12:18 +0200, Rafael J. Wysocki wrote:
> 
> > 1) We define platform_hibernation if CONFIG_ACPI is set.
> 
> Let's just define it always then in the common code so we don't have
> even more magic bits platforms need to define even if they don't care at
> all. And please don't put #ifdef CONFIG_ACPI into the common code ;)
> Maybe #ifdef CONFIG_ARCH_NEEDS_HIBERNATE_HOOKS or something.
> 
> > 2) In the ACPI code we do
> > 
> > if (can do S4)
> > 	platform_hibernation = 1;
> 
> Gotcha.
> 
> > 3) We have functions arch_platform_prepare()/finish()/enter() that are defined
> > to be noops for anything but ACPI systems and for ACPI systems they are
> > defined like this:
> > 
> > int arch_platform_enter(void)
> > {
> > 	if (!platform_hibernation)
> > 		return 0;
> > 
> > 	...
> > }
> > 
> > I think it should work.
> 
> You could reduce code churn in all other platforms by making these weak
> symbols like the irq hooks I did for pm_ops. It looks like it can work
> and possibly is even less intrusive than my hibernate_ops patch. Though
> then again my hibernate_ops patch removed a lot of stuff that is now no
> longer necessary, and also completely removed the PM_SUSPEND_DISK foo...
> we probably want that regardless of how we invoke ACPI.

Yes.  Still, I'd like to rework your patch to deal with ACPI without
introducing hibernate_ops .  I'm going to do this later today if you don't
mind. :-)

Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-26 16:31                     ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Johannes Berg
  2007-04-26 18:40                       ` Rafael J. Wysocki
@ 2007-04-29 12:48                       ` R. J. Wysocki
  2007-04-29 12:53                         ` Rafael J. Wysocki
  2007-04-30  8:29                         ` Johannes Berg
  1 sibling, 2 replies; 117+ messages in thread
From: R. J. Wysocki @ 2007-04-29 12:48 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Pekka Enberg, linux-pm, Nigel Cunningham, Pavel Machek

[Trimmed the CC list to a reasonable minimum]

On Thursday, 26 April 2007 18:31, Johannes Berg wrote:
> On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:
> 
> > > From looking at pm_ops which I was recently working with a lot, it seems
> > > that it was designed by somebody who was reading the ACPI documentation
> > > and was otherwise pretty clueless, even at that level std tries to look
> > > like suspend. IMHO that is one of the first things that should be ripped
> > > out, no pm_ops for STD, it's a pain to work with.
> > 
> > That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> > low-level enter is pretty similar).
> > 
> > Patches would be welcome
> 
> That was easier than I thought. This applies on top of a patch that
> makes kernel/power/user.c optional since I had no idea how to fix it,
> problems I see:
>  * it surfaces kernel implementation details about pm_ops and thus makes
>    the whole thing very fragile
>  * it has yet another interface (yuck) to determine whether to reboot,
>    shut down etc, doesn't use /sys/power/disk
>  * I generally had no idea wtf it is doing in some places
> 
> Anyway, this patch is only compile tested, it
>  * introduces include/linux/hibernate.h with hibernate_ops and
>    a new hibernate() function to hibernate the system
>  * rips apart a lot of the suspend code and puts it back together using
>    the hibernate_ops
>  * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
>  * might apply/compile against -mm, I have all my and some of Rafael's
>    suspend/hibernate work in my tree.
>  * breaks user suspend as I noted above
>  * is incomplete, somewhere pm_suspend_disk() is still defined iirc

OK, I reworked it a bit.

Main changes:

- IMHO 'hibernation_ops' sounds better than 'hibernate_ops', for example, so
now the new names start with 'hibernation_' (or 'HIBERNATION_')

- Moved the hibernation-related definitions to include/linux/suspend.h, since
some hibernation-specific definitions are already there.  We can introduce
hibernation.h in a separate patch (it'll have to #include suspend.h IMO).

- Changed the names starting from 'pm_disk_' (or 'PM_DISK_').

- Cleaned up the new ACPI code (it didn't compile and included some things
unrelated to hibernation).  I'm still not sure about acpi_hibernation_finish()
(is the code after acpi_disable_wakeup_device() really needed?)

- Made kernel/power/user.c compile (and hopefully work too)

It looks like we'll have to change CONFIG_SOFTWARE_SUSPEND into
CONFIG_HIBERNATION, since some pieces of code now look silly.

The appended patch is agaist 2.6.21-rc7-mm2 with two freezer patches applied
(should not affect this one).  Compilation tested on x86_64.

Greetings,
Rafael

---
 Documentation/power/userland-swsusp.txt |   26 ++--
 drivers/acpi/sleep/main.c               |   79 +++++++++++--
 drivers/acpi/sleep/proc.c               |    2 
 drivers/i2c/chips/tps65010.c            |    2 
 include/linux/pm.h                      |   31 -----
 kernel/power/disk.c                     |  184 +++++++++++++++++---------------
 kernel/power/main.c                     |   42 ++-----
 kernel/power/power.h                    |    7 -
 kernel/power/user.c                     |   13 +-
 kernel/sys.c                            |    2 
 10 files changed, 204 insertions(+), 184 deletions(-)

Index: linux-2.6.21-rc7-mm2/include/linux/pm.h
===================================================================
--- linux-2.6.21-rc7-mm2.orig/include/linux/pm.h	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/include/linux/pm.h	2007-04-29 13:39:17.000000000 +0200
@@ -107,26 +107,11 @@ typedef int __bitwise suspend_state_t;
 #define PM_SUSPEND_ON		((__force suspend_state_t) 0)
 #define PM_SUSPEND_STANDBY	((__force suspend_state_t) 1)
 #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
-#define PM_SUSPEND_DISK		((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX		((__force suspend_state_t) 5)
-
-typedef int __bitwise suspend_disk_method_t;
-
-/* invalid must be 0 so struct pm_ops initialisers can leave it out */
-#define PM_DISK_INVALID		((__force suspend_disk_method_t) 0)
-#define	PM_DISK_PLATFORM	((__force suspend_disk_method_t) 1)
-#define	PM_DISK_SHUTDOWN	((__force suspend_disk_method_t) 2)
-#define	PM_DISK_REBOOT		((__force suspend_disk_method_t) 3)
-#define	PM_DISK_TEST		((__force suspend_disk_method_t) 4)
-#define	PM_DISK_TESTPROC	((__force suspend_disk_method_t) 5)
-#define	PM_DISK_MAX		((__force suspend_disk_method_t) 6)
+#define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
 
 /**
  * struct pm_ops - Callbacks for managing platform dependent suspend states.
  * @valid: Callback to determine whether the given state can be entered.
- * 	If %CONFIG_SOFTWARE_SUSPEND is set then %PM_SUSPEND_DISK is
- *	always valid and never passed to this call. If not assigned,
- *	no suspend states are valid.
  *	Valid states are advertised in /sys/power/state but can still
  *	be rejected by prepare or enter if the conditions aren't right.
  *	There is a %pm_valid_only_mem function available that can be assigned
@@ -140,24 +125,12 @@ typedef int __bitwise suspend_disk_metho
  *
  * @finish: Called when the system has left the given state and all devices
  *	are resumed. The return value is ignored.
- *
- * @pm_disk_mode: The generic code always allows one of the shutdown methods
- *	%PM_DISK_SHUTDOWN, %PM_DISK_REBOOT, %PM_DISK_TEST and
- *	%PM_DISK_TESTPROC. If this variable is set, the mode it is set
- *	to is allowed in addition to those modes and is also made default.
- *	When this mode is sent selected, the @prepare call will be called
- *	before suspending to disk (if present), the @enter call should be
- *	present and will be called after all state has been saved and the
- *	machine is ready to be powered off; the @finish callback is called
- *	after state has been restored. All these calls are called with
- *	%PM_SUSPEND_DISK as the state.
  */
 struct pm_ops {
 	int (*valid)(suspend_state_t state);
 	int (*prepare)(suspend_state_t state);
 	int (*enter)(suspend_state_t state);
 	int (*finish)(suspend_state_t state);
-	suspend_disk_method_t pm_disk_mode;
 };
 
 /**
@@ -258,8 +231,6 @@ extern void device_power_up(void);
 extern void device_resume(void);
 
 #ifdef CONFIG_PM
-extern suspend_disk_method_t pm_disk_mode;
-
 extern int device_suspend(pm_message_t state);
 extern int device_prepare_suspend(pm_message_t state);
 
Index: linux-2.6.21-rc7-mm2/kernel/power/main.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/kernel/power/main.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/kernel/power/main.c	2007-04-29 13:43:34.000000000 +0200
@@ -30,7 +30,6 @@
 DEFINE_MUTEX(pm_mutex);
 
 struct pm_ops *pm_ops;
-suspend_disk_method_t pm_disk_mode = PM_DISK_SHUTDOWN;
 
 /**
  *	pm_set_ops - Set the global power method table. 
@@ -41,10 +40,6 @@ void pm_set_ops(struct pm_ops * ops)
 {
 	mutex_lock(&pm_mutex);
 	pm_ops = ops;
-	if (ops && ops->pm_disk_mode != PM_DISK_INVALID) {
-		pm_disk_mode = ops->pm_disk_mode;
-	} else
-		pm_disk_mode = PM_DISK_SHUTDOWN;
 	mutex_unlock(&pm_mutex);
 }
 
@@ -196,24 +191,12 @@ static void suspend_finish(suspend_state
 static const char * const pm_states[PM_SUSPEND_MAX] = {
 	[PM_SUSPEND_STANDBY]	= "standby",
 	[PM_SUSPEND_MEM]	= "mem",
-	[PM_SUSPEND_DISK]	= "disk",
 };
 
 static inline int valid_state(suspend_state_t state)
 {
-	/* Suspend-to-disk does not really need low-level support.
-	 * It can work with shutdown/reboot if needed. If it isn't
-	 * configured, then it cannot be supported.
-	 */
-	if (state == PM_SUSPEND_DISK)
-#ifdef CONFIG_SOFTWARE_SUSPEND
-		return 1;
-#else
-		return 0;
-#endif
-
-	/* all other states need lowlevel support and need to be
-	 * valid to the lowlevel implementation, no valid callback
+	/* All states need lowlevel support and need to be valid
+	 * to the lowlevel implementation, no valid callback
 	 * implies that none are valid. */
 	if (!pm_ops || !pm_ops->valid || !pm_ops->valid(state))
 		return 0;
@@ -241,11 +224,6 @@ static int enter_state(suspend_state_t s
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
-	if (state == PM_SUSPEND_DISK) {
-		error = pm_suspend_disk();
-		goto Unlock;
-	}
-
 	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 	if ((error = suspend_prepare(state)))
 		goto Unlock;
@@ -263,7 +241,7 @@ static int enter_state(suspend_state_t s
 
 /**
  *	pm_suspend - Externally visible function for suspending system.
- *	@state:		Enumarted value of state to enter.
+ *	@state:		Enumerated value of state to enter.
  *
  *	Determine whether or not value is within range, get state 
  *	structure, and enter (above).
@@ -301,7 +279,13 @@ static ssize_t state_show(struct subsyst
 		if (pm_states[i] && valid_state(i))
 			s += sprintf(s,"%s ", pm_states[i]);
 	}
-	s += sprintf(s,"\n");
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	s += sprintf(s, "%s\n", "disk");
+#else
+	if (s != buf)
+		/* convert the last space to a newline */
+		*(s-1) = "\n";
+#endif
 	return (s - buf);
 }
 
@@ -316,6 +300,12 @@ static ssize_t state_store(struct subsys
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
+	/* First, check if we are requested to hibernate */
+	if (strncmp(buf, "disk", len)) {
+		error = hibernate();
+		return error ? error : n;
+	}
+
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && !strncmp(buf, *s, len))
 			break;
Index: linux-2.6.21-rc7-mm2/kernel/power/disk.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/kernel/power/disk.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/kernel/power/disk.c	2007-04-29 13:54:50.000000000 +0200
@@ -30,30 +30,60 @@ char resume_file[256] = CONFIG_PM_STD_PA
 dev_t swsusp_resume_device;
 sector_t swsusp_resume_block;
 
+static int hibernation_mode;
+
+enum {
+	HIBERNATION_INVALID,
+	HIBERNATION_PLATFORM,
+	HIBERNATION_TEST,
+	HIBERNATION_TESTPROC,
+	HIBERNATION_SHUTDOWN,
+	HIBERNATION_REBOOT,
+	/* keep last */
+	__HIBERNATION_AFTER_LAST
+};
+#define HIBERNATION_MAX (__HIBERNATION_AFTER_LAST-1)
+#define HIBERNATION_FIRST (HIBERNATION_INVALID + 1)
+
+struct hibernation_ops *hibernation_ops;
+
+void hibernation_set_ops(struct hibernation_ops *ops)
+{
+	mutex_lock(&pm_mutex);
+	hibernation_ops = ops;
+	mutex_unlock(&pm_mutex);
+	if (hibernation_ops) {
+		BUG_ON(!hibernation_ops->prepare);
+		BUG_ON(!hibernation_ops->enter);
+		BUG_ON(!hibernation_ops->finish);
+	}
+}
+
+
 /**
  *	platform_prepare - prepare the machine for hibernation using the
  *	platform driver if so configured and return an error code if it fails
  */
 
-static inline int platform_prepare(void)
+static int platform_prepare(void)
 {
-	int error = 0;
+	return (hibernation_mode == HIBERNATION_PLATFORM && hibernation_ops) ?
+		hibernation_ops->prepare() : 0;
+}
 
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
-	case PM_DISK_SHUTDOWN:
-	case PM_DISK_REBOOT:
-		break;
-	default:
-		if (pm_ops && pm_ops->prepare)
-			error = pm_ops->prepare(PM_SUSPEND_DISK);
-	}
-	return error;
+/**
+ *	platform_finish - switch the machine to the normal mode of operation
+ *	using the platform driver (must be called after platform_prepare())
+ */
+
+static void platform_finish(void)
+{
+	if (hibernation_mode == HIBERNATION_PLATFORM && hibernation_ops)
+		hibernation_ops->finish();
 }
 
 /**
- *	power_down - Shut machine down for hibernate.
+ *	power_down - Shut the machine down for hibernation.
  *
  *	Use the platform driver, if configured so; otherwise try
  *	to power off or reboot.
@@ -61,20 +91,20 @@ static inline int platform_prepare(void)
 
 static void power_down(void)
 {
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
+	switch (hibernation_mode) {
+	case HIBERNATION_TEST:
+	case HIBERNATION_TESTPROC:
 		break;
-	case PM_DISK_SHUTDOWN:
+	case HIBERNATION_SHUTDOWN:
 		kernel_power_off();
 		break;
-	case PM_DISK_REBOOT:
+	case HIBERNATION_REBOOT:
 		kernel_restart(NULL);
 		break;
-	default:
-		if (pm_ops && pm_ops->enter) {
+	case HIBERNATION_PLATFORM:
+		if (hibernation_ops) {
 			kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-			pm_ops->enter(PM_SUSPEND_DISK);
+			hibernation_ops->enter();
 			break;
 		}
 	}
@@ -87,20 +117,6 @@ static void power_down(void)
 	while(1);
 }
 
-static inline void platform_finish(void)
-{
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
-	case PM_DISK_SHUTDOWN:
-	case PM_DISK_REBOOT:
-		break;
-	default:
-		if (pm_ops && pm_ops->finish)
-			pm_ops->finish(PM_SUSPEND_DISK);
-	}
-}
-
 static void unprepare_processes(void)
 {
 	thaw_processes();
@@ -120,13 +136,10 @@ static int prepare_processes(void)
 }
 
 /**
- *	pm_suspend_disk - The granpappy of hibernation power management.
- *
- *	If not, then call swsusp to do its thing, then figure out how
- *	to power down the system.
+ *	hibernate - The granpappy of the built-in hibernation management
  */
 
-int pm_suspend_disk(void)
+int hibernate(void)
 {
 	int error;
 
@@ -151,7 +164,7 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Thaw;
 
-	if (pm_disk_mode == PM_DISK_TESTPROC) {
+	if (hibernation_mode == HIBERNATION_TESTPROC) {
 		printk("swsusp debug: Waiting for 5 seconds.\n");
 		mdelay(5000);
 		goto Thaw;
@@ -176,7 +189,7 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Enable_cpus;
 
-	if (pm_disk_mode == PM_DISK_TEST) {
+	if (hibernation_mode == HIBERNATION_TEST) {
 		printk("swsusp debug: Waiting for 5 seconds.\n");
 		mdelay(5000);
 		goto Enable_cpus;
@@ -230,7 +243,7 @@ int pm_suspend_disk(void)
  *	Called as a late_initcall (so all devices are discovered and
  *	initialized), we call swsusp to see if we have a saved image or not.
  *	If so, we quiesce devices, the restore the saved image. We will
- *	return above (in pm_suspend_disk() ) if everything goes well.
+ *	return above (in hibernate() ) if everything goes well.
  *	Otherwise, we fail gracefully and return to the normally
  *	scheduled program.
  *
@@ -336,25 +349,26 @@ static int software_resume(void)
 late_initcall(software_resume);
 
 
-static const char * const pm_disk_modes[] = {
-	[PM_DISK_PLATFORM]	= "platform",
-	[PM_DISK_SHUTDOWN]	= "shutdown",
-	[PM_DISK_REBOOT]	= "reboot",
-	[PM_DISK_TEST]		= "test",
-	[PM_DISK_TESTPROC]	= "testproc",
+static const char * const hibernation_modes[] = {
+	[HIBERNATION_PLATFORM]	= "platform",
+	[HIBERNATION_SHUTDOWN]	= "shutdown",
+	[HIBERNATION_REBOOT]	= "reboot",
+	[HIBERNATION_TEST]	= "test",
+	[HIBERNATION_TESTPROC]	= "testproc",
 };
 
 /**
- *	disk - Control suspend-to-disk mode
+ *	disk - Control hibernation mode
  *
  *	Suspend-to-disk can be handled in several ways. We have a few options
  *	for putting the system to sleep - using the platform driver (e.g. ACPI
- *	or other pm_ops), powering off the system or rebooting the system
- *	(for testing) as well as the two test modes.
+ *	or other hibernation_ops), powering off the system or rebooting the
+ *	system (for testing) as well as the two test modes.
  *
  *	The system can support 'platform', and that is known a priori (and
- *	encoded in pm_ops). However, the user may choose 'shutdown' or 'reboot'
- *	as alternatives, as well as the test modes 'test' and 'testproc'.
+ *	encoded by the presence of hibernation_ops). However, the user may
+ *	choose 'shutdown' or 'reboot' as alternatives, as well as one fo the
+ *	test modes, 'test' or 'testproc'.
  *
  *	show() will display what the mode is currently set to.
  *	store() will accept one of
@@ -366,7 +380,7 @@ static const char * const pm_disk_modes[
  *	'testproc'
  *
  *	It will only change to 'platform' if the system
- *	supports it (as determined from pm_ops->pm_disk_mode).
+ *	supports it (as determined by having hibernation_ops).
  */
 
 static ssize_t disk_show(struct subsystem * subsys, char * buf)
@@ -374,27 +388,26 @@ static ssize_t disk_show(struct subsyste
 	int i;
 	char *start = buf;
 
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
-		if (!pm_disk_modes[i])
+	for (i = HIBERNATION_FIRST; i <= HIBERNATION_MAX; i++) {
+		if (!hibernation_modes[i])
 			continue;
 		switch (i) {
-		case PM_DISK_SHUTDOWN:
-		case PM_DISK_REBOOT:
-		case PM_DISK_TEST:
-		case PM_DISK_TESTPROC:
+		case HIBERNATION_SHUTDOWN:
+		case HIBERNATION_REBOOT:
+		case HIBERNATION_TEST:
+		case HIBERNATION_TESTPROC:
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (i == pm_ops->pm_disk_mode))
+		case HIBERNATION_PLATFORM:
+			if (hibernation_ops)
 				break;
 			/* not a valid mode, continue with loop */
 			continue;
 		}
-		if (i == pm_disk_mode)
-			buf += sprintf(buf, "[%s]", pm_disk_modes[i]);
+		if (i == hibernation_mode)
+			buf += sprintf(buf, "[%s]", hibernation_modes[i]);
 		else
-			buf += sprintf(buf, "%s", pm_disk_modes[i]);
-		if (i+1 != PM_DISK_MAX)
+			buf += sprintf(buf, "%s", hibernation_modes[i]);
+		if (i+1 != HIBERNATION_MAX)
 			buf += sprintf(buf, " ");
 	}
 	buf += sprintf(buf, "\n");
@@ -408,39 +421,38 @@ static ssize_t disk_store(struct subsyst
 	int i;
 	int len;
 	char *p;
-	suspend_disk_method_t mode = 0;
+	int mode = HIBERNATION_INVALID;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
 	mutex_lock(&pm_mutex);
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
-		if (!strncmp(buf, pm_disk_modes[i], len)) {
+	for (i = HIBERNATION_FIRST; i < HIBERNATION_MAX; i++) {
+		if (!strncmp(buf, hibernation_modes[i], len)) {
 			mode = i;
 			break;
 		}
 	}
-	if (mode) {
+	if (mode != HIBERNATION_INVALID) {
 		switch (mode) {
-		case PM_DISK_SHUTDOWN:
-		case PM_DISK_REBOOT:
-		case PM_DISK_TEST:
-		case PM_DISK_TESTPROC:
-			pm_disk_mode = mode;
+		case HIBERNATION_SHUTDOWN:
+		case HIBERNATION_REBOOT:
+		case HIBERNATION_TEST:
+		case HIBERNATION_TESTPROC:
+			hibernation_mode = mode;
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (mode == pm_ops->pm_disk_mode))
-				pm_disk_mode = mode;
+		case HIBERNATION_PLATFORM:
+			if (hibernation_ops)
+				hibernation_mode = mode;
 			else
 				error = -EINVAL;
 		}
-	} else {
+	} else
 		error = -EINVAL;
-	}
 
-	pr_debug("PM: suspend-to-disk mode set to '%s'\n",
-		 pm_disk_modes[mode]);
+	if (!error)
+		pr_debug("PM: suspend-to-disk mode set to '%s'\n",
+			 hibernation_modes[mode]);
 	mutex_unlock(&pm_mutex);
 	return error ? error : n;
 }
Index: linux-2.6.21-rc7-mm2/Documentation/power/userland-swsusp.txt
===================================================================
--- linux-2.6.21-rc7-mm2.orig/Documentation/power/userland-swsusp.txt	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/Documentation/power/userland-swsusp.txt	2007-04-29 13:39:18.000000000 +0200
@@ -93,21 +93,23 @@ SNAPSHOT_S2RAM - suspend to RAM; using t
 	to resume the system from RAM if there's enough battery power or restore
 	its state on the basis of the saved suspend image otherwise)
 
-SNAPSHOT_PMOPS - enable the usage of the pmops->prepare, pmops->enter and
-	pmops->finish methods (the in-kernel swsusp knows these as the "platform
-	method") which are needed on many machines to (among others) speed up
-	the resume by letting the BIOS skip some steps or to let the system
-	recognise the correct state of the hardware after the resume (in
-	particular on many machines this ensures that unplugged AC
-	adapters get correctly detected and that kacpid does not run wild after
-	the resume).  The last ioctl() argument can take one of the three
-	values, defined in kernel/power/power.h:
+SNAPSHOT_PMOPS - enable the usage of the hibernation_ops->prepare,
+	hibernate_ops->enter and hibernation_ops->finish methods (the in-kernel
+	swsusp knows these as the "platform method") which are needed on many
+	machines to (among others) speed up the resume by letting the BIOS skip
+	some steps or to let the system recognise the correct state of the
+	hardware after the resume (in particular on many machines this ensures
+	that unplugged AC adapters get correctly detected and that kacpid does
+	not run wild after the resume).  The last ioctl() argument can take one
+	of the three values, defined in kernel/power/power.h:
 	PMOPS_PREPARE - make the kernel carry out the
-		pm_ops->prepare(PM_SUSPEND_DISK) operation
+		hibernation_ops->prepare() operation
 	PMOPS_ENTER - make the kernel power off the system by calling
-		pm_ops->enter(PM_SUSPEND_DISK)
+		hibernation_ops->enter()
 	PMOPS_FINISH - make the kernel carry out the
-		pm_ops->finish(PM_SUSPEND_DISK) operation
+		hibernation_ops->finish() operation
+	Note that the actual constants are misnamed because they surface
+	internal kernel implementation details that have changed.
 
 The device's read() operation can be used to transfer the snapshot image from
 the kernel.  It has the following limitations:
Index: linux-2.6.21-rc7-mm2/drivers/i2c/chips/tps65010.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/drivers/i2c/chips/tps65010.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/drivers/i2c/chips/tps65010.c	2007-04-29 13:39:18.000000000 +0200
@@ -354,7 +354,7 @@ static void tps65010_interrupt(struct tp
 			 * also needs to get error handling and probably
 			 * an #ifdef CONFIG_SOFTWARE_SUSPEND
 			 */
-			pm_suspend(PM_SUSPEND_DISK);
+			hibernate();
 #endif
 			poll = 1;
 		}
Index: linux-2.6.21-rc7-mm2/kernel/sys.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/kernel/sys.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/kernel/sys.c	2007-04-29 13:39:18.000000000 +0200
@@ -942,7 +942,7 @@ asmlinkage long sys_reboot(int magic1, i
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	case LINUX_REBOOT_CMD_SW_SUSPEND:
 		{
-			int ret = pm_suspend(PM_SUSPEND_DISK);
+			int ret = hibernate();
 			unlock_kernel();
 			return ret;
 		}
Index: linux-2.6.21-rc7-mm2/drivers/acpi/sleep/main.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/drivers/acpi/sleep/main.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/drivers/acpi/sleep/main.c	2007-04-29 14:16:30.000000000 +0200
@@ -29,7 +29,6 @@ static u32 acpi_suspend_states[] = {
 	[PM_SUSPEND_ON] = ACPI_STATE_S0,
 	[PM_SUSPEND_STANDBY] = ACPI_STATE_S1,
 	[PM_SUSPEND_MEM] = ACPI_STATE_S3,
-	[PM_SUSPEND_DISK] = ACPI_STATE_S4,
 	[PM_SUSPEND_MAX] = ACPI_STATE_S5
 };
 
@@ -94,14 +93,6 @@ static int acpi_pm_enter(suspend_state_t
 		do_suspend_lowlevel();
 		break;
 
-	case PM_SUSPEND_DISK:
-		if (acpi_pm_ops.pm_disk_mode == PM_DISK_PLATFORM)
-			status = acpi_enter_sleep_state(acpi_state);
-		break;
-	case PM_SUSPEND_MAX:
-		acpi_power_off();
-		break;
-
 	default:
 		return -EINVAL;
 	}
@@ -157,12 +148,13 @@ int acpi_suspend(u32 acpi_state)
 	suspend_state_t states[] = {
 		[1] = PM_SUSPEND_STANDBY,
 		[3] = PM_SUSPEND_MEM,
-		[4] = PM_SUSPEND_DISK,
 		[5] = PM_SUSPEND_MAX
 	};
 
 	if (acpi_state < 6 && states[acpi_state])
 		return pm_suspend(states[acpi_state]);
+	if (acpi_state == 4)
+		return hibernate();
 	return -EINVAL;
 }
 
@@ -189,6 +181,61 @@ static struct pm_ops acpi_pm_ops = {
 	.finish = acpi_pm_finish,
 };
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static int acpi_hibernation_prepare(void)
+{
+	return acpi_sleep_prepare(ACPI_STATE_S4);
+}
+
+static int acpi_hibernation_enter(void)
+{
+	acpi_status status = AE_OK;
+	unsigned long flags = 0;
+	int error;
+
+	ACPI_FLUSH_CPU_CACHE();
+
+	/* Do arch specific saving of state. */
+	error = acpi_save_state_mem();
+	if (error)
+		return error;
+
+	local_irq_save(flags);
+	acpi_enable_wakeup_device(ACPI_STATE_S4);
+	status = acpi_enter_sleep_state(ACPI_STATE_S4);
+	local_irq_restore(flags);
+
+	/*
+	 * Restore processor state
+	 * We should only be here if we're coming back from hibernation and
+	 * the memory image should have already been loaded from disk.
+	 */
+	acpi_restore_state_mem();
+
+	return ACPI_SUCCESS(status) ? 0 : -EFAULT;
+}
+
+static void acpi_hibernation_finish(void)
+{
+	acpi_leave_sleep_state(ACPI_STATE_S4);
+	acpi_disable_wakeup_device(ACPI_STATE_S4);
+
+	/* reset firmware waking vector */
+	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
+
+	if (init_8259A_after_S1) {
+		printk("Broken toshiba laptop -> kicking interrupts\n");
+		init_8259A(0);
+	}
+}
+
+static struct hibernation_ops acpi_hibernation_ops = {
+	.prepare = acpi_hibernation_prepare,
+	.enter = acpi_hibernation_enter,
+	.finish = acpi_hibernation_finish,
+};
+#endif /* CONFIG_SOFTWARE_SUSPEND */
+
 /*
  * Toshiba fails to preserve interrupts over S1, reinitialization
  * of 8259 is needed after S1 resume.
@@ -227,14 +274,18 @@ int __init acpi_sleep_init(void)
 			sleep_states[i] = 1;
 			printk(" S%d", i);
 		}
-		if (i == ACPI_STATE_S4) {
-			if (sleep_states[i])
-				acpi_pm_ops.pm_disk_mode = PM_DISK_PLATFORM;
-		}
 	}
 	printk(")\n");
 
 	pm_set_ops(&acpi_pm_ops);
+
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	if (sleep_states[ACPI_STATE_S4])
+		hibernation_set_ops(&acpi_hibernation_ops);
+#else
+	sleep_states[ACPI_STATE_S4] = 0;
+#endif
+
 	return 0;
 }
 
Index: linux-2.6.21-rc7-mm2/kernel/power/power.h
===================================================================
--- linux-2.6.21-rc7-mm2.orig/kernel/power/power.h	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/kernel/power/power.h	2007-04-29 13:55:55.000000000 +0200
@@ -25,12 +25,7 @@ struct swsusp_info {
  */
 #define SPARE_PAGES	((1024 * 1024) >> PAGE_SHIFT)
 
-extern int pm_suspend_disk(void);
-#else
-static inline int pm_suspend_disk(void)
-{
-	return -EPERM;
-}
+extern struct hibernation_ops *hibernation_ops;
 #endif
 
 extern int pfn_is_nosave(unsigned long);
Index: linux-2.6.21-rc7-mm2/drivers/acpi/sleep/proc.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/drivers/acpi/sleep/proc.c	2007-04-29 13:39:02.000000000 +0200
+++ linux-2.6.21-rc7-mm2/drivers/acpi/sleep/proc.c	2007-04-29 13:49:42.000000000 +0200
@@ -60,7 +60,7 @@ acpi_system_write_sleep(struct file *fil
 	state = simple_strtoul(str, NULL, 0);
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	if (state == 4) {
-		error = pm_suspend(PM_SUSPEND_DISK);
+		error = hibernate();
 		goto Done;
 	}
 #endif
Index: linux-2.6.21-rc7-mm2/kernel/power/user.c
===================================================================
--- linux-2.6.21-rc7-mm2.orig/kernel/power/user.c	2007-04-29 13:43:34.000000000 +0200
+++ linux-2.6.21-rc7-mm2/kernel/power/user.c	2007-04-29 14:00:42.000000000 +0200
@@ -138,16 +138,16 @@ static inline int platform_prepare(void)
 {
 	int error = 0;
 
-	if (pm_ops && pm_ops->prepare)
-		error = pm_ops->prepare(PM_SUSPEND_DISK);
+	if (hibernation_ops)
+		error = hibernation_ops->prepare();
 
 	return error;
 }
 
 static inline void platform_finish(void)
 {
-	if (pm_ops && pm_ops->finish)
-		pm_ops->finish(PM_SUSPEND_DISK);
+	if (hibernation_ops)
+		hibernation_ops->finish();
 }
 
 static inline int snapshot_suspend(int platform_suspend)
@@ -407,7 +407,7 @@ static int snapshot_ioctl(struct inode *
 		switch (arg) {
 
 		case PMOPS_PREPARE:
-			if (pm_ops && pm_ops->enter) {
+			if (hibernation_ops) {
 				data->platform_suspend = 1;
 				error = 0;
 			} else {
@@ -418,8 +418,7 @@ static int snapshot_ioctl(struct inode *
 		case PMOPS_ENTER:
 			if (data->platform_suspend) {
 				kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-				error = pm_ops->enter(PM_SUSPEND_DISK);
-				error = 0;
+				error = hibernation_ops->enter();
 			}
 			break;
 




> 
> johannes
> ---
>  Documentation/power/userland-swsusp.txt |   26 +++----
>  drivers/acpi/sleep/main.c               |   89 ++++++++++++++++++++----
>  drivers/acpi/sleep/proc.c               |    3 
>  drivers/i2c/chips/tps65010.c            |    2 
>  include/linux/hibernate.h               |   36 +++++++++
>  include/linux/pm.h                      |   31 --------
>  kernel/power/disk.c                     |  117 +++++++++++++++++++-------------
>  kernel/power/main.c                     |   47 +++++-------
>  kernel/power/power.h                    |   13 ---
>  kernel/power/user.c                     |   28 +------
>  kernel/sys.c                            |    3 
>  11 files changed, 231 insertions(+), 164 deletions(-)
> 
> --- wireless-dev.orig/include/linux/pm.h	2007-04-26 18:15:00.440691185 +0200
> +++ wireless-dev/include/linux/pm.h	2007-04-26 18:15:09.410691185 +0200
> @@ -107,26 +107,11 @@ typedef int __bitwise suspend_state_t;
>  #define PM_SUSPEND_ON		((__force suspend_state_t) 0)
>  #define PM_SUSPEND_STANDBY	((__force suspend_state_t) 1)
>  #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
> -#define PM_SUSPEND_DISK		((__force suspend_state_t) 4)
> -#define PM_SUSPEND_MAX		((__force suspend_state_t) 5)
> -
> -typedef int __bitwise suspend_disk_method_t;
> -
> -/* invalid must be 0 so struct pm_ops initialisers can leave it out */
> -#define PM_DISK_INVALID		((__force suspend_disk_method_t) 0)
> -#define	PM_DISK_PLATFORM	((__force suspend_disk_method_t) 1)
> -#define	PM_DISK_SHUTDOWN	((__force suspend_disk_method_t) 2)
> -#define	PM_DISK_REBOOT		((__force suspend_disk_method_t) 3)
> -#define	PM_DISK_TEST		((__force suspend_disk_method_t) 4)
> -#define	PM_DISK_TESTPROC	((__force suspend_disk_method_t) 5)
> -#define	PM_DISK_MAX		((__force suspend_disk_method_t) 6)
> +#define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
>  
>  /**
>   * struct pm_ops - Callbacks for managing platform dependent suspend states.
>   * @valid: Callback to determine whether the given state can be entered.
> - * 	If %CONFIG_SOFTWARE_SUSPEND is set then %PM_SUSPEND_DISK is
> - *	always valid and never passed to this call. If not assigned,
> - *	no suspend states are valid.
>   *	Valid states are advertised in /sys/power/state but can still
>   *	be rejected by prepare or enter if the conditions aren't right.
>   *	There is a %pm_valid_only_mem function available that can be assigned
> @@ -140,24 +125,12 @@ typedef int __bitwise suspend_disk_metho
>   *
>   * @finish: Called when the system has left the given state and all devices
>   *	are resumed. The return value is ignored.
> - *
> - * @pm_disk_mode: The generic code always allows one of the shutdown methods
> - *	%PM_DISK_SHUTDOWN, %PM_DISK_REBOOT, %PM_DISK_TEST and
> - *	%PM_DISK_TESTPROC. If this variable is set, the mode it is set
> - *	to is allowed in addition to those modes and is also made default.
> - *	When this mode is sent selected, the @prepare call will be called
> - *	before suspending to disk (if present), the @enter call should be
> - *	present and will be called after all state has been saved and the
> - *	machine is ready to be powered off; the @finish callback is called
> - *	after state has been restored. All these calls are called with
> - *	%PM_SUSPEND_DISK as the state.
>   */
>  struct pm_ops {
>  	int (*valid)(suspend_state_t state);
>  	int (*prepare)(suspend_state_t state);
>  	int (*enter)(suspend_state_t state);
>  	int (*finish)(suspend_state_t state);
> -	suspend_disk_method_t pm_disk_mode;
>  };
>  
>  /**
> @@ -276,8 +249,6 @@ extern void device_power_up(void);
>  extern void device_resume(void);
>  
>  #ifdef CONFIG_PM
> -extern suspend_disk_method_t pm_disk_mode;
> -
>  extern int device_suspend(pm_message_t state);
>  extern int device_prepare_suspend(pm_message_t state);
>  
> --- wireless-dev.orig/kernel/power/main.c	2007-04-26 18:15:00.790691185 +0200
> +++ wireless-dev/kernel/power/main.c	2007-04-26 18:15:09.410691185 +0200
> @@ -21,6 +21,7 @@
>  #include <linux/resume-trace.h>
>  #include <linux/freezer.h>
>  #include <linux/vmstat.h>
> +#include <linux/hibernate.h>
>  
>  #include "power.h"
>  
> @@ -30,7 +31,6 @@
>  DEFINE_MUTEX(pm_mutex);
>  
>  struct pm_ops *pm_ops;
> -suspend_disk_method_t pm_disk_mode = PM_DISK_SHUTDOWN;
>  
>  /**
>   *	pm_set_ops - Set the global power method table. 
> @@ -41,10 +41,6 @@ void pm_set_ops(struct pm_ops * ops)
>  {
>  	mutex_lock(&pm_mutex);
>  	pm_ops = ops;
> -	if (ops && ops->pm_disk_mode != PM_DISK_INVALID) {
> -		pm_disk_mode = ops->pm_disk_mode;
> -	} else
> -		pm_disk_mode = PM_DISK_SHUTDOWN;
>  	mutex_unlock(&pm_mutex);
>  }
>  
> @@ -184,24 +180,12 @@ static void suspend_finish(suspend_state
>  static const char * const pm_states[PM_SUSPEND_MAX] = {
>  	[PM_SUSPEND_STANDBY]	= "standby",
>  	[PM_SUSPEND_MEM]	= "mem",
> -	[PM_SUSPEND_DISK]	= "disk",
>  };
>  
>  static inline int valid_state(suspend_state_t state)
>  {
> -	/* Suspend-to-disk does not really need low-level support.
> -	 * It can work with shutdown/reboot if needed. If it isn't
> -	 * configured, then it cannot be supported.
> -	 */
> -	if (state == PM_SUSPEND_DISK)
> -#ifdef CONFIG_SOFTWARE_SUSPEND
> -		return 1;
> -#else
> -		return 0;
> -#endif
> -
> -	/* all other states need lowlevel support and need to be
> -	 * valid to the lowlevel implementation, no valid callback
> +	/* All states need lowlevel support and need to be valid
> +	 * to the lowlevel implementation, no valid callback
>  	 * implies that none are valid. */
>  	if (!pm_ops || !pm_ops->valid || !pm_ops->valid(state))
>  		return 0;
> @@ -229,11 +213,6 @@ static int enter_state(suspend_state_t s
>  	if (!mutex_trylock(&pm_mutex))
>  		return -EBUSY;
>  
> -	if (state == PM_SUSPEND_DISK) {
> -		error = pm_suspend_disk();
> -		goto Unlock;
> -	}
> -
>  	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
>  	if ((error = suspend_prepare(state)))
>  		goto Unlock;
> @@ -251,7 +230,7 @@ static int enter_state(suspend_state_t s
>  
>  /**
>   *	pm_suspend - Externally visible function for suspending system.
> - *	@state:		Enumarted value of state to enter.
> + *	@state:		Enumerated value of state to enter.
>   *
>   *	Determine whether or not value is within range, get state 
>   *	structure, and enter (above).
> @@ -283,13 +262,19 @@ decl_subsys(power,NULL,NULL);
>  static ssize_t state_show(struct subsystem * subsys, char * buf)
>  {
>  	int i;
> -	char * s = buf;
> +	char *s = buf;
>  
>  	for (i = 0; i < PM_SUSPEND_MAX; i++) {
>  		if (pm_states[i] && valid_state(i))
> -			s += sprintf(s,"%s ", pm_states[i]);
> +			s += sprintf(s, "%s ", pm_states[i]);
>  	}
> -	s += sprintf(s,"\n");
> +#ifdef CONFIG_SOFTWARE_SUSPEND
> +	s += sprintf(s, "%s\n", "disk");
> +#else
> +	if (s != buf)
> +		/* convert the last space to a newline */
> +		*(s-1) = "\n";
> +#endif
>  	return (s - buf);
>  }
>  
> @@ -304,6 +289,12 @@ static ssize_t state_store(struct subsys
>  	p = memchr(buf, '\n', n);
>  	len = p ? p - buf : n;
>  
> +	/* first check hibernate */
> +	if (strncmp(buf, "disk", len)) {
> +		error = hibernate();
> +		return error ? error : n;
> +	}
> +
>  	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
>  		if (*s && !strncmp(buf, *s, len))
>  			break;
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ wireless-dev/include/linux/hibernate.h	2007-04-26 18:21:38.130691185 +0200
> @@ -0,0 +1,36 @@
> +#ifndef __LINUX_HIBERNATE
> +#define __LINUX_HIBERNATE
> +/*
> + * hibernate ('suspend to disk') functionality
> + */
> +
> +/**
> + * struct hibernate_ops - hibernate platform support
> + *
> + * The methods in this structure allow a platform to override what
> + * happens for shutting down the machine when going into hibernation.
> + *
> + * All three methods must be assigned.
> + *
> + * @prepare: prepare system for hibernation
> + * @enter: shut down system after state has been saved to disk
> + * @finish: finish/clean up after state has been reloaded
> + */
> +struct hibernate_ops {
> +	int (*prepare)(void);
> +	int (*enter)(void);
> +	void (*finish)(void);
> +};
> +
> +/**
> + * hibernate_set_ops - set the global hibernate operations
> + * @ops: the hibernate operations to use from now on.
> + */
> +void hibernate_set_ops(struct hibernate_ops *ops);
> +
> +/**
> + * hibernate - hibernate the system
> + */
> +int hibernate(void);
> +
> +#endif /* __LINUX_HIBERNATE */
> --- wireless-dev.orig/kernel/power/disk.c	2007-04-26 18:15:00.800691185 +0200
> +++ wireless-dev/kernel/power/disk.c	2007-04-26 18:15:09.420691185 +0200
> @@ -21,45 +21,72 @@
>  #include <linux/console.h>
>  #include <linux/cpu.h>
>  #include <linux/freezer.h>
> +#include <linux/hibernate.h>
>  
>  #include "power.h"
>  
>  
> -static int noresume = 0;
> +static int noresume;
>  char resume_file[256] = CONFIG_PM_STD_PARTITION;
>  dev_t swsusp_resume_device;
>  sector_t swsusp_resume_block;
>  
> +static struct hibernate_ops *hibernate_ops;
> +static int pm_disk_mode;
> +
> +enum {
> +	PM_DISK_INVALID,
> +	PM_DISK_PLATFORM,
> +	PM_DISK_TEST,
> +	PM_DISK_TESTPROC,
> +	PM_DISK_SHUTDOWN,
> +	PM_DISK_REBOOT,
> +	/* keep last */
> +	__PM_DISK_AFTER_LAST
> +};
> +#define PM_DISK_MAX (__PM_DISK_AFTER_LAST-1)
> +#define PM_DISK_FIRST (PM_DISK_INVALID + 1)
> +
> +void hibernate_set_ops(struct hibernate_ops *ops)
> +{
> +	BUG_ON(!hibernate_ops->prepare);
> +	BUG_ON(!hibernate_ops->enter);
> +	BUG_ON(!hibernate_ops->finish);
> +	mutex_lock(&pm_mutex);
> +	hibernate_ops = ops;
> +	mutex_unlock(&pm_mutex);
> +}
> +
> +
>  /**
> - *	platform_prepare - prepare the machine for hibernation using the
> - *	platform driver if so configured and return an error code if it fails
> + *	hibernate_platform_prepare - prepare the machine for hibernation using
> + *	the platform driver if so configured and return an error code if it
> + *	fails.
>   */
>  
> -static inline int platform_prepare(void)
> +int hibernate_platform_prepare(void)
>  {
> -	int error = 0;
> -
>  	switch (pm_disk_mode) {
>  	case PM_DISK_TEST:
>  	case PM_DISK_TESTPROC:
>  	case PM_DISK_SHUTDOWN:
>  	case PM_DISK_REBOOT:
>  		break;
> -	default:
> -		if (pm_ops && pm_ops->prepare)
> -			error = pm_ops->prepare(PM_SUSPEND_DISK);
> +	case PM_DISK_PLATFORM:
> +		if (hibernate_ops)
> +			return hibernate_ops->prepare();
>  	}
> -	return error;
> +	return 0;
>  }
>  
>  /**
> - *	power_down - Shut machine down for hibernate.
> + *	hibernate_power_down - Shut machine down for hibernate.
>   *
>   *	Use the platform driver, if configured so; otherwise try
>   *	to power off or reboot.
>   */
>  
> -static void power_down(void)
> +static void hibernate_power_down(void)
>  {
>  	switch (pm_disk_mode) {
>  	case PM_DISK_TEST:
> @@ -70,11 +97,10 @@ static void power_down(void)
>  	case PM_DISK_REBOOT:
>  		kernel_restart(NULL);
>  		break;
> -	default:
> -		if (pm_ops && pm_ops->enter) {
> +	case PM_DISK_PLATFORM:
> +		if (hibernate_ops) {
>  			kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
> -			pm_ops->enter(PM_SUSPEND_DISK);
> -			break;
> +			hibernate_ops->enter();
>  		}
>  	}
>  
> @@ -85,7 +111,7 @@ static void power_down(void)
>  	while(1);
>  }
>  
> -static inline void platform_finish(void)
> +void hibernate_platform_finish(void)
>  {
>  	switch (pm_disk_mode) {
>  	case PM_DISK_TEST:
> @@ -93,9 +119,9 @@ static inline void platform_finish(void)
>  	case PM_DISK_SHUTDOWN:
>  	case PM_DISK_REBOOT:
>  		break;
> -	default:
> -		if (pm_ops && pm_ops->finish)
> -			pm_ops->finish(PM_SUSPEND_DISK);
> +	case PM_DISK_PLATFORM:
> +		if (hibernate_ops)
> +			hibernate_ops->finish();
>  	}
>  }
>  
> @@ -118,13 +144,13 @@ static int prepare_processes(void)
>  }
>  
>  /**
> - *	pm_suspend_disk - The granpappy of hibernation power management.
> + *	hibernate - The granpappy of hibernation power management.
>   *
>   *	If not, then call swsusp to do its thing, then figure out how
>   *	to power down the system.
>   */
>  
> -int pm_suspend_disk(void)
> +int hibernate(void)
>  {
>  	int error;
>  
> @@ -147,7 +173,7 @@ int pm_suspend_disk(void)
>  	if (error)
>  		goto Finish;
>  
> -	error = platform_prepare();
> +	error = hibernate_platform_prepare();
>  	if (error)
>  		goto Finish;
>  
> @@ -175,13 +201,13 @@ int pm_suspend_disk(void)
>  
>  	if (in_suspend) {
>  		enable_nonboot_cpus();
> -		platform_finish();
> +		hibernate_platform_finish();
>  		device_resume();
>  		resume_console();
>  		pr_debug("PM: writing image.\n");
>  		error = swsusp_write();
>  		if (!error)
> -			power_down();
> +			hibernate_power_down();
>  		else {
>  			swsusp_free();
>  			goto Finish;
> @@ -194,7 +220,7 @@ int pm_suspend_disk(void)
>   Enable_cpus:
>  	enable_nonboot_cpus();
>   Resume_devices:
> -	platform_finish();
> +	hibernate_platform_finish();
>  	device_resume();
>  	resume_console();
>   Finish:
> @@ -211,7 +237,7 @@ int pm_suspend_disk(void)
>   *	Called as a late_initcall (so all devices are discovered and
>   *	initialized), we call swsusp to see if we have a saved image or not.
>   *	If so, we quiesce devices, the restore the saved image. We will
> - *	return above (in pm_suspend_disk() ) if everything goes well.
> + *	return above (in hibernate() ) if everything goes well.
>   *	Otherwise, we fail gracefully and return to the normally
>   *	scheduled program.
>   *
> @@ -311,12 +337,13 @@ static const char * const pm_disk_modes[
>   *
>   *	Suspend-to-disk can be handled in several ways. We have a few options
>   *	for putting the system to sleep - using the platform driver (e.g. ACPI
> - *	or other pm_ops), powering off the system or rebooting the system
> - *	(for testing) as well as the two test modes.
> + *	or other hibernate_ops), powering off the system or rebooting the
> + *	system (for testing) as well as the two test modes.
>   *
>   *	The system can support 'platform', and that is known a priori (and
> - *	encoded in pm_ops). However, the user may choose 'shutdown' or 'reboot'
> - *	as alternatives, as well as the test modes 'test' and 'testproc'.
> + *	encoded by the presence of hibernate_ops). However, the user may choose
> + *	'shutdown' or 'reboot' as alternatives, as well as the test modes 'test'
> + *	and 'testproc'.
>   *
>   *	show() will display what the mode is currently set to.
>   *	store() will accept one of
> @@ -328,7 +355,7 @@ static const char * const pm_disk_modes[
>   *	'testproc'
>   *
>   *	It will only change to 'platform' if the system
> - *	supports it (as determined from pm_ops->pm_disk_mode).
> + *	supports it (as determined by having hibernate_ops).
>   */
>  
>  static ssize_t disk_show(struct subsystem * subsys, char * buf)
> @@ -336,7 +363,7 @@ static ssize_t disk_show(struct subsyste
>  	int i;
>  	char *start = buf;
>  
> -	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
> +	for (i = PM_DISK_FIRST; i <= PM_DISK_MAX; i++) {
>  		if (!pm_disk_modes[i])
>  			continue;
>  		switch (i) {
> @@ -345,9 +372,8 @@ static ssize_t disk_show(struct subsyste
>  		case PM_DISK_TEST:
>  		case PM_DISK_TESTPROC:
>  			break;
> -		default:
> -			if (pm_ops && pm_ops->enter &&
> -			    (i == pm_ops->pm_disk_mode))
> +		case PM_DISK_PLATFORM:
> +			if (hibernate_ops)
>  				break;
>  			/* not a valid mode, continue with loop */
>  			continue;
> @@ -370,19 +396,19 @@ static ssize_t disk_store(struct subsyst
>  	int i;
>  	int len;
>  	char *p;
> -	suspend_disk_method_t mode = 0;
> +	int mode = PM_DISK_INVALID;
>  
>  	p = memchr(buf, '\n', n);
>  	len = p ? p - buf : n;
>  
>  	mutex_lock(&pm_mutex);
> -	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
> +	for (i = PM_DISK_FIRST; i < PM_DISK_MAX; i++) {
>  		if (!strncmp(buf, pm_disk_modes[i], len)) {
>  			mode = i;
>  			break;
>  		}
>  	}
> -	if (mode) {
> +	if (mode != PM_DISK_INVALID) {
>  		switch (mode) {
>  		case PM_DISK_SHUTDOWN:
>  		case PM_DISK_REBOOT:
> @@ -390,19 +416,18 @@ static ssize_t disk_store(struct subsyst
>  		case PM_DISK_TESTPROC:
>  			pm_disk_mode = mode;
>  			break;
> -		default:
> -			if (pm_ops && pm_ops->enter &&
> -			    (mode == pm_ops->pm_disk_mode))
> +		case PM_DISK_PLATFORM:
> +			if (hibernate_ops)
>  				pm_disk_mode = mode;
>  			else
>  				error = -EINVAL;
>  		}
> -	} else {
> +	} else
>  		error = -EINVAL;
> -	}
>  
> -	pr_debug("PM: suspend-to-disk mode set to '%s'\n",
> -		 pm_disk_modes[mode]);
> +	if (!error)
> +		pr_debug("PM: suspend-to-disk mode set to '%s'\n",
> +			 pm_disk_modes[mode]);
>  	mutex_unlock(&pm_mutex);
>  	return error ? error : n;
>  }
> --- wireless-dev.orig/kernel/power/user.c	2007-04-26 18:15:01.130691185 +0200
> +++ wireless-dev/kernel/power/user.c	2007-04-26 18:15:09.420691185 +0200
> @@ -128,22 +128,6 @@ static ssize_t snapshot_write(struct fil
>  	return res;
>  }
>  
> -static inline int platform_prepare(void)
> -{
> -	int error = 0;
> -
> -	if (pm_ops && pm_ops->prepare)
> -		error = pm_ops->prepare(PM_SUSPEND_DISK);
> -
> -	return error;
> -}
> -
> -static inline void platform_finish(void)
> -{
> -	if (pm_ops && pm_ops->finish)
> -		pm_ops->finish(PM_SUSPEND_DISK);
> -}
> -
>  static inline int snapshot_suspend(int platform_suspend)
>  {
>  	int error;
> @@ -155,7 +139,7 @@ static inline int snapshot_suspend(int p
>  		goto Finish;
>  
>  	if (platform_suspend) {
> -		error = platform_prepare();
> +		error = hibernate_platform_prepare();
>  		if (error)
>  			goto Finish;
>  	}
> @@ -172,7 +156,7 @@ static inline int snapshot_suspend(int p
>  	enable_nonboot_cpus();
>   Resume_devices:
>  	if (platform_suspend)
> -		platform_finish();
> +		hibernate_platform_finish();
>  
>  	device_resume();
>  	resume_console();
> @@ -188,7 +172,7 @@ static inline int snapshot_restore(int p
>  	mutex_lock(&pm_mutex);
>  	pm_prepare_console();
>  	if (platform_suspend) {
> -		error = platform_prepare();
> +		error = hibernate_platform_prepare();
>  		if (error)
>  			goto Finish;
>  	}
> @@ -204,7 +188,7 @@ static inline int snapshot_restore(int p
>  	enable_nonboot_cpus();
>   Resume_devices:
>  	if (platform_suspend)
> -		platform_finish();
> +		hibernate_platform_finish();
>  
>  	device_resume();
>  	resume_console();
> @@ -406,13 +390,15 @@ static int snapshot_ioctl(struct inode *
>  		case PMOPS_ENTER:
>  			if (data->platform_suspend) {
>  				kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
> -				error = pm_ops->enter(PM_SUSPEND_DISK);
> +				error = hibernate_ops->enter();
> +				/* how can this possibly do the right thing? */
>  				error = 0;
>  			}
>  			break;
>  
>  		case PMOPS_FINISH:
>  			if (data->platform_suspend)
> +				/* and why doesn't this invoke anything??? */
>  				error = 0;
>  
>  			break;
> --- wireless-dev.orig/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:02.120691185 +0200
> +++ wireless-dev/Documentation/power/userland-swsusp.txt	2007-04-26 18:15:09.440691185 +0200
> @@ -93,21 +93,23 @@ SNAPSHOT_S2RAM - suspend to RAM; using t
>  	to resume the system from RAM if there's enough battery power or restore
>  	its state on the basis of the saved suspend image otherwise)
>  
> -SNAPSHOT_PMOPS - enable the usage of the pmops->prepare, pmops->enter and
> -	pmops->finish methods (the in-kernel swsusp knows these as the "platform
> -	method") which are needed on many machines to (among others) speed up
> -	the resume by letting the BIOS skip some steps or to let the system
> -	recognise the correct state of the hardware after the resume (in
> -	particular on many machines this ensures that unplugged AC
> -	adapters get correctly detected and that kacpid does not run wild after
> -	the resume).  The last ioctl() argument can take one of the three
> -	values, defined in kernel/power/power.h:
> +SNAPSHOT_PMOPS - enable the usage of the hibernate_ops->prepare,
> +	hibernate_ops->enter and hibernate_ops->finish methods (the in-kernel
> +	swsusp knows these as the "platform method") which are needed on many
> +	machines to (among others) speed up the resume by letting the BIOS skip
> +	some steps or to let the system recognise the correct state of the
> +	hardware after the resume (in particular on many machines this ensures
> +	that unplugged AC adapters get correctly detected and that kacpid does
> +	not run wild after the resume).  The last ioctl() argument can take one
> +	of the three values, defined in kernel/power/power.h:
>  	PMOPS_PREPARE - make the kernel carry out the
> -		pm_ops->prepare(PM_SUSPEND_DISK) operation
> +		hibernate_ops->prepare() operation
>  	PMOPS_ENTER - make the kernel power off the system by calling
> -		pm_ops->enter(PM_SUSPEND_DISK)
> +		hibernate_ops->enter()
>  	PMOPS_FINISH - make the kernel carry out the
> -		pm_ops->finish(PM_SUSPEND_DISK) operation
> +		hibernate_ops->finish() operation
> +	Note that the actual constants are misnamed because they surface
> +	internal kernel implementation details that have changed.
>  
>  The device's read() operation can be used to transfer the snapshot image from
>  the kernel.  It has the following limitations:
> --- wireless-dev.orig/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:02.150691185 +0200
> +++ wireless-dev/drivers/i2c/chips/tps65010.c	2007-04-26 18:15:09.440691185 +0200
> @@ -354,7 +354,7 @@ static void tps65010_interrupt(struct tp
>  			 * also needs to get error handling and probably
>  			 * an #ifdef CONFIG_SOFTWARE_SUSPEND
>  			 */
> -			pm_suspend(PM_SUSPEND_DISK);
> +			hibernate();
>  #endif
>  			poll = 1;
>  		}
> --- wireless-dev.orig/kernel/sys.c	2007-04-26 18:15:01.310691185 +0200
> +++ wireless-dev/kernel/sys.c	2007-04-26 18:15:09.450691185 +0200
> @@ -25,6 +25,7 @@
>  #include <linux/security.h>
>  #include <linux/dcookies.h>
>  #include <linux/suspend.h>
> +#include <linux/hibernate.h>
>  #include <linux/tty.h>
>  #include <linux/signal.h>
>  #include <linux/cn_proc.h>
> @@ -881,7 +882,7 @@ asmlinkage long sys_reboot(int magic1, i
>  #ifdef CONFIG_SOFTWARE_SUSPEND
>  	case LINUX_REBOOT_CMD_SW_SUSPEND:
>  		{
> -			int ret = pm_suspend(PM_SUSPEND_DISK);
> +			int ret = hibernate();
>  			unlock_kernel();
>  			return ret;
>  		}
> --- wireless-dev.orig/drivers/acpi/sleep/main.c	2007-04-26 18:15:02.290691185 +0200
> +++ wireless-dev/drivers/acpi/sleep/main.c	2007-04-26 18:15:09.630691185 +0200
> @@ -15,6 +15,7 @@
>  #include <linux/dmi.h>
>  #include <linux/device.h>
>  #include <linux/suspend.h>
> +#include <linux/hibernate.h>
>  #include <acpi/acpi_bus.h>
>  #include <acpi/acpi_drivers.h>
>  #include "sleep.h"
> @@ -29,7 +30,6 @@ static u32 acpi_suspend_states[] = {
>  	[PM_SUSPEND_ON] = ACPI_STATE_S0,
>  	[PM_SUSPEND_STANDBY] = ACPI_STATE_S1,
>  	[PM_SUSPEND_MEM] = ACPI_STATE_S3,
> -	[PM_SUSPEND_DISK] = ACPI_STATE_S4,
>  	[PM_SUSPEND_MAX] = ACPI_STATE_S5
>  };
>  
> @@ -94,14 +94,6 @@ static int acpi_pm_enter(suspend_state_t
>  		do_suspend_lowlevel();
>  		break;
>  
> -	case PM_SUSPEND_DISK:
> -		if (acpi_pm_ops.pm_disk_mode == PM_DISK_PLATFORM)
> -			status = acpi_enter_sleep_state(acpi_state);
> -		break;
> -	case PM_SUSPEND_MAX:
> -		acpi_power_off();
> -		break;
> -
>  	default:
>  		return -EINVAL;
>  	}
> @@ -157,12 +149,13 @@ int acpi_suspend(u32 acpi_state)
>  	suspend_state_t states[] = {
>  		[1] = PM_SUSPEND_STANDBY,
>  		[3] = PM_SUSPEND_MEM,
> -		[4] = PM_SUSPEND_DISK,
>  		[5] = PM_SUSPEND_MAX
>  	};
>  
>  	if (acpi_state < 6 && states[acpi_state])
>  		return pm_suspend(states[acpi_state]);
> +	if (acpi_state == 4)
> +		return hibernate();
>  	return -EINVAL;
>  }
>  
> @@ -189,6 +182,71 @@ static struct pm_ops acpi_pm_ops = {
>  	.finish = acpi_pm_finish,
>  };
>  
> +#ifdef CONFIG_SOFTWARE_SUSPEND
> +static int acpi_hib_prepare(void)
> +{
> +	return acpi_sleep_prepare(ACPI_STATE_S4);
> +}
> +
> +static int acpi_hib_enter(void)
> +{
> +	acpi_status status = AE_OK;
> +	unsigned long flags = 0;
> +	u32 acpi_state = acpi_suspend_states[pm_state];
> +
> +	ACPI_FLUSH_CPU_CACHE();
> +
> +	/* Do arch specific saving of state. */
> +	int error = acpi_save_state_mem();
> +	if (error)
> +		return error;
> +
> +	local_irq_save(flags);
> +	acpi_enable_wakeup_device(acpi_state);
> +	status = acpi_enter_sleep_state(acpi_state);
> +
> +	/* ACPI 3.0 specs (P62) says that it's the responsabilty
> +	 * of the OSPM to clear the status bit [ implying that the
> +	 * POWER_BUTTON event should not reach userspace ]
> +	 */
> +	if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3))
> +		acpi_clear_event(ACPI_EVENT_POWER_BUTTON);
> +
> +	local_irq_restore(flags);
> +	printk(KERN_DEBUG "Back to C!\n");
> +
> +	/* restore processor state
> +	 * We should only be here if we're coming back from STR or STD.
> +	 * And, in the case of the latter, the memory image should have already
> +	 * been loaded from disk.
> +	 */
> +	acpi_restore_state_mem();
> +
> +	return ACPI_SUCCESS(status) ? 0 : -EFAULT;
> +}
> +
> +static void acpi_hib_finish(void)
> +{
> +	acpi_leave_sleep_state(ACPI_STATE_S4);
> +	acpi_disable_wakeup_device(ACPI_STATE_S4);
> +
> +	/* reset firmware waking vector */
> +	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
> +
> +	if (init_8259A_after_S1) {
> +		printk("Broken toshiba laptop -> kicking interrupts\n");
> +		init_8259A(0);
> +	}
> +	return 0;
> +}
> +
> +static struct hibernate_ops acpi_hib_ops = {
> +	.prepare = acpi_hib_prepare,
> +	.enter = acpi_hib_enter,
> +	.finish = acpi_hib_finish,
> +};
> +#endif /* CONFIG_SOFTWARE_SUSPEND */
> +
>  /*
>   * Toshiba fails to preserve interrupts over S1, reinitialization
>   * of 8259 is needed after S1 resume.
> @@ -227,13 +285,16 @@ int __init acpi_sleep_init(void)
>  			sleep_states[i] = 1;
>  			printk(" S%d", i);
>  		}
> -		if (i == ACPI_STATE_S4) {
> -			if (sleep_states[i])
> -				acpi_pm_ops.pm_disk_mode = PM_DISK_PLATFORM;
> -		}
>  	}
>  	printk(")\n");
>  
> +#ifdef CONFIG_SOFTWARE_SUSPEND
> +	if (sleep_states[ACPI_STATE_S4])
> +		hibernate_set_ops(&acpi_hib_ops);
> +#else
> +	sleep_states[ACPI_STATE_S4] = 0;
> +#endif
> +
>  	pm_set_ops(&acpi_pm_ops);
>  	return 0;
>  }
> --- wireless-dev.orig/kernel/power/power.h	2007-04-26 18:15:01.240691185 +0200
> +++ wireless-dev/kernel/power/power.h	2007-04-26 18:15:09.630691185 +0200
> @@ -13,16 +13,6 @@ struct swsusp_info {
>  
>  
>  
> -#ifdef CONFIG_SOFTWARE_SUSPEND
> -extern int pm_suspend_disk(void);
> -
> -#else
> -static inline int pm_suspend_disk(void)
> -{
> -	return -EPERM;
> -}
> -#endif
> -
>  extern struct mutex pm_mutex;
>  
>  #define power_attr(_name) \
> @@ -179,3 +169,6 @@ extern int suspend_enter(suspend_state_t
>  struct timeval;
>  extern void swsusp_show_speed(struct timeval *, struct timeval *,
>  				unsigned int, char *);
> +
> +extern int hibernate_platform_prepare(void);
> +extern void hibernate_platform_finish(void);
> --- wireless-dev.orig/drivers/acpi/sleep/proc.c	2007-04-26 18:15:02.720691185 +0200
> +++ wireless-dev/drivers/acpi/sleep/proc.c	2007-04-26 18:15:09.630691185 +0200
> @@ -1,6 +1,7 @@
>  #include <linux/proc_fs.h>
>  #include <linux/seq_file.h>
>  #include <linux/suspend.h>
> +#include <linux/hibernate.h>
>  #include <linux/bcd.h>
>  #include <asm/uaccess.h>
>  
> @@ -60,7 +61,7 @@ acpi_system_write_sleep(struct file *fil
>  	state = simple_strtoul(str, NULL, 0);
>  #ifdef CONFIG_SOFTWARE_SUSPEND
>  	if (state == 4) {
> -		error = pm_suspend(PM_SUSPEND_DISK);
> +		error = hibernate();
>  		goto Done;
>  	}
>  #endif
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

-- 
Rafael J. Wysocki, Ph.D.
Institute of Theoretical Physics
Faculty of Physics of Warsaw University
ul. Hoza 69, 00-681 Warsaw
[tel: +48 22 55 32 263]
[mob: +48 60 50 53 693]
----------------------------
One should not increase, beyond what is necessary,
the number of entities required to explain anything.
			-- William of Ockham

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-29 12:48                       ` [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) R. J. Wysocki
@ 2007-04-29 12:53                         ` Rafael J. Wysocki
  2007-04-30  8:29                         ` Johannes Berg
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-29 12:53 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Sunday, 29 April 2007 14:48, R. J. Wysocki wrote:
> [Trimmed the CC list to a reasonable minimum]
> 
> On Thursday, 26 April 2007 18:31, Johannes Berg wrote:
> > On Thu, 2007-04-26 at 13:30 +0200, Pavel Machek wrote:
> > 
> > > > From looking at pm_ops which I was recently working with a lot, it seems
> > > > that it was designed by somebody who was reading the ACPI documentation
> > > > and was otherwise pretty clueless, even at that level std tries to look
> > > > like suspend. IMHO that is one of the first things that should be ripped
> > > > out, no pm_ops for STD, it's a pain to work with.
> > > 
> > > That code goes back to Patrick, AFAICT. (And yes, ACPI S3 and ACPI S4
> > > low-level enter is pretty similar).
> > > 
> > > Patches would be welcome
> > 
> > That was easier than I thought. This applies on top of a patch that
> > makes kernel/power/user.c optional since I had no idea how to fix it,
> > problems I see:
> >  * it surfaces kernel implementation details about pm_ops and thus makes
> >    the whole thing very fragile
> >  * it has yet another interface (yuck) to determine whether to reboot,
> >    shut down etc, doesn't use /sys/power/disk
> >  * I generally had no idea wtf it is doing in some places
> > 
> > Anyway, this patch is only compile tested, it
> >  * introduces include/linux/hibernate.h with hibernate_ops and
> >    a new hibernate() function to hibernate the system
> >  * rips apart a lot of the suspend code and puts it back together using
> >    the hibernate_ops
> >  * switches ACPI to hibernate_ops (the only user of pm_ops.pm_disk_mode)
> >  * might apply/compile against -mm, I have all my and some of Rafael's
> >    suspend/hibernate work in my tree.
> >  * breaks user suspend as I noted above
> >  * is incomplete, somewhere pm_suspend_disk() is still defined iirc
> 
> OK, I reworked it a bit.
> 
> Main changes:
> 
> - IMHO 'hibernation_ops' sounds better than 'hibernate_ops', for example, so
> now the new names start with 'hibernation_' (or 'HIBERNATION_')
> 
> - Moved the hibernation-related definitions to include/linux/suspend.h, since
> some hibernation-specific definitions are already there.  We can introduce
> hibernation.h in a separate patch (it'll have to #include suspend.h IMO).
> 
> - Changed the names starting from 'pm_disk_' (or 'PM_DISK_').
> 
> - Cleaned up the new ACPI code (it didn't compile and included some things
> unrelated to hibernation).  I'm still not sure about acpi_hibernation_finish()
> (is the code after acpi_disable_wakeup_device() really needed?)
> 
> - Made kernel/power/user.c compile (and hopefully work too)

Forgot to say that hibernation_ops is needed, IMO, because ACPI can be modular.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-29 12:48                       ` [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) R. J. Wysocki
  2007-04-29 12:53                         ` Rafael J. Wysocki
@ 2007-04-30  8:29                         ` Johannes Berg
  2007-04-30 14:51                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-04-30  8:29 UTC (permalink / raw)
  To: R. J. Wysocki; +Cc: Pekka Enberg, linux-pm, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 787 bytes --]

On Sun, 2007-04-29 at 14:48 +0200, R. J. Wysocki wrote:

> +	status = acpi_enter_sleep_state(ACPI_STATE_S4);
> +	local_irq_restore(flags);
> +
> +	/*
> +	 * Restore processor state
> +	 * We should only be here if we're coming back from hibernation and
> +	 * the memory image should have already been loaded from disk.

That comment doesn't seem right. This is in ->enter so afaict the image
hasn't been loaded yet at this point. I don't know if you just moved
code but if you did then I don't think it was correct before.

> +	 */
> +	acpi_restore_state_mem();

Maybe that needs to be in ->finish then? Or somewhere in the deeper arch
code?

Other than that it looks good to me on a cursory look. I'll give it a
try on my G5 on Wednesday or Thursday.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-30  8:29                         ` Johannes Berg
@ 2007-04-30 14:51                           ` Rafael J. Wysocki
  2007-04-30 14:59                             ` Johannes Berg
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-04-30 14:51 UTC (permalink / raw)
  To: Johannes Berg, Pavel Machek; +Cc: Pekka Enberg, linux-pm, Nigel Cunningham

On Monday, 30 April 2007 10:29, Johannes Berg wrote:
> On Sun, 2007-04-29 at 14:48 +0200, R. J. Wysocki wrote:
> 
> > +	status = acpi_enter_sleep_state(ACPI_STATE_S4);
> > +	local_irq_restore(flags);
> > +
> > +	/*
> > +	 * Restore processor state
> > +	 * We should only be here if we're coming back from hibernation and
> > +	 * the memory image should have already been loaded from disk.
> 
> That comment doesn't seem right. This is in ->enter so afaict the image
> hasn't been loaded yet at this point. I don't know if you just moved
> code but if you did then I don't think it was correct before.

It was in your patch, so I kept it, but I don't think it's correct too.

Moreover, it seems that acpi_save_state_mem() and acpi_restore_state_mem() are
only needed by s2ram, so we can safely remove them from the hibernation code
path.  Pavel, is that correct?

> > +	 */
> > +	acpi_restore_state_mem();
> 
> Maybe that needs to be in ->finish then? Or somewhere in the deeper arch
> code?
> 
> Other than that it looks good to me on a cursory look. I'll give it a
> try on my G5 on Wednesday or Thursday.

I think I'll have an improved version till then. :-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-30 14:51                           ` Rafael J. Wysocki
@ 2007-04-30 14:59                             ` Johannes Berg
  2007-05-01 14:05                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-04-30 14:59 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Pekka Enberg, linux-pm, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 736 bytes --]

On Mon, 2007-04-30 at 16:51 +0200, Rafael J. Wysocki wrote:

> > That comment doesn't seem right. This is in ->enter so afaict the image
> > hasn't been loaded yet at this point. I don't know if you just moved
> > code but if you did then I don't think it was correct before.
> 
> It was in your patch, so I kept it, but I don't think it's correct too.

If it was in my patch then it must be there in the original code, iirc I
just shuffled it a bit :)

> Moreover, it seems that acpi_save_state_mem() and acpi_restore_state_mem() are
> only needed by s2ram, so we can safely remove them from the hibernation code
> path.  Pavel, is that correct?

This I don't know. They seemed to be done on hibernate too.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-04-30 14:59                             ` Johannes Berg
@ 2007-05-01 14:05                               ` Rafael J. Wysocki
  2007-05-01 22:02                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-01 14:05 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Pekka Enberg, linux-pm, Nigel Cunningham, Pavel Machek

On Monday, 30 April 2007 16:59, Johannes Berg wrote:
> On Mon, 2007-04-30 at 16:51 +0200, Rafael J. Wysocki wrote:
> 
> > > That comment doesn't seem right. This is in ->enter so afaict the image
> > > hasn't been loaded yet at this point. I don't know if you just moved
> > > code but if you did then I don't think it was correct before.
> > 
> > It was in your patch, so I kept it, but I don't think it's correct too.
> 
> If it was in my patch then it must be there in the original code, iirc I
> just shuffled it a bit :)
> 
> > Moreover, it seems that acpi_save_state_mem() and acpi_restore_state_mem() are
> > only needed by s2ram, so we can safely remove them from the hibernation code
> > path.  Pavel, is that correct?
> 
> This I don't know. They seemed to be done on hibernate too.

The previous version of the patch was missing the changes in suspend.h.

Apart from this I've cleaned up some changes in disk.c and main.c to make
the sysfs interface work again and dropped some ACPI code that I think was
not necessary.

Patch appended (tested on x86_64, but not extensively), comments welcome. :-)

Greetings,
Rafael

---
This patch:
 * removes the definitions related to hibernation (aka suspend to disk) from
   include/linux/pm.h
 * introduces struct hibernation_ops and a new function to hibernate the system
   called  hibernate(), defined in include/linux/suspend.h
 * separates suspend code in kernel/power/main.c from hibernation-related code
   in kernel/power/disk.c and kernel/power/user.c (with the help of
   hibernation_ops)
 * switches ACPI (the only user of pm_ops.pm_disk_mode) to hibernation_ops
---

 Documentation/power/userland-swsusp.txt |   26 ++--
 drivers/acpi/sleep/main.c               |   67 +++++++++--
 drivers/acpi/sleep/proc.c               |    2 
 drivers/i2c/chips/tps65010.c            |    2 
 include/linux/pm.h                      |   31 -----
 include/linux/suspend.h                 |   32 +++++
 kernel/power/disk.c                     |  186 +++++++++++++++++---------------
 kernel/power/main.c                     |   42 ++-----
 kernel/power/power.h                    |    7 -
 kernel/power/user.c                     |   13 +-
 kernel/sys.c                            |    2 
 11 files changed, 225 insertions(+), 185 deletions(-)

Index: linux-2.6.21/include/linux/pm.h
===================================================================
--- linux-2.6.21.orig/include/linux/pm.h	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/include/linux/pm.h	2007-05-01 13:35:33.000000000 +0200
@@ -107,26 +107,11 @@ typedef int __bitwise suspend_state_t;
 #define PM_SUSPEND_ON		((__force suspend_state_t) 0)
 #define PM_SUSPEND_STANDBY	((__force suspend_state_t) 1)
 #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
-#define PM_SUSPEND_DISK		((__force suspend_state_t) 4)
-#define PM_SUSPEND_MAX		((__force suspend_state_t) 5)
-
-typedef int __bitwise suspend_disk_method_t;
-
-/* invalid must be 0 so struct pm_ops initialisers can leave it out */
-#define PM_DISK_INVALID		((__force suspend_disk_method_t) 0)
-#define	PM_DISK_PLATFORM	((__force suspend_disk_method_t) 1)
-#define	PM_DISK_SHUTDOWN	((__force suspend_disk_method_t) 2)
-#define	PM_DISK_REBOOT		((__force suspend_disk_method_t) 3)
-#define	PM_DISK_TEST		((__force suspend_disk_method_t) 4)
-#define	PM_DISK_TESTPROC	((__force suspend_disk_method_t) 5)
-#define	PM_DISK_MAX		((__force suspend_disk_method_t) 6)
+#define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
 
 /**
  * struct pm_ops - Callbacks for managing platform dependent suspend states.
  * @valid: Callback to determine whether the given state can be entered.
- * 	If %CONFIG_SOFTWARE_SUSPEND is set then %PM_SUSPEND_DISK is
- *	always valid and never passed to this call. If not assigned,
- *	no suspend states are valid.
  *	Valid states are advertised in /sys/power/state but can still
  *	be rejected by prepare or enter if the conditions aren't right.
  *	There is a %pm_valid_only_mem function available that can be assigned
@@ -140,24 +125,12 @@ typedef int __bitwise suspend_disk_metho
  *
  * @finish: Called when the system has left the given state and all devices
  *	are resumed. The return value is ignored.
- *
- * @pm_disk_mode: The generic code always allows one of the shutdown methods
- *	%PM_DISK_SHUTDOWN, %PM_DISK_REBOOT, %PM_DISK_TEST and
- *	%PM_DISK_TESTPROC. If this variable is set, the mode it is set
- *	to is allowed in addition to those modes and is also made default.
- *	When this mode is sent selected, the @prepare call will be called
- *	before suspending to disk (if present), the @enter call should be
- *	present and will be called after all state has been saved and the
- *	machine is ready to be powered off; the @finish callback is called
- *	after state has been restored. All these calls are called with
- *	%PM_SUSPEND_DISK as the state.
  */
 struct pm_ops {
 	int (*valid)(suspend_state_t state);
 	int (*prepare)(suspend_state_t state);
 	int (*enter)(suspend_state_t state);
 	int (*finish)(suspend_state_t state);
-	suspend_disk_method_t pm_disk_mode;
 };
 
 /**
@@ -276,8 +249,6 @@ extern void device_power_up(void);
 extern void device_resume(void);
 
 #ifdef CONFIG_PM
-extern suspend_disk_method_t pm_disk_mode;
-
 extern int device_suspend(pm_message_t state);
 extern int device_prepare_suspend(pm_message_t state);
 
Index: linux-2.6.21/kernel/power/main.c
===================================================================
--- linux-2.6.21.orig/kernel/power/main.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/kernel/power/main.c	2007-05-01 15:14:00.000000000 +0200
@@ -30,7 +30,6 @@
 DEFINE_MUTEX(pm_mutex);
 
 struct pm_ops *pm_ops;
-suspend_disk_method_t pm_disk_mode = PM_DISK_SHUTDOWN;
 
 /**
  *	pm_set_ops - Set the global power method table. 
@@ -41,10 +40,6 @@ void pm_set_ops(struct pm_ops * ops)
 {
 	mutex_lock(&pm_mutex);
 	pm_ops = ops;
-	if (ops && ops->pm_disk_mode != PM_DISK_INVALID) {
-		pm_disk_mode = ops->pm_disk_mode;
-	} else
-		pm_disk_mode = PM_DISK_SHUTDOWN;
 	mutex_unlock(&pm_mutex);
 }
 
@@ -184,24 +179,12 @@ static void suspend_finish(suspend_state
 static const char * const pm_states[PM_SUSPEND_MAX] = {
 	[PM_SUSPEND_STANDBY]	= "standby",
 	[PM_SUSPEND_MEM]	= "mem",
-	[PM_SUSPEND_DISK]	= "disk",
 };
 
 static inline int valid_state(suspend_state_t state)
 {
-	/* Suspend-to-disk does not really need low-level support.
-	 * It can work with shutdown/reboot if needed. If it isn't
-	 * configured, then it cannot be supported.
-	 */
-	if (state == PM_SUSPEND_DISK)
-#ifdef CONFIG_SOFTWARE_SUSPEND
-		return 1;
-#else
-		return 0;
-#endif
-
-	/* all other states need lowlevel support and need to be
-	 * valid to the lowlevel implementation, no valid callback
+	/* All states need lowlevel support and need to be valid
+	 * to the lowlevel implementation, no valid callback
 	 * implies that none are valid. */
 	if (!pm_ops || !pm_ops->valid || !pm_ops->valid(state))
 		return 0;
@@ -229,11 +212,6 @@ static int enter_state(suspend_state_t s
 	if (!mutex_trylock(&pm_mutex))
 		return -EBUSY;
 
-	if (state == PM_SUSPEND_DISK) {
-		error = pm_suspend_disk();
-		goto Unlock;
-	}
-
 	pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]);
 	if ((error = suspend_prepare(state)))
 		goto Unlock;
@@ -251,7 +229,7 @@ static int enter_state(suspend_state_t s
 
 /**
  *	pm_suspend - Externally visible function for suspending system.
- *	@state:		Enumarted value of state to enter.
+ *	@state:		Enumerated value of state to enter.
  *
  *	Determine whether or not value is within range, get state 
  *	structure, and enter (above).
@@ -289,7 +267,13 @@ static ssize_t state_show(struct subsyst
 		if (pm_states[i] && valid_state(i))
 			s += sprintf(s,"%s ", pm_states[i]);
 	}
-	s += sprintf(s,"\n");
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	s += sprintf(s, "%s\n", "disk");
+#else
+	if (s != buf)
+		/* convert the last space to a newline */
+		*(s-1) = "\n";
+#endif
 	return (s - buf);
 }
 
@@ -304,6 +288,12 @@ static ssize_t state_store(struct subsys
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
+	/* First, check if we are requested to hibernate */
+	if (!strncmp(buf, "disk", len)) {
+		error = hibernate();
+		return error ? error : n;
+	}
+
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && !strncmp(buf, *s, len))
 			break;
Index: linux-2.6.21/kernel/power/disk.c
===================================================================
--- linux-2.6.21.orig/kernel/power/disk.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/kernel/power/disk.c	2007-05-01 15:14:13.000000000 +0200
@@ -30,30 +30,60 @@ char resume_file[256] = CONFIG_PM_STD_PA
 dev_t swsusp_resume_device;
 sector_t swsusp_resume_block;
 
+static int hibernation_mode;
+
+enum {
+	HIBERNATION_INVALID,
+	HIBERNATION_PLATFORM,
+	HIBERNATION_TEST,
+	HIBERNATION_TESTPROC,
+	HIBERNATION_SHUTDOWN,
+	HIBERNATION_REBOOT,
+	/* keep last */
+	__HIBERNATION_AFTER_LAST
+};
+#define HIBERNATION_MAX (__HIBERNATION_AFTER_LAST-1)
+#define HIBERNATION_FIRST (HIBERNATION_INVALID + 1)
+
+struct hibernation_ops *hibernation_ops;
+
+void hibernation_set_ops(struct hibernation_ops *ops)
+{
+	if (ops && !(ops->prepare && ops->enter && ops->finish)) {
+		printk(KERN_ERR "Wrong definition of hibernation operations! "
+			"Using defaults\n");
+		return;
+	}
+	mutex_lock(&pm_mutex);
+	hibernation_ops = ops;
+	mutex_unlock(&pm_mutex);
+}
+
+
 /**
  *	platform_prepare - prepare the machine for hibernation using the
  *	platform driver if so configured and return an error code if it fails
  */
 
-static inline int platform_prepare(void)
+static int platform_prepare(void)
 {
-	int error = 0;
+	return (hibernation_mode == HIBERNATION_PLATFORM && hibernation_ops) ?
+		hibernation_ops->prepare() : 0;
+}
 
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
-	case PM_DISK_SHUTDOWN:
-	case PM_DISK_REBOOT:
-		break;
-	default:
-		if (pm_ops && pm_ops->prepare)
-			error = pm_ops->prepare(PM_SUSPEND_DISK);
-	}
-	return error;
+/**
+ *	platform_finish - switch the machine to the normal mode of operation
+ *	using the platform driver (must be called after platform_prepare())
+ */
+
+static void platform_finish(void)
+{
+	if (hibernation_mode == HIBERNATION_PLATFORM && hibernation_ops)
+		hibernation_ops->finish();
 }
 
 /**
- *	power_down - Shut machine down for hibernate.
+ *	power_down - Shut the machine down for hibernation.
  *
  *	Use the platform driver, if configured so; otherwise try
  *	to power off or reboot.
@@ -61,20 +91,20 @@ static inline int platform_prepare(void)
 
 static void power_down(void)
 {
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
+	switch (hibernation_mode) {
+	case HIBERNATION_TEST:
+	case HIBERNATION_TESTPROC:
 		break;
-	case PM_DISK_SHUTDOWN:
+	case HIBERNATION_SHUTDOWN:
 		kernel_power_off();
 		break;
-	case PM_DISK_REBOOT:
+	case HIBERNATION_REBOOT:
 		kernel_restart(NULL);
 		break;
-	default:
-		if (pm_ops && pm_ops->enter) {
+	case HIBERNATION_PLATFORM:
+		if (hibernation_ops) {
 			kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-			pm_ops->enter(PM_SUSPEND_DISK);
+			hibernation_ops->enter();
 			break;
 		}
 	}
@@ -87,20 +117,6 @@ static void power_down(void)
 	while(1);
 }
 
-static inline void platform_finish(void)
-{
-	switch (pm_disk_mode) {
-	case PM_DISK_TEST:
-	case PM_DISK_TESTPROC:
-	case PM_DISK_SHUTDOWN:
-	case PM_DISK_REBOOT:
-		break;
-	default:
-		if (pm_ops && pm_ops->finish)
-			pm_ops->finish(PM_SUSPEND_DISK);
-	}
-}
-
 static void unprepare_processes(void)
 {
 	thaw_processes();
@@ -120,13 +136,10 @@ static int prepare_processes(void)
 }
 
 /**
- *	pm_suspend_disk - The granpappy of hibernation power management.
- *
- *	If not, then call swsusp to do its thing, then figure out how
- *	to power down the system.
+ *	hibernate - The granpappy of the built-in hibernation management
  */
 
-int pm_suspend_disk(void)
+int hibernate(void)
 {
 	int error;
 
@@ -143,7 +156,8 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Finish;
 
-	if (pm_disk_mode == PM_DISK_TESTPROC) {
+	mutex_lock(&pm_mutex);
+	if (hibernation_mode == HIBERNATION_TESTPROC) {
 		printk("swsusp debug: Waiting for 5 seconds.\n");
 		mdelay(5000);
 		goto Thaw;
@@ -168,7 +182,7 @@ int pm_suspend_disk(void)
 	if (error)
 		goto Enable_cpus;
 
-	if (pm_disk_mode == PM_DISK_TEST) {
+	if (hibernation_mode == HIBERNATION_TEST) {
 		printk("swsusp debug: Waiting for 5 seconds.\n");
 		mdelay(5000);
 		goto Enable_cpus;
@@ -205,6 +219,7 @@ int pm_suspend_disk(void)
 	device_resume();
 	resume_console();
  Thaw:
+	mutex_unlock(&pm_mutex);
 	unprepare_processes();
  Finish:
 	free_basic_memory_bitmaps();
@@ -220,7 +235,7 @@ int pm_suspend_disk(void)
  *	Called as a late_initcall (so all devices are discovered and
  *	initialized), we call swsusp to see if we have a saved image or not.
  *	If so, we quiesce devices, the restore the saved image. We will
- *	return above (in pm_suspend_disk() ) if everything goes well.
+ *	return above (in hibernate() ) if everything goes well.
  *	Otherwise, we fail gracefully and return to the normally
  *	scheduled program.
  *
@@ -315,25 +330,26 @@ static int software_resume(void)
 late_initcall(software_resume);
 
 
-static const char * const pm_disk_modes[] = {
-	[PM_DISK_PLATFORM]	= "platform",
-	[PM_DISK_SHUTDOWN]	= "shutdown",
-	[PM_DISK_REBOOT]	= "reboot",
-	[PM_DISK_TEST]		= "test",
-	[PM_DISK_TESTPROC]	= "testproc",
+static const char * const hibernation_modes[] = {
+	[HIBERNATION_PLATFORM]	= "platform",
+	[HIBERNATION_SHUTDOWN]	= "shutdown",
+	[HIBERNATION_REBOOT]	= "reboot",
+	[HIBERNATION_TEST]	= "test",
+	[HIBERNATION_TESTPROC]	= "testproc",
 };
 
 /**
- *	disk - Control suspend-to-disk mode
+ *	disk - Control hibernation mode
  *
  *	Suspend-to-disk can be handled in several ways. We have a few options
  *	for putting the system to sleep - using the platform driver (e.g. ACPI
- *	or other pm_ops), powering off the system or rebooting the system
- *	(for testing) as well as the two test modes.
+ *	or other hibernation_ops), powering off the system or rebooting the
+ *	system (for testing) as well as the two test modes.
  *
  *	The system can support 'platform', and that is known a priori (and
- *	encoded in pm_ops). However, the user may choose 'shutdown' or 'reboot'
- *	as alternatives, as well as the test modes 'test' and 'testproc'.
+ *	encoded by the presence of hibernation_ops). However, the user may
+ *	choose 'shutdown' or 'reboot' as alternatives, as well as one fo the
+ *	test modes, 'test' or 'testproc'.
  *
  *	show() will display what the mode is currently set to.
  *	store() will accept one of
@@ -345,7 +361,7 @@ static const char * const pm_disk_modes[
  *	'testproc'
  *
  *	It will only change to 'platform' if the system
- *	supports it (as determined from pm_ops->pm_disk_mode).
+ *	supports it (as determined by having hibernation_ops).
  */
 
 static ssize_t disk_show(struct subsystem * subsys, char * buf)
@@ -353,28 +369,25 @@ static ssize_t disk_show(struct subsyste
 	int i;
 	char *start = buf;
 
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
-		if (!pm_disk_modes[i])
+	for (i = HIBERNATION_FIRST; i <= HIBERNATION_MAX; i++) {
+		if (!hibernation_modes[i])
 			continue;
 		switch (i) {
-		case PM_DISK_SHUTDOWN:
-		case PM_DISK_REBOOT:
-		case PM_DISK_TEST:
-		case PM_DISK_TESTPROC:
+		case HIBERNATION_SHUTDOWN:
+		case HIBERNATION_REBOOT:
+		case HIBERNATION_TEST:
+		case HIBERNATION_TESTPROC:
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (i == pm_ops->pm_disk_mode))
+		case HIBERNATION_PLATFORM:
+			if (hibernation_ops)
 				break;
 			/* not a valid mode, continue with loop */
 			continue;
 		}
-		if (i == pm_disk_mode)
-			buf += sprintf(buf, "[%s]", pm_disk_modes[i]);
+		if (i == hibernation_mode)
+			buf += sprintf(buf, "[%s] ", hibernation_modes[i]);
 		else
-			buf += sprintf(buf, "%s", pm_disk_modes[i]);
-		if (i+1 != PM_DISK_MAX)
-			buf += sprintf(buf, " ");
+			buf += sprintf(buf, "%s ", hibernation_modes[i]);
 	}
 	buf += sprintf(buf, "\n");
 	return buf-start;
@@ -387,39 +400,38 @@ static ssize_t disk_store(struct subsyst
 	int i;
 	int len;
 	char *p;
-	suspend_disk_method_t mode = 0;
+	int mode = HIBERNATION_INVALID;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
 	mutex_lock(&pm_mutex);
-	for (i = PM_DISK_PLATFORM; i < PM_DISK_MAX; i++) {
-		if (!strncmp(buf, pm_disk_modes[i], len)) {
+	for (i = HIBERNATION_FIRST; i <= HIBERNATION_MAX; i++) {
+		if (!strncmp(buf, hibernation_modes[i], len)) {
 			mode = i;
 			break;
 		}
 	}
-	if (mode) {
+	if (mode != HIBERNATION_INVALID) {
 		switch (mode) {
-		case PM_DISK_SHUTDOWN:
-		case PM_DISK_REBOOT:
-		case PM_DISK_TEST:
-		case PM_DISK_TESTPROC:
-			pm_disk_mode = mode;
+		case HIBERNATION_SHUTDOWN:
+		case HIBERNATION_REBOOT:
+		case HIBERNATION_TEST:
+		case HIBERNATION_TESTPROC:
+			hibernation_mode = mode;
 			break;
-		default:
-			if (pm_ops && pm_ops->enter &&
-			    (mode == pm_ops->pm_disk_mode))
-				pm_disk_mode = mode;
+		case HIBERNATION_PLATFORM:
+			if (hibernation_ops)
+				hibernation_mode = mode;
 			else
 				error = -EINVAL;
 		}
-	} else {
+	} else
 		error = -EINVAL;
-	}
 
-	pr_debug("PM: suspend-to-disk mode set to '%s'\n",
-		 pm_disk_modes[mode]);
+	if (!error)
+		pr_debug("PM: suspend-to-disk mode set to '%s'\n",
+			 hibernation_modes[mode]);
 	mutex_unlock(&pm_mutex);
 	return error ? error : n;
 }
Index: linux-2.6.21/Documentation/power/userland-swsusp.txt
===================================================================
--- linux-2.6.21.orig/Documentation/power/userland-swsusp.txt	2007-05-01 13:35:25.000000000 +0200
+++ linux-2.6.21/Documentation/power/userland-swsusp.txt	2007-05-01 13:35:33.000000000 +0200
@@ -93,21 +93,23 @@ SNAPSHOT_S2RAM - suspend to RAM; using t
 	to resume the system from RAM if there's enough battery power or restore
 	its state on the basis of the saved suspend image otherwise)
 
-SNAPSHOT_PMOPS - enable the usage of the pmops->prepare, pmops->enter and
-	pmops->finish methods (the in-kernel swsusp knows these as the "platform
-	method") which are needed on many machines to (among others) speed up
-	the resume by letting the BIOS skip some steps or to let the system
-	recognise the correct state of the hardware after the resume (in
-	particular on many machines this ensures that unplugged AC
-	adapters get correctly detected and that kacpid does not run wild after
-	the resume).  The last ioctl() argument can take one of the three
-	values, defined in kernel/power/power.h:
+SNAPSHOT_PMOPS - enable the usage of the hibernation_ops->prepare,
+	hibernate_ops->enter and hibernation_ops->finish methods (the in-kernel
+	swsusp knows these as the "platform method") which are needed on many
+	machines to (among others) speed up the resume by letting the BIOS skip
+	some steps or to let the system recognise the correct state of the
+	hardware after the resume (in particular on many machines this ensures
+	that unplugged AC adapters get correctly detected and that kacpid does
+	not run wild after the resume).  The last ioctl() argument can take one
+	of the three values, defined in kernel/power/power.h:
 	PMOPS_PREPARE - make the kernel carry out the
-		pm_ops->prepare(PM_SUSPEND_DISK) operation
+		hibernation_ops->prepare() operation
 	PMOPS_ENTER - make the kernel power off the system by calling
-		pm_ops->enter(PM_SUSPEND_DISK)
+		hibernation_ops->enter()
 	PMOPS_FINISH - make the kernel carry out the
-		pm_ops->finish(PM_SUSPEND_DISK) operation
+		hibernation_ops->finish() operation
+	Note that the actual constants are misnamed because they surface
+	internal kernel implementation details that have changed.
 
 The device's read() operation can be used to transfer the snapshot image from
 the kernel.  It has the following limitations:
Index: linux-2.6.21/drivers/i2c/chips/tps65010.c
===================================================================
--- linux-2.6.21.orig/drivers/i2c/chips/tps65010.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/drivers/i2c/chips/tps65010.c	2007-05-01 13:35:33.000000000 +0200
@@ -354,7 +354,7 @@ static void tps65010_interrupt(struct tp
 			 * also needs to get error handling and probably
 			 * an #ifdef CONFIG_SOFTWARE_SUSPEND
 			 */
-			pm_suspend(PM_SUSPEND_DISK);
+			hibernate();
 #endif
 			poll = 1;
 		}
Index: linux-2.6.21/kernel/sys.c
===================================================================
--- linux-2.6.21.orig/kernel/sys.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/kernel/sys.c	2007-05-01 13:35:33.000000000 +0200
@@ -881,7 +881,7 @@ asmlinkage long sys_reboot(int magic1, i
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	case LINUX_REBOOT_CMD_SW_SUSPEND:
 		{
-			int ret = pm_suspend(PM_SUSPEND_DISK);
+			int ret = hibernate();
 			unlock_kernel();
 			return ret;
 		}
Index: linux-2.6.21/drivers/acpi/sleep/main.c
===================================================================
--- linux-2.6.21.orig/drivers/acpi/sleep/main.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/drivers/acpi/sleep/main.c	2007-05-01 14:20:45.000000000 +0200
@@ -29,7 +29,6 @@ static u32 acpi_suspend_states[] = {
 	[PM_SUSPEND_ON] = ACPI_STATE_S0,
 	[PM_SUSPEND_STANDBY] = ACPI_STATE_S1,
 	[PM_SUSPEND_MEM] = ACPI_STATE_S3,
-	[PM_SUSPEND_DISK] = ACPI_STATE_S4,
 	[PM_SUSPEND_MAX] = ACPI_STATE_S5
 };
 
@@ -94,14 +93,6 @@ static int acpi_pm_enter(suspend_state_t
 		do_suspend_lowlevel();
 		break;
 
-	case PM_SUSPEND_DISK:
-		if (acpi_pm_ops.pm_disk_mode == PM_DISK_PLATFORM)
-			status = acpi_enter_sleep_state(acpi_state);
-		break;
-	case PM_SUSPEND_MAX:
-		acpi_power_off();
-		break;
-
 	default:
 		return -EINVAL;
 	}
@@ -157,12 +148,13 @@ int acpi_suspend(u32 acpi_state)
 	suspend_state_t states[] = {
 		[1] = PM_SUSPEND_STANDBY,
 		[3] = PM_SUSPEND_MEM,
-		[4] = PM_SUSPEND_DISK,
 		[5] = PM_SUSPEND_MAX
 	};
 
 	if (acpi_state < 6 && states[acpi_state])
 		return pm_suspend(states[acpi_state]);
+	if (acpi_state == 4)
+		return hibernate();
 	return -EINVAL;
 }
 
@@ -189,6 +181,49 @@ static struct pm_ops acpi_pm_ops = {
 	.finish = acpi_pm_finish,
 };
 
+#ifdef CONFIG_SOFTWARE_SUSPEND
+static int acpi_hibernation_prepare(void)
+{
+	return acpi_sleep_prepare(ACPI_STATE_S4);
+}
+
+static int acpi_hibernation_enter(void)
+{
+	acpi_status status = AE_OK;
+	unsigned long flags = 0;
+
+	ACPI_FLUSH_CPU_CACHE();
+
+	local_irq_save(flags);
+	acpi_enable_wakeup_device(ACPI_STATE_S4);
+	/* This shouldn't return.  If it returns, we have a problem */
+	status = acpi_enter_sleep_state(ACPI_STATE_S4);
+	local_irq_restore(flags);
+
+	return ACPI_SUCCESS(status) ? 0 : -EFAULT;
+}
+
+static void acpi_hibernation_finish(void)
+{
+	acpi_leave_sleep_state(ACPI_STATE_S4);
+	acpi_disable_wakeup_device(ACPI_STATE_S4);
+
+	/* reset firmware waking vector */
+	acpi_set_firmware_waking_vector((acpi_physical_address) 0);
+
+	if (init_8259A_after_S1) {
+		printk("Broken toshiba laptop -> kicking interrupts\n");
+		init_8259A(0);
+	}
+}
+
+static struct hibernation_ops acpi_hibernation_ops = {
+	.prepare = acpi_hibernation_prepare,
+	.enter = acpi_hibernation_enter,
+	.finish = acpi_hibernation_finish,
+};
+#endif /* CONFIG_SOFTWARE_SUSPEND */
+
 /*
  * Toshiba fails to preserve interrupts over S1, reinitialization
  * of 8259 is needed after S1 resume.
@@ -227,14 +262,18 @@ int __init acpi_sleep_init(void)
 			sleep_states[i] = 1;
 			printk(" S%d", i);
 		}
-		if (i == ACPI_STATE_S4) {
-			if (sleep_states[i])
-				acpi_pm_ops.pm_disk_mode = PM_DISK_PLATFORM;
-		}
 	}
 	printk(")\n");
 
 	pm_set_ops(&acpi_pm_ops);
+
+#ifdef CONFIG_SOFTWARE_SUSPEND
+	if (sleep_states[ACPI_STATE_S4])
+		hibernation_set_ops(&acpi_hibernation_ops);
+#else
+	sleep_states[ACPI_STATE_S4] = 0;
+#endif
+
 	return 0;
 }
 
Index: linux-2.6.21/kernel/power/power.h
===================================================================
--- linux-2.6.21.orig/kernel/power/power.h	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/kernel/power/power.h	2007-05-01 13:35:33.000000000 +0200
@@ -25,12 +25,7 @@ struct swsusp_info {
  */
 #define SPARE_PAGES	((1024 * 1024) >> PAGE_SHIFT)
 
-extern int pm_suspend_disk(void);
-#else
-static inline int pm_suspend_disk(void)
-{
-	return -EPERM;
-}
+extern struct hibernation_ops *hibernation_ops;
 #endif
 
 extern struct mutex pm_mutex;
Index: linux-2.6.21/drivers/acpi/sleep/proc.c
===================================================================
--- linux-2.6.21.orig/drivers/acpi/sleep/proc.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/drivers/acpi/sleep/proc.c	2007-05-01 13:35:33.000000000 +0200
@@ -60,7 +60,7 @@ acpi_system_write_sleep(struct file *fil
 	state = simple_strtoul(str, NULL, 0);
 #ifdef CONFIG_SOFTWARE_SUSPEND
 	if (state == 4) {
-		error = pm_suspend(PM_SUSPEND_DISK);
+		error = hibernate();
 		goto Done;
 	}
 #endif
Index: linux-2.6.21/kernel/power/user.c
===================================================================
--- linux-2.6.21.orig/kernel/power/user.c	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/kernel/power/user.c	2007-05-01 13:35:33.000000000 +0200
@@ -130,16 +130,16 @@ static inline int platform_prepare(void)
 {
 	int error = 0;
 
-	if (pm_ops && pm_ops->prepare)
-		error = pm_ops->prepare(PM_SUSPEND_DISK);
+	if (hibernation_ops)
+		error = hibernation_ops->prepare();
 
 	return error;
 }
 
 static inline void platform_finish(void)
 {
-	if (pm_ops && pm_ops->finish)
-		pm_ops->finish(PM_SUSPEND_DISK);
+	if (hibernation_ops)
+		hibernation_ops->finish();
 }
 
 static inline int snapshot_suspend(int platform_suspend)
@@ -384,7 +384,7 @@ static int snapshot_ioctl(struct inode *
 		switch (arg) {
 
 		case PMOPS_PREPARE:
-			if (pm_ops && pm_ops->enter) {
+			if (hibernation_ops) {
 				data->platform_suspend = 1;
 				error = 0;
 			} else {
@@ -395,8 +395,7 @@ static int snapshot_ioctl(struct inode *
 		case PMOPS_ENTER:
 			if (data->platform_suspend) {
 				kernel_shutdown_prepare(SYSTEM_SUSPEND_DISK);
-				error = pm_ops->enter(PM_SUSPEND_DISK);
-				error = 0;
+				error = hibernation_ops->enter();
 			}
 			break;
 
Index: linux-2.6.21/include/linux/suspend.h
===================================================================
--- linux-2.6.21.orig/include/linux/suspend.h	2007-05-01 13:35:33.000000000 +0200
+++ linux-2.6.21/include/linux/suspend.h	2007-05-01 13:35:33.000000000 +0200
@@ -32,6 +32,24 @@ static inline int pm_prepare_console(voi
 static inline void pm_restore_console(void) {}
 #endif
 
+/**
+ * struct hibernation_ops - hibernation platform support
+ *
+ * The methods in this structure allow a platform to override the default
+ * mechanism of shutting down the machine during a hibernation transition.
+ *
+ * All three methods must be assigned.
+ *
+ * @prepare: prepare system for hibernation
+ * @enter: shut down system after state has been saved to disk
+ * @finish: finish/clean up after state has been reloaded
+ */
+struct hibernation_ops {
+	int (*prepare)(void);
+	int (*enter)(void);
+	void (*finish)(void);
+};
+
 #if defined(CONFIG_PM) && defined(CONFIG_SOFTWARE_SUSPEND)
 /* kernel/power/snapshot.c */
 extern void __init register_nosave_region(unsigned long, unsigned long);
@@ -39,11 +57,25 @@ extern int swsusp_page_is_forbidden(stru
 extern void swsusp_set_page_free(struct page *);
 extern void swsusp_unset_page_free(struct page *);
 extern unsigned long get_safe_page(gfp_t gfp_mask);
+
+/**
+ * hibernation_set_ops - set the global hibernate operations
+ * @ops: the hibernation operations to use in subsequent hibernation transitions
+ */
+void hibernation_set_ops(struct hibernation_ops *ops);
+
+/**
+ * hibernate - hibernate the system
+ */
+extern int hibernate(void);
 #else
 static inline void register_nosave_region(unsigned long b, unsigned long e) {}
 static inline int swsusp_page_is_forbidden(struct page *p) { return 0; }
 static inline void swsusp_set_page_free(struct page *p) {}
 static inline void swsusp_unset_page_free(struct page *p) {}
+
+static inline void hibernation_set_ops(struct hibernation_ops *ops) {}
+extern inline int hibernate(void) { return -ENOSYS; }
 #endif /* defined(CONFIG_PM) && defined(CONFIG_SOFTWARE_SUSPEND) */
 
 void save_processor_state(void);

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-01 14:05                               ` Rafael J. Wysocki
@ 2007-05-01 22:02                                 ` Rafael J. Wysocki
  2007-05-02  5:13                                   ` Alexey Starikovskiy
  2007-05-02  8:21                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
  0 siblings, 2 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-01 22:02 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Tuesday, 1 May 2007 16:05, Rafael J. Wysocki wrote:
> On Monday, 30 April 2007 16:59, Johannes Berg wrote:
> > On Mon, 2007-04-30 at 16:51 +0200, Rafael J. Wysocki wrote:
> > 
> > > > That comment doesn't seem right. This is in ->enter so afaict the image
> > > > hasn't been loaded yet at this point. I don't know if you just moved
> > > > code but if you did then I don't think it was correct before.
> > > 
> > > It was in your patch, so I kept it, but I don't think it's correct too.
> > 
> > If it was in my patch then it must be there in the original code, iirc I
> > just shuffled it a bit :)
> > 
> > > Moreover, it seems that acpi_save_state_mem() and acpi_restore_state_mem() are
> > > only needed by s2ram, so we can safely remove them from the hibernation code
> > > path.  Pavel, is that correct?
> > 
> > This I don't know. They seemed to be done on hibernate too.
> 
> The previous version of the patch was missing the changes in suspend.h.
> 
> Apart from this I've cleaned up some changes in disk.c and main.c to make
> the sysfs interface work again and dropped some ACPI code that I think was
> not necessary.
> 
> Patch appended (tested on x86_64, but not extensively), comments welcome. :-)

Well, having a look on the ACPI spec I'm thinking that what we're trying to do
with this patch is actually wrong.

Instead, we should rip off all of the invocations of pm_ops->whatever() from
the hibernation code paths (with the below exceptions) and *if* the platform
method is to be used, call pm_ops to make the system go down, in the following
way:
1) call pm_ops->prepare(PM_SUSPEND_DISK)
2) suspend devices (ie. call device_suspend() etc.)
3) call pm_ops->enter(PM_SUSPEND_DISK)
and if that *fails* (ie. pm_ops->enter() returns):
4) call pm_ops->finish(PM_SUSPEND_DISK)
5) halt the system

Formally, after restoring the image, *if* the platform method was used (ie. the
above was executed as the last hibernation step), we should call
pm_ops->finish(PM_SUSPEND_DISK) before resuming devices, but
since we get the control from the "old kernel" rather than from the BIOS,
this doesn't seem to be the right thing to do.

I'll try to create a patch along these lines and see if it breaks anything on
my boxes.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-01 22:02                                 ` Rafael J. Wysocki
@ 2007-05-02  5:13                                   ` Alexey Starikovskiy
  2007-05-02 13:42                                     ` Rafael J. Wysocki
  2007-05-02  8:21                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
  1 sibling, 1 reply; 117+ messages in thread
From: Alexey Starikovskiy @ 2007-05-02  5:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

Rafael,

On resume ACPI expects
boot kernel do pm_prepare().
resumed kernel do pm_finish() before device_resume().

Thanks,
Alex.

On 5/2/07, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Tuesday, 1 May 2007 16:05, Rafael J. Wysocki wrote:
> > On Monday, 30 April 2007 16:59, Johannes Berg wrote:
> > > On Mon, 2007-04-30 at 16:51 +0200, Rafael J. Wysocki wrote:
> > >
> > > > > That comment doesn't seem right. This is in ->enter so afaict the image
> > > > > hasn't been loaded yet at this point. I don't know if you just moved
> > > > > code but if you did then I don't think it was correct before.
> > > >
> > > > It was in your patch, so I kept it, but I don't think it's correct too.
> > >
> > > If it was in my patch then it must be there in the original code, iirc I
> > > just shuffled it a bit :)
> > >
> > > > Moreover, it seems that acpi_save_state_mem() and acpi_restore_state_mem() are
> > > > only needed by s2ram, so we can safely remove them from the hibernation code
> > > > path.  Pavel, is that correct?
> > >
> > > This I don't know. They seemed to be done on hibernate too.
> >
> > The previous version of the patch was missing the changes in suspend.h.
> >
> > Apart from this I've cleaned up some changes in disk.c and main.c to make
> > the sysfs interface work again and dropped some ACPI code that I think was
> > not necessary.
> >
> > Patch appended (tested on x86_64, but not extensively), comments welcome. :-)
>
> Well, having a look on the ACPI spec I'm thinking that what we're trying to do
> with this patch is actually wrong.
>
> Instead, we should rip off all of the invocations of pm_ops->whatever() from
> the hibernation code paths (with the below exceptions) and *if* the platform
> method is to be used, call pm_ops to make the system go down, in the following
> way:
> 1) call pm_ops->prepare(PM_SUSPEND_DISK)
> 2) suspend devices (ie. call device_suspend() etc.)
> 3) call pm_ops->enter(PM_SUSPEND_DISK)
> and if that *fails* (ie. pm_ops->enter() returns):
> 4) call pm_ops->finish(PM_SUSPEND_DISK)
> 5) halt the system
>
> Formally, after restoring the image, *if* the platform method was used (ie. the
> above was executed as the last hibernation step), we should call
> pm_ops->finish(PM_SUSPEND_DISK) before resuming devices, but
> since we get the control from the "old kernel" rather than from the BIOS,
> this doesn't seem to be the right thing to do.
>
> I'll try to create a patch along these lines and see if it breaks anything on
> my boxes.
>
> Greetings,
> Rafael
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/linux-pm
>

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-01 22:02                                 ` Rafael J. Wysocki
  2007-05-02  5:13                                   ` Alexey Starikovskiy
@ 2007-05-02  8:21                                   ` Johannes Berg
  2007-05-02  9:02                                     ` Rafael J. Wysocki
  2007-05-02  9:16                                     ` Pavel Machek
  1 sibling, 2 replies; 117+ messages in thread
From: Johannes Berg @ 2007-05-02  8:21 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 884 bytes --]

On Wed, 2007-05-02 at 00:02 +0200, Rafael J. Wysocki wrote:

> Well, having a look on the ACPI spec I'm thinking that what we're trying to do
> with this patch is actually wrong.

No idea :)

> Instead, we should rip off all of the invocations of pm_ops->whatever() from
> the hibernation code paths (with the below exceptions) and *if* the platform
> method is to be used, call pm_ops to make the system go down, in the following
> way:
> 1) call pm_ops->prepare(PM_SUSPEND_DISK)
> 2) suspend devices (ie. call device_suspend() etc.)
> 3) call pm_ops->enter(PM_SUSPEND_DISK)
> and if that *fails* (ie. pm_ops->enter() returns):
> 4) call pm_ops->finish(PM_SUSPEND_DISK)
> 5) halt the system

Can we still split that off to another method so we don't use pm_ops? No
matter how we invoke hibernation_ops or in what order, imho we shouldn't
use pm_ops.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  8:21                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
@ 2007-05-02  9:02                                     ` Rafael J. Wysocki
  2007-05-02  9:16                                     ` Pavel Machek
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-02  9:02 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Wednesday, 2 May 2007 10:21, Johannes Berg wrote:
> On Wed, 2007-05-02 at 00:02 +0200, Rafael J. Wysocki wrote:
> 
> > Well, having a look on the ACPI spec I'm thinking that what we're trying to do
> > with this patch is actually wrong.
> 
> No idea :)
> 
> > Instead, we should rip off all of the invocations of pm_ops->whatever() from
> > the hibernation code paths (with the below exceptions) and *if* the platform
> > method is to be used, call pm_ops to make the system go down, in the following
> > way:
> > 1) call pm_ops->prepare(PM_SUSPEND_DISK)
> > 2) suspend devices (ie. call device_suspend() etc.)
> > 3) call pm_ops->enter(PM_SUSPEND_DISK)
> > and if that *fails* (ie. pm_ops->enter() returns):
> > 4) call pm_ops->finish(PM_SUSPEND_DISK)
> > 5) halt the system
> 
> Can we still split that off to another method so we don't use pm_ops? No
> matter how we invoke hibernation_ops or in what order, imho we shouldn't
> use pm_ops.

OK, I think we can go ahead with the patch if nobody objects.  It's been tested
to some extent and seems to work.  More testing will be appreciated.

Later on we can do what I said above using hibernation_ops instead of pm_ops,
if turns out to really make sense.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  8:21                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
  2007-05-02  9:02                                     ` Rafael J. Wysocki
@ 2007-05-02  9:16                                     ` Pavel Machek
  2007-05-02  9:25                                       ` Johannes Berg
  2007-05-02 13:43                                       ` Rafael J. Wysocki
  1 sibling, 2 replies; 117+ messages in thread
From: Pavel Machek @ 2007-05-02  9:16 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham

Hi!

> > Well, having a look on the ACPI spec I'm thinking that what we're trying to do
> > with this patch is actually wrong.
> 
> No idea :)
> 
> > Instead, we should rip off all of the invocations of pm_ops->whatever() from
> > the hibernation code paths (with the below exceptions) and *if* the platform
> > method is to be used, call pm_ops to make the system go down, in the following
> > way:
> > 1) call pm_ops->prepare(PM_SUSPEND_DISK)
> > 2) suspend devices (ie. call device_suspend() etc.)
> > 3) call pm_ops->enter(PM_SUSPEND_DISK)
> > and if that *fails* (ie. pm_ops->enter() returns):
> > 4) call pm_ops->finish(PM_SUSPEND_DISK)
> > 5) halt the system
> 
> Can we still split that off to another method so we don't use pm_ops? No
> matter how we invoke hibernation_ops or in what order, imho we shouldn't
> use pm_ops.

Well... the powerdown during hibernation... does not have _anything_
to do with snapshot/restore. It is really a very deep sleep; similar
to soft powerdown, but not quite.

So this usage of pm_ops seems ok.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  9:16                                     ` Pavel Machek
@ 2007-05-02  9:25                                       ` Johannes Berg
  2007-05-03 14:00                                         ` Alan Stern
  2007-05-02 13:43                                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-02  9:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham


[-- Attachment #1.1: Type: text/plain, Size: 657 bytes --]

On Wed, 2007-05-02 at 11:16 +0200, Pavel Machek wrote:

> Well... the powerdown during hibernation... does not have _anything_
> to do with snapshot/restore. It is really a very deep sleep; similar
> to soft powerdown, but not quite.

It's also horribly confusing to intermingle hibernation and suspend into
one operation struct when there's only a single user for it anyway. Just
look at what all the arm platforms had there, trying to veto suspend to
disk through pm_ops etc. I don't technically disagree with you, but from
a point of how to understand this whole thing I'd rather have hibernate
and suspend be totally orthogonal.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  5:13                                   ` Alexey Starikovskiy
@ 2007-05-02 13:42                                     ` Rafael J. Wysocki
  2007-05-02 14:11                                       ` Alexey Starikovskiy
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-02 13:42 UTC (permalink / raw)
  To: Alexey Starikovskiy
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

Hi,

On Wednesday, 2 May 2007 07:13, Alexey Starikovskiy wrote:
> Rafael,
> 
> On resume ACPI expects
> boot kernel do pm_prepare().
> resumed kernel do pm_finish() before device_resume().

Well, lets analyse what pm_prepare() actually does.  If my understading of
the code in there and the ACPI spec [1] is correct, it does the following:

(1) Sets the firmware waking vector (doesn't matter for hibernation)
(2) Prepares the wake-up devices for a state transition, by calling their _PSW
methods ("to enable wake" according to the spec)
(3) Disables the GPEs that cannot wake up the system
(4) Runs the _PTS and _GTS methods
(5) Runs the _SST method
(6) Disables all GPEs

Now, there's a couple of problems with that regardless of what it's used for,
as far as I can see:

a) The spec (in Section 7.2) says that (2) should be done *after* the _PTS
method is called

b) The spec (Section 7.3.2) says:

"This [_PTS] method is called after OSPM has notified native device drivers of
the sleep state transition and before the OSPM has had a chance to fully
prepare the system for a sleep state transition."

We don't seem to be doing this.  Moreover, Section 15.1.6 of the spec say that
"OSPM places all device drivers into their respective Dx state" *before* _PTS
is executed, so it doesn't look like _PTS should be executed before
device_suspend().

c) According to the spec (Section 15.1.6) "OSPM saves any other processor’s
context (other than the local processor) to memory" *after* executing _PTS,
but *before* _GTS is executed, but we do this after _GTS is executed.
Moreover, the waking vector should be written into FACS after the "other
processor’s context" has been saved, but *before* _GTS is executed.

d) The spec (Section 7.3.3) says literally this:

" _GTS allows ACPI system firmware to perform any required system specific
functions prior to entering a system sleep state. OSPM will set the sleep
enable (SLP_EN) bit in the PM1 control register immediately following the
execution of the _GTS control method without performing any other physical I/O
or allowing any interrupt servicing."

However, in our code _GTS is executed *waaay* before setting the SLP_EN bit
in PM1, which only happens in acpi_enter_sleep_state() called from
acpi_pm_enter(), *after* we've executed device_suspend() with IRQs enabled
and, in the hibernation case, called device_resume() and saved the image
(oh, dear).

e) It implicitly follows from d) that _SST should be executed before _GTS
and after we run device_suspend().

f) I'm not sure if the disabling of all GPEs before device_suspend() is
actually a good idea.

Next, we can consider acpi_pm_finish().  Again, if my understading of the code
in there and the ACPI spec [1] is correct, it does the following:

(7) Sets SLP_EN and SLP_TYPE to state S0
(8) Executes the _SST method (Waking)
(9) Executes the _BFS (Back From Sleep) method
(10) Executes the _WAK method
(11) Enables the runtime GPEs
(12) Enables the power button
(13) Executes the _SST method (Working)
(14) Disables the wake-up capability of devices
(15) Resets the firmware waking vector (doesn't matter for hibernation)

Now, there seems to be nothing wrong with that *if* it's executed while
resuming from RAM, for example, but it doesn't seem to be suitable for using
in such a way as we do this in the resume-during-hibernation code path.

Consider a hibernation (aka suspend to disk) transition (ie. an operation in
which we snapshot the system memory, save the image and shut the system down).

Currently, we call acpi_pm_prepare(PM_SUSPEND_DISK) and run device_suspend(),
which seems to be in many ways agaist the ACPI spec.  The spec, as I understand
it, indicates that we should run device_suspend() first and then execute the
_PTS method.  We shouldn't, however, execute either _GTS, or _SST just yet.

Next, we suspend sysdevs etc., and create the memory snapshot.  We want
to be able to save it, so w call acpi_pm_finish(), which causes _BFS and _WAK
to be executed *after* _GTS, which is clearly against the spec (might this be the
reason why (7) is sometimes necessary?).  Moreover, calling _BFS at this stage
makes no sense, IMHO, since there hasn't been any transition (the system has
not slept).  What I think we should do at this point is to execute _WAK only,
which means "power transition aborted" to the firmware, and continue with
device_resume().

Next, we save the image and now we'd like to put the system to "sleep", so
we use acpi_pm_enter(PM_SUSPEND_DISK), but we shouldn't do that, since the
power transition has been aborted by _WAK in acpi_pm_finish()!  Thus we should
start the transition again, run device_suspend(), execute _PTS, do (2) and (3),
save the "other processor's context" etc., execute _SST(S4), execute _GTS and
set SLP_EN in PM1 etc.

When we restore the system state from a hibernation image, the "boot kernel" is
first started.  It loads the image into memory, calls
device_suspend(PMSG_PRETHAW), suspends sysdevs etc., and replaces itself with
the "resumed kernel".  It doesn't call acpi_pm_prepare(), which I think is
right, because it doesn't want to start any power transition, not even a
fake one.  Now, the "resumed kernel" takes control, resumes sysdevs and calls
acpi_pm_finish(), which seems to be about OK, except that I'm not sure if _BFS
should be executed in that case (the ACPI spec seems to assume that the
hibernation image will be loaded into memory by a boot loader).

Concluding, it seems to me that the "restore" code path is correct, but the
"hibernate" code path is not and should be reworked.  Also, it seems that
acpi_pm_prepare() and acpi_pm_enter() should be fixed for the s2ram case
either (_PTS should be executed after device_suspend() and _GTS should
be executed in acpi_pm_enter(), right before the transition is completed).

Greetings,
Rafael


[1] Advanced Configuration and Power Interface Specification, Revision 3.0,
September 2, 2004

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  9:16                                     ` Pavel Machek
  2007-05-02  9:25                                       ` Johannes Berg
@ 2007-05-02 13:43                                       ` Rafael J. Wysocki
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-02 13:43 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Johannes Berg, Pekka Enberg, linux-pm, Nigel Cunningham

On Wednesday, 2 May 2007 11:16, Pavel Machek wrote:
> Hi!
> 
> > > Well, having a look on the ACPI spec I'm thinking that what we're trying to do
> > > with this patch is actually wrong.
> > 
> > No idea :)
> > 
> > > Instead, we should rip off all of the invocations of pm_ops->whatever() from
> > > the hibernation code paths (with the below exceptions) and *if* the platform
> > > method is to be used, call pm_ops to make the system go down, in the following
> > > way:
> > > 1) call pm_ops->prepare(PM_SUSPEND_DISK)
> > > 2) suspend devices (ie. call device_suspend() etc.)
> > > 3) call pm_ops->enter(PM_SUSPEND_DISK)
> > > and if that *fails* (ie. pm_ops->enter() returns):
> > > 4) call pm_ops->finish(PM_SUSPEND_DISK)
> > > 5) halt the system
> > 
> > Can we still split that off to another method so we don't use pm_ops? No
> > matter how we invoke hibernation_ops or in what order, imho we shouldn't
> > use pm_ops.
> 
> Well... the powerdown during hibernation... does not have _anything_
> to do with snapshot/restore.

Agreed.

> It is really a very deep sleep; similar to soft powerdown, but not quite.

Yeah, not quite.  For example, we may want to use some devices for waking up
the system, but with the current code it's impossible, because pm_ops->finish()
disables this capability of devices.

I think we shouldn't confuse the quiescing of devices before the image creation
with a power transition.  This is not a power transition, since it's not
completed by calling pm_ops->enter().  Instead, we kinda-sorta abort it with
pm_ops->finish() which confuses the heck out of the ACPI firmware (please
see my reply to Alexey in the same thread for a detailed analysis).

> So this usage of pm_ops seems ok.

To me, it doesn't.  These are the main problems I see with it:
1) device_suspend() should be called before the _PTS method is executed (IMO
it's correct not to execute _PTS at all if we don't want to do a real power
transition)
2) The _GTS method shouldn't be executed in acpi_pm_prepare(), but instead
should be executed in acpi_pm_enter(), right before the transition is completed
3) The _BFS method shouldn't be executed in the resume-during-hibernation
code path
4) The wake-up capability of devices should be enabled before we execute
pm_ops->enter() and shouldn't be enabled before the image creation (what for?).
5) The first part of 4) requires that the transition be started over after the
image has been saved.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02 13:42                                     ` Rafael J. Wysocki
@ 2007-05-02 14:11                                       ` Alexey Starikovskiy
  2007-05-02 19:26                                         ` ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops) Rafael J. Wysocki
       [not found]                                         ` <200705022126.47897.rjw@sisk.pl>
  0 siblings, 2 replies; 117+ messages in thread
From: Alexey Starikovskiy @ 2007-05-02 14:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

Rafael,

> Concluding, it seems to me that the "restore" code path is correct, but the
> "hibernate" code path is not and should be reworked.  Also, it seems that
> acpi_pm_prepare() and acpi_pm_enter() should be fixed for the s2ram case
> either (_PTS should be executed after device_suspend() and _GTS should
> be executed in acpi_pm_enter(), right before the transition is completed).

Current implementation is not fully up-to spec, so we may try to get
it closer to, I agree.

> When we restore the system state from a hibernation image, the "boot kernel" is
> first started.  It loads the image into memory, calls
> device_suspend(PMSG_PRETHAW), suspends sysdevs etc., and replaces itself with
> the "resumed kernel".  It doesn't call acpi_pm_prepare(), which I think is
> right, because it doesn't want to start any power transition, not even a
> fake one.  Now, the "resumed kernel" takes control, resumes sysdevs and calls
Currently call to prepare() is needed to stop ACPI devices to send
GPEs to ACPI drivers.
If you remove it, Acer laptops will resume without ACPI interrupt at
all (with all problems from it).
> acpi_pm_finish(), which seems to be about OK, except that I'm not sure if _BFS
> should be executed in that case (the ACPI spec seems to assume that the
> hibernation image will be loaded into memory by a boot loader).

> Next, we suspend sysdevs etc., and create the memory snapshot.  We want
> to be able to save it, so w call acpi_pm_finish(), which causes _BFS and _WAK
> to be executed *after* _GTS, which is clearly against the spec (might this be the
> reason why (7) is sometimes necessary?).  Moreover, calling _BFS at this stage
> makes no sense, IMHO, since there hasn't been any transition (the system has
> not slept).  What I think we should do at this point is to execute _WAK only,
> which means "power transition aborted" to the firmware, and continue with
> device_resume().

But I don't get your idea about executing _finish() between _prepare()
and _enter()...
_finish is executed only if _prepare() fails, so we are rolling back,
or it is executed after we loaded the image and transfered execution
to it, so again -- we are going from _prepare() state to running
state...

Regards,
Alex.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
  2007-05-02 14:11                                       ` Alexey Starikovskiy
@ 2007-05-02 19:26                                         ` Rafael J. Wysocki
       [not found]                                         ` <200705022126.47897.rjw@sisk.pl>
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-02 19:26 UTC (permalink / raw)
  To: Alexey Starikovskiy
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg,
	Pavel Machek, Johannes Berg, linux-pm

Hi,

[Added linux-acpi to the CC list, should be there from the start]

On Wednesday, 2 May 2007 16:11, Alexey Starikovskiy wrote:
> Rafael,
> 
> > Concluding, it seems to me that the "restore" code path is correct, but the
> > "hibernate" code path is not and should be reworked.  Also, it seems that
> > acpi_pm_prepare() and acpi_pm_enter() should be fixed for the s2ram case
> > either (_PTS should be executed after device_suspend() and _GTS should
> > be executed in acpi_pm_enter(), right before the transition is completed).
> 
> Current implementation is not fully up-to spec, so we may try to get
> it closer to, I agree.

Okay.  Since we're trying to separate the hibernation code from the
suspend code anyway, we can use the opportunity to introduce some new
callbacks for the hibernation and/or redefine the existing ones.

The spec suggests that we need the following callbacks:

(1) prepare() - called after device_suspend(), execute _PTS and disable GPEs
(2) cancel() - called at any time after prepare() if there's an error, execute
    _WAK and enable run-time GPEs
(3) enter() - called after the image has been saved, execute _GTS and do what's
    currently done in pm_enter()
(4) finish() - called after the image has been restored, do what's currently
    done in pm_finish()

[At least, the execution of _GTS in pm_prepare() seems to be dangerous at first
sight.]

We also might need a callback that will be run before device_suspend() to
invoke some ACPI-related magic needed at that point, but I have no idea what
it would have to do. :-)

> > When we restore the system state from a hibernation image, the "boot kernel" is
> > first started.  It loads the image into memory, calls
> > device_suspend(PMSG_PRETHAW), suspends sysdevs etc., and replaces itself with
> > the "resumed kernel".  It doesn't call acpi_pm_prepare(), which I think is
> > right, because it doesn't want to start any power transition, not even a
> > fake one.  Now, the "resumed kernel" takes control, resumes sysdevs and calls

> Currently call to prepare() is needed to stop ACPI devices to send
> GPEs to ACPI drivers.

Does it mean that we need to call pm_prepare() (or an equivalent function)
before device_suspend()?  If that's the case, then which part of pm_prepare()
is essential here?

> If you remove it, Acer laptops will resume without ACPI interrupt at
> all (with all problems from it).

A recent discussion on the LKML lead to the conclusion that for the
hibernation we shouldn't use .suspend() callbacks before snapshotting the
system memory.  Instead, we should use some other callbacks to quiesce devices,
create the snapshot, reactivate devices, save the image and carry out the
actual power transition after that.  Would something like this be viable from
the ACPI point of view?

> > acpi_pm_finish(), which seems to be about OK, except that I'm not sure if _BFS
> > should be executed in that case (the ACPI spec seems to assume that the
> > hibernation image will be loaded into memory by a boot loader).
> 
> > Next, we suspend sysdevs etc., and create the memory snapshot.  We want
> > to be able to save it, so w call acpi_pm_finish(), which causes _BFS and _WAK
> > to be executed *after* _GTS, which is clearly against the spec (might this be the
> > reason why (7) is sometimes necessary?).  Moreover, calling _BFS at this stage
> > makes no sense, IMHO, since there hasn't been any transition (the system has
> > not slept).  What I think we should do at this point is to execute _WAK only,
> > which means "power transition aborted" to the firmware, and continue with
> > device_resume().
> 
> But I don't get your idea about executing _finish() between _prepare()
> and _enter()...
> _finish is executed only if _prepare() fails, so we are rolling back,

Well, no.  Please have a look at the code in kernel/power/disk.c.

Should we remove it from the nonerror code paths?

> or it is executed after we loaded the image and transfered execution
> to it, so again -- we are going from _prepare() state to running
> state...

Currently that's not the case.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-02  9:25                                       ` Johannes Berg
@ 2007-05-03 14:00                                         ` Alan Stern
  2007-05-03 17:17                                           ` Rafael J. Wysocki
                                                             ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-03 14:00 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Wed, 2 May 2007, Johannes Berg wrote:

> On Wed, 2007-05-02 at 11:16 +0200, Pavel Machek wrote:
> 
> > Well... the powerdown during hibernation... does not have _anything_
> > to do with snapshot/restore. It is really a very deep sleep; similar
> > to soft powerdown, but not quite.

Is this really a good idea?

For that matter, what are the differences among the various sorts of 
poweroff?

	Which devices remain minimally powered for wakeup purposes?

	Anything else?

In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
same as a normal non-hibernate poweroff?  Aren't drivers required to 
assume (during the processing after the snapshot has been restored) that 
power could have been lost and devices might need to be completely 
reinitialized?

We are letting ourselves in for problems if we say that when the snapshot
is restored, devices may or may not need to be reinitialized.  Drivers
might not be able to tell which, so they would have to reinitialize
regardless, losing any advantage.  Even worse, the device may _appear_ not
to need reinitialization because the firmware (BIOS) has already
initialized it but left it in a state that's useless for the kernel's
purposes.  (That's part of the reason why PRETHAW was added.)

If the only remaining difference between poweroff for hibernate and normal 
poweroff is which wakeup devices will function, then it seems pointless.  
Why shouldn't the same devices work for wakeup from hibernate and wakeup 
from normal poweroff?

Or have I misunderstood something and is this all nonsense?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 14:00                                         ` Alan Stern
@ 2007-05-03 17:17                                           ` Rafael J. Wysocki
  2007-05-03 18:33                                             ` Alan Stern
  2007-05-03 20:33                                           ` David Brownell
  2007-05-03 22:18                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Pavel Machek
  2 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-03 17:17 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thursday, 3 May 2007 16:00, Alan Stern wrote:
> On Wed, 2 May 2007, Johannes Berg wrote:
> 
> > On Wed, 2007-05-02 at 11:16 +0200, Pavel Machek wrote:
> > 
> > > Well... the powerdown during hibernation... does not have _anything_
> > > to do with snapshot/restore. It is really a very deep sleep; similar
> > > to soft powerdown, but not quite.
> 
> Is this really a good idea?
> 
> For that matter, what are the differences among the various sorts of 
> poweroff?
> 
> 	Which devices remain minimally powered for wakeup purposes?
> 
> 	Anything else?
> 
> In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> same as a normal non-hibernate poweroff?

Not quite (see (*) below).

> Aren't drivers required to assume (during the processing after the snapshot
> has been restored) that power could have been lost and devices might need to
> be completely reinitialized?
> 
> We are letting ourselves in for problems if we say that when the snapshot
> is restored, devices may or may not need to be reinitialized.

Agreed.

> Drivers might not be able to tell which, so they would have to reinitialize
> regardless, losing any advantage.  Even worse, the device may _appear_ not
> to need reinitialization because the firmware (BIOS) has already
> initialized it but left it in a state that's useless for the kernel's
> purposes.  (That's part of the reason why PRETHAW was added.)

Yes.

> If the only remaining difference between poweroff for hibernate and normal 
> poweroff is which wakeup devices will function, then it seems pointless.

No, this is not the only difference (*).

> Why shouldn't the same devices work for wakeup from hibernate and wakeup 
> from normal poweroff?
> 
> Or have I misunderstood something and is this all nonsense?

The problem, generally speaking, is that we have to prepare devices for waking
up the system.  On an ACPI system this is done, among other things, by
executing the devices' _PSW control methods after the system-level _PTS method
has run.  For this purpose the devices must be in (low) power states from which
the wake is possible, so in particular they must not be powered off.  Later, by
making the platform enter the suspend-to-disk (ACPI S4) state we prevent it
from powering off the wake-up devices, among other things.

That's why I'm thinking that it might be a good idea to do a
suspend-before-poweroff, but it doesn't mean that device drivers would be
allowed to make any assumptions regarding the state of the device after the
resume.  IMO, if this is a resume from disk, devices should be initialized from
scratch.

(*) Another issue is that, for example, on my notebook the status of the AC
power supply (and sometimes of the battery too) is not reported correctly by
the platform after the resume if the suspend-to-disk (ACPI S4) state has not
been entered during the hibernation. I don't understand why this happens, but
I'm going to find out.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 17:17                                           ` Rafael J. Wysocki
@ 2007-05-03 18:33                                             ` Alan Stern
  2007-05-03 19:47                                               ` Rafael J. Wysocki
  2007-05-03 20:33                                               ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) David Brownell
  0 siblings, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-03 18:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thu, 3 May 2007, Rafael J. Wysocki wrote:

> > In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> > same as a normal non-hibernate poweroff?
> 
> Not quite (see (*) below).

> The problem, generally speaking, is that we have to prepare devices for waking
> up the system.  On an ACPI system this is done, among other things, by
> executing the devices' _PSW control methods after the system-level _PTS method
> has run.  For this purpose the devices must be in (low) power states from which
> the wake is possible, so in particular they must not be powered off.  Later, by
> making the platform enter the suspend-to-disk (ACPI S4) state we prevent it
> from powering off the wake-up devices, among other things.
> 
> That's why I'm thinking that it might be a good idea to do a
> suspend-before-poweroff, but it doesn't mean that device drivers would be
> allowed to make any assumptions regarding the state of the device after the
> resume.  IMO, if this is a resume from disk, devices should be initialized from
> scratch.

I generally agree with your last sentence, but with one reservation (see 
below).

As for the rest, you missed my point.  Granted that all these special 
activities are required on ACPI systems in order to support proper 
operation of wakeup devices -- Why shouldn't these same steps also be 
followed during a normal poweroff?

There really are two orthogonal issues here:

	(1) Is this a "hibernate" poweroff (as opposed to a "normal" 
	    poweroff)? 

	(2) Should some devices remain minimally powered and be capable
	    of waking up the system?

I don't see any necessary relation between the answers to (1) and (2).  In 
particular, I don't see why a Yes answer to (1) should imply a Yes answer 
to (2).

This suggests that the poweroff methods be completely independent of
hibernation_ops (or whatever you are now calling it).  Perhaps there
should be a separate sysfs attribute controlling whether or not wakeup is
enabled.  If it is then poweroff should go through all the ACPI (or the
platform's equivalent) hoops, otherwise everything should just be turned
off completely.  Regardless of whether the poweroff is part of a
hibernate sequence.

> (*) Another issue is that, for example, on my notebook the status of the AC
> power supply (and sometimes of the battery too) is not reported correctly by
> the platform after the resume if the suspend-to-disk (ACPI S4) state has not
> been entered during the hibernation. I don't understand why this happens, but
> I'm going to find out.

Hopefully it's not directly related to the matter under discussion. :-)


There remains one issue associated with always reinitializing devices 
during resume from hibernation.  In the one area I know a lot about (USB) 
this actually does matter, at least a little.

The USB specs include the notion of a "power session", which is
essentially an uninterrupted continuous connection between the host and
the device.  As long as a power session exists, the host is guaranteed
that the device has not been unplugged or replaced with a different
device.

On most systems, hibernation breaks power sessions.  When the system wakes 
back up it sees a bunch of USB devices connected, but it is not allowed 
(by the spec!) to assume that these are the same devices as were attached 
before.  In fact, some of them might not be.

Mostly this doesn't make any difference, but for mass-storage it does.  
Memory mappings and filesystem mounts will be disrupted when the
underlying logical device goes away, even if the same physical device is
still attached to the same port.  This has caused significant headaches
for USB users in the past.

On the other hand, some systems are designed cleverly enough to maintain
power sessions across hibernation.  Not many -- the only ones I've heard
about were all PPC Macs.  The USB drivers have always tried to keep power
sessions intact across hibernation whenever the hardware and firmware
would permit, but of course reinitializing the USB controller would
destroy them.

There are a couple of reaons for not worrying about this very much.  
First, as mentioned before this issue exists on only a small number of
systems.  Second, I have submitted to Greg KH a couple of patches to
maintain persistence of USB devices even when the power sessions are lost
(they're still in his queue so you can't try them out yet).  This feature
violates the USB spec and it is potentially dangerous -- users could
easily lose data for example by changing the card in a USB card reader
while the system is hibernating -- so it is a non-default Kconfig option.  
Nevertheless, it does solve the problem.

In the end, this is a long-winded way of saying that always reinitializing
devices while resuming from a hibernation is probably the best overall
approach, even if it may not be optimal in a few cases.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 18:33                                             ` Alan Stern
@ 2007-05-03 19:47                                               ` Rafael J. Wysocki
  2007-05-03 19:59                                                 ` Alan Stern
  2007-05-03 20:33                                               ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-03 19:47 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thursday, 3 May 2007 20:33, Alan Stern wrote:
> On Thu, 3 May 2007, Rafael J. Wysocki wrote:
> 
> > > In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> > > same as a normal non-hibernate poweroff?
> > 
> > Not quite (see (*) below).
> 
> > The problem, generally speaking, is that we have to prepare devices for waking
> > up the system.  On an ACPI system this is done, among other things, by
> > executing the devices' _PSW control methods after the system-level _PTS method
> > has run.  For this purpose the devices must be in (low) power states from which
> > the wake is possible, so in particular they must not be powered off.  Later, by
> > making the platform enter the suspend-to-disk (ACPI S4) state we prevent it
> > from powering off the wake-up devices, among other things.

The last sencence in the above paragraph is not actually true, sorry for the
confusion.

> > That's why I'm thinking that it might be a good idea to do a
> > suspend-before-poweroff, but it doesn't mean that device drivers would be
> > allowed to make any assumptions regarding the state of the device after the
> > resume.  IMO, if this is a resume from disk, devices should be initialized from
> > scratch.
> 
> I generally agree with your last sentence, but with one reservation (see 
> below).
> 
> As for the rest, you missed my point.  Granted that all these special 
> activities are required on ACPI systems in order to support proper 
> operation of wakeup devices -- Why shouldn't these same steps also be 
> followed during a normal poweroff?
> 
> There really are two orthogonal issues here:
> 
> 	(1) Is this a "hibernate" poweroff (as opposed to a "normal" 
> 	    poweroff)? 
> 
> 	(2) Should some devices remain minimally powered and be capable
> 	    of waking up the system?
> 
> I don't see any necessary relation between the answers to (1) and (2).  In 
> particular, I don't see why a Yes answer to (1) should imply a Yes answer 
> to (2).
> 
> This suggests that the poweroff methods be completely independent of
> hibernation_ops (or whatever you are now calling it).  Perhaps there
> should be a separate sysfs attribute controlling whether or not wakeup is
> enabled.  If it is then poweroff should go through all the ACPI (or the
> platform's equivalent) hoops, otherwise everything should just be turned
> off completely.  Regardless of whether the poweroff is part of a
> hibernate sequence.

Well, after reviewing the code once again I see that we already do it this
way on ACPI systems, since the 'normal' power off is done by entering the
ACPI S5 state.  Moreover, there shouldn't be any difference between
ACPI S4 and 'power off' with respect to the wake-up devices, so you're
absolutely right.

It seems, though, that we need to do acpi_enter_sleep_state(ACPI_STATE_S4)
to finish the hibernation in order to avoid problems like (*) and for this purpose
we need to use hibernation_ops earlier during the hibernation.

> > (*) Another issue is that, for example, on my notebook the status of the AC
> > power supply (and sometimes of the battery too) is not reported correctly by
> > the platform after the resume if the suspend-to-disk (ACPI S4) state has not
> > been entered during the hibernation. I don't understand why this happens, but
> > I'm going to find out.
> 
> Hopefully it's not directly related to the matter under discussion. :-)
> 
> 
> There remains one issue associated with always reinitializing devices 
> during resume from hibernation.  In the one area I know a lot about (USB) 
> this actually does matter, at least a little.
> 
> The USB specs include the notion of a "power session", which is
> essentially an uninterrupted continuous connection between the host and
> the device.  As long as a power session exists, the host is guaranteed
> that the device has not been unplugged or replaced with a different
> device.
> 
> On most systems, hibernation breaks power sessions.  When the system wakes 
> back up it sees a bunch of USB devices connected, but it is not allowed 
> (by the spec!) to assume that these are the same devices as were attached 
> before.  In fact, some of them might not be.
> 
> Mostly this doesn't make any difference, but for mass-storage it does.  
> Memory mappings and filesystem mounts will be disrupted when the
> underlying logical device goes away, even if the same physical device is
> still attached to the same port.  This has caused significant headaches
> for USB users in the past.
> 
> On the other hand, some systems are designed cleverly enough to maintain
> power sessions across hibernation.  Not many -- the only ones I've heard
> about were all PPC Macs.  The USB drivers have always tried to keep power
> sessions intact across hibernation whenever the hardware and firmware
> would permit, but of course reinitializing the USB controller would
> destroy them.

That seems to be one of the really rare cases in which a device driver can
actually make sure that the device is in certain state after the hibernation on
the basis of information provided by the device itself, so it doesn't need to
make any assupmtions.  In such cases it might be possible not to reinitialize
the device, but that would have to be handled with much care.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 19:47                                               ` Rafael J. Wysocki
@ 2007-05-03 19:59                                                 ` Alan Stern
  2007-05-03 20:21                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-03 19:59 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thu, 3 May 2007, Rafael J. Wysocki wrote:

> > This suggests that the poweroff methods be completely independent of
> > hibernation_ops (or whatever you are now calling it).  Perhaps there
> > should be a separate sysfs attribute controlling whether or not wakeup is
> > enabled.  If it is then poweroff should go through all the ACPI (or the
> > platform's equivalent) hoops, otherwise everything should just be turned
> > off completely.  Regardless of whether the poweroff is part of a
> > hibernate sequence.
> 
> Well, after reviewing the code once again I see that we already do it this
> way on ACPI systems, since the 'normal' power off is done by entering the
> ACPI S5 state.  Moreover, there shouldn't be any difference between
> ACPI S4 and 'power off' with respect to the wake-up devices, so you're
> absolutely right.
> 
> It seems, though, that we need to do acpi_enter_sleep_state(ACPI_STATE_S4)
> to finish the hibernation in order to avoid problems like (*) and for this purpose
> we need to use hibernation_ops earlier during the hibernation.

But why shouldn't a "normal" poweroff enter ACPI S4?  And why shouldn't a 
"hibernate" poweroff enter ACPI S5?  The choice of which state to enter is 
independent of the reason for shutting down, right?

In other words, the choice for whether or not to call
acpi_enter_sleep_state(ACPI_STATE_S4) shouldn't depend on whether or not 
you're hibernating.  So it shouldn't affect the usage of hibernation_ops 
at all.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 19:59                                                 ` Alan Stern
@ 2007-05-03 20:21                                                   ` Rafael J. Wysocki
  2007-05-04 14:40                                                     ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-03 20:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thursday, 3 May 2007 21:59, Alan Stern wrote:
> On Thu, 3 May 2007, Rafael J. Wysocki wrote:
> 
> > > This suggests that the poweroff methods be completely independent of
> > > hibernation_ops (or whatever you are now calling it).  Perhaps there
> > > should be a separate sysfs attribute controlling whether or not wakeup is
> > > enabled.  If it is then poweroff should go through all the ACPI (or the
> > > platform's equivalent) hoops, otherwise everything should just be turned
> > > off completely.  Regardless of whether the poweroff is part of a
> > > hibernate sequence.
> > 
> > Well, after reviewing the code once again I see that we already do it this
> > way on ACPI systems, since the 'normal' power off is done by entering the
> > ACPI S5 state.  Moreover, there shouldn't be any difference between
> > ACPI S4 and 'power off' with respect to the wake-up devices, so you're
> > absolutely right.
> > 
> > It seems, though, that we need to do acpi_enter_sleep_state(ACPI_STATE_S4)
> > to finish the hibernation in order to avoid problems like (*) and for this purpose
> > we need to use hibernation_ops earlier during the hibernation.
> 
> But why shouldn't a "normal" poweroff enter ACPI S4?  And why shouldn't a 
> "hibernate" poweroff enter ACPI S5?  The choice of which state to enter is 
> independent of the reason for shutting down, right?

Well, not exactly.

> In other words, the choice for whether or not to call
> acpi_enter_sleep_state(ACPI_STATE_S4) shouldn't depend on whether or not 
> you're hibernating.  So it shouldn't affect the usage of hibernation_ops 
> at all.

This works the other way around, I think. :-)

Granted, some boxes require us to call acpi_enter_sleep_state(ACPI_STATE_S4)
as a 'power off method' so that they work correctly after the 'return' from hibernation.
If we do acpi_enter_sleep_state(ACPI_STATE_S5) instead, some things might
not work on them (this is an experimental observation, I don't know what
exactly the reason of it is).

Now, since I have such a box, I need to do the
acpi_enter_sleep_state(ACPI_STATE_S4) thing (IOW, use the 'platform' power off
method) and not acpi_enter_sleep_state(ACPI_STATE_S5) (the 'shutdown' power
off method).  *However*, acpi_enter_sleep_state(ACPI_STATE_S4) cannot be used
without previous preparations, which are made with the help of hibernation_ops.

IOW, all hibernation_ops, including the ->enter() method that actually calls
acpi_enter_sleep_state(ACPI_STATE_S4), are just different pieces of one
(complicated) 'platform' power off method.  It doesn't make sense to use the
(other) hibernation_ops without the ->enter() method.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 14:00                                         ` Alan Stern
  2007-05-03 17:17                                           ` Rafael J. Wysocki
@ 2007-05-03 20:33                                           ` David Brownell
  2007-05-03 20:51                                             ` Rafael J. Wysocki
  2007-05-04 14:51                                             ` Alan Stern
  2007-05-03 22:18                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Pavel Machek
  2 siblings, 2 replies; 117+ messages in thread
From: David Brownell @ 2007-05-03 20:33 UTC (permalink / raw)
  To: linux-pm; +Cc: Johannes Berg, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Thursday 03 May 2007, Alan Stern wrote:

> In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> same as a normal non-hibernate poweroff? 

No.  One of the differences between ACPI S4 (hibernate)
and S5 (poweroff) states is for example how wakeup behaves.
Look for example at /proc/acpi/wakeup and see how many
devices are listed as "can wake from S5" vs from S4 ...
most systems support some S4 events, not so for S5.

Another is that S4 can consume more power.

(Although I believe I noticed a regression there in recent
kernels ... previously I was able to trigger wakeup from
hibernation using the RTC, but not with 2.6.21 patches.)

Non-ACPI systems can make the same natural distinctions.


> We are letting ourselves in for problems if we say that when the snapshot
> is restored, devices may or may not need to be reinitialized. 

We have those problems already.  Of course, most of the
time S4/hibernate involves device re-init, while S3/STR
doesn't.


> Drivers 
> might not be able to tell which, so they would have to reinitialize
> regardless, losing any advantage.

For those specific devices.  Of course, not many drivers
are power-aware enough to notice.  Most will re-init.
On PCs the exceptions are USB and, maybe, network drivers.

Drivers for embedded platforms more often leverage the
"retention" states which don't require complete re-init,
since those systems generally don't "hibernate".


> Even worse, the device may _appear_ not 
> to need reinitialization because the firmware (BIOS) has already
> initialized it but left it in a state that's useless for the kernel's
> purposes.  (That's part of the reason why PRETHAW was added.)

That's *ALL* of the reason for PRETHAW.  I asked the
guy who did it.  ;)


> If the only remaining difference between poweroff for hibernate and normal 
> poweroff is which wakeup devices will function, then it seems pointless.

There's the additional power usage involved in enabling additional
wakeup sources, plus the additional system components that are
expected (possibly unreasonably!) to work.


> Why shouldn't the same devices work for wakeup from hibernate and wakeup 
> from normal poweroff?

You're suggesting Linux not use the S5 state, essentially.

So the question is really "why should Linux use S5 (and similar
states on non-ACPI systems), instead of disregarding the ACPI
spec?"

The short answer:  having a "true OFF" state is valuable, if
for no other reason than to cope with buggy "partial-ON" states
like S4.  Also, it's not clear that disregarding ACPI's guidance
here would be a good thing.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 18:33                                             ` Alan Stern
  2007-05-03 19:47                                               ` Rafael J. Wysocki
@ 2007-05-03 20:33                                               ` David Brownell
  1 sibling, 0 replies; 117+ messages in thread
From: David Brownell @ 2007-05-03 20:33 UTC (permalink / raw)
  To: linux-pm; +Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg

On Thursday 03 May 2007, Alan Stern wrote:

> First, as mentioned before this issue exists on only a small number of
> systems.  Second, I have submitted to Greg KH a couple of patches to
> maintain persistence of USB devices even when the power sessions are lost
> (they're still in his queue so you can't try them out yet).  This feature
> violates the USB spec and it is potentially dangerous -- users could
> easily lose data

... which is why I don't like having it as any kind of option.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 20:33                                           ` David Brownell
@ 2007-05-03 20:51                                             ` Rafael J. Wysocki
  2007-05-04 14:51                                             ` Alan Stern
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-03 20:51 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thursday, 3 May 2007 22:33, David Brownell wrote:
> On Thursday 03 May 2007, Alan Stern wrote:
> 
> > In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> > same as a normal non-hibernate poweroff? 
> 
> No.  One of the differences between ACPI S4 (hibernate)
> and S5 (poweroff) states is for example how wakeup behaves.
> Look for example at /proc/acpi/wakeup and see how many
> devices are listed as "can wake from S5" vs from S4 ...
> most systems support some S4 events, not so for S5.
> 
> Another is that S4 can consume more power.
> 
> (Although I believe I noticed a regression there in recent
> kernels ... previously I was able to trigger wakeup from
> hibernation using the RTC, but not with 2.6.21 patches.)

May I ask you to test a patch (appended)?

Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>

In the platform mode of hibernation swsusp calls (indirectly) the function
acpi_pm_finish() in the nonerror resume-during-hibernation code paths, which
is wrong, because this function effectively aborts the power transition and
disables the wake-up capability of devices.  Fix it.

Remove references to the platform functions from the snapshot restore code path
in kernel/power/user.c , since they should not be there.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---

 kernel/power/disk.c |    1 -
 kernel/power/user.c |   15 +++------------
 2 files changed, 3 insertions(+), 13 deletions(-)

Index: linux-2.6.21/kernel/power/disk.c
===================================================================
--- linux-2.6.21.orig/kernel/power/disk.c	2007-05-03 12:24:05.000000000 +0200
+++ linux-2.6.21/kernel/power/disk.c	2007-05-03 14:42:26.000000000 +0200
@@ -195,7 +195,6 @@ int hibernate(void)
 
 	if (in_suspend) {
 		enable_nonboot_cpus();
-		platform_finish();
 		device_resume();
 		resume_console();
 		pr_debug("PM: writing image.\n");
Index: linux-2.6.21/kernel/power/user.c
===================================================================
--- linux-2.6.21.orig/kernel/power/user.c	2007-05-03 12:22:57.000000000 +0200
+++ linux-2.6.21/kernel/power/user.c	2007-05-03 14:40:49.000000000 +0200
@@ -169,7 +169,7 @@ static inline int snapshot_suspend(int p
 	}
 	enable_nonboot_cpus();
  Resume_devices:
-	if (platform_suspend)
+	if (platform_suspend && (!in_suspend || error))
 		platform_finish();
 
 	device_resume();
@@ -179,17 +179,12 @@ static inline int snapshot_suspend(int p
 	return error;
 }
 
-static inline int snapshot_restore(int platform_suspend)
+static inline int snapshot_restore(void)
 {
 	int error;
 
 	mutex_lock(&pm_mutex);
 	pm_prepare_console();
-	if (platform_suspend) {
-		error = platform_prepare();
-		if (error)
-			goto Finish;
-	}
 	suspend_console();
 	error = device_suspend(PMSG_PRETHAW);
 	if (error)
@@ -201,12 +196,8 @@ static inline int snapshot_restore(int p
 
 	enable_nonboot_cpus();
  Resume_devices:
-	if (platform_suspend)
-		platform_finish();
-
 	device_resume();
 	resume_console();
- Finish:
 	pm_restore_console();
 	mutex_unlock(&pm_mutex);
 	return error;
@@ -272,7 +263,7 @@ static int snapshot_ioctl(struct inode *
 			error = -EPERM;
 			break;
 		}
-		error = snapshot_restore(data->platform_suspend);
+		error = snapshot_restore();
 		break;
 
 	case SNAPSHOT_FREE:

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 14:00                                         ` Alan Stern
  2007-05-03 17:17                                           ` Rafael J. Wysocki
  2007-05-03 20:33                                           ` David Brownell
@ 2007-05-03 22:18                                           ` Pavel Machek
  2007-05-04 14:57                                             ` Alan Stern
  2 siblings, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-03 22:18 UTC (permalink / raw)
  To: Alan Stern; +Cc: Johannes Berg, Pekka Enberg, linux-pm, Nigel Cunningham

Hi!

> > > Well... the powerdown during hibernation... does not have _anything_
> > > to do with snapshot/restore. It is really a very deep sleep; similar
> > > to soft powerdown, but not quite.
> 
> Is this really a good idea?

We have no other choice. ACPI spec says we should use S4.

> For that matter, what are the differences among the various sorts of 
> poweroff?
> 
> 	Which devices remain minimally powered for wakeup purposes?
> 
> 	Anything else?

Blinking moon LED.

Unfortunately if we do normal powerdown, we'll confuse ACPI BIOS.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
       [not found]                                         ` <200705022126.47897.rjw@sisk.pl>
@ 2007-05-03 22:48                                           ` Pavel Machek
       [not found]                                           ` <20070503224807.GD13426@elf.ucw.cz>
  1 sibling, 0 replies; 117+ messages in thread
From: Pavel Machek @ 2007-05-03 22:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg,
	Johannes Berg, linux-pm

Hi!

Crazy idea... could we kill hibernate_ops-like struct, and just create
a device for ACPI, using its suspend()/resume()/whatever callbacks to
do the ACPI magic?

> Okay.  Since we're trying to separate the hibernation code from the
> suspend code anyway, we can use the opportunity to introduce some new
> callbacks for the hibernation and/or redefine the existing ones.
> 
> The spec suggests that we need the following callbacks:
> 
> (1) prepare() - called after device_suspend(), execute _PTS and
> disable GPEs

sysdev .suspend() method would do the trick.

> (2) cancel() - called at any time after prepare() if there's an error, execute
>     _WAK and enable run-time GPEs

sysdev .resume() should do the trick. 

> (3) enter() - called after the image has been saved, execute _GTS and do what's
>     currently done in pm_enter()

This one is tricky. It is essentially
powerdown_but_enter_S4_instead. I guess we can live with if()... as we
need to special-case reboot in the same place.

> (4) finish() - called after the image has been restored, do what's currently
>     done in pm_finish()

platform (?) device .resume() method should work.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
       [not found]                                           ` <20070503224807.GD13426@elf.ucw.cz>
@ 2007-05-03 23:14                                             ` Rafael J. Wysocki
  2007-05-04 10:54                                             ` Johannes Berg
       [not found]                                             ` <1178276072.7408.7.camel@johannes.berg>
  2 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-03 23:14 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg,
	Johannes Berg, linux-pm

Hi,

On Friday, 4 May 2007 00:48, Pavel Machek wrote:
> Hi!
> 
> Crazy idea... could we kill hibernate_ops-like struct, and just create
> a device for ACPI, using its suspend()/resume()/whatever callbacks to
> do the ACPI magic?

Hmm, I didn't think about that.  It seems to be viable at first sight.

Still, I think we can first separate hibernation_ops from pm_ops, figure out
what they should be and then try to replace them with a cleaner solution.

> > Okay.  Since we're trying to separate the hibernation code from the
> > suspend code anyway, we can use the opportunity to introduce some new
> > callbacks for the hibernation and/or redefine the existing ones.
> > 
> > The spec suggests that we need the following callbacks:

In fact, I should have added

(0) start() - called before device_suspend(), execute _TTS(S4)

and I'm not sure if the GPEs should be disabled here or in prepare()

In principle this could be done as a device's .resume() call, but that would
have to be the very first device registered (can we do that?).

> > (1) prepare() - called after device_suspend(), execute _PTS and
> > disable GPEs
> 
> sysdev .suspend() method would do the trick.

Yes.

> > (2) cancel() - called at any time after prepare() if there's an error, execute
> >     _WAK and enable run-time GPEs
> 
> sysdev .resume() should do the trick.

But .resume() would be called unconditionally, so there should be a way of
figuring out what to do - looks complicated.
 
> > (3) enter() - called after the image has been saved, execute _GTS and do what's
> >     currently done in pm_enter()
> 
> This one is tricky. It is essentially
> powerdown_but_enter_S4_instead. I guess we can live with if()... as we
> need to special-case reboot in the same place.

Yes.

> > (4) finish() - called after the image has been restored, do what's currently
> >     done in pm_finish()
>
> platform (?) device .resume() method should work.

Hmm, perhaps.

And we need one more (in fact this one should be called finish() and the
previous one wake() or something like that):

(5) finish() - called after device_resume(), but only after the image has been
restored or in case of a hibernation error, execute _TTS(S0).  It looks like
this also should enable the GPEs (or the previous one; that's the information
I'm looking for).

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
       [not found]                                           ` <20070503224807.GD13426@elf.ucw.cz>
  2007-05-03 23:14                                             ` Rafael J. Wysocki
@ 2007-05-04 10:54                                             ` Johannes Berg
       [not found]                                             ` <1178276072.7408.7.camel@johannes.berg>
  2 siblings, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 10:54 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg, linux-pm


[-- Attachment #1.1: Type: text/plain, Size: 405 bytes --]

On Fri, 2007-05-04 at 00:48 +0200, Pavel Machek wrote:

> Crazy idea... could we kill hibernate_ops-like struct, and just create
> a device for ACPI, using its suspend()/resume()/whatever callbacks to
> do the ACPI magic?

Doesn't that have the ordering problem again? You must ensure that this
sysdev is suspended as the last one, and that's currently impossible if
ACPI is modular.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
       [not found]                                             ` <1178276072.7408.7.camel@johannes.berg>
@ 2007-05-04 12:08                                               ` Pavel Machek
       [not found]                                               ` <20070504120802.GF13426@elf.ucw.cz>
  1 sibling, 0 replies; 117+ messages in thread
From: Pavel Machek @ 2007-05-04 12:08 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg, linux-pm

Hi!

> > Crazy idea... could we kill hibernate_ops-like struct, and just create
> > a device for ACPI, using its suspend()/resume()/whatever callbacks to
> > do the ACPI magic?
> 
> Doesn't that have the ordering problem again? You must ensure that this
> sysdev is suspended as the last one, and that's currently impossible if
> ACPI is modular.

I do not think acpi has these kinds of ordering requirements... (And I
do not see what it has to do with module or not).



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops)
       [not found]                                               ` <20070504120802.GF13426@elf.ucw.cz>
@ 2007-05-04 12:29                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 12:29 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, ACPI Devel Maling List, Pekka Enberg,
	Johannes Berg, linux-pm

Hi,

On Friday, 4 May 2007 14:08, Pavel Machek wrote:
> Hi!
> 
> > > Crazy idea... could we kill hibernate_ops-like struct, and just create
> > > a device for ACPI, using its suspend()/resume()/whatever callbacks to
> > > do the ACPI magic?
> > 
> > Doesn't that have the ordering problem again? You must ensure that this
> > sysdev is suspended as the last one, and that's currently impossible if
> > ACPI is modular.
> 
> I do not think acpi has these kinds of ordering requirements... (And I
> do not see what it has to do with module or not).

Theoretically, ACPI has some ordering requirements.  For example, according to
the spec, the _PTS system-control method should be executed *after* devices are
placed in the appropriate Dx states, which (theoretically) requires us to
execute it after device_suspend() (we don't do this in practice, but I think we
should).

There are some more ordering assumptions like this in the spec and I think
we should at least try to follow them or, if that breaks things, document why
we don't.

That's why I think we should try to do what's needed using hibernation_ops 
(perhaps we'll need to add a couple of callbacks to hibernation_ops) and
then try to replace hibernation_ops with another mechanism allowing us to do
the same.  We first need to determine which operations have to be carried out
at what points so that things don't break.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 20:21                                                   ` Rafael J. Wysocki
@ 2007-05-04 14:40                                                     ` Alan Stern
  2007-05-04 20:20                                                       ` Rafael J. Wysocki
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-04 14:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

Rafael, David, and Pavel:

You all misunderstood the point I was trying to make.

On Thu, 3 May 2007, Rafael J. Wysocki wrote:

> > But why shouldn't a "normal" poweroff enter ACPI S4?  And why shouldn't a 
> > "hibernate" poweroff enter ACPI S5?  The choice of which state to enter is 
> > independent of the reason for shutting down, right?
> 
> Well, not exactly.
> 
> > In other words, the choice for whether or not to call
> > acpi_enter_sleep_state(ACPI_STATE_S4) shouldn't depend on whether or not 
> > you're hibernating.  So it shouldn't affect the usage of hibernation_ops 
> > at all.
> 
> This works the other way around, I think. :-)
> 
> Granted, some boxes require us to call acpi_enter_sleep_state(ACPI_STATE_S4)
> as a 'power off method' so that they work correctly after the 'return' from hibernation.
> If we do acpi_enter_sleep_state(ACPI_STATE_S5) instead, some things might
> not work on them (this is an experimental observation, I don't know what
> exactly the reason of it is).
> 
> Now, since I have such a box, I need to do the
> acpi_enter_sleep_state(ACPI_STATE_S4) thing (IOW, use the 'platform' power off
> method) and not acpi_enter_sleep_state(ACPI_STATE_S5) (the 'shutdown' power
> off method).  *However*, acpi_enter_sleep_state(ACPI_STATE_S4) cannot be used
> without previous preparations, which are made with the help of hibernation_ops.
> 
> IOW, all hibernation_ops, including the ->enter() method that actually calls
> acpi_enter_sleep_state(ACPI_STATE_S4), are just different pieces of one
> (complicated) 'platform' power off method.  It doesn't make sense to use the
> (other) hibernation_ops without the ->enter() method.

Let's look at the big picture.

Entering hibernation basically involves these steps:

	1. Freeze tasks

	2. Quiesce devices and drivers

	3. Create snapshot

	4. Reactivate devices and drivers

	5. Save snapshot to disk

	6. Prepare devices for wakeup

	7. Power down (ACPI S4 on systems which support it)

Leaving hibernation involves a similar sequence which I won't discuss.

Notice that steps 1-5 above are _completely_ independent of all issues 
concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
carried out for hibernation to work, no matter how the system ends up 
getting shut down.

On the other hand, steps 6 and 7 aren't really needed for hibernation.  
You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
wouldn't work, but the next time the user turned the computer on manually
it would still resume from hibernation.

Conversely, steps 6 and 7 can make sense even in situations where you
don't want to hibernate.  For example, you might want a normal shutdown in
which the operating system does a full restart when the firmware is
signalled by a wakeup device.

So there should be separate data structures associated with 1-5 and 6-7.  
Maybe the one associated with 6-7 is what you are calling hibernation_ops;  
if so then fine.  But I still think that it should be usable for
situations where you are not entering hibernation, and we should be 
possible to enter hibernation without using it.  The system administrator 
should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
regardless of whether it's to start hibernating.

The ACPI spec might refer to S4 as "hibernation" (does it? -- I'm too lazy
to check and see), but that doesn't mean we have to use the terms
synonymously.

Does this make sense, or am I missing something very basic?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 20:33                                           ` David Brownell
  2007-05-03 20:51                                             ` Rafael J. Wysocki
@ 2007-05-04 14:51                                             ` Alan Stern
  2007-05-04 14:56                                               ` Johannes Berg
  2007-05-04 22:00                                               ` David Brownell
  1 sibling, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-04 14:51 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Thu, 3 May 2007, David Brownell wrote:

> On Thursday 03 May 2007, Alan Stern wrote:
> 
> > In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> > same as a normal non-hibernate poweroff? 
> 
> No.  One of the differences between ACPI S4 (hibernate)
> and S5 (poweroff) states is for example how wakeup behaves.
> Look for example at /proc/acpi/wakeup and see how many
> devices are listed as "can wake from S5" vs from S4 ...
> most systems support some S4 events, not so for S5.
> 
> Another is that S4 can consume more power.

You are describing the difference between ACPI S4 and S5, but I was 
talking about the difference between "normal" poweroff and "hibernate" 
poweroff.  There doesn't seem to be any reason why we must always have

	hibernate = S4    and     normal = S5.

> Non-ACPI systems can make the same natural distinctions.

On such systems there seems to be even less reason for those equalities 
(or rather, their analogs).


> > We are letting ourselves in for problems if we say that when the snapshot
> > is restored, devices may or may not need to be reinitialized. 
> 
> We have those problems already.

Exactly because we are waffling on this issue.  If we settled the matter 
once and for all (devices must ALWAYS be reinitialized after the snapshot 
is restored) then we wouldn't have those problems.  (We might have other 
problems though...)


> > Even worse, the device may _appear_ not 
> > to need reinitialization because the firmware (BIOS) has already
> > initialized it but left it in a state that's useless for the kernel's
> > purposes.  (That's part of the reason why PRETHAW was added.)
> 
> That's *ALL* of the reason for PRETHAW.  I asked the
> guy who did it.  ;)

Well, be fair.  If your resume methods had some way to know whether or not 
a snapshot had just been restored then you wouldn't have needed to add 
PRETHAW.  So another part of the reason is that restore() methods don't 
take a pm_message_t argument.


> > Why shouldn't the same devices work for wakeup from hibernate and wakeup 
> > from normal poweroff?
> 
> You're suggesting Linux not use the S5 state, essentially.

No, I'm suggesting that the user should be able to control whether Linux 
uses S4 vs. S5 at poweroff time.  If the user selected always to use S4 
then wakeup devices would function in both hibernation and normal 
shutdown.  If the user selected always to use S5 then wakeup devices would 
not function in either hibernation or normal shutdown.

> So the question is really "why should Linux use S5 (and similar
> states on non-ACPI systems), instead of disregarding the ACPI
> spec?"
> 
> The short answer:  having a "true OFF" state is valuable, if
> for no other reason than to cope with buggy "partial-ON" states
> like S4.  Also, it's not clear that disregarding ACPI's guidance
> here would be a good thing.

Which part of ACPI's so-called guidance are you referring to?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:51                                             ` Alan Stern
@ 2007-05-04 14:56                                               ` Johannes Berg
  2007-05-04 20:27                                                 ` Rafael J. Wysocki
  2007-05-04 22:00                                               ` David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 14:56 UTC (permalink / raw)
  To: Alan Stern; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 887 bytes --]

On Fri, 2007-05-04 at 10:51 -0400, Alan Stern wrote:

> Exactly because we are waffling on this issue.  If we settled the matter 
> once and for all (devices must ALWAYS be reinitialized after the snapshot 
> is restored) then we wouldn't have those problems.  (We might have other 
> problems though...)

From what I've understood so far, ACPI is very unhappy on some machines
if you go to S5 after hiberation. I still don't understand why, if the
ACPI code would properly re-initialise itself (treat ACPI as a device
and apply your "devices must ALWAYS be reinitialized after the snapshot
is restored") then this shouldn't be possible to happen.

And at that point I agree that the issue becomes completely orthogonal.

(btw, it's always possible right now to go to S5 instead of S4 when
doing hibernation simply by changing /sys/power/disk to "shutdown")

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-03 22:18                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Pavel Machek
@ 2007-05-04 14:57                                             ` Alan Stern
  2007-05-04 20:50                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-04 14:57 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Johannes Berg, Pekka Enberg, linux-pm, Nigel Cunningham

On Fri, 4 May 2007, Pavel Machek wrote:

> Hi!
> 
> > > > Well... the powerdown during hibernation... does not have _anything_
> > > > to do with snapshot/restore. It is really a very deep sleep; similar
> > > > to soft powerdown, but not quite.
> > 
> > Is this really a good idea?
> 
> We have no other choice. ACPI spec says we should use S4.

I haven't checked the spec, but I find it hard to believe.  What could 
possibly be wrong with using S5?  It works just fine for normal poweroff, 
with no wakeup devices enabled.  Provided you don't enable the wakeup 
devices during hibernation, why not use S5?

> Unfortunately if we do normal powerdown, we'll confuse ACPI BIOS.

We do normal powerdown whenever someone shuts off his computer without 
hibernating.  I haven't noticed any ACPI BIOS confusion from that...

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:40                                                     ` Alan Stern
@ 2007-05-04 20:20                                                       ` Rafael J. Wysocki
  2007-05-04 20:21                                                         ` Johannes Berg
  2007-05-04 20:58                                                       ` Pavel Machek
  2007-05-04 21:40                                                       ` David Brownell
  2 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 20:20 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Friday, 4 May 2007 16:40, Alan Stern wrote:
> Rafael, David, and Pavel:
> 
> You all misunderstood the point I was trying to make.
> 
> On Thu, 3 May 2007, Rafael J. Wysocki wrote:
> 
> > > But why shouldn't a "normal" poweroff enter ACPI S4?  And why shouldn't a 
> > > "hibernate" poweroff enter ACPI S5?  The choice of which state to enter is 
> > > independent of the reason for shutting down, right?
> > 
> > Well, not exactly.
> > 
> > > In other words, the choice for whether or not to call
> > > acpi_enter_sleep_state(ACPI_STATE_S4) shouldn't depend on whether or not 
> > > you're hibernating.  So it shouldn't affect the usage of hibernation_ops 
> > > at all.
> > 
> > This works the other way around, I think. :-)
> > 
> > Granted, some boxes require us to call acpi_enter_sleep_state(ACPI_STATE_S4)
> > as a 'power off method' so that they work correctly after the 'return' from hibernation.
> > If we do acpi_enter_sleep_state(ACPI_STATE_S5) instead, some things might
> > not work on them (this is an experimental observation, I don't know what
> > exactly the reason of it is).
> > 
> > Now, since I have such a box, I need to do the
> > acpi_enter_sleep_state(ACPI_STATE_S4) thing (IOW, use the 'platform' power off
> > method) and not acpi_enter_sleep_state(ACPI_STATE_S5) (the 'shutdown' power
> > off method).  *However*, acpi_enter_sleep_state(ACPI_STATE_S4) cannot be used
> > without previous preparations, which are made with the help of hibernation_ops.
> > 
> > IOW, all hibernation_ops, including the ->enter() method that actually calls
> > acpi_enter_sleep_state(ACPI_STATE_S4), are just different pieces of one
> > (complicated) 'platform' power off method.  It doesn't make sense to use the
> > (other) hibernation_ops without the ->enter() method.
> 
> Let's look at the big picture.
> 
> Entering hibernation basically involves these steps:
> 
> 	1. Freeze tasks
> 
> 	2. Quiesce devices and drivers
> 
> 	3. Create snapshot
> 
> 	4. Reactivate devices and drivers
> 
> 	5. Save snapshot to disk
> 
> 	6. Prepare devices for wakeup
> 
> 	7. Power down (ACPI S4 on systems which support it)
> 
> Leaving hibernation involves a similar sequence which I won't discuss.
> 
> Notice that steps 1-5 above are _completely_ independent of all issues 
> concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> carried out for hibernation to work, no matter how the system ends up 
> getting shut down.
> 
> On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> wouldn't work, but the next time the user turned the computer on manually
> it would still resume from hibernation.

That's correct, with the exception that the user may find the system not fully
functional after the resume in that case.

> Conversely, steps 6 and 7 can make sense even in situations where you
> don't want to hibernate.  For example, you might want a normal shutdown in
> which the operating system does a full restart when the firmware is
> signalled by a wakeup device.
> 
> So there should be separate data structures associated with 1-5 and 6-7.  
> Maybe the one associated with 6-7 is what you are calling hibernation_ops;  
> if so then fine.  But I still think that it should be usable for
> situations where you are not entering hibernation, and we should be 
> possible to enter hibernation without using it.  The system administrator 
> should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
> regardless of whether it's to start hibernating.

Yes, this should be doable.
 
> The ACPI spec might refer to S4 as "hibernation" (does it? -- I'm too lazy
> to check and see),

Not directly.  The word "hibernation" is never used in the ACPI specification
(as of ACPI 2.0).

> but that doesn't mean we have to use the terms synonymously.

Agreed.

> Does this make sense, or am I missing something very basic?

Hmm, I think it makes sense.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:20                                                       ` Rafael J. Wysocki
@ 2007-05-04 20:21                                                         ` Johannes Berg
  2007-05-04 20:55                                                           ` Pavel Machek
  2007-05-04 21:06                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Rafael J. Wysocki
  0 siblings, 2 replies; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 20:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list


[-- Attachment #1.1: Type: text/plain, Size: 762 bytes --]

On Fri, 2007-05-04 at 22:20 +0200, Rafael J. Wysocki wrote:

> > On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> > You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> > wouldn't work, but the next time the user turned the computer on manually
> > it would still resume from hibernation.
> 
> That's correct, with the exception that the user may find the system not fully
> functional after the resume in that case.

Why is that anyway? Is it just a matter of the acpi code getting
confused about the acpi bios state? How can the acpi bios possibly be
screwed up after what it must see as a fresh boot? Does the acpi code
poke it in ways it's not supposed to be poked after a fresh boot?

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:56                                               ` Johannes Berg
@ 2007-05-04 20:27                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 20:27 UTC (permalink / raw)
  To: linux-pm; +Cc: Johannes Berg, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Friday, 4 May 2007 16:56, Johannes Berg wrote:
> On Fri, 2007-05-04 at 10:51 -0400, Alan Stern wrote:
> 
> > Exactly because we are waffling on this issue.  If we settled the matter 
> > once and for all (devices must ALWAYS be reinitialized after the snapshot 
> > is restored) then we wouldn't have those problems.  (We might have other 
> > problems though...)
> 
> From what I've understood so far, ACPI is very unhappy on some machines
> if you go to S5 after hiberation. I still don't understand why, if the
> ACPI code would properly re-initialise itself (treat ACPI as a device
> and apply your "devices must ALWAYS be reinitialized after the snapshot
> is restored") then this shouldn't be possible to happen.

I agree, and that's why I suspect that the ACPI driver's .resume() routines
make some, well, ACPIish assumptions about the resume from hibernation, which
is the source of the problem.  If we separate the hibernation code from the
suspend (s2ram, standby) code completely, this issue will have to be resolved
somehow.

> And at that point I agree that the issue becomes completely orthogonal.
> 
> (btw, it's always possible right now to go to S5 instead of S4 when
> doing hibernation simply by changing /sys/power/disk to "shutdown")

That's correct.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:50                                               ` Rafael J. Wysocki
@ 2007-05-04 20:49                                                 ` Johannes Berg
  2007-05-04 21:11                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 20:49 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 332 bytes --]

On Fri, 2007-05-04 at 22:50 +0200, Rafael J. Wysocki wrote:

> To prevent this from happening, we need a separate set of hibernation callbacks
> in device drivers.

You *can* actually do that now with prethaw and all that afaict. But all
the more argument for splitting up the callbacks as discussed
previously.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:57                                             ` Alan Stern
@ 2007-05-04 20:50                                               ` Rafael J. Wysocki
  2007-05-04 20:49                                                 ` Johannes Berg
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 20:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Friday, 4 May 2007 16:57, Alan Stern wrote:
> On Fri, 4 May 2007, Pavel Machek wrote:
> 
> > Hi!
> > 
> > > > > Well... the powerdown during hibernation... does not have _anything_
> > > > > to do with snapshot/restore. It is really a very deep sleep; similar
> > > > > to soft powerdown, but not quite.
> > > 
> > > Is this really a good idea?
> > 
> > We have no other choice. ACPI spec says we should use S4.
> 
> I haven't checked the spec, but I find it hard to believe.  What could 
> possibly be wrong with using S5?  It works just fine for normal poweroff, 
> with no wakeup devices enabled.  Provided you don't enable the wakeup 
> devices during hibernation, why not use S5?

I think the problem is the "reinitialize from scratch after the resume" part.

If we're waking up from the hibernation, device drivers should reinitialize
their devices, but if we're waking up from a suspend (eg. s2ram), it would be
wrong to reinitialize, for example, the ACPI subsystem from scratch.  Now,
the problem is that the drivers (including ACPI drivers) cannot tell whether
the resume is from hibernation or from suspend so they try to do something
"generic".  This may lead to having the system not fully functional after the
resume from hibernation if we don't tell the ACPI BIOS that we're hibernating
(by entering the S4 state instead of S5).

> > Unfortunately if we do normal powerdown, we'll confuse ACPI BIOS.
> 
> We do normal powerdown whenever someone shuts off his computer without 
> hibernating.  I haven't noticed any ACPI BIOS confusion from that...

In fact, I think, the BIOS isn't confused, but it may preserve some state
information that the OS can use later on.  By entering S4 we tell the BIOS
to tell the "next" kernel that we've hibernated and to preserve some
configuration information for it.  If this information is not present, our own
ACPI drivers get confised during the resume.

To prevent this from happening, we need a separate set of hibernation callbacks
in device drivers.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:21                                                         ` Johannes Berg
@ 2007-05-04 20:55                                                           ` Pavel Machek
  2007-05-04 21:08                                                             ` Johannes Berg
  2007-05-04 21:06                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Rafael J. Wysocki
  1 sibling, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-04 20:55 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list

Hi!

> > > On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> > > You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> > > wouldn't work, but the next time the user turned the computer on manually
> > > it would still resume from hibernation.
> > 
> > That's correct, with the exception that the user may find the system not fully
> > functional after the resume in that case.
> 
> Why is that anyway? Is it just a matter of the acpi code getting
> confused about the acpi bios state? How can the acpi bios possibly be
> screwed up after what it must see as a fresh boot? Does the acpi code
> poke it in ways it's not supposed to be poked after a fresh boot?

No, ACPI BIOS does not see a fresh boot.

ACPI BIOS communicates with hw, too. Suppose it generates random
number, stores it in memory and tells it to the keyboard conroller
during bootup (more specifically during ACPI enable phase).

Now, it periodically checks if number in memory is same as the number
known by keyboard controller.

If you suspend/resume without telling acpi, it will find out, because
numbers will not match.

(And now, ACPI is probably not crazy enough to store random numbers --
but it could -- but for example "I had AC power, now I do not, and I
did not see a interrupt telling me it went away" can be counted as
confusing for ACPI).
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:40                                                     ` Alan Stern
  2007-05-04 20:20                                                       ` Rafael J. Wysocki
@ 2007-05-04 20:58                                                       ` Pavel Machek
  2007-05-04 21:24                                                         ` Rafael J. Wysocki
  2007-05-04 21:40                                                       ` David Brownell
  2 siblings, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-04 20:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list,
	Johannes Berg

Hi!

> You all misunderstood the point I was trying to make.
> 

> > acpi_enter_sleep_state(ACPI_STATE_S4), are just different pieces of one
> > (complicated) 'platform' power off method.  It doesn't make sense to use the
> > (other) hibernation_ops without the ->enter() method.
> 
> Let's look at the big picture.
> 
> Entering hibernation basically involves these steps:
> 
> 	1. Freeze tasks
> 
> 	2. Quiesce devices and drivers
> 
> 	3. Create snapshot
> 
> 	4. Reactivate devices and drivers
> 
> 	5. Save snapshot to disk
> 
> 	6. Prepare devices for wakeup
> 
> 	7. Power down (ACPI S4 on systems which support it)
> 
> Leaving hibernation involves a similar sequence which I won't discuss.
> 
> Notice that steps 1-5 above are _completely_ independent of all issues 
> concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to
> be

No, they are not. You probably should tell ACPI at step 2 that you are
suspending, and you definitely need to tell ACPI that you have resumed
(so it can re-scan AC adapters, for example).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:21                                                         ` Johannes Berg
  2007-05-04 20:55                                                           ` Pavel Machek
@ 2007-05-04 21:06                                                           ` Rafael J. Wysocki
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 21:06 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list

On Friday, 4 May 2007 22:21, Johannes Berg wrote:
> On Fri, 2007-05-04 at 22:20 +0200, Rafael J. Wysocki wrote:
> 
> > > On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> > > You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> > > wouldn't work, but the next time the user turned the computer on manually
> > > it would still resume from hibernation.
> > 
> > That's correct, with the exception that the user may find the system not fully
> > functional after the resume in that case.
> 
> Why is that anyway? Is it just a matter of the acpi code getting
> confused about the acpi bios state?

Yes, I think so.

> How can the acpi bios possibly be screwed up after what it must see as a
> fresh boot? Does the acpi code poke it in ways it's not supposed to be poked
> after a fresh boot?

Sort of.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:55                                                           ` Pavel Machek
@ 2007-05-04 21:08                                                             ` Johannes Berg
  2007-05-04 21:15                                                               ` Pavel Machek
  0 siblings, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 21:08 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list


[-- Attachment #1.1: Type: text/plain, Size: 1569 bytes --]

On Fri, 2007-05-04 at 22:55 +0200, Pavel Machek wrote:

> > Why is that anyway? Is it just a matter of the acpi code getting
> > confused about the acpi bios state? How can the acpi bios possibly be
> > screwed up after what it must see as a fresh boot? Does the acpi code
> > poke it in ways it's not supposed to be poked after a fresh boot?
> 
> No, ACPI BIOS does not see a fresh boot.

Sure. It just booted the machine so it must see it as a fresh boot.

> ACPI BIOS communicates with hw, too. Suppose it generates random
> number, stores it in memory and tells it to the keyboard conroller
> during bootup (more specifically during ACPI enable phase).
> 
> Now, it periodically checks if number in memory is same as the number
> known by keyboard controller.
> 
> If you suspend/resume without telling acpi, it will find out, because
> numbers will not match.
> 
> (And now, ACPI is probably not crazy enough to store random numbers --
> but it could -- but for example "I had AC power, now I do not, and I
> did not see a interrupt telling me it went away" can be counted as
> confusing for ACPI).

I don't follow.

 * you have AC power.
 * you save system state and shut down (S5)
 * you boot up again on battery power
 * you restore system state
 * ...

vs.

 * you have AC power
 * you shut down
 * you boot up again on battery power
 * ...

where's the difference to the ACPI bios? Oh, I see, it stores it
somewhere in the memory that you've stored/restored? Well, that's your
bug then, don't touch it.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:49                                                 ` Johannes Berg
@ 2007-05-04 21:11                                                   ` Rafael J. Wysocki
  2007-05-04 21:23                                                     ` Johannes Berg
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 21:11 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Friday, 4 May 2007 22:49, Johannes Berg wrote:
> On Fri, 2007-05-04 at 22:50 +0200, Rafael J. Wysocki wrote:
> 
> > To prevent this from happening, we need a separate set of hibernation callbacks
> > in device drivers.
> 
> You *can* actually do that now with prethaw and all that afaict.

Actually, prethaw is to prevent drivers loaded before the image is restored
from doing unreasonable things.  It doesn't have any effect on the drivers'
.resume() routines.

Besides, if the drivers in question are compiled as modules and not loaded
before the image is restored, prethaw doesn't have any effect on them and
on their devices at all. ;-)

> But all the more argument for splitting up the callbacks as discussed
> previously.

Yes.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:08                                                             ` Johannes Berg
@ 2007-05-04 21:15                                                               ` Pavel Machek
  2007-05-04 21:53                                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-04 21:15 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list

Hi!

> > ACPI BIOS communicates with hw, too. Suppose it generates random
> > number, stores it in memory and tells it to the keyboard conroller
> > during bootup (more specifically during ACPI enable phase).
> > 
> > Now, it periodically checks if number in memory is same as the number
> > known by keyboard controller.
> > 
> > If you suspend/resume without telling acpi, it will find out, because
> > numbers will not match.
> > 
> > (And now, ACPI is probably not crazy enough to store random numbers --
> > but it could -- but for example "I had AC power, now I do not, and I
> > did not see a interrupt telling me it went away" can be counted as
> > confusing for ACPI).
> 
> I don't follow.
> 
>  * you have AC power.
>  * you save system state and shut down (S5)
>  * you boot up again on battery power
>  * you restore system state
>  * ...
> 
> vs.
> 
>  * you have AC power
>  * you shut down
>  * you boot up again on battery power
>  * ...
> 
> where's the difference to the ACPI bios? Oh, I see, it stores it
> somewhere in the memory that you've stored/restored? Well, that's your
> bug then, don't touch it.

Not sure... yes, it stores parts somewhere in memory. Plus, it may
have some parts related to the communications with operating system
(*)... I guess we need to save those, and parts related to hw
state... where your suggestion makes sense.

(*) and yes, there probably are such parts. If we set backlight to
20%, we'll be confused if it is 100% after resume... we probably could
handle those one-by-one...
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:11                                                   ` Rafael J. Wysocki
@ 2007-05-04 21:23                                                     ` Johannes Berg
  2007-05-04 21:55                                                       ` Rafael J. Wysocki
  2007-05-05 16:15                                                       ` Alan Stern
  0 siblings, 2 replies; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 21:23 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 389 bytes --]

On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:

> Actually, prethaw is to prevent drivers loaded before the image is restored
> from doing unreasonable things.  It doesn't have any effect on the drivers'
> .resume() routines.

Oh, but it can, you could have a flag in your driver saying "the next
resume is after restore" and you set that flag in prethaw.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 20:58                                                       ` Pavel Machek
@ 2007-05-04 21:24                                                         ` Rafael J. Wysocki
  2007-05-05 16:19                                                           ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 21:24 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list,
	Johannes Berg

Hi,

On Friday, 4 May 2007 22:58, Pavel Machek wrote:
> Hi!
> 
> > You all misunderstood the point I was trying to make.
> > 
> 
> > > acpi_enter_sleep_state(ACPI_STATE_S4), are just different pieces of one
> > > (complicated) 'platform' power off method.  It doesn't make sense to use the
> > > (other) hibernation_ops without the ->enter() method.
> > 
> > Let's look at the big picture.
> > 
> > Entering hibernation basically involves these steps:
> > 
> > 	1. Freeze tasks
> > 
> > 	2. Quiesce devices and drivers
> > 
> > 	3. Create snapshot
> > 
> > 	4. Reactivate devices and drivers
> > 
> > 	5. Save snapshot to disk
> > 
> > 	6. Prepare devices for wakeup
> > 
> > 	7. Power down (ACPI S4 on systems which support it)
> > 
> > Leaving hibernation involves a similar sequence which I won't discuss.
> > 
> > Notice that steps 1-5 above are _completely_ independent of all issues 
> > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to
> > be
> 
> No, they are not. You probably should tell ACPI at step 2 that you are
> suspending,

You can, but even if you don't, the BIOS shouldn't have problems.  What might
have problems is our ACPI code during the resume, if it cannot get appropriate
information from the BIOS.

> and you definitely need to tell ACPI that you have resumed 
> (so it can re-scan AC adapters, for example).

Yes, but that can be done in two different ways:

1) "We have restored the hibernation image, but the BIOS state corresponds to
a fresh reboot, so please initialize everything from scratch."

2) "We have restored the hibernation image and the ACPI S4 was used for
powering off (hint: you may try not to initialize everything from scratch)."

Of course, in the case 2) we are responsible for ensuring that the contents of
the hibernation image are consistent with the information preserved by the
BIOS.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:40                                                     ` Alan Stern
  2007-05-04 20:20                                                       ` Rafael J. Wysocki
  2007-05-04 20:58                                                       ` Pavel Machek
@ 2007-05-04 21:40                                                       ` David Brownell
  2007-05-04 22:19                                                         ` Rafael J. Wysocki
  2007-05-05 16:08                                                         ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Alan Stern
  2 siblings, 2 replies; 117+ messages in thread
From: David Brownell @ 2007-05-04 21:40 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Friday 04 May 2007, Alan Stern wrote:
> Rafael, David, and Pavel:
> 
> You all misunderstood the point I was trying to make.
> 
> ...
> 
> Let's look at the big picture.
> 
> Entering hibernation basically involves these steps:
> 
> 	1. Freeze tasks
> 	2. Quiesce devices and drivers
> 	3. Create snapshot
> 	4. Reactivate devices and drivers
> 	5. Save snapshot to disk
> 	6. Prepare devices for wakeup
> 	7. Power down (ACPI S4 on systems which support it)
> 
> Leaving hibernation involves a similar sequence which I won't discuss.
> 
> Notice that steps 1-5 above are _completely_ independent of all issues 
> concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> carried out for hibernation to work, no matter how the system ends up 
> getting shut down.

Not exactly.  Step 2 is supposed to be aware of the target state's
capabilities, including what's wakeup-capable.  ACPI uses target
device states to choose which _SxD methods to execute, etc.  (Or it
should ... though come to think of it, I don't think I ever saw a
hook whereby PCI could trigger that.)


> On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> wouldn't work, but the next time the user turned the computer on manually
> it would still resume from hibernation.

I believe I did comment on your point that step 7 could use S5.

However, the ACPI spec *does* say up front (2.2 in ACPI 2.0C)
that S5 == G2 "Soft OFF" is not a "sleeping" (G1) state.  (Then
fuzzes the issue in 2.4, but those bits are less relevant here;
2.2 also mentions G3 = "Mechanical OFF", which is the only state
in which machine disassembly/reassembly is expected to be safe.

ACPI is allowed to distinguish between S4 and S5 in more ways
than just the power usage.  It'd be fair for the AML to store
state in something that retains power, and rely on that.  It'd
be better not to do things that are allowed to confuse ACPI.


> Conversely, steps 6 and 7 can make sense even in situations where you
> don't want to hibernate.  For example, you might want a normal shutdown in
> which the operating system does a full restart when the firmware is
> signalled by a wakeup device.

Wakeup devices in S4 are expected to be a superset of those in S5,
and system documentation often covers that.  Yeah, I know, "who
bothers to RTFM".  Still, the point is that these systems are now
documented to work in a particular way, and there really ought to
be a good reason to invalidate user training and documentation.

 
> So there should be separate data structures associated with 1-5 and 6-7.  
> Maybe the one associated with 6-7 is what you are calling hibernation_ops;  
> if so then fine.  But I still think that it should be usable for
> situations where you are not entering hibernation, and we should be 
> possible to enter hibernation without using it.  The system administrator 
> should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
> regardless of whether it's to start hibernating.

But ... why?  What value would users see from that?

We do have /sys/power/disk today, but that's only for
hibernation.  (And it's a bit confusing, too.)

A "Soft OFF" should be S5 to conform to specs and
documentation.


> The ACPI spec might refer to S4 as "hibernation" (does it? -- I'm too lazy
> to check and see), but that doesn't mean we have to use the terms
> synonymously.

It talks S4 as a "sleeping" state, like S1, S2, and S3.
Or, about S4 as a "Non-Volatle sleep" state

I think it also assumes more intelligence on resume-from-S4
than Linux has just now, which may partly explain why it
takes so long for swsusp to finish its thing.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:53                                                                 ` Rafael J. Wysocki
@ 2007-05-04 21:53                                                                   ` Johannes Berg
  2007-05-04 22:25                                                                     ` Rafael J. Wysocki
  2007-05-05 15:52                                                                   ` Alan Stern
  1 sibling, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 21:53 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list


[-- Attachment #1.1: Type: text/plain, Size: 646 bytes --]

On Fri, 2007-05-04 at 23:53 +0200, Rafael J. Wysocki wrote:
> > Plus, it may have some parts related to the communications with operating
> > system (*)... I guess we need to save those, and parts related to hw
> > state... where your suggestion makes sense.
> 
> If they are accessible to us, then we can, but what if they aren't (eg. the
> state information is stored in the embedded controller, can only be read with
> the help of some AML invocations and cannot be changed from the OS level)?

Well, in that case you also haven't overwritten/changed them during
restore so there's no room for mismatches and confusion.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:15                                                               ` Pavel Machek
@ 2007-05-04 21:53                                                                 ` Rafael J. Wysocki
  2007-05-04 21:53                                                                   ` Johannes Berg
  2007-05-05 15:52                                                                   ` Alan Stern
  0 siblings, 2 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 21:53 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Pekka Enberg, Johannes Berg,
	Linux-pm mailing list

On Friday, 4 May 2007 23:15, Pavel Machek wrote:
> Hi!
> 
> > > ACPI BIOS communicates with hw, too. Suppose it generates random
> > > number, stores it in memory and tells it to the keyboard conroller
> > > during bootup (more specifically during ACPI enable phase).
> > > 
> > > Now, it periodically checks if number in memory is same as the number
> > > known by keyboard controller.
> > > 
> > > If you suspend/resume without telling acpi, it will find out, because
> > > numbers will not match.
> > > 
> > > (And now, ACPI is probably not crazy enough to store random numbers --
> > > but it could -- but for example "I had AC power, now I do not, and I
> > > did not see a interrupt telling me it went away" can be counted as
> > > confusing for ACPI).
> > 
> > I don't follow.
> > 
> >  * you have AC power.
> >  * you save system state and shut down (S5)
> >  * you boot up again on battery power
> >  * you restore system state
> >  * ...
> > 
> > vs.
> > 
> >  * you have AC power
> >  * you shut down
> >  * you boot up again on battery power
> >  * ...
> > 
> > where's the difference to the ACPI bios? Oh, I see, it stores it
> > somewhere in the memory that you've stored/restored? Well, that's your
> > bug then, don't touch it.
> 
> Not sure... yes, it stores parts somewhere in memory.

These are reserved regions.  On the majority of systems we handle them
correctly.

> Plus, it may have some parts related to the communications with operating
> system (*)... I guess we need to save those, and parts related to hw
> state... where your suggestion makes sense.

If they are accessible to us, then we can, but what if they aren't (eg. the
state information is stored in the embedded controller, can only be read with
the help of some AML invocations and cannot be changed from the OS level)?

> (*) and yes, there probably are such parts. If we set backlight to
> 20%, we'll be confused if it is 100% after resume... we probably could
> handle those one-by-one...

*If* we reinitialize devices *and* ACPI from scratch after restoring the image,
we'll discard the old value (20%) and read the new value (100%) from the BIOS.
The problems occur, IMO, because we try to be smart and use the BIOS
after the resume as though we'd resumed from a real suspend (eg. s2ram).

Which is natural, if we use the same set of .resume() callbacks for both cases.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:55                                                       ` Rafael J. Wysocki
@ 2007-05-04 21:54                                                         ` Johannes Berg
  2007-05-04 22:21                                                           ` Rafael J. Wysocki
  2007-05-04 22:12                                                         ` David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Johannes Berg @ 2007-05-04 21:54 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 755 bytes --]

On Fri, 2007-05-04 at 23:55 +0200, Rafael J. Wysocki wrote:
> On Friday, 4 May 2007 23:23, Johannes Berg wrote:
> > On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> > 
> > > Actually, prethaw is to prevent drivers loaded before the image is restored
> > > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > > .resume() routines.
> > 
> > Oh, but it can, you could have a flag in your driver saying "the next
> > resume is after restore" and you set that flag in prethaw.
> 
> No, you should have set that flag in .suspend(), really. :-)

Yeah, whatever. You can fix the problem but it's ugly. Let's come up
with a good way to do the 6 callbacks mentioned in some other thread
earlier.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:23                                                     ` Johannes Berg
@ 2007-05-04 21:55                                                       ` Rafael J. Wysocki
  2007-05-04 21:54                                                         ` Johannes Berg
  2007-05-04 22:12                                                         ` David Brownell
  2007-05-05 16:15                                                       ` Alan Stern
  1 sibling, 2 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 21:55 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Friday, 4 May 2007 23:23, Johannes Berg wrote:
> On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> 
> > Actually, prethaw is to prevent drivers loaded before the image is restored
> > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > .resume() routines.
> 
> Oh, but it can, you could have a flag in your driver saying "the next
> resume is after restore" and you set that flag in prethaw.

No, you should have set that flag in .suspend(), really. :-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 14:51                                             ` Alan Stern
  2007-05-04 14:56                                               ` Johannes Berg
@ 2007-05-04 22:00                                               ` David Brownell
  2007-05-05 15:49                                                 ` Alan Stern
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-04 22:00 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Friday 04 May 2007, Alan Stern wrote:
> On Thu, 3 May 2007, David Brownell wrote:
> 
> > On Thursday 03 May 2007, Alan Stern wrote:
> > 
> > > In fact, shouldn't the poweroff at the end of a hibernate be exactly the 
> > > same as a normal non-hibernate poweroff? 
> > 
> > No.  One of the differences between ACPI S4 (hibernate)
> > and S5 (poweroff) states is for example how wakeup behaves.
> > Look for example at /proc/acpi/wakeup and see how many
> > devices are listed as "can wake from S5" vs from S4 ...
> > most systems support some S4 events, not so for S5.
> > 
> > Another is that S4 can consume more power.
> 
> You are describing the difference between ACPI S4 and S5, but I was 
> talking about the difference between "normal" poweroff and "hibernate" 
> poweroff.  There doesn't seem to be any reason why we must always have
> 
> 	hibernate = S4    and     normal = S5.

What the ACPI spec describes for the "Non-Volatile Sleep" is
that either S4 or S5 could match "hibernate" ... but for
a software-controlled "poweroff", only S5 is appropriate.

That's a reason.  Another:  pretty much all end-user docs
on this stuff match what ACPI says.

Lacking compelling reasons to violate specs (like them
being clearly broken), I avoid breaking them.


> > Non-ACPI systems can make the same natural distinctions.
> 
> On such systems there seems to be even less reason for those equalities 
> (or rather, their analogs).

This is one of those "less is more" things, right?  :)

People doing embedded designs _like_ their flexibility.

It's common to have multiple power levels.  If you mean
that they _could_ give up that flexibility and only use
one of those state analogues, yes they could ... but if
you mean they'd see that as a Good Thing, I doubt it.
 

> 
> > > We are letting ourselves in for problems if we say that when the snapshot
> > > is restored, devices may or may not need to be reinitialized. 
> > 
> > We have those problems already.
> 
> Exactly because we are waffling on this issue.  If we settled the matter 
> once and for all (devices must ALWAYS be reinitialized after the snapshot 
> is restored) then we wouldn't have those problems.  (We might have other 
> problems though...)

We *WOULD* have problems.

I guess I don't see why you want to throw away all the
work the hardware (and/or software) designers did to
ensure that some key devices use a "retention" mode
in their S4-analogue state.

Me, I always thought that leveraging those retention
states was a great way to shrink wakeup times and get
more functionality.


> > > Even worse, the device may _appear_ not 
> > > to need reinitialization because the firmware (BIOS) has already
> > > initialized it but left it in a state that's useless for the kernel's
> > > purposes.  (That's part of the reason why PRETHAW was added.)
> > 
> > That's *ALL* of the reason for PRETHAW.  I asked the
> > guy who did it.  ;)
> 
> Well, be fair.  If your resume methods had some way to know whether or not 
> a snapshot had just been restored then you wouldn't have needed to add 
> PRETHAW.  So another part of the reason is that restore() methods don't 
> take a pm_message_t argument.

Well, to be fair he says he didn't even consider such an
intrusive change.  The entire *reason* was to address that
particular issue.  Implementation tradeoffs are separate.


> > > Why shouldn't the same devices work for wakeup from hibernate and wakeup 
> > > from normal poweroff?
> > 
> > You're suggesting Linux not use the S5 state, essentially.
> 
> No, I'm suggesting that the user should be able to control whether Linux 
> uses S4 vs. S5 at poweroff time.  If the user selected always to use S4 
> then wakeup devices would function in both hibernation and normal 
> shutdown.  If the user selected always to use S5 then wakeup devices would 
> not function in either hibernation or normal shutdown.

That's a different suggestion, yes.  I'm not sure I see any
benefit of that flexibility for "soft off" states though,
especially if it made "off" consume more power.

 
> > So the question is really "why should Linux use S5 (and similar
> > states on non-ACPI systems), instead of disregarding the ACPI
> > spec?"
> > 
> > The short answer:  having a "true OFF" state is valuable, if
> > for no other reason than to cope with buggy "partial-ON" states
> > like S4.  Also, it's not clear that disregarding ACPI's guidance
> > here would be a good thing.
> 
> Which part of ACPI's so-called guidance are you referring to?

Section 2.2 of the spec I looked at, which defines how non-volatile
sleep relates to S4 and S5 states, and to the G3 "Mechanical OFF"
which could also be entered from either of those by flick'o'switch.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:55                                                       ` Rafael J. Wysocki
  2007-05-04 21:54                                                         ` Johannes Berg
@ 2007-05-04 22:12                                                         ` David Brownell
  2007-05-04 22:31                                                           ` Rafael J. Wysocki
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-04 22:12 UTC (permalink / raw)
  To: linux-pm; +Cc: Johannes Berg, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Friday 04 May 2007, Rafael J. Wysocki wrote:
> On Friday, 4 May 2007 23:23, Johannes Berg wrote:
> > On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> > 
> > > Actually, prethaw is to prevent drivers loaded before the image is restored
> > > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > > .resume() routines.
> > 
> > Oh, but it can, you could have a flag in your driver saying "the next
> > resume is after restore" and you set that flag in prethaw.
> 
> No, you should have set that flag in .suspend(), really. :-)

That doesn't work very well.  Not only does suspend() not
know the target state, but you don't want to trash the
controller state if you're getting resumed after some kind
of fault in the suspend-to-disk path...

I'm hoping that explains the smiley!

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:40                                                       ` David Brownell
@ 2007-05-04 22:19                                                         ` Rafael J. Wysocki
  2007-05-07  1:05                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)) David Brownell
  2007-05-05 16:08                                                         ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Alan Stern
  1 sibling, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 22:19 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Friday, 4 May 2007 23:40, David Brownell wrote:
> On Friday 04 May 2007, Alan Stern wrote:
> > Rafael, David, and Pavel:
> > 
> > You all misunderstood the point I was trying to make.
> > 
> > ...
> > 
> > Let's look at the big picture.
> > 
> > Entering hibernation basically involves these steps:
> > 
> > 	1. Freeze tasks
> > 	2. Quiesce devices and drivers
> > 	3. Create snapshot
> > 	4. Reactivate devices and drivers
> > 	5. Save snapshot to disk
> > 	6. Prepare devices for wakeup
> > 	7. Power down (ACPI S4 on systems which support it)
> > 
> > Leaving hibernation involves a similar sequence which I won't discuss.
> > 
> > Notice that steps 1-5 above are _completely_ independent of all issues 
> > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> > carried out for hibernation to work, no matter how the system ends up 
> > getting shut down.
> 
> Not exactly.  Step 2 is supposed to be aware of the target state's
> capabilities, including what's wakeup-capable.  ACPI uses target
> device states to choose which _SxD methods to execute, etc.  (Or it
> should ... though come to think of it, I don't think I ever saw a
> hook whereby PCI could trigger that.)

Still, step 4 effectively undoes at least some things we did in 2.  At least
the GPEs should be enabled for normal operation so that we can save the image.

> > On the other hand, steps 6 and 7 aren't really needed for hibernation.  
> > You _could_ shut the system off completely (ACPI S5).  Automatic wakeup
> > wouldn't work, but the next time the user turned the computer on manually
> > it would still resume from hibernation.
> 
> I believe I did comment on your point that step 7 could use S5.
> 
> However, the ACPI spec *does* say up front (2.2 in ACPI 2.0C)
> that S5 == G2 "Soft OFF" is not a "sleeping" (G1) state.  (Then
> fuzzes the issue in 2.4, but those bits are less relevant here;
> 2.2 also mentions G3 = "Mechanical OFF", which is the only state
> in which machine disassembly/reassembly is expected to be safe.

But then there's the nice picture in 9.3.3 (OS loading) that shows how OSPM
(that would be us) can verify that the hardware configuration hasn't changed.

In fact we don't do this, because we always go to the "Load OS Images" block
and load the hibernation image from this newly loaded OS (aka the boot kernel).

Thus our resume is always different from the "ACPI wake up from S4".

> ACPI is allowed to distinguish between S4 and S5 in more ways
> than just the power usage.  It'd be fair for the AML to store
> state in something that retains power, and rely on that.  It'd
> be better not to do things that are allowed to confuse ACPI.

As far as I understand the specification, OSPM (ie. we) can always discard
the fact that the system has entered S4 and reinitialize everything from
scratch.

> > Conversely, steps 6 and 7 can make sense even in situations where you
> > don't want to hibernate.  For example, you might want a normal shutdown in
> > which the operating system does a full restart when the firmware is
> > signalled by a wakeup device.
> 
> Wakeup devices in S4 are expected to be a superset of those in S5,
> and system documentation often covers that.  Yeah, I know, "who
> bothers to RTFM".  Still, the point is that these systems are now
> documented to work in a particular way, and there really ought to
> be a good reason to invalidate user training and documentation.

That's a very important point, IMO.

> > So there should be separate data structures associated with 1-5 and 6-7.  
> > Maybe the one associated with 6-7 is what you are calling hibernation_ops;  
> > if so then fine.  But I still think that it should be usable for
> > situations where you are not entering hibernation, and we should be 
> > possible to enter hibernation without using it.  The system administrator 
> > should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
> > regardless of whether it's to start hibernating.
> 
> But ... why?  What value would users see from that?
> 
> We do have /sys/power/disk today, but that's only for
> hibernation.  (And it's a bit confusing, too.)
> 
> A "Soft OFF" should be S5 to conform to specs and
> documentation.
> 
> 
> > The ACPI spec might refer to S4 as "hibernation" (does it? -- I'm too lazy
> > to check and see), but that doesn't mean we have to use the terms
> > synonymously.
> 
> It talks S4 as a "sleeping" state, like S1, S2, and S3.
> Or, about S4 as a "Non-Volatle sleep" state
> 
> I think it also assumes more intelligence on resume-from-S4
> than Linux has just now, which may partly explain why it
> takes so long for swsusp to finish its thing.

Well, please look at the picture in 9.3.3 and compare it to what we're
doing. ;-)

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:54                                                         ` Johannes Berg
@ 2007-05-04 22:21                                                           ` Rafael J. Wysocki
  2007-05-05 15:37                                                             ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 22:21 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Friday, 4 May 2007 23:54, Johannes Berg wrote:
> On Fri, 2007-05-04 at 23:55 +0200, Rafael J. Wysocki wrote:
> > On Friday, 4 May 2007 23:23, Johannes Berg wrote:
> > > On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> > > 
> > > > Actually, prethaw is to prevent drivers loaded before the image is restored
> > > > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > > > .resume() routines.
> > > 
> > > Oh, but it can, you could have a flag in your driver saying "the next
> > > resume is after restore" and you set that flag in prethaw.
> > 
> > No, you should have set that flag in .suspend(), really. :-)
> 
> Yeah, whatever. You can fix the problem but it's ugly. Let's come up
> with a good way to do the 6 callbacks mentioned in some other thread
> earlier.

This is the plan, but we need to do some preparations.

For example, I think, we should introduce some consistent terminology, so that
we *always* know what we're talking about.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:53                                                                   ` Johannes Berg
@ 2007-05-04 22:25                                                                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 22:25 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list

On Friday, 4 May 2007 23:53, Johannes Berg wrote:
> On Fri, 2007-05-04 at 23:53 +0200, Rafael J. Wysocki wrote:
> > > Plus, it may have some parts related to the communications with operating
> > > system (*)... I guess we need to save those, and parts related to hw
> > > state... where your suggestion makes sense.
> > 
> > If they are accessible to us, then we can, but what if they aren't (eg. the
> > state information is stored in the embedded controller, can only be read with
> > the help of some AML invocations and cannot be changed from the OS level)?
> 
> Well, in that case you also haven't overwritten/changed them during
> restore so there's no room for mismatches and confusion.

Not if we went for S5 to finish the hibernation and then we try to be smart and
rely on the BIOS-provided information/functionality *as though* we had passed
through S4.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 22:12                                                         ` David Brownell
@ 2007-05-04 22:31                                                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-04 22:31 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-pm, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Saturday, 5 May 2007 00:12, David Brownell wrote:
> On Friday 04 May 2007, Rafael J. Wysocki wrote:
> > On Friday, 4 May 2007 23:23, Johannes Berg wrote:
> > > On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> > > 
> > > > Actually, prethaw is to prevent drivers loaded before the image is restored
> > > > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > > > .resume() routines.
> > > 
> > > Oh, but it can, you could have a flag in your driver saying "the next
> > > resume is after restore" and you set that flag in prethaw.
> > 
> > No, you should have set that flag in .suspend(), really. :-)
> 
> That doesn't work very well.  Not only does suspend() not
> know the target state, but you don't want to trash the
> controller state if you're getting resumed after some kind
> of fault in the suspend-to-disk path...
> 
> I'm hoping that explains the smiley!

Yes, among other things (like that passing anything from prethaw to .resume()
really doesn't work unless the data are stored in a device ;-)).

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 22:21                                                           ` Rafael J. Wysocki
@ 2007-05-05 15:37                                                             ` Alan Stern
  2007-05-05 18:49                                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 15:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Sat, 5 May 2007, Rafael J. Wysocki wrote:

> > Yeah, whatever. You can fix the problem but it's ugly. Let's come up
> > with a good way to do the 6 callbacks mentioned in some other thread
> > earlier.
> 
> This is the plan, but we need to do some preparations.
> 
> For example, I think, we should introduce some consistent terminology, so that
> we *always* know what we're talking about.

A proposal:

For suspend-to-RAM we already have suspend() and resume().  At the 
possible cost of introducing some confusion, I think it makes sense to 
keep those method names.

For hibernation we need these:

	pre_snapshot()
	post_snapshot()
	pre_restore()
	post_restore()

In addition we may want to have early/late variations on these (for use 
after interrupts have been disabled), which would lead to:

	pre_snapshot()
	pre_snapshot_late()
	post_snapshot_early()
	post_snapshot()
	pre_restore()
	pre_restore_late()
	post_restore_early()
	post_restore()

Yes, it's a large list...  But it seems to be necessary for providing all 
the information drivers will need.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 22:00                                               ` David Brownell
@ 2007-05-05 15:49                                                 ` Alan Stern
  2007-05-07  1:10                                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)) David Brownell
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 15:49 UTC (permalink / raw)
  To: David Brownell
  Cc: Linux-pm mailing list, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Fri, 4 May 2007, David Brownell wrote:

> > You are describing the difference between ACPI S4 and S5, but I was 
> > talking about the difference between "normal" poweroff and "hibernate" 
> > poweroff.  There doesn't seem to be any reason why we must always have
> > 
> > 	hibernate = S4    and     normal = S5.
> 
> What the ACPI spec describes for the "Non-Volatile Sleep" is
> that either S4 or S5 could match "hibernate" ... but for
> a software-controlled "poweroff", only S5 is appropriate.
> 
> That's a reason.  Another:  pretty much all end-user docs
> on this stuff match what ACPI says.
> 
> Lacking compelling reasons to violate specs (like them
> being clearly broken), I avoid breaking them.

Again you misunderstand.  I concede that either S4 or S5 is appropriate
for "Non-Volatile Sleep" whereas only S5 is appropriate for
software-controlled "poweroff".

But who says that hibernate has to use "Non-Volatile Sleep" and normal 
shutdown has to use software-controlled "poweroff"?  Why shouldn't the 
user be able to do it the other way 'round?


> > > Non-ACPI systems can make the same natural distinctions.
> > 
> > On such systems there seems to be even less reason for those equalities 
> > (or rather, their analogs).
> 
> This is one of those "less is more" things, right?  :)
> 
> People doing embedded designs _like_ their flexibility.
> 
> It's common to have multiple power levels.  If you mean
> that they _could_ give up that flexibility and only use
> one of those state analogues, yes they could ... but if
> you mean they'd see that as a Good Thing, I doubt it.

No, no!  That's not what I mean.  I'm proposing that we offer the user
_more_ flexibility by giving a choice of power levels.  The user should be
able to choose whether the system uses "Non-Volatile Sleep" vs.
software-controlled "poweroff"; the choice shouldn't be dictated by
whether or not the system is entering hibernation.

> I guess I don't see why you want to throw away all the
> work the hardware (and/or software) designers did to
> ensure that some key devices use a "retention" mode
> in their S4-analogue state.
> 
> Me, I always thought that leveraging those retention
> states was a great way to shrink wakeup times and get
> more functionality.

I can't imagine why you think I proposed anything along those lines.


> > > You're suggesting Linux not use the S5 state, essentially.
> > 
> > No, I'm suggesting that the user should be able to control whether Linux 
> > uses S4 vs. S5 at poweroff time.  If the user selected always to use S4 
> > then wakeup devices would function in both hibernation and normal 
> > shutdown.  If the user selected always to use S5 then wakeup devices would 
> > not function in either hibernation or normal shutdown.
> 
> That's a different suggestion, yes.  I'm not sure I see any
> benefit of that flexibility for "soft off" states though,
> especially if it made "off" consume more power.

The benefit is that it allows more devices to function as wakeup sources, 
right?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:53                                                                 ` Rafael J. Wysocki
  2007-05-04 21:53                                                                   ` Johannes Berg
@ 2007-05-05 15:52                                                                   ` Alan Stern
  2007-05-07  1:16                                                                     ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 15:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Fri, 4 May 2007, Rafael J. Wysocki wrote:

> > Plus, it may have some parts related to the communications with operating
> > system (*)... I guess we need to save those, and parts related to hw
> > state... where your suggestion makes sense.
> 
> If they are accessible to us, then we can, but what if they aren't (eg. the
> state information is stored in the embedded controller, can only be read with
> the help of some AML invocations and cannot be changed from the OS level)?
> 
> > (*) and yes, there probably are such parts. If we set backlight to
> > 20%, we'll be confused if it is 100% after resume... we probably could
> > handle those one-by-one...
> 
> *If* we reinitialize devices *and* ACPI from scratch after restoring the image,
> we'll discard the old value (20%) and read the new value (100%) from the BIOS.
> The problems occur, IMO, because we try to be smart and use the BIOS
> after the resume as though we'd resumed from a real suspend (eg. s2ram).
> 
> Which is natural, if we use the same set of .resume() callbacks for both cases.

Agreed, these all sound like problems in the ACPI driver's implementation 
of suspend and resume.  Problems that are caused (at least in part) by the 
fact that the PM core doesn't tell the driver whether it's doing
suspend-to-RAM vs. hibernation.  Once that is straighened out, everything 
else should become much simpler.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:40                                                       ` David Brownell
  2007-05-04 22:19                                                         ` Rafael J. Wysocki
@ 2007-05-05 16:08                                                         ` Alan Stern
  2007-05-05 17:50                                                           ` Rafael J. Wysocki
  2007-05-07  1:31                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
  1 sibling, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-05 16:08 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Fri, 4 May 2007, David Brownell wrote:

> > Entering hibernation basically involves these steps:
> > 
> > 	1. Freeze tasks
> > 	2. Quiesce devices and drivers
> > 	3. Create snapshot
> > 	4. Reactivate devices and drivers
> > 	5. Save snapshot to disk
> > 	6. Prepare devices for wakeup
> > 	7. Power down (ACPI S4 on systems which support it)
> > 
> > Leaving hibernation involves a similar sequence which I won't discuss.
> > 
> > Notice that steps 1-5 above are _completely_ independent of all issues 
> > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> > carried out for hibernation to work, no matter how the system ends up 
> > getting shut down.
> 
> Not exactly.  Step 2 is supposed to be aware of the target state's
> capabilities, including what's wakeup-capable.

Not true.  Step 2 is (or should be) divorced from power-level
considerations.  All it needs to do is quiesce things so that a consistent
snapshot can be obtained; changing power levels would take time and
ideally should be avoided.  Furthermore, anything done in step 2 should be
reversed in step 4.

Did you mean to say that Step _6_ is supposed to be aware of the target 
state's capabilities?  I'll agree to that.


> However, the ACPI spec *does* say up front (2.2 in ACPI 2.0C)
> that S5 == G2 "Soft OFF" is not a "sleeping" (G1) state.  (Then
> fuzzes the issue in 2.4, but those bits are less relevant here;
> 2.2 also mentions G3 = "Mechanical OFF", which is the only state
> in which machine disassembly/reassembly is expected to be safe.

Sure.  But entering hibernation need not involve putting the system into a 
"sleeping" state.  Going into G3 should also work for hibernation.

> ACPI is allowed to distinguish between S4 and S5 in more ways
> than just the power usage.  It'd be fair for the AML to store
> state in something that retains power, and rely on that.  It'd
> be better not to do things that are allowed to confuse ACPI.

None of that should matter for post-snapshot-restore processing.  The 
boot kernel interacts with ACPI when the system wakes up; the restored 
kernel is handed an already-running BIOS, which it should do its best to 
reinitialize from the existing hardware state.


> > possible to enter hibernation without using it.  The system administrator 
> > should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
> > regardless of whether it's to start hibernating.
> 
> But ... why?  What value would users see from that?
> 
> We do have /sys/power/disk today, but that's only for
> hibernation.  (And it's a bit confusing, too.)

Yes.  I'm proposing that it be generalized.  (And it should be renamed,
too -- that's a separate issue.)

I'm also pointing out that the policy choice decided by the contents of 
/sys/power/disk comes into play during steps 6-7 above, but not at all in 
steps 1-5.  Hence any associated software structures should explicitly be 
connected only with steps 6 and 7.

And since normal shutdown ought to have its own analog of steps 6 and 7, 
the same software structures should be used there.  Hence naming them 
"hibernation_ops" isn't a good idea.


> I think it also assumes more intelligence on resume-from-S4
> than Linux has just now, which may partly explain why it
> takes so long for swsusp to finish its thing.

And it may explain some of the strange behavior people sometimes observe
when they try to hibernate twice in a row.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:23                                                     ` Johannes Berg
  2007-05-04 21:55                                                       ` Rafael J. Wysocki
@ 2007-05-05 16:15                                                       ` Alan Stern
  1 sibling, 0 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-05 16:15 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek

On Fri, 4 May 2007, Johannes Berg wrote:

> On Fri, 2007-05-04 at 23:11 +0200, Rafael J. Wysocki wrote:
> 
> > Actually, prethaw is to prevent drivers loaded before the image is restored
> > from doing unreasonable things.  It doesn't have any effect on the drivers'
> > .resume() routines.
> 
> Oh, but it can, you could have a flag in your driver saying "the next
> resume is after restore" and you set that flag in prethaw.

You're both wrong.  PRETHAW is to prevent drivers present in the image
from doing reasonable-but-wrong things (because they were misled by
actions taken by the boot kernel or the BIOS before the image was
restored).  It gives the boot kernel's driver a chance to put the device
in a state which won't be misleading.

And while you could have a flag in your driver saying "the next resume is
after restore", setting it during PRETHAW would accomplish nothing.  
PRETHAW occurs immediately before the image is restored, which means the
flag would get overwritten by the contents of the image.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-04 21:24                                                         ` Rafael J. Wysocki
@ 2007-05-05 16:19                                                           ` Alan Stern
  2007-05-05 17:46                                                             ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 16:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Fri, 4 May 2007, Rafael J. Wysocki wrote:

> > > Entering hibernation basically involves these steps:
> > > 
> > > 	1. Freeze tasks
> > > 
> > > 	2. Quiesce devices and drivers
> > > 
> > > 	3. Create snapshot
> > > 
> > > 	4. Reactivate devices and drivers
> > > 
> > > 	5. Save snapshot to disk
> > > 
> > > 	6. Prepare devices for wakeup
> > > 
> > > 	7. Power down (ACPI S4 on systems which support it)
> > > 
> > > Leaving hibernation involves a similar sequence which I won't discuss.
> > > 
> > > Notice that steps 1-5 above are _completely_ independent of all issues 
> > > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to
> > > be
> > 
> > No, they are not. You probably should tell ACPI at step 2 that you are
> > suspending,

At step 2 you don't _know_ that you are suspending!  Step 5 might fail.  
You should tell ACPI during step 6 or 7.

> You can, but even if you don't, the BIOS shouldn't have problems.  What might
> have problems is our ACPI code during the resume, if it cannot get appropriate
> information from the BIOS.
> 
> > and you definitely need to tell ACPI that you have resumed 
> > (so it can re-scan AC adapters, for example).
> 
> Yes, but that can be done in two different ways:
> 
> 1) "We have restored the hibernation image, but the BIOS state corresponds to
> a fresh reboot, so please initialize everything from scratch."
> 
> 2) "We have restored the hibernation image and the ACPI S4 was used for
> powering off (hint: you may try not to initialize everything from scratch)."
> 
> Of course, in the case 2) we are responsible for ensuring that the contents of
> the hibernation image are consistent with the information preserved by the
> BIOS.

Keep in mind also that before you can do either 1) or 2), the boot kernel 
has already communicated with the BIOS, possibly changing some of the ACPI 
state.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 16:19                                                           ` Alan Stern
@ 2007-05-05 17:46                                                             ` Rafael J. Wysocki
  2007-05-05 21:42                                                               ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 17:46 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Saturday, 5 May 2007 18:19, Alan Stern wrote:
> On Fri, 4 May 2007, Rafael J. Wysocki wrote:
> 
> > > > Entering hibernation basically involves these steps:
> > > > 
> > > > 	1. Freeze tasks
> > > > 
> > > > 	2. Quiesce devices and drivers
> > > > 
> > > > 	3. Create snapshot
> > > > 
> > > > 	4. Reactivate devices and drivers
> > > > 
> > > > 	5. Save snapshot to disk
> > > > 
> > > > 	6. Prepare devices for wakeup
> > > > 
> > > > 	7. Power down (ACPI S4 on systems which support it)
> > > > 
> > > > Leaving hibernation involves a similar sequence which I won't discuss.
> > > > 
> > > > Notice that steps 1-5 above are _completely_ independent of all issues 
> > > > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to
> > > > be
> > > 
> > > No, they are not. You probably should tell ACPI at step 2 that you are
> > > suspending,
> 
> At step 2 you don't _know_ that you are suspending!  Step 5 might fail.  
> You should tell ACPI during step 6 or 7.
> 
> > You can, but even if you don't, the BIOS shouldn't have problems.  What might
> > have problems is our ACPI code during the resume, if it cannot get appropriate
> > information from the BIOS.
> > 
> > > and you definitely need to tell ACPI that you have resumed 
> > > (so it can re-scan AC adapters, for example).
> > 
> > Yes, but that can be done in two different ways:
> > 
> > 1) "We have restored the hibernation image, but the BIOS state corresponds to
> > a fresh reboot, so please initialize everything from scratch."
> > 
> > 2) "We have restored the hibernation image and the ACPI S4 was used for
> > powering off (hint: you may try not to initialize everything from scratch)."
> > 
> > Of course, in the case 2) we are responsible for ensuring that the contents of
> > the hibernation image are consistent with the information preserved by the
> > BIOS.
> 
> Keep in mind also that before you can do either 1) or 2), the boot kernel 
> has already communicated with the BIOS, possibly changing some of the ACPI 
> state.

That's correct, but it follows from the ACPI spec that there is a way for the
boot kernel to distinguish 'normal' boot from 'S4 resume' boot.  If this
mechanism is used and the boot kernel states that it's doing a 'S4 resume',
it will be able to leave ACPI alone and restore the hibernation image.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 16:08                                                         ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Alan Stern
@ 2007-05-05 17:50                                                           ` Rafael J. Wysocki
  2007-05-05 21:43                                                             ` Alan Stern
  2007-05-07  1:31                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 17:50 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Saturday, 5 May 2007 18:08, Alan Stern wrote:
> On Fri, 4 May 2007, David Brownell wrote:
> 
> > > Entering hibernation basically involves these steps:
> > > 
> > > 	1. Freeze tasks
> > > 	2. Quiesce devices and drivers
> > > 	3. Create snapshot
> > > 	4. Reactivate devices and drivers
> > > 	5. Save snapshot to disk
> > > 	6. Prepare devices for wakeup
> > > 	7. Power down (ACPI S4 on systems which support it)
> > > 
> > > Leaving hibernation involves a similar sequence which I won't discuss.
> > > 
> > > Notice that steps 1-5 above are _completely_ independent of all issues 
> > > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> > > carried out for hibernation to work, no matter how the system ends up 
> > > getting shut down.
> > 
> > Not exactly.  Step 2 is supposed to be aware of the target state's
> > capabilities, including what's wakeup-capable.
> 
> Not true.  Step 2 is (or should be) divorced from power-level
> considerations.  All it needs to do is quiesce things so that a consistent
> snapshot can be obtained; changing power levels would take time and
> ideally should be avoided.  Furthermore, anything done in step 2 should be
> reversed in step 4.
> 
> Did you mean to say that Step _6_ is supposed to be aware of the target 
> state's capabilities?  I'll agree to that.
> 
> 
> > However, the ACPI spec *does* say up front (2.2 in ACPI 2.0C)
> > that S5 == G2 "Soft OFF" is not a "sleeping" (G1) state.  (Then
> > fuzzes the issue in 2.4, but those bits are less relevant here;
> > 2.2 also mentions G3 = "Mechanical OFF", which is the only state
> > in which machine disassembly/reassembly is expected to be safe.
> 
> Sure.  But entering hibernation need not involve putting the system into a 
> "sleeping" state.  Going into G3 should also work for hibernation.
> 
> > ACPI is allowed to distinguish between S4 and S5 in more ways
> > than just the power usage.  It'd be fair for the AML to store
> > state in something that retains power, and rely on that.  It'd
> > be better not to do things that are allowed to confuse ACPI.
> 
> None of that should matter for post-snapshot-restore processing.  The 
> boot kernel interacts with ACPI when the system wakes up; the restored 
> kernel is handed an already-running BIOS, which it should do its best to 
> reinitialize from the existing hardware state.
> 
> 
> > > possible to enter hibernation without using it.  The system administrator 
> > > should be able to choose which of S4 or S5 gets used for _any_ poweroff, 
> > > regardless of whether it's to start hibernating.
> > 
> > But ... why?  What value would users see from that?
> > 
> > We do have /sys/power/disk today, but that's only for
> > hibernation.  (And it's a bit confusing, too.)
> 
> Yes.  I'm proposing that it be generalized.  (And it should be renamed,
> too -- that's a separate issue.)
> 
> I'm also pointing out that the policy choice decided by the contents of 
> /sys/power/disk comes into play during steps 6-7 above, but not at all in 
> steps 1-5.  Hence any associated software structures should explicitly be 
> connected only with steps 6 and 7.

At present, this policy choice does affect the earlier steps too.
 
> And since normal shutdown ought to have its own analog of steps 6 and 7, 
> the same software structures should be used there.  Hence naming them 
> "hibernation_ops" isn't a good idea.
> 
> 
> > I think it also assumes more intelligence on resume-from-S4
> > than Linux has just now, which may partly explain why it
> > takes so long for swsusp to finish its thing.
> 
> And it may explain some of the strange behavior people sometimes observe
> when they try to hibernate twice in a row.

Yes, this seems to be the case.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 15:37                                                             ` Alan Stern
@ 2007-05-05 18:49                                                               ` Rafael J. Wysocki
  2007-05-05 21:44                                                                 ` Alan Stern
  2007-05-07  8:51                                                                 ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
  0 siblings, 2 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 18:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Saturday, 5 May 2007 17:37, Alan Stern wrote:
> On Sat, 5 May 2007, Rafael J. Wysocki wrote:
> 
> > > Yeah, whatever. You can fix the problem but it's ugly. Let's come up
> > > with a good way to do the 6 callbacks mentioned in some other thread
> > > earlier.
> > 
> > This is the plan, but we need to do some preparations.
> > 
> > For example, I think, we should introduce some consistent terminology, so that
> > we *always* know what we're talking about.
> 
> A proposal:
> 
> For suspend-to-RAM we already have suspend() and resume().  At the 
> possible cost of introducing some confusion, I think it makes sense to 
> keep those method names.

I agree.

> For hibernation we need these:
> 
> 	pre_snapshot()
> 	post_snapshot()
> 	pre_restore()
> 	post_restore()
> 
> In addition we may want to have early/late variations on these (for use 
> after interrupts have been disabled), which would lead to:
> 
> 	pre_snapshot()
> 	pre_snapshot_late()
> 	post_snapshot_early()
> 	post_snapshot()
> 	pre_restore()
> 	pre_restore_late()
> 	post_restore_early()
> 	post_restore()
> 
> Yes, it's a large list...  But it seems to be necessary for providing all 
> the information drivers will need.

I think we may need yet another callback, executed before pre_snapshot()
and before we shrink memory during the hibernation, to be used by drivers
that need a lot of additional memory in pre_snapshot().

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 17:46                                                             ` Rafael J. Wysocki
@ 2007-05-05 21:42                                                               ` Alan Stern
  2007-05-05 22:14                                                                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 21:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Sat, 5 May 2007, Rafael J. Wysocki wrote:

> > > Yes, but that can be done in two different ways:
> > > 
> > > 1) "We have restored the hibernation image, but the BIOS state corresponds to
> > > a fresh reboot, so please initialize everything from scratch."
> > > 
> > > 2) "We have restored the hibernation image and the ACPI S4 was used for
> > > powering off (hint: you may try not to initialize everything from scratch)."
> > > 
> > > Of course, in the case 2) we are responsible for ensuring that the contents of
> > > the hibernation image are consistent with the information preserved by the
> > > BIOS.
> > 
> > Keep in mind also that before you can do either 1) or 2), the boot kernel 
> > has already communicated with the BIOS, possibly changing some of the ACPI 
> > state.
> 
> That's correct, but it follows from the ACPI spec that there is a way for the
> boot kernel to distinguish 'normal' boot from 'S4 resume' boot.  If this
> mechanism is used and the boot kernel states that it's doing a 'S4 resume',
> it will be able to leave ACPI alone and restore the hibernation image.

Okay, good.  That means part of the resume-from-hibernation handling must
be included in the standard startup code of the ACPI driver, because it 
runs in the boot kernel rather than the restored kernel.  Does it work 
that way now?  You'd think it must...

The restored kernel could do either 1) or 2), I don't see that it matters
much which.  1) might be safer, because it's possible that external power
was turned off at some point during the hibernation (and no battery power 
was available).

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 17:50                                                           ` Rafael J. Wysocki
@ 2007-05-05 21:43                                                             ` Alan Stern
  2007-05-05 22:16                                                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 21:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Sat, 5 May 2007, Rafael J. Wysocki wrote:

> > I'm also pointing out that the policy choice decided by the contents of 
> > /sys/power/disk comes into play during steps 6-7 above, but not at all in 
> > steps 1-5.  Hence any associated software structures should explicitly be 
> > connected only with steps 6 and 7.
> 
> At present, this policy choice does affect the earlier steps too.

Isn't this then another aspect of hibernation needing to be fixed?  Or is 
there some genuine reason I'm not aware of that the choice of shutdown 
method should affect those steps?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 18:49                                                               ` Rafael J. Wysocki
@ 2007-05-05 21:44                                                                 ` Alan Stern
  2007-05-05 22:36                                                                   ` Rafael J. Wysocki
  2007-05-07  8:51                                                                 ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
  1 sibling, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-05 21:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Sat, 5 May 2007, Rafael J. Wysocki wrote:

> > In addition we may want to have early/late variations on these (for use 
> > after interrupts have been disabled), which would lead to:
> > 
> > 	pre_snapshot()
> > 	pre_snapshot_late()
> > 	post_snapshot_early()
> > 	post_snapshot()
> > 	pre_restore()
> > 	pre_restore_late()
> > 	post_restore_early()
> > 	post_restore()
> > 
> > Yes, it's a large list...  But it seems to be necessary for providing all 
> > the information drivers will need.
> 
> I think we may need yet another callback, executed before pre_snapshot()
> and before we shrink memory during the hibernation, to be used by drivers
> that need a lot of additional memory in pre_snapshot().

	pre_snapshot_early()

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 21:42                                                               ` Alan Stern
@ 2007-05-05 22:14                                                                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 22:14 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Saturday, 5 May 2007 23:42, Alan Stern wrote:
> On Sat, 5 May 2007, Rafael J. Wysocki wrote:
> 
> > > > Yes, but that can be done in two different ways:
> > > > 
> > > > 1) "We have restored the hibernation image, but the BIOS state corresponds to
> > > > a fresh reboot, so please initialize everything from scratch."
> > > > 
> > > > 2) "We have restored the hibernation image and the ACPI S4 was used for
> > > > powering off (hint: you may try not to initialize everything from scratch)."
> > > > 
> > > > Of course, in the case 2) we are responsible for ensuring that the contents of
> > > > the hibernation image are consistent with the information preserved by the
> > > > BIOS.
> > > 
> > > Keep in mind also that before you can do either 1) or 2), the boot kernel 
> > > has already communicated with the BIOS, possibly changing some of the ACPI 
> > > state.
> > 
> > That's correct, but it follows from the ACPI spec that there is a way for the
> > boot kernel to distinguish 'normal' boot from 'S4 resume' boot.  If this
> > mechanism is used and the boot kernel states that it's doing a 'S4 resume',
> > it will be able to leave ACPI alone and restore the hibernation image.
> 
> Okay, good.  That means part of the resume-from-hibernation handling must
> be included in the standard startup code of the ACPI driver, because it 
> runs in the boot kernel rather than the restored kernel.  Does it work 
> that way now?  You'd think it must...

Well, I'm not sure, but I don't think so.  It looks like the ACPI code that we
use in the hibernation/suspend code paths is not in a good shape in general.

IOW, we may want to implement that in the future, but I'd rather like to get
1) working reliably for everyone first.

> The restored kernel could do either 1) or 2), I don't see that it matters
> much which.  1) might be safer, because it's possible that external power
> was turned off at some point during the hibernation (and no battery power 
> was available).

I think that the 'ACPI S4' handling adds quite a lot of complexity to the
picture and should be added on top of a working infrastructure, as an
extension.

Currently, we don't handle the hibernation in accordance with the ACPI spec
anyway.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 21:43                                                             ` Alan Stern
@ 2007-05-05 22:16                                                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 22:16 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Saturday, 5 May 2007 23:43, Alan Stern wrote:
> On Sat, 5 May 2007, Rafael J. Wysocki wrote:
> 
> > > I'm also pointing out that the policy choice decided by the contents of 
> > > /sys/power/disk comes into play during steps 6-7 above, but not at all in 
> > > steps 1-5.  Hence any associated software structures should explicitly be 
> > > connected only with steps 6 and 7.
> > 
> > At present, this policy choice does affect the earlier steps too.
> 
> Isn't this then another aspect of hibernation needing to be fixed?  Or is 
> there some genuine reason I'm not aware of that the choice of shutdown 
> method should affect those steps?

Well, I think it should be fixed, but I'm afraid that'll take a *lot* of time.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 21:44                                                                 ` Alan Stern
@ 2007-05-05 22:36                                                                   ` Rafael J. Wysocki
  2007-05-06 22:01                                                                     ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-05 22:36 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Saturday, 5 May 2007 23:44, Alan Stern wrote:
> On Sat, 5 May 2007, Rafael J. Wysocki wrote:
> 
> > > In addition we may want to have early/late variations on these (for use 
> > > after interrupts have been disabled), which would lead to:
> > > 
> > > 	pre_snapshot()
> > > 	pre_snapshot_late()
> > > 	post_snapshot_early()
> > > 	post_snapshot()
> > > 	pre_restore()
> > > 	pre_restore_late()
> > > 	post_restore_early()
> > > 	post_restore()
> > > 
> > > Yes, it's a large list...  But it seems to be necessary for providing all 
> > > the information drivers will need.
> > 
> > I think we may need yet another callback, executed before pre_snapshot()
> > and before we shrink memory during the hibernation, to be used by drivers
> > that need a lot of additional memory in pre_snapshot().
> 
> 	pre_snapshot_early()

OK

So, I think the hibernation code ordering should be like this (let's forget
about ACPI for now):

1) tasks are frozen
2) pre_snapshot_early()
3) memory is freed for the snapshot image
4) pre_snapshod()
5) nonboot CPUs are offlined
6) IRQs are disabled
7) pre_snapshot_late()
8) sysdev_pre_snapshot()
9) snapshot image is created
10) sysdev_post_snapshot()
11) post_snapshot_early()
12) IRQs are enabled
13) nonboot CPUs are enabled
14) post_snapshot()
15) snapshot image is saved
16) device_shutdown()
17) system is powered off

Apart from this, we may need notifiers for subsystems that should do something
before the freezing and after the thawing of tasks (like FUSE etc.).

Also, if there's an error, we have to be able to thaw tasks after
post_snapshot() and continue running.

The restore code, IMO, should be like this (again, let's ignore ACPI for now):

1) boot kernel is started, initrd is loaded etc.
2) tasks are frozen
3) snapshot image is loaded
4) pre_restore()
5) nonboot CPUs are offlined
6) IRQs are disabled
7) pre_restore_late()
8) sysdev_pre_restore()
9) boot kernel is replaced with the 'hibernated' kernel
10) sysdev_post_restore()
11) post_restore_early()
12) IRQs are enabled
13) nonboot CPUs are enabled
14) post_restore()
15) tasks are thawed
16) system is running

and we may need a notifier for subsystems that should do something after
tasks have been thawed.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 22:36                                                                   ` Rafael J. Wysocki
@ 2007-05-06 22:01                                                                     ` Alan Stern
  2007-05-06 22:31                                                                       ` Rafael J. Wysocki
  2007-05-07  1:37                                                                       ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..) David Brownell
  0 siblings, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-06 22:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Sun, 6 May 2007, Rafael J. Wysocki wrote:

> > > I think we may need yet another callback, executed before pre_snapshot()
> > > and before we shrink memory during the hibernation, to be used by drivers
> > > that need a lot of additional memory in pre_snapshot().
> > 
> > 	pre_snapshot_early()
> 
> OK

I changed my mind -- pre_hibernate() seems like a better name.  There
could be a matching post_hibernate(), if anyone finds it necessary.  I
considered pre_freeze(), but that's not such a good choice since the
freezer can be used for other things in addition to hibernation.

> So, I think the hibernation code ordering should be like this (let's forget
> about ACPI for now):
> 
> 1) tasks are frozen
> 2) pre_snapshot_early()

Or rather: 2) pre_hibernate()

> 3) memory is freed for the snapshot image
> 4) pre_snapshod()
> 5) nonboot CPUs are offlined
> 6) IRQs are disabled
> 7) pre_snapshot_late()
> 8) sysdev_pre_snapshot()
> 9) snapshot image is created
> 10) sysdev_post_snapshot()
> 11) post_snapshot_early()
> 12) IRQs are enabled
> 13) nonboot CPUs are enabled
> 14) post_snapshot()
> 15) snapshot image is saved
> 16) device_shutdown()
> 17) system is powered off
> 
> Apart from this, we may need notifiers for subsystems that should do something
> before the freezing and after the thawing of tasks (like FUSE etc.).

Quite so.

> Also, if there's an error, we have to be able to thaw tasks after
> post_snapshot() and continue running.
> 
> The restore code, IMO, should be like this (again, let's ignore ACPI for now):
> 
> 1) boot kernel is started, initrd is loaded etc.
> 2) tasks are frozen
> 3) snapshot image is loaded
> 4) pre_restore()
> 5) nonboot CPUs are offlined
> 6) IRQs are disabled
> 7) pre_restore_late()
> 8) sysdev_pre_restore()
> 9) boot kernel is replaced with the 'hibernated' kernel
> 10) sysdev_post_restore()
> 11) post_restore_early()
> 12) IRQs are enabled
> 13) nonboot CPUs are enabled
> 14) post_restore()
> 15) tasks are thawed
> 16) system is running
> 
> and we may need a notifier for subsystems that should do something after
> tasks have been thawed.

It sounds good to me.  Now if only it were possible to get rid of those
pesky sysdevs...

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-06 22:01                                                                     ` Alan Stern
@ 2007-05-06 22:31                                                                       ` Rafael J. Wysocki
  2007-05-07  1:37                                                                       ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..) David Brownell
  1 sibling, 0 replies; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-06 22:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Johannes Berg, Pekka Enberg, linux-pm, Pavel Machek,
	Nigel Cunningham

On Monday, 7 May 2007 00:01, Alan Stern wrote:
> On Sun, 6 May 2007, Rafael J. Wysocki wrote:
> 
> > > > I think we may need yet another callback, executed before pre_snapshot()
> > > > and before we shrink memory during the hibernation, to be used by drivers
> > > > that need a lot of additional memory in pre_snapshot().
> > > 
> > > 	pre_snapshot_early()
> > 
> > OK
> 
> I changed my mind -- pre_hibernate() seems like a better name.

OK

> There could be a matching post_hibernate(), if anyone finds it necessary.  I
> considered pre_freeze(), but that's not such a good choice since the
> freezer can be used for other things in addition to hibernation.

Agreed.

> > So, I think the hibernation code ordering should be like this (let's forget
> > about ACPI for now):
> > 
> > 1) tasks are frozen
> > 2) pre_snapshot_early()
> 
> Or rather: 2) pre_hibernate()

OK

> > 3) memory is freed for the snapshot image
> > 4) pre_snapshod()
> > 5) nonboot CPUs are offlined
> > 6) IRQs are disabled
> > 7) pre_snapshot_late()
> > 8) sysdev_pre_snapshot()
> > 9) snapshot image is created
> > 10) sysdev_post_snapshot()
> > 11) post_snapshot_early()
> > 12) IRQs are enabled
> > 13) nonboot CPUs are enabled
> > 14) post_snapshot()
> > 15) snapshot image is saved
> > 16) device_shutdown()
> > 17) system is powered off
> > 
> > Apart from this, we may need notifiers for subsystems that should do something
> > before the freezing and after the thawing of tasks (like FUSE etc.).
> 
> Quite so.
> 
> > Also, if there's an error, we have to be able to thaw tasks after
> > post_snapshot() and continue running.
> > 
> > The restore code, IMO, should be like this (again, let's ignore ACPI for now):
> > 
> > 1) boot kernel is started, initrd is loaded etc.
> > 2) tasks are frozen
> > 3) snapshot image is loaded
> > 4) pre_restore()
> > 5) nonboot CPUs are offlined
> > 6) IRQs are disabled
> > 7) pre_restore_late()
> > 8) sysdev_pre_restore()
> > 9) boot kernel is replaced with the 'hibernated' kernel
> > 10) sysdev_post_restore()
> > 11) post_restore_early()
> > 12) IRQs are enabled
> > 13) nonboot CPUs are enabled
> > 14) post_restore()
> > 15) tasks are thawed
> > 16) system is running
> > 
> > and we may need a notifier for subsystems that should do something after
> > tasks have been thawed.
> 
> It sounds good to me.  Now if only it were possible to get rid of those
> pesky sysdevs...

I think that will be possible over time.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-04 22:19                                                         ` Rafael J. Wysocki
@ 2007-05-07  1:05                                                           ` David Brownell
  0 siblings, 0 replies; 117+ messages in thread
From: David Brownell @ 2007-05-07  1:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Friday 04 May 2007, Rafael J. Wysocki wrote:
> On Friday, 4 May 2007 23:40, David Brownell wrote:
> > On Friday 04 May 2007, Alan Stern wrote:

> > > 	1. Freeze tasks
> > > 	2. Quiesce devices and drivers
> > > 	3. Create snapshot
> > > 	4. Reactivate devices and drivers
> > > 	5. Save snapshot to disk
> > > 	6. Prepare devices for wakeup
> > > 	7. Power down (ACPI S4 on systems which support it)
> > > 
> > > Leaving hibernation involves a similar sequence which I won't discuss.
> > > 
> > > Notice that steps 1-5 above are _completely_ independent of all issues 
> > > concerning wakeup devices and S4 vs. S5 vs. whatever.  They have to be 
> > > carried out for hibernation to work, no matter how the system ends up 
> > > getting shut down.
> > 
> > Not exactly.  Step 2 is supposed to be aware of the target state's
> > capabilities, including what's wakeup-capable.  ACPI uses target
> > device states to choose which _SxD methods to execute, etc.  (Or it
> > should ... though come to think of it, I don't think I ever saw a
> > hook whereby PCI could trigger that.)

The hook is there, but it's not yet implemented ... patch in the
works.  Whoever implemented pci_choose_state() botched it up.

 
> Still, step 4 effectively undoes at least some things we did in 2.  At least
> the GPEs should be enabled for normal operation so that we can save the image.

And for that matter, wakeup shouldn't be limited to wake-from-sleep;
runtime device PM should be able to use it.  ACPI doesn't use GPEs
very well at all, except maybe runtime GPEs.  Step 6 needs to know
the same info, so it can enable the GPEs that work from S4.

 

> But then there's the nice picture in 9.3.3 (OS loading) that shows how OSPM
> (that would be us) can verify that the hardware configuration hasn't changed.
> 
> In fact we don't do this, because we always go to the "Load OS Images" block
> and load the hibernation image from this newly loaded OS (aka the boot kernel).
> 
> Thus our resume is always different from the "ACPI wake up from S4".

Right ... "slower" being one consequence.


> > ACPI is allowed to distinguish between S4 and S5 in more ways
> > than just the power usage.  It'd be fair for the AML to store
> > state in something that retains power, and rely on that.  It'd
> > be better not to do things that are allowed to confuse ACPI.
> 
> As far as I understand the specification, OSPM (ie. we) can always discard
> the fact that the system has entered S4 and reinitialize everything from
> scratch.

At the price of making some things needlessly misbehave.  Devices
that can wake from D3cold will detect state being trashed if you
re-init, which is at least sub-optimal if not wrong.


> >	 Still, the point is that these systems are now
> > documented to work in a particular way, and there really ought to
> > be a good reason to invalidate user training and documentation.
> 
> That's a very important point, IMO.

So I just re-quoted it.  ;)


> > A "Soft OFF" should be S5 to conform to specs and
> > documentation.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-05 15:49                                                 ` Alan Stern
@ 2007-05-07  1:10                                                   ` David Brownell
  2007-05-07 18:46                                                     ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07  1:10 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Saturday 05 May 2007, Alan Stern wrote:

> But who says that hibernate has to use "Non-Volatile Sleep" and normal 
> shutdown has to use software-controlled "poweroff"?  Why shouldn't the 
> user be able to do it the other way 'round?

Well, the definition of NVS matches hibernation, and
the definition of soft-off matches poweroff.


> > > No, I'm suggesting that the user should be able to control whether Linux 
> > > uses S4 vs. S5 at poweroff time.  If the user selected always to use S4 
> > > then wakeup devices would function in both hibernation and normal 
> > > shutdown.  If the user selected always to use S5 then wakeup devices would 
> > > not function in either hibernation or normal shutdown.
> > 
> > That's a different suggestion, yes.  I'm not sure I see any
> > benefit of that flexibility for "soft off" states though,
> > especially if it made "off" consume more power.
> 
> The benefit is that it allows more devices to function as wakeup sources, 
> right?

With downsides of "more power consumed during 'off' states"
and "invalidating documentation, training, and expectations".

This is a case where the fact that something could technically
be done doesn't recommend it to me.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-05 15:52                                                                   ` Alan Stern
@ 2007-05-07  1:16                                                                     ` David Brownell
  2007-05-07 21:00                                                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07  1:16 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Saturday 05 May 2007, Alan Stern wrote:

> Agreed, these all sound like problems in the ACPI driver's implementation 
> of suspend and resume.  Problems that are caused (at least in part) by the 
> fact that the PM core doesn't tell the driver whether it's doing
> suspend-to-RAM vs. hibernation.  Once that is straighened out, everything 
> else should become much simpler.

I'm not sure I agree with that diagnosis, but for the record:
updating drivers/pci/pci-acpi.c so that it can implement the
platform_pci_choose_state() hook requires ACPI to export that
information.

So for now I have drivers/acpi/sleep/main.c exporting

        s_state = acpi_get_target_sleep_state();

so that ACPI-aware code can know to call "_S3D" instead of
the "_S1D" or "_S4D" methods (and "_S3W" etc).  Of course
the $SUBJECT patch will finish borking that for S4.  :(

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-05 16:08                                                         ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Alan Stern
  2007-05-05 17:50                                                           ` Rafael J. Wysocki
@ 2007-05-07  1:31                                                           ` David Brownell
  2007-05-07 16:33                                                             ` Alan Stern
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07  1:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Saturday 05 May 2007, Alan Stern wrote:
> On Fri, 4 May 2007, David Brownell wrote:
> 
> Did you mean to say that Step _6_ is supposed to be aware of the target 
> state's capabilities?  I'll agree to that.

Yes ... but I don't see why it would be wrong for step 2 either.
If the device can't wake from S5, it wouldn't set up with the
assumption that was a possibility.


> > However, the ACPI spec *does* say up front (2.2 in ACPI 2.0C)
> > that S5 == G2 "Soft OFF" is not a "sleeping" (G1) state.  (Then
> > fuzzes the issue in 2.4, but those bits are less relevant here;
> > 2.2 also mentions G3 = "Mechanical OFF", which is the only state
> > in which machine disassembly/reassembly is expected to be safe.
> 
> Sure.  But entering hibernation need not involve putting the system into a 
> "sleeping" state.  Going into G3 should also work for hibernation.

For some definitions of "should"; that's where specs get fuzzy.

Since disassembly is allowed in G3, if you swapped a disk that
should prevent the system from resuming ... it should force a
boot-from-scratch.  But if you just swapped a power supply it
would probably work OK.


> I'm also pointing out that the policy choice decided by the contents of 
> /sys/power/disk comes into play during steps 6-7 above, but not at all in 
> steps 1-5.  Hence any associated software structures should explicitly be 
> connected only with steps 6 and 7.

The difference between S4 and S5 could matter to step 2 though.
Perhaps it's not the most likely thing, but certainly avoiding
the work to setup wake-from-S4 is reasonable when going to S5.

 
> And since normal shutdown ought to have its own analog of steps 6 and 7, 
> the same software structures should be used there.  Hence naming them 
> "hibernation_ops" isn't a good idea.

That's something of a different stance.  And it's untrue for
step 6 too ... suspend() and shutdown() differ a lot.  Maybe
if I saw some details, that would make more sense to me.


> > I think it also assumes more intelligence on resume-from-S4
> > than Linux has just now, which may partly explain why it
> > takes so long for swsusp to finish its thing.
> 
> And it may explain some of the strange behavior people sometimes observe
> when they try to hibernate twice in a row.

There's all kinds of bizarreness there.  I kind of get the
feeling the ACPI folk were so deluged by IRQ and other resource
setup issues (the "C" in ACPI) that the power management bits
(the "P") didn't get that much attention.  As pointed out very
recently by Rafael.  :)

Plus there's the issue that while this thread has touched a lot
on ACPI issues and models, Linux must not assume ACPI.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..)
  2007-05-06 22:01                                                                     ` Alan Stern
  2007-05-06 22:31                                                                       ` Rafael J. Wysocki
@ 2007-05-07  1:37                                                                       ` David Brownell
  2007-05-08  2:57                                                                         ` Greg KH
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07  1:37 UTC (permalink / raw)
  To: linux-pm; +Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg

On Sunday 06 May 2007, Alan Stern wrote:
> It sounds good to me.  Now if only it were possible to get rid of those
> pesky sysdevs...

Other than lack of patches ... is there a reason??
I thought that sysdevs were no longer needed.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy))
  2007-05-05 18:49                                                               ` Rafael J. Wysocki
  2007-05-05 21:44                                                                 ` Alan Stern
@ 2007-05-07  8:51                                                                 ` Johannes Berg
  1 sibling, 0 replies; 117+ messages in thread
From: Johannes Berg @ 2007-05-07  8:51 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: linux-pm, Pekka Enberg, Nigel Cunningham, Pavel Machek


[-- Attachment #1.1: Type: text/plain, Size: 504 bytes --]

On Sat, 2007-05-05 at 20:49 +0200, Rafael J. Wysocki wrote:

> I think we may need yet another callback, executed before pre_snapshot()
> and before we shrink memory during the hibernation, to be used by drivers
> that need a lot of additional memory in pre_snapshot().

I'm not sure we really need a callback here for that, your suspend
memory allocation chain seemed good enough since most drivers won't
actually be using it and it's not a hard requirement. Not that I care
much.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07  1:31                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
@ 2007-05-07 16:33                                                             ` Alan Stern
  2007-05-07 20:49                                                               ` Pavel Machek
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-07 16:33 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek,
	Linux-pm mailing list, Johannes Berg

On Sun, 6 May 2007, David Brownell wrote:

> On Saturday 05 May 2007, Alan Stern wrote:
> > On Fri, 4 May 2007, David Brownell wrote:
> > 
> > Did you mean to say that Step _6_ is supposed to be aware of the target 
> > state's capabilities?  I'll agree to that.
> 
> Yes ... but I don't see why it would be wrong for step 2 either.

The principle of information hiding: If step 2 doesn't _need_ to know 
the final target state (which it shouldn't!) then we ought not to tell it.

> If the device can't wake from S5, it wouldn't set up with the
> assumption that was a possibility.

But step 2 doesn't set up devices' wakeup functions.  It merely quiesces
them so the snapshot can be made safely.  Then step 4 reactivates the
devices, and step 6 takes care of setting up the devices for the final
sleep state.


> > Sure.  But entering hibernation need not involve putting the system into a 
> > "sleeping" state.  Going into G3 should also work for hibernation.
> 
> For some definitions of "should"; that's where specs get fuzzy.
> 
> Since disassembly is allowed in G3, if you swapped a disk that
> should prevent the system from resuming ... it should force a
> boot-from-scratch.  But if you just swapped a power supply it
> would probably work OK.

Yep.  The problem isn't so much in the specs; it's that no one has ever
(so far as I know) given a precise definition of what Linux's "hibernate"  
is supposed to do.  Is it supposed to be safe to disassemble a hibernating
computer?  Is remote wakeup necessarily supported?  I've never seen
answers to these questions.

> > I'm also pointing out that the policy choice decided by the contents of 
> > /sys/power/disk comes into play during steps 6-7 above, but not at all in 
> > steps 1-5.  Hence any associated software structures should explicitly be 
> > connected only with steps 6 and 7.
> 
> The difference between S4 and S5 could matter to step 2 though.
> Perhaps it's not the most likely thing, but certainly avoiding
> the work to setup wake-from-S4 is reasonable when going to S5.

I don't understand.  Step 2 doesn't do the work to set up wake-from-S4;  
step 6 does.  So why should the knowledge of S4 vs. S5 matter to step 2?

> > And since normal shutdown ought to have its own analog of steps 6 and 7, 
> > the same software structures should be used there.  Hence naming them 
> > "hibernation_ops" isn't a good idea.
> 
> That's something of a different stance.  And it's untrue for
> step 6 too ... suspend() and shutdown() differ a lot.  Maybe
> if I saw some details, that would make more sense to me.

It is true that for G3 type shutdown, step 6 can be empty.  We don't need 
to do anything to the devices or drivers, we just turn off all the power.  
Still, the empty set _is_ a set.  :-)

Here's another way to express my ideas: We want to support at least two 
different kinds of powered-down states:

	(A) Remote wakeup may be enabled on some devices, there can be
	    a certain power drain on the batteries or power line, it may 
	    not be safe to disassemble the machine, etc.

	(B) Remote wakeup is completely disabled, there is no power
	    drain at all, it is safe to disassemble the machine provided
	    you don't switch components like disks, etc.

(With (B) it should always be _physically_ safe to switch disks and other
components.  Whether it is _logically_ safe depends on what happens the
next time you start the machine: Will you try to restore a saved memory
image or not?  This isn't directly related to the nature of the
powered-down state except for the obvious fact that you can't restore an
image if no image has been saved.)

I don't see any reason why (A) and (B) shouldn't both be allowed for 
hibernate, as in fact they are now by way of /sys/power/disk.  And I don't 
see any reason why they shouldn't both be allowed for normal non-hibernate 
shutdowns as well.

Furthermore, the choice of whether to use (A) or (B) shouldn't matter 
during steps 1-5 of the hibernate sequence.  It should matter during steps 
6-7 and during normal shutdown (which doesn't have steps 1-5 since it 
doesn't save a memory image).

> Plus there's the issue that while this thread has touched a lot
> on ACPI issues and models, Linux must not assume ACPI.

Yes indeed.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07  1:10                                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)) David Brownell
@ 2007-05-07 18:46                                                     ` Alan Stern
  2007-05-07 21:29                                                       ` Rafael J. Wysocki
  2007-05-07 21:43                                                       ` David Brownell
  0 siblings, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-07 18:46 UTC (permalink / raw)
  To: David Brownell
  Cc: Linux-pm mailing list, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Sun, 6 May 2007, David Brownell wrote:

> On Saturday 05 May 2007, Alan Stern wrote:
> 
> > But who says that hibernate has to use "Non-Volatile Sleep" and normal 
> > shutdown has to use software-controlled "poweroff"?  Why shouldn't the 
> > user be able to do it the other way 'round?
> 
> Well, the definition of NVS matches hibernation, and
> the definition of soft-off matches poweroff.

Okay, I read sections 2.2 and 2.4 of the ACPI 3.0 spec.  Here's the story
in a nutshell:

	G3 = "mechanical off" = no wakeup devices are enabled,
				safe to disassemble
	G2/S5 = "soft off" = wakeup may be enabled, not safe to
				disassemble
	S4 = "non-volatile sleep" = hibernation, memory image is saved
	S5 = "soft off" = almost the same as S4 except there is no
				memory image

The spec does not explicitly associate S4 with either G2 or G3, and in
fact it contains language suggesting very strongly that the system could
be in either one.  The spec also uses the same name for G2 and for S5, no 
doubt leading to extra levels of confusion.

So there's no question that S4 = NVS = hibernation.  But hibernation
can involve either G2 or G3.

And there's no question (in my mind at least) that normal shutdown should
be able to involve either G2/S5 or G3.  So although the spec doesn't put 
things quite this way, we could say:

	hibernation = S4 = G2/S4 or G3/S4,

	shutdown = S5 = G2/S5 or G3/S5.

Thus the choice between S4 vs. S5 is made at the very start, and steps 1-5 
are executed only for S4.  The choice between G2 vs. G3 can be (and should 
be!) deferred until steps 6-7.


> > > That's a different suggestion, yes.  I'm not sure I see any
> > > benefit of that flexibility for "soft off" states though,
> > > especially if it made "off" consume more power.
> > 
> > The benefit is that it allows more devices to function as wakeup sources, 
> > right?
> 
> With downsides of "more power consumed during 'off' states"
> and "invalidating documentation, training, and expectations".

Okay, let's clear up the confusion.  The additional flexibility I'm 
suggesting for "soft off" = G2 states is that we should allow both G2/S4 
and G2/S5.  They would consume the same amount of power since they are 
both G2 states; the difference is that G2/S4 involves saving and restoring 
a memory image and G2/S5 does not.

This does not invalidate any documentation or training so far as I know.  
And as for expectations...  That's a little harder.  What people _expect_ 
of Linux and what Linux actually _does_ don't always jibe well, owing to 
lack of sufficient documentation -- typical of Open Source projects.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 16:33                                                             ` Alan Stern
@ 2007-05-07 20:49                                                               ` Pavel Machek
  2007-05-07 21:38                                                                 ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-07 20:49 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list,
	Johannes Berg

Hi!

> It is true that for G3 type shutdown, step 6 can be empty.  We don't need 
> to do anything to the devices or drivers, we just turn off all the power.  
> Still, the empty set _is_ a set.  :-)
> 
> Here's another way to express my ideas: We want to support at least two 
> different kinds of powered-down states:
> 
> 	(A) Remote wakeup may be enabled on some devices, there can be
> 	    a certain power drain on the batteries or power line, it may 
> 	    not be safe to disassemble the machine, etc.
> 
> 	(B) Remote wakeup is completely disabled, there is no power
> 	    drain at all, it is safe to disassemble the machine provided
> 	    you don't switch components like disks, etc.
> 
...
> 
> I don't see any reason why (A) and (B) shouldn't both be allowed for 
> hibernate, as in fact they are now by way of /sys/power/disk.  And I don't 
> see any reason why they shouldn't both be allowed for normal non-hibernate 
> shutdowns as well.

No, sorry, that does not work. Software can't select (A) vs. (B). Only
user can, by physically switching real power switch, or by unplugging
the machine.

And yes, there's documentation about expectations of swsusp, in
Doc*/power/swsusp.txt.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07  1:16                                                                     ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
@ 2007-05-07 21:00                                                                       ` Rafael J. Wysocki
  2007-05-07 21:45                                                                         ` David Brownell
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-07 21:00 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Monday, 7 May 2007 03:16, David Brownell wrote:
> On Saturday 05 May 2007, Alan Stern wrote:
> 
> > Agreed, these all sound like problems in the ACPI driver's implementation 
> > of suspend and resume.  Problems that are caused (at least in part) by the 
> > fact that the PM core doesn't tell the driver whether it's doing
> > suspend-to-RAM vs. hibernation.  Once that is straighened out, everything 
> > else should become much simpler.
> 
> I'm not sure I agree with that diagnosis, but for the record:
> updating drivers/pci/pci-acpi.c so that it can implement the
> platform_pci_choose_state() hook requires ACPI to export that
> information.
> 
> So for now I have drivers/acpi/sleep/main.c exporting
> 
>         s_state = acpi_get_target_sleep_state();
> 
> so that ACPI-aware code can know to call "_S3D" instead of
> the "_S1D" or "_S4D" methods (and "_S3W" etc).  Of course
> the $SUBJECT patch will finish borking that for S4.  :(

Why exactly?

Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 18:46                                                     ` Alan Stern
@ 2007-05-07 21:29                                                       ` Rafael J. Wysocki
  2007-05-07 22:22                                                         ` Alan Stern
  2007-05-07 21:43                                                       ` David Brownell
  1 sibling, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-07 21:29 UTC (permalink / raw)
  To: linux-pm; +Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg

On Monday, 7 May 2007 20:46, Alan Stern wrote:
> On Sun, 6 May 2007, David Brownell wrote:
> 
> > On Saturday 05 May 2007, Alan Stern wrote:
> > 
> > > But who says that hibernate has to use "Non-Volatile Sleep" and normal 
> > > shutdown has to use software-controlled "poweroff"?  Why shouldn't the 
> > > user be able to do it the other way 'round?
> > 
> > Well, the definition of NVS matches hibernation, and
> > the definition of soft-off matches poweroff.
> 
> Okay, I read sections 2.2 and 2.4 of the ACPI 3.0 spec.  Here's the story
> in a nutshell:
> 
> 	G3 = "mechanical off" = no wakeup devices are enabled,
> 				safe to disassemble
> 	G2/S5 = "soft off" = wakeup may be enabled, not safe to
> 				disassemble
> 	S4 = "non-volatile sleep" = hibernation, memory image is saved
> 	S5 = "soft off" = almost the same as S4 except there is no
> 				memory image
> 
> The spec does not explicitly associate S4 with either G2 or G3, and in
> fact it contains language suggesting very strongly that the system could
> be in either one.  The spec also uses the same name for G2 and for S5, no 
> doubt leading to extra levels of confusion.

Well, it's quite clearly stated in 4.5 and in 15 that S4 belongs to G1.
Moreover, it's reiterated several times in different places that
S5 Soft off = G2.

> So there's no question that S4 = NVS = hibernation.  But hibernation
> can involve either G2 or G3.

Not according to ACPI.

> And there's no question (in my mind at least) that normal shutdown should
> be able to involve either G2/S5 or G3.  So although the spec doesn't put 
> things quite this way, we could say:
> 
> 	hibernation = S4 = G2/S4 or G3/S4,
> 
> 	shutdown = S5 = G2/S5 or G3/S5.
> 
> Thus the choice between S4 vs. S5 is made at the very start, and steps 1-5 
> are executed only for S4.  The choice between G2 vs. G3 can be (and should 
> be!) deferred until steps 6-7.

The problem is that ACPI insists on treating S4 as a sleeping state.

Still, I agree that what we do in steps 1 - 5 should be independent of
whether or not we're going to enter S4.  Devices should not be
suspended before creating the image, because the system is not going to
enter any power state *at that time*.  There seems to be no reason whatsoever
for putting devices in low power states for creating the hibernation image.

> > > > That's a different suggestion, yes.  I'm not sure I see any
> > > > benefit of that flexibility for "soft off" states though,
> > > > especially if it made "off" consume more power.
> > > 
> > > The benefit is that it allows more devices to function as wakeup sources, 
> > > right?
> > 
> > With downsides of "more power consumed during 'off' states"
> > and "invalidating documentation, training, and expectations".
> 
> Okay, let's clear up the confusion.  The additional flexibility I'm 
> suggesting for "soft off" = G2 states is that we should allow both G2/S4 
> and G2/S5.  They would consume the same amount of power since they are 
> both G2 states; the difference is that G2/S4 involves saving and restoring 
> a memory image and G2/S5 does not.

There's nothing like G2/S4 in ACPI and we shouldn't refer to such a notion to
avoid confusion.

That's why I said that what we want to call 'hibernation' is and will probably
always be different from an ACPI transition to S4 (at least until we make a
bootloader capable of reading suspend images and ACPI-aware).

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 20:49                                                               ` Pavel Machek
@ 2007-05-07 21:38                                                                 ` Alan Stern
  2007-05-08  0:30                                                                   ` Pavel Machek
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-07 21:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list,
	Johannes Berg

On Mon, 7 May 2007, Pavel Machek wrote:

> > 	(A) Remote wakeup may be enabled on some devices, there can be
> > 	    a certain power drain on the batteries or power line, it may 
> > 	    not be safe to disassemble the machine, etc.
> > 
> > 	(B) Remote wakeup is completely disabled, there is no power
> > 	    drain at all, it is safe to disassemble the machine provided
> > 	    you don't switch components like disks, etc.
> > 
> ...
> > 
> > I don't see any reason why (A) and (B) shouldn't both be allowed for 
> > hibernate, as in fact they are now by way of /sys/power/disk.  And I don't 
> > see any reason why they shouldn't both be allowed for normal non-hibernate 
> > shutdowns as well.
> 
> No, sorry, that does not work. Software can't select (A) vs. (B). Only
> user can, by physically switching real power switch, or by unplugging
> the machine.

Okay.  Then what exactly is the difference between the kind of poweroff we 
do during hibernate (say with "platform" in /sys/power/disk) and the kind 
of poweroff we do during a normal system shutdown?

> And yes, there's documentation about expectations of swsusp, in
> Doc*/power/swsusp.txt.

It says this near the start:

 * 		If you change
 * your hardware while system is suspended... well, it was not good idea;
 * but it will probably only crash.

with similar warnings elsewhere.

This appears to refer to confusion in the kernel after the image is 
restored; it doesn't seem to mean that you could damage equipment or 
electrocute yourself.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 18:46                                                     ` Alan Stern
  2007-05-07 21:29                                                       ` Rafael J. Wysocki
@ 2007-05-07 21:43                                                       ` David Brownell
  2007-05-07 22:41                                                         ` Alan Stern
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07 21:43 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Monday 07 May 2007, Alan Stern wrote:
> On Sun, 6 May 2007, David Brownell wrote:
> 
> > On Saturday 05 May 2007, Alan Stern wrote:
> > 
> > > But who says that hibernate has to use "Non-Volatile Sleep" and normal 
> > > shutdown has to use software-controlled "poweroff"?  Why shouldn't the 
> > > user be able to do it the other way 'round?
> > 
> > Well, the definition of NVS matches hibernation, and
> > the definition of soft-off matches poweroff.
> 
> Okay, I read sections 2.2 and 2.4 of the ACPI 3.0 spec.  Here's the story
> in a nutshell:
> 
> 	G3 = "mechanical off" = no wakeup devices are enabled,
> 				safe to disassemble
> 	G2/S5 = "soft off" = wakeup may be enabled, not safe to
> 				disassemble
> 	S4 = "non-volatile sleep" = hibernation, memory image is saved
> 	S5 = "soft off" = almost the same as S4 except there is no
> 				memory image

This summary suggests there are two S5 states, which I believe
is incorrect.  G2 is just another name for S5.  See Fig 3-1;
the ACPI 2.0 spec has the same figure.

Also, section 2.2 highlights that after S5 the OS restarts,
which it doesn't do from S4 (table 2-1) ... although when it
describes S4/NVS it fuzzes that issue by saying the key issue
is whether an NVS state file is found and used, not the level
of power available.


> The spec does not explicitly associate S4 with either G2 or G3, and in
> fact it contains language suggesting very strongly that the system could
> be in either one.  The spec also uses the same name for G2 and for S5, no 
> doubt leading to extra levels of confusion.

Figure 3-1 seemed quite explicit to me ... S4 is one of the G1
states, S5 is the only G2 state, and G3 is is a different beast.
Text elsewhere agrees with that.

What's confusing is how it describes NVS/hibernate.  It's very
explicitly a G1 state.  But leaving G2 or G3 can also trigger
a resume-from-NVS ... according to the text in 2.2 but not the
state diagrams, which don't show entering G3 even cleanly, much
less uncleanly (like a neighborhood power failure).  Bleech.

I think the implication is that going to either G2 or G3 "off"
states discards something that a G1 state preserves.  But I'd
have to search more deeply to see if that's clearly defined.
It's suggestive that there are no "_S5D" or "_S5W" methods;
such wake events would evidently be managed by BIOS not OSPM.


> So there's no question that S4 = NVS = hibernation.  But hibernation
> can involve either G2 or G3.

I suspect there's a reason this part of ACPI is so vague;
it may relate to the desire to allow direct BIOS handling
of the NVS state.

 
> And there's no question (in my mind at least) that normal shutdown should
> be able to involve either G2/S5 or G3.

G2/S5, yes ... that can be entered under software control.

But by definition, not G3 since it requires a mechanical/manual
power switch update.  ("Mechanical OFF", or in the spec's example
"movement of a large red switch".)


> So although the spec doesn't put  
> things quite this way, we could say:
> 
> 	hibernation = S4 = G2/S4 or G3/S4,
> 
> 	shutdown = S5 = G2/S5 or G3/S5.

No, you're missing the key "mechanical" red-switch-ish step in G3.

G3 *can't* be entered under software control.  By definition.  It's
there for among other things regulatory reasons ... the only power
consumed in G3 is from the on-board RTC battery.


> > > > That's a different suggestion, yes.  I'm not sure I see any
> > > > benefit of that flexibility for "soft off" states though,
> > > > especially if it made "off" consume more power.
> > > 
> > > The benefit is that it allows more devices to function as wakeup sources, 
> > > right?
> > 
> > With downsides of "more power consumed during 'off' states"
> > and "invalidating documentation, training, and expectations".
> 
> Okay, let's clear up the confusion.  The additional flexibility I'm 
> suggesting for "soft off" = G2 states is that we should allow both G2/S4 
> and G2/S5.  They would consume the same amount of power since they are 
> both G2 states; the difference is that G2/S4 involves saving and restoring 
> a memory image and G2/S5 does not.

There is no G2/S4 state; it's G1/S4 or G2/S5.  And S5 does not
involve an NVS file, or it'd be S4.  The ACPI spec is sadly
vague in those areas, however.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 21:00                                                                       ` Rafael J. Wysocki
@ 2007-05-07 21:45                                                                         ` David Brownell
  2007-05-07 22:16                                                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-07 21:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Monday 07 May 2007, Rafael J. Wysocki wrote:
> On Monday, 7 May 2007 03:16, David Brownell wrote:

> > So for now I have drivers/acpi/sleep/main.c exporting
> > 
> >         s_state = acpi_get_target_sleep_state();
> > 
> > so that ACPI-aware code can know to call "_S3D" instead of
> > the "_S1D" or "_S4D" methods (and "_S3W" etc).  Of course
> > the $SUBJECT patch will finish borking that for S4.  :(
> 
> Why exactly?

Because it adds new code paths ... currently pm_ops methods
record the target state.  Fixable later.

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 21:45                                                                         ` David Brownell
@ 2007-05-07 22:16                                                                           ` Rafael J. Wysocki
  2007-05-09 19:23                                                                             ` David Brownell
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-07 22:16 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Monday, 7 May 2007 23:45, David Brownell wrote:
> On Monday 07 May 2007, Rafael J. Wysocki wrote:
> > On Monday, 7 May 2007 03:16, David Brownell wrote:
> 
> > > So for now I have drivers/acpi/sleep/main.c exporting
> > > 
> > >         s_state = acpi_get_target_sleep_state();
> > > 
> > > so that ACPI-aware code can know to call "_S3D" instead of
> > > the "_S1D" or "_S4D" methods (and "_S3W" etc).  Of course
> > > the $SUBJECT patch will finish borking that for S4.  :(
> > 
> > Why exactly?
> 
> Because it adds new code paths ... currently pm_ops methods
> record the target state.  Fixable later.

Hmm, I think hibernation_ops do the equivalent of what pm_ops did for
ACPI_STATE_S4 and the target state is still recorded (in
acpi_enter_sleep_state_prep()).  Isn't that correct?

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 21:29                                                       ` Rafael J. Wysocki
@ 2007-05-07 22:22                                                         ` Alan Stern
  2007-05-07 22:47                                                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 117+ messages in thread
From: Alan Stern @ 2007-05-07 22:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Mon, 7 May 2007, Rafael J. Wysocki wrote:

> > 	G3 = "mechanical off" = no wakeup devices are enabled,
> > 				safe to disassemble
> > 	G2/S5 = "soft off" = wakeup may be enabled, not safe to
> > 				disassemble
> > 	S4 = "non-volatile sleep" = hibernation, memory image is saved
> > 	S5 = "soft off" = almost the same as S4 except there is no
> > 				memory image
> > 
> > The spec does not explicitly associate S4 with either G2 or G3, and in
> > fact it contains language suggesting very strongly that the system could
> > be in either one.  The spec also uses the same name for G2 and for S5, no 
> > doubt leading to extra levels of confusion.
> 
> Well, it's quite clearly stated in 4.5 and in 15 that S4 belongs to G1.
> Moreover, it's reiterated several times in different places that
> S5 Soft off = G2.

More confusion in the spec...  It describes two different kinds of S4
states!

I was talking about "S4 Non-Volatile Sleep", defined on p.20 just above
Table 2-1.  The text says this:

	The machine will then enter the S4 state.  When the system
	leaves the Soft Off or Mechanical Off state,...

That's a pretty clear indication that S4-NVS involves G2 or G3.

You're talking about "S4 Sleeping State", defined on p.22, section 2.4.  
Evidently these two "S4" states are quite different.

> The problem is that ACPI insists on treating S4 as a sleeping state.

Section 2.4 is rather confusing.  What I gather is that S4 and S5 are 
essentially the same except for the presence or absence of a stored 
memory snapshot.  And yet S4 counts as a sleeping state while S5 doesn't.  
What's the explanation for that?

> Still, I agree that what we do in steps 1 - 5 should be independent of
> whether or not we're going to enter S4.  Devices should not be
> suspended before creating the image, because the system is not going to
> enter any power state *at that time*.  There seems to be no reason whatsoever
> for putting devices in low power states for creating the hibernation image.

Agreed.


> There's nothing like G2/S4 in ACPI and we shouldn't refer to such a notion to
> avoid confusion.

Except for the text on p.20.

> That's why I said that what we want to call 'hibernation' is and will probably
> always be different from an ACPI transition to S4 (at least until we make a
> bootloader capable of reading suspend images and ACPI-aware).

In what sense is the boot kernel different from a "bootloader"?  It 
certainly is capable of reading suspend images and is ACPI-aware.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 21:43                                                       ` David Brownell
@ 2007-05-07 22:41                                                         ` Alan Stern
  0 siblings, 0 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-07 22:41 UTC (permalink / raw)
  To: David Brownell
  Cc: Linux-pm mailing list, Pekka Enberg, Johannes Berg, Pavel Machek,
	Nigel Cunningham

On Mon, 7 May 2007, David Brownell wrote:

> This summary suggests there are two S5 states, which I believe
> is incorrect.  G2 is just another name for S5.  See Fig 3-1;
> the ACPI 2.0 spec has the same figure.
> 
> Also, section 2.2 highlights that after S5 the OS restarts,
> which it doesn't do from S4 (table 2-1) ... although when it
> describes S4/NVS it fuzzes that issue by saying the key issue
> is whether an NVS state file is found and used, not the level
> of power available.

It also says that the NVS state file is found and used when the system
leaves the Soft Off (G2) or Mechanical Off (G3) state.  How did it enter
either of those states in the first place if S4-NVS is a Sleeping (G1)
state?

I imagine that business about the OS not restarting from S4-NVS is 
intended to mean the OS continues from the restored image rather than 
starting over completely fresh.

> Figure 3-1 seemed quite explicit to me ... S4 is one of the G1
> states, S5 is the only G2 state, and G3 is is a different beast.
> Text elsewhere agrees with that.

Yes, okay.

> What's confusing is how it describes NVS/hibernate.  It's very
> explicitly a G1 state.  But leaving G2 or G3 can also trigger
> a resume-from-NVS ... according to the text in 2.2 but not the
> state diagrams, which don't show entering G3 even cleanly, much
> less uncleanly (like a neighborhood power failure).  Bleech.

You can understand my confusion...

> I think the implication is that going to either G2 or G3 "off"
> states discards something that a G1 state preserves.  But I'd
> have to search more deeply to see if that's clearly defined.

Or what it is that gets discarded.  Especially since 2.4 lists only one 
difference between S5 and S4: whether or not there is a saved image.

> I suspect there's a reason this part of ACPI is so vague;
> it may relate to the desire to allow direct BIOS handling
> of the NVS state.

Could be.  I wish the spec was more upfront about its vagueness,
explaining what has been left out and why instead of just skipping over
some things and contradicting itself.

> G2/S5, yes ... that can be entered under software control.
> 
> But by definition, not G3 since it requires a mechanical/manual
> power switch update.  ("Mechanical OFF", or in the spec's example
> "movement of a large red switch".)

Okay, I understand that now.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 22:22                                                         ` Alan Stern
@ 2007-05-07 22:47                                                           ` Rafael J. Wysocki
  2007-05-08 14:56                                                             ` Alan Stern
  0 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-07 22:47 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Tuesday, 8 May 2007 00:22, Alan Stern wrote:
> On Mon, 7 May 2007, Rafael J. Wysocki wrote:
> 
> > > 	G3 = "mechanical off" = no wakeup devices are enabled,
> > > 				safe to disassemble
> > > 	G2/S5 = "soft off" = wakeup may be enabled, not safe to
> > > 				disassemble
> > > 	S4 = "non-volatile sleep" = hibernation, memory image is saved
> > > 	S5 = "soft off" = almost the same as S4 except there is no
> > > 				memory image
> > > 
> > > The spec does not explicitly associate S4 with either G2 or G3, and in
> > > fact it contains language suggesting very strongly that the system could
> > > be in either one.  The spec also uses the same name for G2 and for S5, no 
> > > doubt leading to extra levels of confusion.
> > 
> > Well, it's quite clearly stated in 4.5 and in 15 that S4 belongs to G1.
> > Moreover, it's reiterated several times in different places that
> > S5 Soft off = G2.
> 
> More confusion in the spec...  It describes two different kinds of S4
> states!
> 
> I was talking about "S4 Non-Volatile Sleep", defined on p.20 just above
> Table 2-1.  The text says this:
> 
> 	The machine will then enter the S4 state.  When the system
> 	leaves the Soft Off or Mechanical Off state,...
> 
> That's a pretty clear indication that S4-NVS involves G2 or G3.
> 
> You're talking about "S4 Sleeping State", defined on p.22, section 2.4.  
> Evidently these two "S4" states are quite different.
> 
> > The problem is that ACPI insists on treating S4 as a sleeping state.
> 
> Section 2.4 is rather confusing.  What I gather is that S4 and S5 are 
> essentially the same except for the presence or absence of a stored 
> memory snapshot.  And yet S4 counts as a sleeping state while S5 doesn't.  
> What's the explanation for that?

As far as I understand it, for S4 the platform provides a means for verifying
if the hardware wasn't changed too much while the system was "sleeping" (via
the NVS memory region).

> > Still, I agree that what we do in steps 1 - 5 should be independent of
> > whether or not we're going to enter S4.  Devices should not be
> > suspended before creating the image, because the system is not going to
> > enter any power state *at that time*.  There seems to be no reason whatsoever
> > for putting devices in low power states for creating the hibernation image.
> 
> Agreed.
> 
> 
> > There's nothing like G2/S4 in ACPI and we shouldn't refer to such a notion to
> > avoid confusion.
> 
> Except for the text on p.20.

Yes, this is very confusing.  I think what they wanted to say there is that the
image restore could in principle happen when the system is started after being
in a "power off" state.  In that case, however, it wouldn't be known if it's
safe to restore the image and continue, because the hardware might have
changed.  For this reason, a special "sleeping" state is needed such that when
leaving it, the PM software can detect any (substantial) hardware changes
before even loading the entire image.

> > That's why I said that what we want to call 'hibernation' is and will probably
> > always be different from an ACPI transition to S4 (at least until we make a
> > bootloader capable of reading suspend images and ACPI-aware).
> 
> In what sense is the boot kernel different from a "bootloader"?  It 
> certainly is capable of reading suspend images and is ACPI-aware.

The boot loader uses the BIOS to read from disks and it can avoid initializing
ACPI.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 21:38                                                                 ` Alan Stern
@ 2007-05-08  0:30                                                                   ` Pavel Machek
  0 siblings, 0 replies; 117+ messages in thread
From: Pavel Machek @ 2007-05-08  0:30 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Linux-pm mailing list,
	Johannes Berg

Hi!

> It says this near the start:
> 
>  * 		If you change
>  * your hardware while system is suspended... well, it was not good idea;
>  * but it will probably only crash.
> 
> with similar warnings elsewhere.
> 
> This appears to refer to confusion in the kernel after the image is 
> restored; it doesn't seem to mean that you could damage equipment or 
> electrocute yourself.

For electrocuting, see product manual :-). Basically, you have to
unplug PC from AC power physically in order to open it. shutdown -h
now is _not_ enough. For notebooks, remove battery, too.
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..)
  2007-05-07  1:37                                                                       ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..) David Brownell
@ 2007-05-08  2:57                                                                         ` Greg KH
  0 siblings, 0 replies; 117+ messages in thread
From: Greg KH @ 2007-05-08  2:57 UTC (permalink / raw)
  To: David Brownell
  Cc: Pekka Enberg, linux-pm, Nigel Cunningham, Johannes Berg,
	Pavel Machek

On Sun, May 06, 2007 at 06:37:36PM -0700, David Brownell wrote:
> On Sunday 06 May 2007, Alan Stern wrote:
> > It sounds good to me.  Now if only it were possible to get rid of those
> > pesky sysdevs...
> 
> Other than lack of patches ... is there a reason??
> I thought that sysdevs were no longer needed.

I would love to get rid of them, patches gladly accepted :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-07 22:47                                                           ` Rafael J. Wysocki
@ 2007-05-08 14:56                                                             ` Alan Stern
  2007-05-08 19:59                                                               ` Rafael J. Wysocki
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-08 14:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Tue, 8 May 2007, Rafael J. Wysocki wrote:

> As far as I understand it, for S4 the platform provides a means for verifying
> if the hardware wasn't changed too much while the system was "sleeping" (via
> the NVS memory region).

Rereading p.20, it appears to go the other way: The system checks for
hardware changes when booting from Soft Off.  Or perhaps it always checks.  
I guess there aren't supposed to be any hardware changes while in S4,
since then it's not safe to disassemble the machine.

Sounds a lot like USB's power sessions...

> Yes, this is very confusing.  I think what they wanted to say there is that the
> image restore could in principle happen when the system is started after being
> in a "power off" state.  In that case, however, it wouldn't be known if it's
> safe to restore the image and continue, because the hardware might have
> changed.  For this reason, a special "sleeping" state is needed such that when
> leaving it, the PM software can detect any (substantial) hardware changes
> before even loading the entire image.

And apparently the bootloader is not expected to restore the memory image
if the hardware has changed too much.

So here's the current state of my understanding of ACPI:

	S4 is the lowest-power Sleep state.  RAM is not powered, the OS
	has stored a non-volatile memory image somewhere, and some ACPI
	state is maintained.

	S5 is misnamed, in that it isn't really a Sleep state at all --
	it's an Off state.  In fact, it is the state the computer enters
	when you first plug it in (or insert the battery).

If the OS stores a memory image and then switches to S5, at reboot the
bootloader will probably try to restore it.  (That's what p.20 says.)  
And if the user unplugs the computer (removes the battery) while it is in
S4, then upon replugging the computer will enter S5.  Thus, when waking
from either S4 or S5 the bootloader will try to restore an image if one
can be found (and if the hardware hasn't changed too much and if the user
doesn't abort the restore).

I've never encountered any documentation saying that you shouldn't unplug 
the computer while it's in hibernation.  It doesn't look like you would 
lose much by doing so, except that perhaps not as many wakeup devices are 
functional in S5 as in S4.

Now as for how all this relates to Linux:

What we do for hibernation is not an exact match for either S4 or S5.  It
may be closest to S4, but we don't use a bootloader.  Instead the boot
kernel does some sort of ACPI reset and restores the memory image all by
itself.  Whatever ACPI state information may be saved in the image is not
accessible to the boot kernel.  Conversely, the information about whether
we booted from S4 or from S5 is lost when the image overwrites the boot
kernel.

As a result, hibernation is capable of using either S4 or S5 -- as it must
be, since the user could always unplug the computer while it's in S4 --
although perhaps when using S4 it manages to confuse ACPI somewhat through
not matching the spec's expectations.

What do the differences between S4 and S5 amount to?  As far as I can 
tell, they look like this:

	ACPI expects there to be a memory image in S4.  In S5 there
	may or may not be an image.

	ACPI expects that when resuming from S4, the kernel will
	continue using some preserved ACPI state.  It expects that 
	when starting from S5, the kernel will need to reinitialize
	pretty much all the ACPI state.

	S4 involves a larger power consumption and may allow for
	more wakeup devices than S5.

And how do these relate to Linux?

	In fact, ACPI has no way of knowing whether or not there is an
	image.  The kernel is perfectly free to do whatever it wants.

	The boot kernel can't make much use of the state preserved by
	ACPI because it doesn't have access to the image kernel's
	records.  It needs to reinitialize ACPI no matter what.
	Consequently the restored kernel cannot use any preserved ACPI
	state, since this state gets wiped out by the boot kernel.
	Information about hardware changes might be available to the
	boot kernel, which could in principle then decide not to restore 
	the image.  It's not clear that this would be a good idea.  In
	any case, ACPI is limited to knowledge about devices on the
	motherboard -- it knows nothing about hotplugged devices, which
	makes the information less useful.

	Hibernation allows the user to choose whether to go to S4 or S5
	by means of /sys/power/disk.  Therefore the user gets to decide
	how the power-consumption vs. wakeup-functionality tradeoff
	should be made.

In short, the boot kernel should do whatever it needs to in order to make
ACPI happy.  This might involve telling ACPI that it has successfully
resumed from S4, even though the boot kernel is unaware of system state at
the start of hibernation.  In fact, the boot kernel has to take care of
all this before it even knows whether a valid image exists in the swap
partition.

Putting this together, it says that there should be no impediment to doing
a fresh boot from S4; i.e., not restoring a memory image but simply
letting the boot kernel continue on with a normal startup.  The corollary
is that there should be no impediment to entering S4 during a normal
shutdown.

>From the user's point of view, the differences between S4 and S5 amount to
just these: power consumption and availability of wakeup devices.  
(Perhaps also the presence of a blinking LED -- but in my experience the
blinking LED indicates STR, not hibernation.)  In the end, this is nothing 
more than the usual tradeoff between power usage and functionality.

We give the user a chance to decide how this tradeoff should go when 
entering hibernation.  Why not also give the user a chance to decide the 
tradeoff during normal shutdown?

Yes, it violates the spec in the sense that we would be entering S4 
without saving a memory image.  But we _already_ violate the spec by not 
using a bootloader to restore the image.  I don't see this as being any 
worse.


Finally, what about non-ACPI systems?  Basically this boils down to two 
choices:

	Should a memory image be stored?

	How much power/wakeup-functionality should the system
	consume/provide while it is down?

The first choice is decided by the user, by either entering hibernation or 
shutting down.  Why shouldn't the second also be decided by the user?

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-08 14:56                                                             ` Alan Stern
@ 2007-05-08 19:59                                                               ` Rafael J. Wysocki
  2007-05-08 21:26                                                                 ` Alan Stern
  2007-05-09  8:17                                                               ` Pavel Machek
  2007-05-09 19:35                                                               ` David Brownell
  2 siblings, 1 reply; 117+ messages in thread
From: Rafael J. Wysocki @ 2007-05-08 19:59 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Tuesday, 8 May 2007 16:56, Alan Stern wrote:
> On Tue, 8 May 2007, Rafael J. Wysocki wrote:
> 
> > As far as I understand it, for S4 the platform provides a means for verifying
> > if the hardware wasn't changed too much while the system was "sleeping" (via
> > the NVS memory region).
> 
> Rereading p.20, it appears to go the other way: The system checks for
> hardware changes when booting from Soft Off.

Nope.  That's clarified later on.  Please read Section 15, "Waking and
Sleeping" (it's short ;-)), in particular 15.3.3.

> Or perhaps it always checks.  I guess there aren't supposed to be any
> hardware changes while in S4, since then it's not safe to disassemble the
> machine.

That's correct, and that's why the hardware signature in FACS is needed for S4
(according to the spec), while it's not needed for the wake up from "Soft Off"
(S5).

> Sounds a lot like USB's power sessions...

Well, not exactly that.  The hardware signature in FACS only covers some
"essential" hardware (I'm not sure what that is, probably depends on the
platform design).

> > Yes, this is very confusing.  I think what they wanted to say there is that the
> > image restore could in principle happen when the system is started after being
> > in a "power off" state.  In that case, however, it wouldn't be known if it's
> > safe to restore the image and continue, because the hardware might have
> > changed.  For this reason, a special "sleeping" state is needed such that when
> > leaving it, the PM software can detect any (substantial) hardware changes
> > before even loading the entire image.
> 
> And apparently the bootloader is not expected to restore the memory image
> if the hardware has changed too much.

Yes.

> So here's the current state of my understanding of ACPI:
> 
> 	S4 is the lowest-power Sleep state.  RAM is not powered, the OS
> 	has stored a non-volatile memory image somewhere, and some ACPI
> 	state is maintained.

That's correct, AFAICS.

> 	S5 is misnamed, in that it isn't really a Sleep state at all --
> 	it's an Off state.  In fact, it is the state the computer enters
> 	when you first plug it in (or insert the battery).

Yes.

> If the OS stores a memory image and then switches to S5, at reboot the
> bootloader will probably try to restore it.  (That's what p.20 says.)  

That may happen.  The bootloader will probably check if there's the image
and if it's there, it will try compare the hardware signature in the image with
the one in FACS.  If the test is passed, it will attempt to restore the image
(this is illustrated in the picture in 15.3.3, BTW).

> And if the user unplugs the computer (removes the battery) while it is in
> S4, then upon replugging the computer will enter S5.  Thus, when waking
> from either S4 or S5 the bootloader will try to restore an image if one
> can be found (and if the hardware hasn't changed too much and if the user
> doesn't abort the restore).

That's correct.

> I've never encountered any documentation saying that you shouldn't unplug 
> the computer while it's in hibernation.  It doesn't look like you would 
> lose much by doing so, except that perhaps not as many wakeup devices are 
> functional in S5 as in S4.
> 
> Now as for how all this relates to Linux:
> 
> What we do for hibernation is not an exact match for either S4 or S5.  It
> may be closest to S4, but we don't use a bootloader.  Instead the boot
> kernel does some sort of ACPI reset and restores the memory image all by
> itself.  Whatever ACPI state information may be saved in the image is not
> accessible to the boot kernel.

In principle, it could be, but we don't use it in the boot kernel.

> Conversely, the information about whether we booted from S4 or from S5
> is lost when the image overwrites the boot kernel.

Yes.

> As a result, hibernation is capable of using either S4 or S5 -- as it must
> be, since the user could always unplug the computer while it's in S4 --
> although perhaps when using S4 it manages to confuse ACPI somewhat through
> not matching the spec's expectations.
> 
> What do the differences between S4 and S5 amount to?  As far as I can 
> tell, they look like this:
> 
> 	ACPI expects there to be a memory image in S4.  In S5 there
> 	may or may not be an image.
> 
> 	ACPI expects that when resuming from S4, the kernel will
> 	continue using some preserved ACPI state.  It expects that 
> 	when starting from S5, the kernel will need to reinitialize
> 	pretty much all the ACPI state.
> 
> 	S4 involves a larger power consumption and may allow for
> 	more wakeup devices than S5.
> 
> And how do these relate to Linux?
> 
> 	In fact, ACPI has no way of knowing whether or not there is an
> 	image.  The kernel is perfectly free to do whatever it wants.
> 
> 	The boot kernel can't make much use of the state preserved by
> 	ACPI because it doesn't have access to the image kernel's
> 	records.  It needs to reinitialize ACPI no matter what.

To be precise, it usually needs to initialize ACPI to read the image (drivers
use ACPI to some extent).  In principle we could make it behave as though
ACPI were not compiled in and read the image while being in that state.
Then, it could use the ACPI state information contained in the image
(it would have to be pointed to by the image header, but that's easy).

> 	Consequently the restored kernel cannot use any preserved ACPI
> 	state, since this state gets wiped out by the boot kernel.
> 	Information about hardware changes might be available to the
> 	boot kernel, which could in principle then decide not to restore 
> 	the image.  It's not clear that this would be a good idea.  In
> 	any case, ACPI is limited to knowledge about devices on the
> 	motherboard -- it knows nothing about hotplugged devices, which
> 	makes the information less useful.
> 
> 	Hibernation allows the user to choose whether to go to S4 or S5
> 	by means of /sys/power/disk.  Therefore the user gets to decide
> 	how the power-consumption vs. wakeup-functionality tradeoff
> 	should be made.
> 
> In short, the boot kernel should do whatever it needs to in order to make
> ACPI happy.  This might involve telling ACPI that it has successfully
> resumed from S4, even though the boot kernel is unaware of system state at
> the start of hibernation.  In fact, the boot kernel has to take care of
> all this before it even knows whether a valid image exists in the swap
> partition.
> 
> Putting this together, it says that there should be no impediment to doing
> a fresh boot from S4; i.e., not restoring a memory image but simply
> letting the boot kernel continue on with a normal startup.  The corollary
> is that there should be no impediment to entering S4 during a normal
> shutdown.
> 
> From the user's point of view, the differences between S4 and S5 amount to
> just these: power consumption and availability of wakeup devices.  
> (Perhaps also the presence of a blinking LED -- but in my experience the
> blinking LED indicates STR, not hibernation.)  In the end, this is nothing 
> more than the usual tradeoff between power usage and functionality.
> 
> We give the user a chance to decide how this tradeoff should go when 
> entering hibernation.  Why not also give the user a chance to decide the 
> tradeoff during normal shutdown?
> 
> Yes, it violates the spec in the sense that we would be entering S4 
> without saving a memory image.  But we _already_ violate the spec by not 
> using a bootloader to restore the image.  I don't see this as being any 
> worse.
> 
> 
> Finally, what about non-ACPI systems?  Basically this boils down to two 
> choices:
> 
> 	Should a memory image be stored?
> 
> 	How much power/wakeup-functionality should the system
> 	consume/provide while it is down?
> 
> The first choice is decided by the user, by either entering hibernation or 
> shutting down.  Why shouldn't the second also be decided by the user?

I generally agree.

Moreover, it doesn't seem to be necessary to assume that the image should
be created and saved *after* we've put devices into low power states and
prepared ACPI for the power transition.  I think it's equally possible to
create and save the image *before* the power transition is initiated.

Greetings,
Rafael


> 
> Alan Stern
> 
> 
> 

-- 
If you don't have the time to read,
you don't have the time or the tools to write.
		- Stephen King

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-08 19:59                                                               ` Rafael J. Wysocki
@ 2007-05-08 21:26                                                                 ` Alan Stern
  0 siblings, 0 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-08 21:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Tue, 8 May 2007, Rafael J. Wysocki wrote:

> On Tuesday, 8 May 2007 16:56, Alan Stern wrote:
> > On Tue, 8 May 2007, Rafael J. Wysocki wrote:
> > 
> > > As far as I understand it, for S4 the platform provides a means for verifying
> > > if the hardware wasn't changed too much while the system was "sleeping" (via
> > > the NVS memory region).
> > 
> > Rereading p.20, it appears to go the other way: The system checks for
> > hardware changes when booting from Soft Off.
> 
> Nope.  That's clarified later on.  Please read Section 15, "Waking and
> Sleeping" (it's short ;-)), in particular 15.3.3.

You're right.  It says specifically that when booting from an S4 state,
the bootloader compares the signature in the NVS image with hardware
signature in the BIOS's FACS table.  (Although Figure 15-5 makes no 
mention of different pathways for S4 and S5.)

Does the Linux boot kernel actually do the comparison?

Chapter 15 doesn't seem to take into account the possibility that the
computer might be unplugged after entering S4.  It talks about the next 
wakeup being a wake from S4 -- although the actions of the BIOS are 
supposed to be the same when waking from S4 or booting from S5.  In either 
case the BIOS runs the POST and initializes the ACPI tables.  Only the 
actions of the bootloader are different.

So how is the bootloader supposed to know whether it is booting from S4 or
S5?  Does it just assume that the presence of a valid NVS image indicates
an S4 boot, even though it may really be booting from S5?

> > Sounds a lot like USB's power sessions...
> 
> Well, not exactly that.  The hardware signature in FACS only covers some
> "essential" hardware (I'm not sure what that is, probably depends on the
> platform design).

15.1.4.1 says:

	A change in hardware configuration is defined to be any change in
	the platform hardware that would cause the platform to fail when
	trying to restore the S4 context; this hardware is normally
	limited to boot devices.  For example, changing the graphics
	adapter or hard disk controller while in the S4 state should cause
	the hardware signature to change.  On the other hand, removing or
	adding a PC Card device from a PC Card slot should not cause the
	hardware signature to change.

Take it for what it's worth.


> > 	The boot kernel can't make much use of the state preserved by
> > 	ACPI because it doesn't have access to the image kernel's
> > 	records.  It needs to reinitialize ACPI no matter what.
> 
> To be precise, it usually needs to initialize ACPI to read the image (drivers
> use ACPI to some extent).  In principle we could make it behave as though
> ACPI were not compiled in and read the image while being in that state.
> Then, it could use the ACPI state information contained in the image
> (it would have to be pointed to by the image header, but that's easy).

In other words, make the boot kernel act as a bootloader.

Isn't this likely to cause problems?  There must be plenty of systems that
won't work properly without ACPI.  Certainly there are reported cases of
IRQ routing being wrong (and also cases where it is wrong only when ACPI
_is_ in use).


> I generally agree.
> 
> Moreover, it doesn't seem to be necessary to assume that the image should
> be created and saved *after* we've put devices into low power states and
> prepared ACPI for the power transition.  I think it's equally possible to
> create and save the image *before* the power transition is initiated.

Possible and desirable, both.

Okay, so the two of us are in agreement.  I don't know about anyone else, 
though...  :-)

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-08 14:56                                                             ` Alan Stern
  2007-05-08 19:59                                                               ` Rafael J. Wysocki
@ 2007-05-09  8:17                                                               ` Pavel Machek
  2007-05-09 15:21                                                                 ` Alan Stern
  2007-05-09 19:35                                                               ` David Brownell
  2 siblings, 1 reply; 117+ messages in thread
From: Pavel Machek @ 2007-05-09  8:17 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nigel Cunningham, Pekka Enberg, linux-pm, Johannes Berg

Hi!

> We give the user a chance to decide how this tradeoff should go when 
> entering hibernation.  Why not also give the user a chance to decide the 
> tradeoff during normal shutdown?
> 
> Yes, it violates the spec in the sense that we would be entering S4 
> without saving a memory image.  

I think you already replied to yourself :-).

There are more reasons, like we getting useless code paths to
debug. So far you demonstrated that S4-on-shutdown is probably
possible, and while violating specs, it should probably work.

What do you expect now? Me jumping with joy and implementing
S4-on-shutdown because it should be possible?

Now... if you feel very strongly about S4-on-shutdown, you may try to
create a patch. If it is not-too-ugly, and if it is really good for
something, we may merge it.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-09  8:17                                                               ` Pavel Machek
@ 2007-05-09 15:21                                                                 ` Alan Stern
  0 siblings, 0 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-09 15:21 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Nigel Cunningham, Pekka Enberg, linux-pm, Johannes Berg

On Wed, 9 May 2007, Pavel Machek wrote:

> Hi!
> 
> > We give the user a chance to decide how this tradeoff should go when 
> > entering hibernation.  Why not also give the user a chance to decide the 
> > tradeoff during normal shutdown?
> > 
> > Yes, it violates the spec in the sense that we would be entering S4 
> > without saving a memory image.  
> 
> I think you already replied to yourself :-).

Yes -- but going to S5 during hibernation (which is what "echo shutdown 
>/sys/power/disk" does, right?) also violates the spec.  So I don't feel 
too guilty about this.

> There are more reasons, like we getting useless code paths to
> debug. So far you demonstrated that S4-on-shutdown is probably
> possible, and while violating specs, it should probably work.
> 
> What do you expect now? Me jumping with joy and implementing
> S4-on-shutdown because it should be possible?

Actually all I wanted was someone to look over my reasoning and check that 
it was correct.  You and Raphael have now done so, thank you.

And when I first began contributing to this thread, the main purpose was 
to point out that hibernation_ops (or anything else related to the 
shutdown method) should not be involved in the steps responsible for 
creating and storing the snapshot image.

> Now... if you feel very strongly about S4-on-shutdown, you may try to
> create a patch. If it is not-too-ugly, and if it is really good for
> something, we may merge it.

At some time I might just do that...

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)
  2007-05-07 22:16                                                                           ` Rafael J. Wysocki
@ 2007-05-09 19:23                                                                             ` David Brownell
  0 siblings, 0 replies; 117+ messages in thread
From: David Brownell @ 2007-05-09 19:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, Johannes Berg,
	Linux-pm mailing list

On Monday 07 May 2007, Rafael J. Wysocki wrote:
> On Monday, 7 May 2007 23:45, David Brownell wrote:
> > On Monday 07 May 2007, Rafael J. Wysocki wrote:
> > > On Monday, 7 May 2007 03:16, David Brownell wrote:
> > 
> > > > So for now I have drivers/acpi/sleep/main.c exporting
> > > > 
> > > >         s_state = acpi_get_target_sleep_state();
> > > > 
> > > > so that ACPI-aware code can know to call "_S3D" instead of
> > > > the "_S1D" or "_S4D" methods (and "_S3W" etc).  Of course
> > > > the $SUBJECT patch will finish borking that for S4.  :(
> > > 
> > > Why exactly?
> > 
> > Because it adds new code paths ... currently pm_ops methods
> > record the target state.  Fixable later.
> 
> Hmm, I think hibernation_ops do the equivalent of what pm_ops did for
> ACPI_STATE_S4 and the target state is still recorded (in
> acpi_enter_sleep_state_prep()).  Isn't that correct?

I didn't use that method, because of information hiding.

See the patch I just posted.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-08 14:56                                                             ` Alan Stern
  2007-05-08 19:59                                                               ` Rafael J. Wysocki
  2007-05-09  8:17                                                               ` Pavel Machek
@ 2007-05-09 19:35                                                               ` David Brownell
  2007-05-09 20:04                                                                 ` Alan Stern
  2 siblings, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-09 19:35 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Tuesday 08 May 2007, Alan Stern wrote:

> So here's the current state of my understanding of ACPI:
> 
> 	S4 is the lowest-power Sleep state.  RAM is not powered, the OS
> 	has stored a non-volatile memory image somewhere, and some ACPI
> 	state is maintained.
> 
> 	S5 is misnamed, in that it isn't really a Sleep state at all --
> 	it's an Off state. 

It's called "Soft Off" ... :)

The reason it resembles a sleep state is that various events other
than power switches are allowed to wake systems in S5.  RTC alarms
and keyboard events come to mind as common examples.

Agreed that the distinction between S4 and S5 seems too much in the
category of "because we said so!" than because of real technical
differences (beyond presence/absence of a non-volatile image, and
a few additional wakeup event sources).


> 			In fact, it is the state the computer enters 
> 	when you first plug it in (or insert the battery).

No; again, you're missing the entire point of G3 "mechanical off".

When you first plug it in, it's going to be in G3.  Then you turn
on the power switch.  Then you press the "on/off" button.

>From then on you can use only the "on/off" button, but the system
is vampiric ... when off/dead, it can choose to come alive, and is
always sucking power/blood at a low level.

But the "large red switch" option is available to put the system
into G3 ... driving a bloody stake through its heart, so it can't
re-activate itself at midnight, and preventing constant power drain.


> From the user's point of view, the differences between S4 and S5 amount to
> just these: power consumption and availability of wakeup devices.

And the fact that in S4 there's always a resumable OS image.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-09 19:35                                                               ` David Brownell
@ 2007-05-09 20:04                                                                 ` Alan Stern
  2007-05-09 20:21                                                                   ` David Brownell
  2007-05-09 21:07                                                                   ` Pavel Machek
  0 siblings, 2 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-09 20:04 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Wed, 9 May 2007, David Brownell wrote:

> > 			In fact, it is the state the computer enters 
> > 	when you first plug it in (or insert the battery).
> 
> No; again, you're missing the entire point of G3 "mechanical off".
> 
> When you first plug it in, it's going to be in G3.  Then you turn
> on the power switch.  Then you press the "on/off" button.
> 
> From then on you can use only the "on/off" button, but the system
> is vampiric ... when off/dead, it can choose to come alive, and is
> always sucking power/blood at a low level.
> 
> But the "large red switch" option is available to put the system
> into G3 ... driving a bloody stake through its heart, so it can't
> re-activate itself at midnight, and preventing constant power drain.

Sorry.  What I meant to say was that S5 is the state the computer enters 
when you first plug it in and turn on the power switch -- before you press 
the on/off button.

> > From the user's point of view, the differences between S4 and S5 amount to
> > just these: power consumption and availability of wakeup devices.
> 
> And the fact that in S4 there's always a resumable OS image.

Are you sure?  What happens if the OSPM writes a defective, non-resumable 
OS image and then goes into S4?

What happens if the OS writes a resumable OS image and goes into S4, and 
then the user unplugs the computer, plugs it back in, and turns the power 
switch on?  At that point the system must be in S5 (by definition), but 
there's still a resumable image.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-09 20:04                                                                 ` Alan Stern
@ 2007-05-09 20:21                                                                   ` David Brownell
  2007-05-10 15:17                                                                     ` Alan Stern
  2007-05-09 21:07                                                                   ` Pavel Machek
  1 sibling, 1 reply; 117+ messages in thread
From: David Brownell @ 2007-05-09 20:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

> > > From the user's point of view, the differences between S4 and S5 amount to
> > > just these: power consumption and availability of wakeup devices.
> > 
> > And the fact that in S4 there's always a resumable OS image.
> 
> Are you sure?  What happens if the OSPM writes a defective, non-resumable 
> OS image and then goes into S4?

The ACPI spec omits all such error transitions.  As well as
a fair number of non-error ones ... like how to enter G3.


> What happens if the OS writes a resumable OS image and goes into S4, and 
> then the user unplugs the computer, plugs it back in, and turns the power 
> switch on?  At that point the system must be in S5 (by definition), but 
> there's still a resumable image.

As allowed by the chapter 2 text I pointed out earlier.
S4 *always* has such an image.

- Dave

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-09 20:04                                                                 ` Alan Stern
  2007-05-09 20:21                                                                   ` David Brownell
@ 2007-05-09 21:07                                                                   ` Pavel Machek
  1 sibling, 0 replies; 117+ messages in thread
From: Pavel Machek @ 2007-05-09 21:07 UTC (permalink / raw)
  To: Alan Stern; +Cc: Nigel Cunningham, Pekka Enberg, linux-pm, Johannes Berg

Hi!

> > > 			In fact, it is the state the computer enters 
> > > 	when you first plug it in (or insert the battery).
> > 
> > No; again, you're missing the entire point of G3 "mechanical off".
> > 
> > When you first plug it in, it's going to be in G3.  Then you turn
> > on the power switch.  Then you press the "on/off" button.
> > 
> > From then on you can use only the "on/off" button, but the system
> > is vampiric ... when off/dead, it can choose to come alive, and is
> > always sucking power/blood at a low level.
> > 
> > But the "large red switch" option is available to put the system
> > into G3 ... driving a bloody stake through its heart, so it can't
> > re-activate itself at midnight, and preventing constant power drain.
> 
> Sorry.  What I meant to say was that S5 is the state the computer enters 
> when you first plug it in and turn on the power switch -- before you press 
> the on/off button.

Actually... some machines just power on when you first plug them in,
and some other have it configurable in BIOS.

For server, you want it to power up after power fail.

For home desktop, you definitely want it to stay powered off after
power fail.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 117+ messages in thread

* Re: Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...))
  2007-05-09 20:21                                                                   ` David Brownell
@ 2007-05-10 15:17                                                                     ` Alan Stern
  0 siblings, 0 replies; 117+ messages in thread
From: Alan Stern @ 2007-05-10 15:17 UTC (permalink / raw)
  To: David Brownell
  Cc: Nigel Cunningham, Pekka Enberg, Pavel Machek, linux-pm,
	Johannes Berg

On Wed, 9 May 2007, David Brownell wrote:

> > > > From the user's point of view, the differences between S4 and S5 amount to
> > > > just these: power consumption and availability of wakeup devices.
> > > 
> > > And the fact that in S4 there's always a resumable OS image.

> > What happens if the OS writes a resumable OS image and goes into S4, and 
> > then the user unplugs the computer, plugs it back in, and turns the power 
> > switch on?  At that point the system must be in S5 (by definition), but 
> > there's still a resumable image.
> 
> As allowed by the chapter 2 text I pointed out earlier.
> S4 *always* has such an image.

So the correct statement is that S4 always has a resumable OS image and S5
may have a resumable image.  From a user's point of view that doesn't
sound like much of a difference, especially since the image can be 
successfully restored from either state.

Alan Stern

^ permalink raw reply	[flat|nested] 117+ messages in thread

end of thread, other threads:[~2007-05-10 15:17 UTC | newest]

Thread overview: 117+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070425072350.GA6866@ucw.cz>
     [not found] ` <Pine.LNX.4.64.0704251239110.8406@vaio.localdomain>
     [not found]   ` <alpine.LFD.0.98.0704251252070.9964@woody.linux-foundation.org>
     [not found]     ` <20070425202741.GC17387@elf.ucw.cz>
     [not found]       ` <alpine.LFD.0.98.0704251332090.9964@woody.linux-foundation.org>
     [not found]         ` <20070425214420.GG17387@elf.ucw.cz>
     [not found]           ` <alpine.LFD.0.98.0704251515140.9964@woody.linux-foundation.org>
     [not found]             ` <1177540027.5025.87.camel@nigel.suspend2.net>
     [not found]               ` <alpine.LFD.0.98.0704251550210.9964@woody.linux-foundation.org>
     [not found]                 ` <1177583998.6814.42.camel@johannes.berg>
     [not found]                   ` <20070426113005.GU17387@elf.ucw.cz>
2007-04-26 16:31                     ` suspend2 merge (was Re: [Suspend2-devel] Re: CFS and suspend2: hang in atomic copy) Johannes Berg
2007-04-26 18:40                       ` Rafael J. Wysocki
2007-04-26 18:40                         ` Johannes Berg
     [not found]                         ` <1177612802.6814.121.camel@johannes.berg>
2007-04-26 19:02                           ` Rafael J. Wysocki
     [not found]                           ` <200704262102.38568.rjw@sisk.pl>
2007-04-27  9:41                             ` Johannes Berg
     [not found]                             ` <1177666915.7828.35.camel@johannes.berg>
2007-04-27 10:09                               ` Johannes Berg
2007-04-27 10:18                               ` Rafael J. Wysocki
     [not found]                               ` <200704271218.07120.rjw@sisk.pl>
2007-04-27 10:19                                 ` Johannes Berg
     [not found]                                 ` <1177669179.7828.53.camel@johannes.berg>
     [not found]                                   ` <200704271409.56687.rjw@sisk.pl>
2007-04-27 12:07                                     ` Johannes Berg
2007-04-27 12:09                                   ` Rafael J. Wysocki
2007-04-29 12:48                       ` [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) R. J. Wysocki
2007-04-29 12:53                         ` Rafael J. Wysocki
2007-04-30  8:29                         ` Johannes Berg
2007-04-30 14:51                           ` Rafael J. Wysocki
2007-04-30 14:59                             ` Johannes Berg
2007-05-01 14:05                               ` Rafael J. Wysocki
2007-05-01 22:02                                 ` Rafael J. Wysocki
2007-05-02  5:13                                   ` Alexey Starikovskiy
2007-05-02 13:42                                     ` Rafael J. Wysocki
2007-05-02 14:11                                       ` Alexey Starikovskiy
2007-05-02 19:26                                         ` ACPI code in platform mode hibernation code paths (was: Re: [PATCH] swsusp: do not use pm_ops) Rafael J. Wysocki
     [not found]                                         ` <200705022126.47897.rjw@sisk.pl>
2007-05-03 22:48                                           ` Pavel Machek
     [not found]                                           ` <20070503224807.GD13426@elf.ucw.cz>
2007-05-03 23:14                                             ` Rafael J. Wysocki
2007-05-04 10:54                                             ` Johannes Berg
     [not found]                                             ` <1178276072.7408.7.camel@johannes.berg>
2007-05-04 12:08                                               ` Pavel Machek
     [not found]                                               ` <20070504120802.GF13426@elf.ucw.cz>
2007-05-04 12:29                                                 ` Rafael J. Wysocki
2007-05-02  8:21                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
2007-05-02  9:02                                     ` Rafael J. Wysocki
2007-05-02  9:16                                     ` Pavel Machek
2007-05-02  9:25                                       ` Johannes Berg
2007-05-03 14:00                                         ` Alan Stern
2007-05-03 17:17                                           ` Rafael J. Wysocki
2007-05-03 18:33                                             ` Alan Stern
2007-05-03 19:47                                               ` Rafael J. Wysocki
2007-05-03 19:59                                                 ` Alan Stern
2007-05-03 20:21                                                   ` Rafael J. Wysocki
2007-05-04 14:40                                                     ` Alan Stern
2007-05-04 20:20                                                       ` Rafael J. Wysocki
2007-05-04 20:21                                                         ` Johannes Berg
2007-05-04 20:55                                                           ` Pavel Machek
2007-05-04 21:08                                                             ` Johannes Berg
2007-05-04 21:15                                                               ` Pavel Machek
2007-05-04 21:53                                                                 ` Rafael J. Wysocki
2007-05-04 21:53                                                                   ` Johannes Berg
2007-05-04 22:25                                                                     ` Rafael J. Wysocki
2007-05-05 15:52                                                                   ` Alan Stern
2007-05-07  1:16                                                                     ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
2007-05-07 21:00                                                                       ` Rafael J. Wysocki
2007-05-07 21:45                                                                         ` David Brownell
2007-05-07 22:16                                                                           ` Rafael J. Wysocki
2007-05-09 19:23                                                                             ` David Brownell
2007-05-04 21:06                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Rafael J. Wysocki
2007-05-04 20:58                                                       ` Pavel Machek
2007-05-04 21:24                                                         ` Rafael J. Wysocki
2007-05-05 16:19                                                           ` Alan Stern
2007-05-05 17:46                                                             ` Rafael J. Wysocki
2007-05-05 21:42                                                               ` Alan Stern
2007-05-05 22:14                                                                 ` Rafael J. Wysocki
2007-05-04 21:40                                                       ` David Brownell
2007-05-04 22:19                                                         ` Rafael J. Wysocki
2007-05-07  1:05                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)) David Brownell
2007-05-05 16:08                                                         ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Alan Stern
2007-05-05 17:50                                                           ` Rafael J. Wysocki
2007-05-05 21:43                                                             ` Alan Stern
2007-05-05 22:16                                                               ` Rafael J. Wysocki
2007-05-07  1:31                                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...) David Brownell
2007-05-07 16:33                                                             ` Alan Stern
2007-05-07 20:49                                                               ` Pavel Machek
2007-05-07 21:38                                                                 ` Alan Stern
2007-05-08  0:30                                                                   ` Pavel Machek
2007-05-03 20:33                                               ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) David Brownell
2007-05-03 20:33                                           ` David Brownell
2007-05-03 20:51                                             ` Rafael J. Wysocki
2007-05-04 14:51                                             ` Alan Stern
2007-05-04 14:56                                               ` Johannes Berg
2007-05-04 20:27                                                 ` Rafael J. Wysocki
2007-05-04 22:00                                               ` David Brownell
2007-05-05 15:49                                                 ` Alan Stern
2007-05-07  1:10                                                   ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ...)) David Brownell
2007-05-07 18:46                                                     ` Alan Stern
2007-05-07 21:29                                                       ` Rafael J. Wysocki
2007-05-07 22:22                                                         ` Alan Stern
2007-05-07 22:47                                                           ` Rafael J. Wysocki
2007-05-08 14:56                                                             ` Alan Stern
2007-05-08 19:59                                                               ` Rafael J. Wysocki
2007-05-08 21:26                                                                 ` Alan Stern
2007-05-09  8:17                                                               ` Pavel Machek
2007-05-09 15:21                                                                 ` Alan Stern
2007-05-09 19:35                                                               ` David Brownell
2007-05-09 20:04                                                                 ` Alan Stern
2007-05-09 20:21                                                                   ` David Brownell
2007-05-10 15:17                                                                     ` Alan Stern
2007-05-09 21:07                                                                   ` Pavel Machek
2007-05-07 21:43                                                       ` David Brownell
2007-05-07 22:41                                                         ` Alan Stern
2007-05-03 22:18                                           ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Pavel Machek
2007-05-04 14:57                                             ` Alan Stern
2007-05-04 20:50                                               ` Rafael J. Wysocki
2007-05-04 20:49                                                 ` Johannes Berg
2007-05-04 21:11                                                   ` Rafael J. Wysocki
2007-05-04 21:23                                                     ` Johannes Berg
2007-05-04 21:55                                                       ` Rafael J. Wysocki
2007-05-04 21:54                                                         ` Johannes Berg
2007-05-04 22:21                                                           ` Rafael J. Wysocki
2007-05-05 15:37                                                             ` Alan Stern
2007-05-05 18:49                                                               ` Rafael J. Wysocki
2007-05-05 21:44                                                                 ` Alan Stern
2007-05-05 22:36                                                                   ` Rafael J. Wysocki
2007-05-06 22:01                                                                     ` Alan Stern
2007-05-06 22:31                                                                       ` Rafael J. Wysocki
2007-05-07  1:37                                                                       ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: ..) David Brownell
2007-05-08  2:57                                                                         ` Greg KH
2007-05-07  8:51                                                                 ` Re: [PATCH] swsusp: do not use pm_ops (was: Re: suspend2 merge (was: Re: CFS and suspend2: hang in atomic copy)) Johannes Berg
2007-05-04 22:12                                                         ` David Brownell
2007-05-04 22:31                                                           ` Rafael J. Wysocki
2007-05-05 16:15                                                       ` Alan Stern
2007-05-02 13:43                                       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox