public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] Add suspend/resume for HPET
  2007-03-29 13:31   ` Maxim
@ 2007-03-29 13:46     ` Maxim Levitsky
  2007-03-29 16:53       ` Linus Torvalds
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Maxim Levitsky @ 2007-03-29 13:46 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: gregkh, Jeff Chua, linux-ide, jgarzik, Ingo Molnar, linux-pm,
	Linux Kernel Mailing List, Adrian Bunk, linux-acpi, linux-pci,
	Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Thomas Gleixner, Linus Torvalds, Andrew Morton

Subject: Add suspend/resume for HPET
This adds support of suspend/resume on i386 for HPET
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 68 insertions(+), 0 deletions(-)

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index 0fd9fba..7c67780 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -310,6 +312,7 @@ int __init hpet_enable(void)
 out_nohpet:
 	iounmap(hpet_virt_address);
 	hpet_virt_address = NULL;
+	boot_hpet_disable = 1;
 	return 0;
 }
 
@@ -524,3 +527,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+	unsigned long cfg = hpet_readl(HPET_CFG);
+
+	cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+	hpet_writel(cfg, HPET_CFG);
+
+	return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+	unsigned int id;
+
+	hpet_start_counter();
+
+	id = hpet_readl(HPET_ID);
+
+	if (id & HPET_ID_LEGSUP)
+		hpet_enable_int();
+
+	return 0;
+}
+
+static struct sysdev_class hpet_class = {
+	set_kset_name("hpet"),
+	.suspend	= hpet_suspend,
+	.resume		= hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+	.id		= 0,
+	.cls		= &hpet_class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+	int err;
+
+	if (!is_hpet_capable())
+		return 0;
+
+	err = sysdev_class_register(&hpet_class);
+
+	if (!err) {
+		err = sysdev_register(&hpet_device);
+		if (err)
+			sysdev_class_unregister(&hpet_class);
+	}
+
+	return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif
-- 
1.4.4.2

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 13:46     ` [PATCH v2] Add suspend/resume for HPET Maxim Levitsky
@ 2007-03-29 16:53       ` Linus Torvalds
  2007-03-29 17:28         ` Maxim Levitsky
  2007-03-29 17:51         ` Ingo Molnar
  2007-03-29 18:11       ` Jeff Chua
  2007-03-31 15:51       ` Thomas Gleixner
  2 siblings, 2 replies; 30+ messages in thread
From: Linus Torvalds @ 2007-03-29 16:53 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Andrew Morton, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Thomas Gleixner, jgarzik, Ingo Molnar



On Thu, 29 Mar 2007, Maxim Levitsky wrote:
>
> Subject: Add suspend/resume for HPET
>
> This adds support of suspend/resume on i386 for HPET
>
> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> 
> ---
>  arch/i386/kernel/hpet.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++

Btw, what about arch/x86_64/kernel/hpet.c?

That thing seems totally broken. Lookie here:

  arch/x86_64/kernel/hpet.c:irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
  drivers/char/rtc.c:extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);

anybody see a problem? The x86-64 version doesn't seem to be very well 
maintained. Is there some fundamental reason why this file isn't shared 
across architectures?

			Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 16:53       ` Linus Torvalds
@ 2007-03-29 17:28         ` Maxim Levitsky
  2007-03-29 17:51         ` Ingo Molnar
  1 sibling, 0 replies; 30+ messages in thread
From: Maxim Levitsky @ 2007-03-29 17:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sergei Shtylyov, Ingo Molnar, Thomas Gleixner, Jeff Chua,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Thursday 29 March 2007 18:53:37 Linus Torvalds wrote:
> 
> On Thu, 29 Mar 2007, Maxim Levitsky wrote:
> >
> > Subject: Add suspend/resume for HPET
> >
> > This adds support of suspend/resume on i386 for HPET
> >
> > Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> > 
> > ---
> >  arch/i386/kernel/hpet.c |   68 +++++++++++++++++++++++++++++++++++++++++++++++
> 
> Btw, what about arch/x86_64/kernel/hpet.c?
> 
> That thing seems totally broken. Lookie here:
> 
>   arch/x86_64/kernel/hpet.c:irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
>   drivers/char/rtc.c:extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);
> 
> anybody see a problem? The x86-64 version doesn't seem to be very well 
> maintained. Is there some fundamental reason why this file isn't shared 
> across architectures?
> 
> 			Linus
> 

Hi,
	I agree with that, there seems to be lot of code duplication between i386 and x86_64.
	By the way, x86_64 does take care of suspend/resume for hpet, it is done by 

	linux-2.6/arch/x86_64/kernel/time.c:timer_resume(struct sys_device *dev):
		hpet_reenable()


	on i386 PIT driver goes out of way when HPET is detected
	So it seems that there is lot of work to do to remove redundant code.


	Best regards,
		Maxim Levitsky

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [patch, v2] add suspend/resume for HPET
  2007-03-29 17:22       ` Linus Torvalds
@ 2007-03-29 17:47         ` Ingo Molnar
  0 siblings, 0 replies; 30+ messages in thread
From: Ingo Molnar @ 2007-03-29 17:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, gregkh, linux-pm,
	Linux Kernel Mailing List, Adrian Bunk, linux-acpi, linux-pci,
	Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Thomas Gleixner, jgarzik, Andrew Morton


update: i've tested Maxim's v2 patch on both a hpet-capable and a 
hpet-less system, and it works fine here.

on a dual-core hpet-capable system, running a NO_HZ+!HIGH_RES_TIMERS 
kernel:

  europe:~> grep Clock /proc/timer_list
  Clock Event Device: hpet
  Clock Event Device: lapic
  Clock Event Device: lapic

s2ram works fine now - it hung on resume before.

on a dual-core non-hpet system, with a NO_HZ+!HIGH_RES_TIMERS kernel:

  neptune:~> grep Clock /proc/timer_list
  Clock Event Device: pit
  Clock Event Device: lapic
  Clock Event Device: lapic

s2ram worked fine before - and it still works now.

(The combination of NO_HZ+!HIGH_RES_TIMERS was the most fragile wrt. 
suspend because in the !HIGH_RES_TIMERS there's just a single instance 
after resume that we touch the timer hardware, and we very much rely on 
the periodic interrupt being set to the precise value.)

So this is a go on my systems - good work Maxim! (I've reproduced 
Maxim's patch below with minor patch-metadata updates.)

	Ingo

---------------------------->
Subject: [patch] add suspend/resume for HPET
From: Maxim Levitsky <maximlevitsky@gmail.com>

This adds support for suspend/resume on i386 for HPET.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

---
 arch/i386/kernel/hpet.c |   68 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

Index: linux/arch/i386/kernel/hpet.c
===================================================================
--- linux.orig/arch/i386/kernel/hpet.c
+++ linux/arch/i386/kernel/hpet.c
@@ -3,6 +3,8 @@
 #include <linux/errno.h>
 #include <linux/hpet.h>
 #include <linux/init.h>
+#include <linux/sysdev.h>
+#include <linux/pm.h>
 
 #include <asm/hpet.h>
 #include <asm/io.h>
@@ -307,6 +309,7 @@ int __init hpet_enable(void)
 out_nohpet:
 	iounmap(hpet_virt_address);
 	hpet_virt_address = NULL;
+	boot_hpet_disable = 1;
 	return 0;
 }
 
@@ -523,3 +526,68 @@ irqreturn_t hpet_rtc_interrupt(int irq, 
 	return IRQ_HANDLED;
 }
 #endif
+
+
+/*
+ * Suspend/resume part
+ */
+
+#ifdef CONFIG_PM
+
+static int hpet_suspend(struct sys_device *sys_device, pm_message_t state)
+{
+	unsigned long cfg = hpet_readl(HPET_CFG);
+
+	cfg &= ~(HPET_CFG_ENABLE|HPET_CFG_LEGACY);
+	hpet_writel(cfg, HPET_CFG);
+
+	return 0;
+}
+
+static int hpet_resume(struct sys_device *sys_device)
+{
+	unsigned int id;
+
+	hpet_start_counter();
+
+	id = hpet_readl(HPET_ID);
+
+	if (id & HPET_ID_LEGSUP)
+		hpet_enable_int();
+
+	return 0;
+}
+
+static struct sysdev_class hpet_class = {
+	set_kset_name("hpet"),
+	.suspend	= hpet_suspend,
+	.resume		= hpet_resume,
+};
+
+static struct sys_device hpet_device = {
+	.id		= 0,
+	.cls		= &hpet_class,
+};
+
+
+static __init int hpet_register_sysfs(void)
+{
+	int err;
+
+	if (!is_hpet_capable())
+		return 0;
+
+	err = sysdev_class_register(&hpet_class);
+
+	if (!err) {
+		err = sysdev_register(&hpet_device);
+		if (err)
+			sysdev_class_unregister(&hpet_class);
+	}
+
+	return err;
+}
+
+device_initcall(hpet_register_sysfs);
+
+#endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 16:53       ` Linus Torvalds
  2007-03-29 17:28         ` Maxim Levitsky
@ 2007-03-29 17:51         ` Ingo Molnar
  2007-03-29 20:46           ` Andi Kleen
  1 sibling, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2007-03-29 17:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Thomas Gleixner, jgarzik, Andrew Morton


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Btw, what about arch/x86_64/kernel/hpet.c?

at least wrt. suspend/resume it should be fine, because in 
arch/x86_64/kernel/time.c it does this upon resume:

 static int timer_resume(struct sys_device *dev)
 {
         if (hpet_address)
                 hpet_reenable();
         else
                 i8254_timer_resume();

[ barring the issue that mixing two pieces of hardware like this in a 
  single resume function is wrong - all timer hardware should be 
  separated like we did it for i386. I've got 64-bit clockevents code in 
  -rt which does this separation. ]

> That thing seems totally broken. Lookie here:
> 
>   arch/x86_64/kernel/hpet.c:irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
>   drivers/char/rtc.c:extern irqreturn_t hpet_rtc_interrupt(int irq, void *dev_id);
> 
> anybody see a problem? The x86-64 version doesn't seem to be very well 
> maintained. Is there some fundamental reason why this file isn't 
> shared across architectures?

there's no fundamental reason. x86_64 COW-ed hpet_timer.c and 
time_hpet.c years ago and drifted off into different areas.
They should be unified: more power to arch/x86/ ;-)

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 13:46     ` [PATCH v2] Add suspend/resume for HPET Maxim Levitsky
  2007-03-29 16:53       ` Linus Torvalds
@ 2007-03-29 18:11       ` Jeff Chua
  2007-03-31 15:51       ` Thomas Gleixner
  2 siblings, 0 replies; 30+ messages in thread
From: Jeff Chua @ 2007-03-29 18:11 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Sergei Shtylyov, Linus Torvalds, Ingo Molnar, Thomas Gleixner,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On 3/29/07, Maxim Levitsky <maximlevitsky@gmail.com> wrote:
> Subject: Add suspend/resume for HPET
> This adds support of suspend/resume on i386 for HPET
> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>

Confirmed that suspend/resume disk/ram works on X60s with
CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.

But suspend to disk still hang with CONFIG_NO_HZ unset.

Thanks,
Jeff.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 17:51         ` Ingo Molnar
@ 2007-03-29 20:46           ` Andi Kleen
  0 siblings, 0 replies; 30+ messages in thread
From: Andi Kleen @ 2007-03-29 20:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Maxim Levitsky, Jeff Chua, linux-acpi, Sergei Shtylyov, jgarzik,
	gregkh, linux-pm, Linux Kernel Mailing List, Andrew Morton,
	Adrian Bunk, linux-ide, Eric W. Biederman, Jens Axboe,
	Michael S. Tsirkin, linux-pci, Linus Torvalds, Thomas Gleixner

Ingo Molnar <mingo@elte.hu> writes:
> 
> there's no fundamental reason. x86_64 COW-ed hpet_timer.c and 
> time_hpet.c years ago and drifted off into different areas.

Not quite -- x86-64 did HPET long before i386; the only stuff cowed
was the character driver support code. But the core HPET code
was always totally different code streams. We never did the complicated 
pluggable clock code i386 did though -- i never quite saw the point of that
because there aren't that many timers there.
Of course it is already obsolete with clocksources now.

-Andi

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-29 13:46     ` [PATCH v2] Add suspend/resume for HPET Maxim Levitsky
  2007-03-29 16:53       ` Linus Torvalds
  2007-03-29 18:11       ` Jeff Chua
@ 2007-03-31 15:51       ` Thomas Gleixner
  2007-03-31 16:01         ` Jeff Chua
                           ` (2 more replies)
  2 siblings, 3 replies; 30+ messages in thread
From: Thomas Gleixner @ 2007-03-31 15:51 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Jeff Chua, linux-ide, Sergei Shtylyov, jgarzik, gregkh, linux-pm,
	Linux Kernel Mailing List, Adrian Bunk, linux-acpi, linux-pci,
	Eric W. Biederman, Jens Axboe, Michael S. Tsirkin, Ingo Molnar,
	Linus Torvalds, Andrew Morton

On Thu, 2007-03-29 at 15:46 +0200, Maxim Levitsky wrote:
> Subject: Add suspend/resume for HPET
> This adds support of suspend/resume on i386 for HPET
> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
>
> +static struct sysdev_class hpet_class = {
> +	set_kset_name("hpet"),
> +	.suspend	= hpet_suspend,
> +	.resume		= hpet_resume,
> +};
> +
> +static struct sys_device hpet_device = {
> +	.id		= 0,
> +	.cls		= &hpet_class,
> +};

Sorry for being inresponsive. I was travelling and unexpectedly cut off
from the internet for some days.

While I agree in principle with the patch, I'm a bit uncomfortable. The
sys device suspend / resume ordering is not guaranteed and relies on the
registering order.

Jeff still seems to have problems with CONFIG_NO_HZ=n and it might be
caused by time keeping / tick management resume happening before the
HPET resume.

The required resume order is:

clocksources
timekeeping
clockevents
tick management

I'm not sure how to do this properly with the sys device facilities, but
I look into it.

/me goes off to understand the sys device magic.

	tglx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 15:51       ` Thomas Gleixner
@ 2007-03-31 16:01         ` Jeff Chua
  2007-03-31 16:09           ` Thomas Gleixner
  2007-03-31 16:09         ` Linus Torvalds
  2007-03-31 16:56         ` Maxim Levitsky
  2 siblings, 1 reply; 30+ messages in thread
From: Jeff Chua @ 2007-03-31 16:01 UTC (permalink / raw)
  To: tglx
  Cc: Maxim Levitsky, linux-ide, Sergei Shtylyov, jgarzik, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Ingo Molnar, Linus Torvalds, Andrew Morton

On 3/31/07, Thomas Gleixner <tglx@linutronix.de> wrote:

> Jeff still seems to have problems with CONFIG_NO_HZ=n and it might be
> caused by time keeping / tick management resume happening before the
> HPET resume.


me>Confirmed that suspend/resume disk/ram works on X60s with
me>CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.
me> But suspend to disk still hang with CONFIG_NO_HZ unset.


Oops, sorry. Typo (as a result copy/paste using mouse)
    ... I actually meant CONFIG_NO_HZ "set".

Just to be clear, suspend to disk still hang with CONFIG_NO_HZ=y. It
hang just before you see the percent saving %.


Jeff.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:01         ` Jeff Chua
@ 2007-03-31 16:09           ` Thomas Gleixner
  0 siblings, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2007-03-31 16:09 UTC (permalink / raw)
  To: Jeff Chua
  Cc: Maxim Levitsky, Sergei Shtylyov, Linus Torvalds, Ingo Molnar,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Sun, 2007-04-01 at 00:01 +0800, Jeff Chua wrote:
> On 3/31/07, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Jeff still seems to have problems with CONFIG_NO_HZ=n and it might be
> > caused by time keeping / tick management resume happening before the
> > HPET resume.
> 
> 
> me>Confirmed that suspend/resume disk/ram works on X60s with
> me>CONFIG_HPET_TIMER=y and CONFIG_NO_HZ unset.
> me> But suspend to disk still hang with CONFIG_NO_HZ unset.
> 
> 
> Oops, sorry. Typo (as a result copy/paste using mouse)
>     ... I actually meant CONFIG_NO_HZ "set".
> 
> Just to be clear, suspend to disk still hang with CONFIG_NO_HZ=y. It
> hang just before you see the percent saving %.

Ah, that's a different one then. In that path the timers should be
alive, but who knows.

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 15:51       ` Thomas Gleixner
  2007-03-31 16:01         ` Jeff Chua
@ 2007-03-31 16:09         ` Linus Torvalds
  2007-03-31 16:33           ` Thomas Gleixner
  2007-03-31 16:56         ` Maxim Levitsky
  2 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2007-03-31 16:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Ingo Molnar, jgarzik, Andrew Morton



On Sat, 31 Mar 2007, Thomas Gleixner wrote:
> 
> While I agree in principle with the patch, I'm a bit uncomfortable. The
> sys device suspend / resume ordering is not guaranteed and relies on the
> registering order.

Well, this is why we probably should try to get away from the "system 
device" issue, exactly because system devices are totally outside the 
normal ordering and only have a random linear order.

If the clocksources were actually in the device tree, you'd get all the 
normal guarantees about hierarchical ordering..

		Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:09         ` Linus Torvalds
@ 2007-03-31 16:33           ` Thomas Gleixner
  2007-03-31 16:41             ` Greg KH
  2007-03-31 16:53             ` Linus Torvalds
  0 siblings, 2 replies; 30+ messages in thread
From: Thomas Gleixner @ 2007-03-31 16:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maxim Levitsky, Sergei Shtylyov, Ingo Molnar, Jeff Chua,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Sat, 2007-03-31 at 09:09 -0700, Linus Torvalds wrote:
> 
> On Sat, 31 Mar 2007, Thomas Gleixner wrote:
> > 
> > While I agree in principle with the patch, I'm a bit uncomfortable. The
> > sys device suspend / resume ordering is not guaranteed and relies on the
> > registering order.
> 
> Well, this is why we probably should try to get away from the "system 
> device" issue, exactly because system devices are totally outside the 
> normal ordering and only have a random linear order.
> 
> If the clocksources were actually in the device tree, you'd get all the 
> normal guarantees about hierarchical ordering..

Right, but clock - sources/events need to be extremly late suspended and
early resumed. How can we ensure this ?

	tglx



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:33           ` Thomas Gleixner
@ 2007-03-31 16:41             ` Greg KH
  2007-03-31 16:53             ` Linus Torvalds
  1 sibling, 0 replies; 30+ messages in thread
From: Greg KH @ 2007-03-31 16:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, Maxim Levitsky, Sergei Shtylyov, Ingo Molnar,
	Jeff Chua, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, linux-pci,
	Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Sat, Mar 31, 2007 at 06:33:20PM +0200, Thomas Gleixner wrote:
> On Sat, 2007-03-31 at 09:09 -0700, Linus Torvalds wrote:
> > 
> > On Sat, 31 Mar 2007, Thomas Gleixner wrote:
> > > 
> > > While I agree in principle with the patch, I'm a bit uncomfortable. The
> > > sys device suspend / resume ordering is not guaranteed and relies on the
> > > registering order.
> > 
> > Well, this is why we probably should try to get away from the "system 
> > device" issue, exactly because system devices are totally outside the 
> > normal ordering and only have a random linear order.
> > 
> > If the clocksources were actually in the device tree, you'd get all the 
> > normal guarantees about hierarchical ordering..
> 
> Right, but clock - sources/events need to be extremly late suspended and
> early resumed. How can we ensure this ?

Put it at the top of the device tree.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:33           ` Thomas Gleixner
  2007-03-31 16:41             ` Greg KH
@ 2007-03-31 16:53             ` Linus Torvalds
  2007-03-31 17:02               ` Ingo Molnar
  2007-03-31 17:08               ` Greg KH
  1 sibling, 2 replies; 30+ messages in thread
From: Linus Torvalds @ 2007-03-31 16:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Ingo Molnar, jgarzik, Andrew Morton



On Sat, 31 Mar 2007, Thomas Gleixner wrote:
> 
> Right, but clock - sources/events need to be extremly late suspended and
> early resumed. How can we ensure this ?

Make them be at the top of the device tree by adding them early. That's 
the whole point of the device tree after all - we have an ordering that is 
enforced by its topology, and that we sort by when things were added.

Right now the way things work is (iirc - somebody like Greg should 
double-check me) is that we add all devices to the power list at 
device_add() time by traversing the devices fromt he root all the way out, 
and doing a device_add() which does a device_pm_add(), which in turn adds 
it to the power-management list - so that the list is always topologically 
sorted.

So the only thing that needs to be done is to make sure that we add the 
timer devices early during bootup - something we have to do *anyway*. If a 
device is added early in bootup, that automatically means that it will be 
suspended late, and resumed early - because we maintain that order all the 
way through..

		Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 15:51       ` Thomas Gleixner
  2007-03-31 16:01         ` Jeff Chua
  2007-03-31 16:09         ` Linus Torvalds
@ 2007-03-31 16:56         ` Maxim Levitsky
  2007-03-31 17:09           ` Linus Torvalds
  2 siblings, 1 reply; 30+ messages in thread
From: Maxim Levitsky @ 2007-03-31 16:56 UTC (permalink / raw)
  To: tglx
  Cc: Sergei Shtylyov, Linus Torvalds, Ingo Molnar, Jeff Chua,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Saturday 31 March 2007 18:51:11 Thomas Gleixner wrote:
> On Thu, 2007-03-29 at 15:46 +0200, Maxim Levitsky wrote:
> > Subject: Add suspend/resume for HPET
> > This adds support of suspend/resume on i386 for HPET
> > Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
> >
> > +static struct sysdev_class hpet_class = {
> > +	set_kset_name("hpet"),
> > +	.suspend	= hpet_suspend,
> > +	.resume		= hpet_resume,
> > +};
> > +
> > +static struct sys_device hpet_device = {
> > +	.id		= 0,
> > +	.cls		= &hpet_class,
> > +};
> 
> Sorry for being inresponsive. I was travelling and unexpectedly cut off
> from the internet for some days.
> 
> While I agree in principle with the patch, I'm a bit uncomfortable. The
> sys device suspend / resume ordering is not guaranteed and relies on the
> registering order.
> 
> Jeff still seems to have problems with CONFIG_NO_HZ=n and it might be
> caused by time keeping / tick management resume happening before the
> HPET resume.
> 
> The required resume order is:
> 
> clocksources
> timekeeping
> clockevents
> tick management
> 
> I'm not sure how to do this properly with the sys device facilities, but
> I look into it.
> 
> /me goes off to understand the sys device magic.
> 
> 	tglx
> 
> 
> 

Hi,

So maybe I was right afrer all,
Maybe it is better to add a suspend/resume hook to each clock source and call 
it from timekeeping_resume() ?

Or maybe even unite clocksources with clockevents, don't know 

By the way I want to report maybe a bug / maybe a feature :-)  : 
(sorry for long explanation)

Basically I have two clockevent sources : PIT(HPET) and APIC
(Actually I it seems that in next version of kernel HPET will be switched out 
of 'legacy replacement mode' , so PIT and HPET and RTC could coexist of same system,
But HPET won't be able to generate IRQ0, and it will be assigned some IRQ, possibly shared with other devices)

APIC timer  is chosen by default and works fine, 
since I don't have C2/C2 states on my system (ICH8 doesn't support them :-( )

But if I force it off (nolapic_timer) HPET or PIC is chosen and strangely they are
 put in _periodic_ mode although they are capable of one-shot mode
Is this a bug ?

Secondary I am getting a very strange behavior if I use CONFIG_NOHZ + !CONFIG_HIGH_RES_TIMERS
and try to suspend to ram:

System resumes, but gets crazy:
'top' shows that  ksoftirqd consumes 9999 % of cpu time (this is not a typo)
And other 'normal' programs that are running show same 9999 too.
System slows to crawl.

Also I found that one of APICS is in periodic mode,  and second is in one shoot mode.
And I tested this with or without my patch (thank goodness it is not my fault)

CONFIG_NOHZ + CONFIG_HIGH_RES_TIMERS work just fine.

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:53             ` Linus Torvalds
@ 2007-03-31 17:02               ` Ingo Molnar
  2007-03-31 18:18                 ` David Brownell
       [not found]                 ` <200703311118.55132.david-b@pacbell.net>
  2007-03-31 17:08               ` Greg KH
  1 sibling, 2 replies; 30+ messages in thread
From: Ingo Molnar @ 2007-03-31 17:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	Thomas Gleixner, jgarzik, Andrew Morton


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sat, 31 Mar 2007, Thomas Gleixner wrote:
> > 
> > Right, but clock - sources/events need to be extremly late suspended and
> > early resumed. How can we ensure this ?

[...]
> So the only thing that needs to be done is to make sure that we add 
> the timer devices early during bootup - something we have to do 
> *anyway*. If a device is added early in bootup, that automatically 
> means that it will be suspended late, and resumed early - because we 
> maintain that order all the way through..

IIRC hpet is particularly hard to initialize early on in the bootup 
sequence. So the way the clockevents code works is that it will always 
try to make the best out of all available devices, and dynamically 
adapts things as devices 'arrive' or 'depart' - no matter how late that 
happens. (That way there's no dependency on how late a device gets 
registered - it will only delay the switch to high-res mode for 
example.) A given time device might 'depart' because for example the 
watchdog mechanism finds that its quality is not good enough, or because 
someone initiated cpufreq which breaks the TSC clocksource.

i dont think there's any particular problem here because suspend/resume 
wont be done during bootup - but we might need a way to move a device to 
earlier spots in the device tree, even if they got registered later on - 
instead of forcing the time devices to be registered very early?

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:53             ` Linus Torvalds
  2007-03-31 17:02               ` Ingo Molnar
@ 2007-03-31 17:08               ` Greg KH
  1 sibling, 0 replies; 30+ messages in thread
From: Greg KH @ 2007-03-31 17:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, Maxim Levitsky, Sergei Shtylyov, Ingo Molnar,
	Jeff Chua, Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, linux-pci,
	Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Sat, Mar 31, 2007 at 09:53:43AM -0700, Linus Torvalds wrote:
> Make them be at the top of the device tree by adding them early. That's 
> the whole point of the device tree after all - we have an ordering that is 
> enforced by its topology, and that we sort by when things were added.
> 
> Right now the way things work is (iirc - somebody like Greg should 
> double-check me) is that we add all devices to the power list at 
> device_add() time by traversing the devices fromt he root all the way out, 
> and doing a device_add() which does a device_pm_add(), which in turn adds 
> it to the power-management list - so that the list is always topologically 
> sorted.

Yes, this is how it works (or if not, then there's a bug that needs to
be fixed, as that is how it _should_ work...)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 16:56         ` Maxim Levitsky
@ 2007-03-31 17:09           ` Linus Torvalds
  2007-03-31 17:17             ` Ingo Molnar
  0 siblings, 1 reply; 30+ messages in thread
From: Linus Torvalds @ 2007-03-31 17:09 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Andrew Morton, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	tglx, jgarzik, Ingo Molnar



On Sat, 31 Mar 2007, Maxim Levitsky wrote:
> 
> So maybe I was right afrer all,
> Maybe it is better to add a suspend/resume hook to each clock source and call 
> it from timekeeping_resume() ?

Umm.. WHy not make the device tree look like this:

	    -- "clocksource" -- +-- HPET
				|
				+-- TSC
				|
				+-- i8259
				|
				+-- lapic timer
				|
				.. whatever else

and use the "struct device" that we *have* for this? The whole "struct 
device" is literally designed to do this, and to be embedded into whatever 
bigger structures you have that describes higher-level behaviour. Ie you'd 
put a "struct device" inside the "struct clocksource".

That thingalready *has* the suspend/resume hooks, and it will mean that 
people will see the clocks in the device tree rather than have a new 
notion.


			Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 17:09           ` Linus Torvalds
@ 2007-03-31 17:17             ` Ingo Molnar
  2007-03-31 17:58               ` Daniel Walker
  0 siblings, 1 reply; 30+ messages in thread
From: Ingo Molnar @ 2007-03-31 17:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Maxim Levitsky, Jeff Chua, linux-ide, Sergei Shtylyov, gregkh,
	linux-pm, Linux Kernel Mailing List, Adrian Bunk, linux-acpi,
	linux-pci, Eric W. Biederman, Jens Axboe, Michael S. Tsirkin,
	tglx, jgarzik, Andrew Morton


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Umm.. WHy not make the device tree look like this:
> 
> 	    -- "clocksource" -- +-- HPET
> 				|
> 				+-- TSC
> 				|
> 				+-- i8259
> 				|
> 				+-- lapic timer
> 				|
> 				.. whatever else
> 
> and use the "struct device" that we *have* for this? The whole "struct 
> device" is literally designed to do this, and to be embedded into 
> whatever bigger structures you have that describes higher-level 
> behaviour. Ie you'd put a "struct device" inside the "struct 
> clocksource".

yeah. There's some practical problems that need to be sorted out: much 
of the current GTOD code is irq-driven (and all GTOD locks are 
irq-safe), while the sysfs code needs to run in process-context level. 

Clocksources 'arrive' and 'depart' in hardirq context (which is the 
primary place where we notice their breakage, determine that they are 
now verified to be usable, etc.). This came partly from legacy: the 
gradual conversion of the monolithic time code, and the need to preserve 
GTOD and non-GTOD architectures without too much duplication. It also 
came partly because there's also a fundamental need to have accurate 
time, which is better served from irq context.

i very much agree that this should and must be cleaned up, but it needs 
quite a bit more logistics than it might appear at first sight. 
Clockevents basically just followed (and had to follow) the direction of 
clocksources in this regard.

	Ingo

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 17:17             ` Ingo Molnar
@ 2007-03-31 17:58               ` Daniel Walker
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Walker @ 2007-03-31 17:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Maxim Levitsky, tglx, Sergei Shtylyov, Jeff Chua,
	Adrian Bunk, Andrew Morton, Linux Kernel Mailing List,
	Eric W. Biederman, Rafael J. Wysocki, pavel, linux-pm, gregkh,
	linux-pci, Jens Axboe, Len Brown, linux-acpi, jgarzik, linux-ide,
	Michael S. Tsirkin

On Sat, 2007-03-31 at 19:17 +0200, Ingo Molnar wrote:

> yeah. There's some practical problems that need to be sorted out: much 
> of the current GTOD code is irq-driven (and all GTOD locks are 
> irq-safe), while the sysfs code needs to run in process-context level. 
> 
> Clocksources 'arrive' and 'depart' in hardirq context (which is the 
> primary place where we notice their breakage, determine that they are 
> now verified to be usable, etc.). This came partly from legacy: the 
> gradual conversion of the monolithic time code, and the need to preserve 
> GTOD and non-GTOD architectures without too much duplication. It also 
> came partly because there's also a fundamental need to have accurate 
> time, which is better served from irq context.
> 

Is this in reference to the irq-context clocksource polling stuff? I
don't see a dire reason to keep that code, and I agree removing that is
a certainly a worth while cleanup .. I added this cleanup to one of my
trees when you first suggested it , and there is some infrastructure
that really should be added to facilitate it.

Daniel


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-03-31 17:02               ` Ingo Molnar
@ 2007-03-31 18:18                 ` David Brownell
       [not found]                 ` <200703311118.55132.david-b@pacbell.net>
  1 sibling, 0 replies; 30+ messages in thread
From: David Brownell @ 2007-03-31 18:18 UTC (permalink / raw)
  To: linux-pm
  Cc: Maxim Levitsky, Jeff Chua, linux-acpi, Sergei Shtylyov, jgarzik,
	gregkh, Linux Kernel Mailing List, Adrian Bunk, linux-ide,
	linux-pci, Thomas Gleixner, Eric W. Biederman, Jens Axboe,
	Michael S. Tsirkin, Andrew Morton, Linus Torvalds, Ingo Molnar

( please remove obsolute linux-pm@lists.osdl.org  from further messages!! )

On Saturday 31 March 2007 10:02 am, Ingo Molnar wrote:
> 
> i dont think there's any particular problem here because suspend/resume 
> wont be done during bootup - but we might need a way to move a device to 
> earlier spots in the device tree, even if they got registered later on - 
> instead of forcing the time devices to be registered very early?

I'm about ready to test the appended patch... a "move one device" call
might be safest at this point in the release cycle though.

- Dave

========================	SNIP!
Change how the PM list is constructed, so that devices are added right
after their parents (when they have one) rather than at the end of the
list.  This preserves sequencing guarantees, but enables sequencing of
suspend/resume operations by more important characteristics than "when
device happened to enumerate" ... e.g. clocksources and clockevents
at a clearly defined point during suspend and resume.

This patch has a potential downside for devices that have multiple
power dependencies and which "just happened to work" before.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>

--- g26.orig/drivers/base/power/main.c	2006-07-02 12:30:30.000000000 -0700
+++ g26/drivers/base/power/main.c	2007-03-31 11:02:28.000000000 -0700
@@ -52,12 +52,17 @@ EXPORT_SYMBOL_GPL(device_pm_set_parent);
 int device_pm_add(struct device * dev)
 {
 	int error;
+	struct device *parent = dev->parent;
 
-	pr_debug("PM: Adding info for %s:%s\n",
-		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name);
+	pr_debug("PM: Adding info for %s:%s, after %s\n",
+		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name,
+		 parent ? parent->bus_id : "(no parent)");
 	down(&dpm_list_sem);
-	list_add_tail(&dev->power.entry, &dpm_active);
-	device_pm_set_parent(dev, dev->parent);
+	if (parent)
+		list_add(&dev->power.entry, &parent->power.entry);
+	else
+		list_add_tail(&dev->power.entry, &dpm_active);
+	device_pm_set_parent(dev, parent);
 	if ((error = dpm_sysfs_add(dev)))
 		list_del(&dev->power.entry);
 	up(&dpm_list_sem);

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
       [not found]                 ` <200703311118.55132.david-b@pacbell.net>
@ 2007-03-31 19:32                   ` David Brownell
       [not found]                   ` <200703311232.57505.david-b@pacbell.net>
  1 sibling, 0 replies; 30+ messages in thread
From: David Brownell @ 2007-03-31 19:32 UTC (permalink / raw)
  To: linux-pm
  Cc: Maxim Levitsky, Jeff Chua, linux-acpi, Sergei Shtylyov, jgarzik,
	gregkh, Linux Kernel Mailing List, Adrian Bunk, linux-ide,
	linux-pci, Thomas Gleixner, Eric W. Biederman, Jens Axboe,
	Michael S. Tsirkin, Andrew Morton, Linus Torvalds, Ingo Molnar

On Saturday 31 March 2007 11:18 am, David Brownell wrote:
> ( please remove obsolute linux-pm@lists.osdl.org  from further messages!! )
> 
> On Saturday 31 March 2007 10:02 am, Ingo Molnar wrote:
> > 
> > i dont think there's any particular problem here because suspend/resume 
> > wont be done during bootup - but we might need a way to move a device to 
> > earlier spots in the device tree, even if they got registered later on - 
> > instead of forcing the time devices to be registered very early?
> 
> I'm about ready to test the appended patch... a "move one device" call
> might be safest at this point in the release cycle though.

As expected, this behaved OK on an x86 laptop.  I'll see if it breaks
some of the ARM boards I have handy.

To be clear, what this means is that if "clockevent" and "clocksource"
devices get registered "very early", and the various clockevent and
clock source devices get registered, then the suspend/resume methods
for those will all get grouped together ... suspended "very late" and
resumed "very early", regardless of when they get registered.  Pretty
much the driver model parts of what Linus was suggesting; clockevent
bits would still be needed.

- Dave



> - Dave
> 
> ========================	SNIP!
> Change how the PM list is constructed, so that devices are added right
> after their parents (when they have one) rather than at the end of the
> list.  This preserves sequencing guarantees, but enables sequencing of
> suspend/resume operations by more important characteristics than "when
> device happened to enumerate" ... e.g. clocksources and clockevents
> at a clearly defined point during suspend and resume.
> 
> This patch has a potential downside for devices that have multiple
> power dependencies and which "just happened to work" before.
> 
> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
> 
> --- g26.orig/drivers/base/power/main.c	2006-07-02 12:30:30.000000000 -0700
> +++ g26/drivers/base/power/main.c	2007-03-31 11:02:28.000000000 -0700
> @@ -52,12 +52,17 @@ EXPORT_SYMBOL_GPL(device_pm_set_parent);
>  int device_pm_add(struct device * dev)
>  {
>  	int error;
> +	struct device *parent = dev->parent;
>  
> -	pr_debug("PM: Adding info for %s:%s\n",
> -		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name);
> +	pr_debug("PM: Adding info for %s:%s, after %s\n",
> +		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name,
> +		 parent ? parent->bus_id : "(no parent)");
>  	down(&dpm_list_sem);
> -	list_add_tail(&dev->power.entry, &dpm_active);
> -	device_pm_set_parent(dev, dev->parent);
> +	if (parent)
> +		list_add(&dev->power.entry, &parent->power.entry);
> +	else
> +		list_add_tail(&dev->power.entry, &dpm_active);
> +	device_pm_set_parent(dev, parent);
>  	if ((error = dpm_sysfs_add(dev)))
>  		list_del(&dev->power.entry);
>  	up(&dpm_list_sem);
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
       [not found]                   ` <200703311232.57505.david-b@pacbell.net>
@ 2007-04-01  3:13                     ` Jeff Chua
  2007-04-01  4:13                       ` David Brownell
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff Chua @ 2007-04-01  3:13 UTC (permalink / raw)
  To: David Brownell
  Cc: Andrew Morton, linux-acpi, Sergei Shtylyov, jgarzik, gregkh,
	Linux Kernel Mailing List, Adrian Bunk, linux-ide, linux-pci,
	Thomas Gleixner, Eric W. Biederman, Jens Axboe,
	Michael S. Tsirkin, Ingo Molnar, Linus Torvalds, linux-pm,
	Maxim Levitsky

On 4/1/07, David Brownell <david-b@pacbell.net> wrote:
> for those will all get grouped together ... suspended "very late" and
> resumed "very early", regardless of when they get registered.  Pretty
> much the driver model parts of what Linus was suggesting; clockevent
> bits would still be needed.

I tested your patch with CONFIG_NO_HZ=y, but it still hangs while
suspending to disk (before the percent saving).

But one discovery. I get tired of all these hangs, so I decided to
press some keys and the power button. Accidentally, the suspend
proceeded and successfully suspended!

I tried many more times, and discovered that to proceed with the
suspend, hit any 4 keys slowly. (e.g. "F1 F2 F3 F4", or "1 2 3 4").

My .config has CONFIG_NO_HZ=y and CONFIG_HIGH_RES_TIMERS=y, but I
suppose CONFIG_HIGH_RES_TIMERS=y gots nothing to do with the hang.

I went back with my vanilla 2.6.21-rc5 with Maxim's patch, and with
hitting the keys, I could suspend to disk with CONFIG_NO_HZ=y and
CONFIG_HIGH_RES_TIMERS=y.

Jeff.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-01  3:13                     ` Jeff Chua
@ 2007-04-01  4:13                       ` David Brownell
  0 siblings, 0 replies; 30+ messages in thread
From: David Brownell @ 2007-04-01  4:13 UTC (permalink / raw)
  To: Jeff Chua
  Cc: Andrew Morton, linux-acpi, Sergei Shtylyov, jgarzik, gregkh,
	Linux Kernel Mailing List, Adrian Bunk, linux-ide, linux-pci,
	Thomas Gleixner, Eric W. Biederman, Jens Axboe,
	Michael S. Tsirkin, Ingo Molnar, Linus Torvalds, linux-pm,
	Maxim Levitsky

On Saturday 31 March 2007 8:13 pm, Jeff Chua wrote:
> On 4/1/07, David Brownell <david-b@pacbell.net> wrote:
> > for those will all get grouped together ... suspended "very late" and
> > resumed "very early", regardless of when they get registered.  Pretty
> > much the driver model parts of what Linus was suggesting; clockevent
> > bits would still be needed.
> 
> I tested your patch with CONFIG_NO_HZ=y, but it still hangs while
> suspending to disk (before the percent saving).

Of course.  No clockevent updates...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [linux-pm] [PATCH v2] Add suspend/resume for HPET
@ 2007-04-02 14:16 Alan Stern
  2007-04-02 17:09 ` Linus Torvalds
  2007-04-02 19:16 ` David Brownell
  0 siblings, 2 replies; 30+ messages in thread
From: Alan Stern @ 2007-04-02 14:16 UTC (permalink / raw)
  To: David Brownell
  Cc: Maxim Levitsky, Ingo Molnar, Linus Torvalds, Greg KH,
	Thomas Gleixner, Kernel development list, Linux-pm mailing list

On Saturday March 31, 2007, David Brownell wrote:

> I'm about ready to test the appended patch... a "move one device" call
> might be safest at this point in the release cycle though.
> 
> - Dave
> 
> ========================	SNIP!
> Change how the PM list is constructed, so that devices are added right
> after their parents (when they have one) rather than at the end of the
> list.  This preserves sequencing guarantees, but enables sequencing of
> suspend/resume operations by more important characteristics than "when
> device happened to enumerate" ... e.g. clocksources and clockevents
> at a clearly defined point during suspend and resume.
> 
> This patch has a potential downside for devices that have multiple
> power dependencies and which "just happened to work" before.

Dave:

Would you mind retracting this patch?  It will interfere with some work
I've been doing in USB, work that relies on exactly the sort of multiple 
power dependency you mention.

The problem is this: When a low- or full-speed USB device is resumed 
following a power loss (this is part of the "persistent USB" patch), it's 
necessary for the corresponding EHCI root hub to be resumed first so that 
the port's OWNER bit can be set properly.

It works fine as long as devices are resumed in order of registration,
because a low- or full-speed USB device can't be registered before the
high-speed root hub.  (If it was, it would be disconnected when the
high-speed root hub initialized and took control of the port.)  So this
isn't a "just happened to work" situation; it really is a critical
ordering dependency.

The clock source problem can be solved in other ways.  For instance, right 
now things are set up so that clock sources are registered at various 
random times and the clockevents code adapts to use the best available 
device, right?.  So simply arrange for a clock source to "depart" as it is 
suspended; then clockevents can fall back to using other lower-precision 
sources.  As long as devices suspend in reverse order of discovery, the 
system should always function properly.

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-02 14:16 [linux-pm] [PATCH v2] Add suspend/resume for HPET Alan Stern
@ 2007-04-02 17:09 ` Linus Torvalds
  2007-04-02 19:16 ` David Brownell
  1 sibling, 0 replies; 30+ messages in thread
From: Linus Torvalds @ 2007-04-02 17:09 UTC (permalink / raw)
  To: Alan Stern
  Cc: Maxim Levitsky, Ingo Molnar, Linux-pm mailing list,
	Thomas Gleixner, Kernel development list



On Mon, 2 Apr 2007, Alan Stern wrote:
> 
> Would you mind retracting this patch?  It will interfere with some work
> I've been doing in USB, work that relies on exactly the sort of multiple 
> power dependency you mention.

I agree. We should just make the damn timers be added at the right point. 
We can for example *add* the devices early, even if we then actually start 
using them at some later date.

			Linus

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-02 14:16 [linux-pm] [PATCH v2] Add suspend/resume for HPET Alan Stern
  2007-04-02 17:09 ` Linus Torvalds
@ 2007-04-02 19:16 ` David Brownell
  2007-04-02 20:04   ` Alan Stern
  1 sibling, 1 reply; 30+ messages in thread
From: David Brownell @ 2007-04-02 19:16 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Maxim Levitsky, Ingo Molnar,
	Linus Torvalds, Thomas Gleixner, Kernel development list

On Monday 02 April 2007 7:16 am, Alan Stern wrote:
> On Saturday March 31, 2007, David Brownell wrote:
> 
> > Change how the PM list is constructed, so that devices are added right
> > after their parents (when they have one) rather than at the end of the
> > list.  ...
> > 
> > This patch has a potential downside for devices that have multiple
> > power dependencies and which "just happened to work" before.
> 
> Dave:
> 
> Would you mind retracting this patch?  It will interfere with some work
> I've been doing in USB, work that relies on exactly the sort of multiple 
> power dependency you mention.

It's out there for discussion right now more than merging.

What the driver model does now is problematic.  Even if the clockevent
stuff gets different fixes, those driver model issues still exist:  the
existence of dependencies that are not part of the device tree.


> The problem is this: When a low- or full-speed USB device is resumed 
> following a power loss (this is part of the "persistent USB" patch), it's 

Yeah, that "persistent USB" stuff that I don't like at all!   Power off
is power off, and should be treated just like any other case of the USB
circuit being broken.  (Offtopic here, of course.)


> necessary for the corresponding EHCI root hub to be resumed first so that 
> the port's OWNER bit can be set properly.
> 
> It works fine as long as devices are resumed in order of registration,
> because a low- or full-speed USB device can't be registered before the
> high-speed root hub.  (If it was, it would be disconnected when the
> high-speed root hub initialized and took control of the port.)  So this
> isn't a "just happened to work" situation; it really is a critical
> ordering dependency.

No, it's a "just happened to work" because the only ordering promise
that was explicitly made is that the parent/child relationships will
be obeyed.  Any additional ordering dependency is out-of-scope of the
current driver model -- and that's a proble that eventually needs to
be fixed.

This is the kind of thing that the pm_parent relationship was (AFAICT)
originally supposed to handle.  Of course, it doesn't/can't, given the
current implementation ... that relationship is never used.


> The clock source problem can be solved in other ways.  For instance, right 
> now things are set up so that clock sources are registered at various 
> random times and the clockevents code adapts to use the best available 
> device, right?.

Clocksource ~= counter, clockevent ~= timer irq ... you're mixing the
two, they're different!  Both can be registered at various times; and
the best available instance of both is used.  (Unfortunately "best" is
not a static policy on x86 hardware.)


> So simply arrange for a clock source to "depart" as it is  
> suspended; then clockevents can fall back to using other lower-precision 
> sources.  As long as devices suspend in reverse order of discovery, the 
> system should always function properly.

It's not that simple though, especially with HPET.  The BIOS may expect
the PIT to work, but Linux currently (and problematically!) uses HPET in
"legacy replacement mode".  And ISTR the problems are coming up when the
system is already in a low-functionality state:  IRQs off everywhere,
even timer ticks have stopped.

- Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-02 19:16 ` David Brownell
@ 2007-04-02 20:04   ` Alan Stern
  2007-04-03  5:54     ` Thomas Gleixner
  2007-04-04 15:06     ` David Brownell
  0 siblings, 2 replies; 30+ messages in thread
From: Alan Stern @ 2007-04-02 20:04 UTC (permalink / raw)
  To: David Brownell
  Cc: Linux-pm mailing list, Maxim Levitsky, Ingo Molnar,
	Linus Torvalds, Thomas Gleixner, Kernel development list

On Mon, 2 Apr 2007, David Brownell wrote:

> What the driver model does now is problematic.  Even if the clockevent
> stuff gets different fixes, those driver model issues still exist:  the
> existence of dependencies that are not part of the device tree.

> No, it's a "just happened to work" because the only ordering promise
> that was explicitly made is that the parent/child relationships will
> be obeyed.  Any additional ordering dependency is out-of-scope of the
> current driver model -- and that's a proble that eventually needs to
> be fixed.
> 
> This is the kind of thing that the pm_parent relationship was (AFAICT)
> originally supposed to handle.  Of course, it doesn't/can't, given the
> current implementation ... that relationship is never used.

Just so.  In fact, there almost certainly are other dependencies that 
nobody is aware of, simply because they have never had a chance to bite.

Such things can be rather difficult to pin down when they occur.  I would
be happy enough to leave matters as they are, with a strict LIFO approach.  
If someone ever tries to parallelize suspend/resume in multiple threads,
they will have their work cut out for them -- even probing ran into
trouble when attempts were made to parallelize it, and it's a lot simpler.


> It's not that simple though, especially with HPET.  The BIOS may expect
> the PIT to work, but Linux currently (and problematically!) uses HPET in
> "legacy replacement mode".  And ISTR the problems are coming up when the
> system is already in a low-functionality state:  IRQs off everywhere,
> even timer ticks have stopped.

I know nothing about the workings of the HPET and other clock code.  My 
point was this: Suspend passes through various intermediate stages in 
which some devices are available and others aren't.  So long as those 
stages are exact duplicates (in reverse order) of the stages that occurred 
during startup, it should be possible to make them all work.

Alan Stern

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-02 20:04   ` Alan Stern
@ 2007-04-03  5:54     ` Thomas Gleixner
  2007-04-04 15:06     ` David Brownell
  1 sibling, 0 replies; 30+ messages in thread
From: Thomas Gleixner @ 2007-04-03  5:54 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Maxim Levitsky, Ingo Molnar,
	Linus Torvalds, Kernel development list

On Mon, 2007-04-02 at 16:04 -0400, Alan Stern wrote:
> > It's not that simple though, especially with HPET.  The BIOS may expect
> > the PIT to work, but Linux currently (and problematically!) uses HPET in
> > "legacy replacement mode".  And ISTR the problems are coming up when the
> > system is already in a low-functionality state:  IRQs off everywhere,
> > even timer ticks have stopped.
> 
> I know nothing about the workings of the HPET and other clock code.  My 
> point was this: Suspend passes through various intermediate stages in 
> which some devices are available and others aren't.  So long as those 
> stages are exact duplicates (in reverse order) of the stages that occurred 
> during startup, it should be possible to make them all work.

Unfortunately it is not a fully linear problem. Devices are initialized
late and put the system into a more complex state (i.e. dynticks,
highres) which needs to be suspended and resumed. If we want to do this
completely linear we need to do a full reverse rollback of the system
states, which moves even more complexity into such systems.

Also the linear approach is not working with other devices, as one can
see with the still unresolved "IRQ#X nobody cared" issues at resume,
which break my laptop. It works nice on startup of the system, but
breaks on resume.

	tglx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2] Add suspend/resume for HPET
  2007-04-02 20:04   ` Alan Stern
  2007-04-03  5:54     ` Thomas Gleixner
@ 2007-04-04 15:06     ` David Brownell
  1 sibling, 0 replies; 30+ messages in thread
From: David Brownell @ 2007-04-04 15:06 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux-pm mailing list, Maxim Levitsky, Ingo Molnar,
	Linus Torvalds, Thomas Gleixner, Kernel development list

On Monday 02 April 2007 1:04 pm, Alan Stern wrote:
> On Mon, 2 Apr 2007, David Brownell wrote:

> > This is the kind of thing that the pm_parent relationship was (AFAICT)
> > originally supposed to handle.  Of course, it doesn't/can't, given the
> > current implementation ... that relationship is never used.
> 
> Just so.  In fact, there almost certainly are other dependencies that 
> nobody is aware of, simply because they have never had a chance to bite.

In any given system, yes there are bugs lurking.  But I was more concerned
with a provably wrong assumption made by the current framework.  Such things
cause cascading fragility.

As Thomas mentioned, HPET isn't the only place where a "linear" model fails.


> Such things can be rather difficult to pin down when they occur.  I would
> be happy enough to leave matters as they are, with a strict LIFO approach.

I wouldn't.  Much better to have a solid handle on the interdependencies
than to need to cope, long term, with a framework that doesn't allow that.

Remember also that a LIFO model assumes that there's only one sequence by
which the hardware powers up/down ... i.e. that there's no runtime PM going
on, whereby large chunks are regularly powered down/up based on usage.
Better runtime PM becomes more important as system complexity rises.

- Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2007-04-04 15:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-04-02 14:16 [linux-pm] [PATCH v2] Add suspend/resume for HPET Alan Stern
2007-04-02 17:09 ` Linus Torvalds
2007-04-02 19:16 ` David Brownell
2007-04-02 20:04   ` Alan Stern
2007-04-03  5:54     ` Thomas Gleixner
2007-04-04 15:06     ` David Brownell
     [not found] <Pine.LNX.4.64.0703160925270.26106@woody.linux-foundation.org>
2007-03-29  5:47 ` [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions Maxim
2007-03-29 16:35   ` Linus Torvalds
2007-03-29 16:51     ` Maxim Levitsky
2007-03-29 17:22       ` Linus Torvalds
2007-03-29 17:47         ` [patch, v2] add suspend/resume for HPET Ingo Molnar
2007-03-29 13:20 ` [ PATCH] Add suspend/resume for HPET was: Re: [3/6] 2.6.21-rc4: known regressions Sergei Shtylyov
2007-03-29 13:31   ` Maxim
2007-03-29 13:46     ` [PATCH v2] Add suspend/resume for HPET Maxim Levitsky
2007-03-29 16:53       ` Linus Torvalds
2007-03-29 17:28         ` Maxim Levitsky
2007-03-29 17:51         ` Ingo Molnar
2007-03-29 20:46           ` Andi Kleen
2007-03-29 18:11       ` Jeff Chua
2007-03-31 15:51       ` Thomas Gleixner
2007-03-31 16:01         ` Jeff Chua
2007-03-31 16:09           ` Thomas Gleixner
2007-03-31 16:09         ` Linus Torvalds
2007-03-31 16:33           ` Thomas Gleixner
2007-03-31 16:41             ` Greg KH
2007-03-31 16:53             ` Linus Torvalds
2007-03-31 17:02               ` Ingo Molnar
2007-03-31 18:18                 ` David Brownell
     [not found]                 ` <200703311118.55132.david-b@pacbell.net>
2007-03-31 19:32                   ` David Brownell
     [not found]                   ` <200703311232.57505.david-b@pacbell.net>
2007-04-01  3:13                     ` Jeff Chua
2007-04-01  4:13                       ` David Brownell
2007-03-31 17:08               ` Greg KH
2007-03-31 16:56         ` Maxim Levitsky
2007-03-31 17:09           ` Linus Torvalds
2007-03-31 17:17             ` Ingo Molnar
2007-03-31 17:58               ` Daniel Walker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox