* [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 21:30 [PATCH 0/2] suspend-to-ram debugging patches Linus Torvalds
@ 2006-06-13 21:35 ` Linus Torvalds
2006-06-13 22:10 ` Nigel Cunningham
2006-06-14 10:25 ` Pavel Machek
0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-06-13 21:35 UTC (permalink / raw)
To: Power management list
Considering that there isn't a lot of hw we can depend on during
resume, this is about as good as it gets.
Use "#include <linux/resume-trace.h>", and then sprinkle TRACE_RESUME(0)
commands liberally over the driver that you're trying to figure out why
and where it hangs. Expect to waste a _lot_ of time, but at least this
gives you _some_ chance to actually debug it, instead of just staring at a
dead machine.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
Not a lot of space in the RTC, but it's the only piece of hardware that is
(a) reachable at all times, regardless of any other setup and (b) doesn't
lose it state or have firmware reset the memory of at boot.
Side note: you really don't want to do this unless you have an external
time-source like NTP that resets the clock to the right value after the
boot is done ;)
diff --git a/arch/i386/kernel/vmlinux.lds.S b/arch/i386/kernel/vmlinux.lds.S
index 8831303..509af98 100644
--- a/arch/i386/kernel/vmlinux.lds.S
+++ b/arch/i386/kernel/vmlinux.lds.S
@@ -37,6 +37,13 @@ SECTIONS
RODATA
+ . = ALIGN(4);
+ __tracedata_start = .;
+ .tracedata : AT(ADDR(.tracedata) - LOAD_OFFSET) {
+ *(.tracedata)
+ }
+ __tracedata_end = .;
+
/* writeable */
.data : AT(ADDR(.data) - LOAD_OFFSET) { /* Data */
*(.data)
diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile
index c0219ad..adc4250 100644
--- a/drivers/base/power/Makefile
+++ b/drivers/base/power/Makefile
@@ -1,5 +1,5 @@
obj-y := shutdown.o
-obj-$(CONFIG_PM) += main.o suspend.o resume.o runtime.o sysfs.o
+obj-$(CONFIG_PM) += main.o suspend.o resume.o runtime.o sysfs.o trace.o
ifeq ($(CONFIG_DEBUG_DRIVER),y)
EXTRA_CFLAGS += -DDEBUG
diff --git a/drivers/base/power/trace.c b/drivers/base/power/trace.c
new file mode 100644
index 0000000..bcc5f12
--- /dev/null
+++ b/drivers/base/power/trace.c
@@ -0,0 +1,228 @@
+/*
+ * drivers/base/power/trace.c
+ *
+ * Copyright (C) 2006 Linus Torvalds
+ *
+ * Trace facility for suspend/resume problems, when none of the
+ * devices may be working.
+ */
+
+#include <linux/resume-trace.h>
+#include <linux/rtc.h>
+
+#include <asm/rtc.h>
+
+#include "power.h"
+
+/*
+ * Horrid, horrid, horrid.
+ *
+ * It turns out that the _only_ piece of hardware that actually
+ * keeps its value across a hard boot (and, more importantly, the
+ * POST init sequence) is literally the realtime clock.
+ *
+ * Never mind that an RTC chip has 114 bytes (and often a whole
+ * other bank of an additional 128 bytes) of nice SRAM that is
+ * _designed_ to keep data - the POST will clear it. So we literally
+ * can just use the few bytes of actual time data, which means that
+ * we're really limited.
+ *
+ * It means, for example, that we can't use the seconds at all
+ * (since the time between the hang and the boot might be more
+ * than a minute), and we'd better not depend on the low bits of
+ * the minutes either.
+ *
+ * There are the wday fields etc, but I wouldn't guarantee those
+ * are dependable either. And if the date isn't valid, either the
+ * hw or POST will do strange things.
+ *
+ * So we're left with:
+ * - year: 0-99
+ * - month: 0-11
+ * - day-of-month: 1-28
+ * - hour: 0-23
+ * - min: (0-30)*2
+ *
+ * Giving us a total range of 0-16128000 (0xf61800), ie less
+ * than 24 bits of actual data we can save across reboots.
+ *
+ * And if your box can't boot in less than three minutes,
+ * you're screwed.
+ *
+ * Now, almost 24 bits of data is pitifully small, so we need
+ * to be pretty dense if we want to use it for anything nice.
+ * What we do is that instead of saving off nice readable info,
+ * we save off _hashes_ of information that we can hopefully
+ * regenerate after the reboot.
+ *
+ * In particular, this means that we might be unlucky, and hit
+ * a case where we have a hash collision, and we end up not
+ * being able to tell for certain exactly which case happened.
+ * But that's hopefully unlikely.
+ *
+ * What we do is to take the bits we can fit, and split them
+ * into three parts (16*997*1009 = 16095568), and use the values
+ * for:
+ * - 0-15: user-settable
+ * - 0-996: file + line number
+ * - 0-1008: device
+ */
+#define USERHASH (16)
+#define FILEHASH (997)
+#define DEVHASH (1009)
+
+#define DEVSEED (7919)
+
+static unsigned int dev_hash_value;
+
+static int set_magic_time(unsigned int user, unsigned int file, unsigned int device)
+{
+ unsigned int n = user + USERHASH*(file + FILEHASH*device);
+
+ // June 7th, 2006
+ static struct rtc_time time = {
+ .tm_sec = 0,
+ .tm_min = 0,
+ .tm_hour = 0,
+ .tm_mday = 7,
+ .tm_mon = 5, // June - counting from zero
+ .tm_year = 106,
+ .tm_wday = 3,
+ .tm_yday = 160,
+ .tm_isdst = 1
+ };
+
+ time.tm_year = (n % 100);
+ n /= 100;
+ time.tm_mon = (n % 12);
+ n /= 12;
+ time.tm_mday = (n % 28) + 1;
+ n /= 28;
+ time.tm_hour = (n % 24);
+ n /= 24;
+ time.tm_min = (n % 20) * 3;
+ n /= 20;
+ set_rtc_time(&time);
+ return n ? -1 : 0;
+}
+
+static unsigned int read_magic_time(void)
+{
+ struct rtc_time time;
+ unsigned int val;
+
+ get_rtc_time(&time);
+ printk("Time: %2d:%02d:%02d Date: %02d/%02d/%02d\n",
+ time.tm_hour, time.tm_min, time.tm_sec,
+ time.tm_mon, time.tm_mday, time.tm_year);
+ val = time.tm_year; /* 100 years */
+ if (val > 100)
+ val -= 100;
+ val += time.tm_mon * 100; /* 12 months */
+ val += (time.tm_mday-1) * 100 * 12; /* 28 month-days */
+ val += time.tm_hour * 100 * 12 * 28; /* 24 hours */
+ val += (time.tm_min / 3) * 100 * 12 * 28 * 24; /* 20 3-minute intervals */
+ return val;
+}
+
+/*
+ * This is just the sdbm hash function with a user-supplied
+ * seed and final size parameter.
+ */
+static unsigned int hash_string(unsigned int seed, const char *data, unsigned int mod)
+{
+ unsigned char c;
+ while ((c = *data++) != 0) {
+ seed = (seed << 16) + (seed << 6) - seed + c;
+ }
+ return seed % mod;
+}
+
+void set_trace_device(struct device *dev)
+{
+ dev_hash_value = hash_string(DEVSEED, dev->bus_id, DEVHASH);
+}
+
+/*
+ * We could just take the "tracedata" index into the .tracedata
+ * section instead. Generating a hash of the data gives us a
+ * chance to work across kernel versions, and perhaps more
+ * importantly it also gives us valid/invalid check (ie we will
+ * likely not give totally bogus reports - if the hash matches,
+ * it's not any guarantee, but it's a high _likelihood_ that
+ * the match is valid).
+ */
+void generate_resume_trace(void *tracedata, unsigned int user)
+{
+ unsigned short lineno = *(unsigned short *)tracedata;
+ const char *file = *(const char **)(tracedata + 2);
+ unsigned int user_hash_value, file_hash_value;
+
+ user_hash_value = user % USERHASH;
+ file_hash_value = hash_string(lineno, file, FILEHASH);
+ set_magic_time(user_hash_value, file_hash_value, dev_hash_value);
+}
+
+extern char __tracedata_start, __tracedata_end;
+static int show_file_hash(unsigned int value)
+{
+ int match;
+ char *tracedata;
+
+ match = 0;
+ for (tracedata = &__tracedata_start ; tracedata < &__tracedata_end ; tracedata += 6) {
+ unsigned short lineno = *(unsigned short *)tracedata;
+ const char *file = *(const char **)(tracedata + 2);
+ unsigned int hash = hash_string(lineno, file, FILEHASH);
+ if (hash != value)
+ continue;
+ printk(" hash matches %s:%u\n", file, lineno);
+ match++;
+ }
+ return match;
+}
+
+static int show_dev_hash(unsigned int value)
+{
+ int match = 0;
+ struct list_head * entry = dpm_active.prev;
+
+ while (entry != &dpm_active) {
+ struct device * dev = to_device(entry);
+ unsigned int hash = hash_string(DEVSEED, dev->bus_id, DEVHASH);
+ if (hash == value) {
+ printk(" hash matches device %s\n", dev->bus_id);
+ match++;
+ }
+ entry = entry->prev;
+ }
+ return match;
+}
+
+static unsigned int hash_value_early_read;
+
+static int early_resume_init(void)
+{
+ hash_value_early_read = read_magic_time();
+ return 0;
+}
+
+static int late_resume_init(void)
+{
+ unsigned int val = hash_value_early_read;
+ unsigned int user, file, dev;
+
+ user = val % USERHASH;
+ val = val / USERHASH;
+ file = val % FILEHASH;
+ val = val / FILEHASH;
+ dev = val /* % DEVHASH */;
+
+ printk(" Magic number: %d:%d:%d\n", user, file, dev);
+ show_file_hash(file);
+ show_dev_hash(dev);
+ return 0;
+}
+
+core_initcall(early_resume_init);
+late_initcall(late_resume_init);
diff --git a/include/asm-generic/rtc.h b/include/asm-generic/rtc.h
index cef08db..4087037 100644
--- a/include/asm-generic/rtc.h
+++ b/include/asm-generic/rtc.h
@@ -114,6 +114,7 @@ #endif
/* Set the current date and time in the real time clock. */
static inline int set_rtc_time(struct rtc_time *time)
{
+ unsigned long flags;
unsigned char mon, day, hrs, min, sec;
unsigned char save_control, save_freq_select;
unsigned int yrs;
@@ -131,7 +132,7 @@ #endif
if (yrs > 255) /* They are unsigned */
return -EINVAL;
- spin_lock_irq(&rtc_lock);
+ spin_lock_irqsave(&rtc_lock, flags);
#ifdef CONFIG_MACH_DECSTATION
real_yrs = yrs;
leap_yr = ((!((yrs + 1900) % 4) && ((yrs + 1900) % 100)) ||
@@ -152,7 +153,7 @@ #endif
* whether the chip is in binary mode or not.
*/
if (yrs > 169) {
- spin_unlock_irq(&rtc_lock);
+ spin_unlock_irqrestore(&rtc_lock, flags);
return -EINVAL;
}
@@ -187,7 +188,7 @@ #endif
CMOS_WRITE(save_control, RTC_CONTROL);
CMOS_WRITE(save_freq_select, RTC_FREQ_SELECT);
- spin_unlock_irq(&rtc_lock);
+ spin_unlock_irqrestore(&rtc_lock, flags);
return 0;
}
diff --git a/include/linux/resume-trace.h b/include/linux/resume-trace.h
new file mode 100644
index 0000000..e2e1e14
--- /dev/null
+++ b/include/linux/resume-trace.h
@@ -0,0 +1,21 @@
+#ifndef RESUME_TRACE_H
+#define RESUME_TRACE_H
+
+struct device;
+extern void set_trace_device(struct device *);
+extern void generate_resume_trace(void *tracedata, unsigned int user);
+
+#define TRACE_DEVICE(dev) set_trace_device(dev)
+#define TRACE_RESUME(user) do { \
+ void *tracedata; \
+ asm volatile("movl $1f,%0\n" \
+ ".section .tracedata,\"a\"\n" \
+ "1:\t.word %c1\n" \
+ "\t.long %c2\n" \
+ ".previous" \
+ :"=r" (tracedata) \
+ : "i" (__LINE__), "i" (__FILE__)); \
+ generate_resume_trace(tracedata, user); \
+} while (0)
+
+#endif
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 21:35 ` [PATCH 1/2] Add some basic resume trace facilities Linus Torvalds
@ 2006-06-13 22:10 ` Nigel Cunningham
2006-06-13 22:50 ` Linus Torvalds
2006-06-14 10:25 ` Pavel Machek
1 sibling, 1 reply; 10+ messages in thread
From: Nigel Cunningham @ 2006-06-13 22:10 UTC (permalink / raw)
To: linux-pm; +Cc: Linus Torvalds
[-- Attachment #1.1: Type: text/plain, Size: 11611 bytes --]
Hi.
On Wednesday 14 June 2006 07:35, Linus Torvalds wrote:
> Considering that there isn't a lot of hw we can depend on during
> resume, this is about as good as it gets.
>
> Use "#include <linux/resume-trace.h>", and then sprinkle TRACE_RESUME(0)
> commands liberally over the driver that you're trying to figure out why
> and where it hangs. Expect to waste a _lot_ of time, but at least this
> gives you _some_ chance to actually debug it, instead of just staring at a
> dead machine.
>
> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
s/On laptop per child/One bdi2000 per computer/? I'll give it a try.
Regards,
Nigel
> ---
>
> Not a lot of space in the RTC, but it's the only piece of hardware that is
> (a) reachable at all times, regardless of any other setup and (b) doesn't
> lose it state or have firmware reset the memory of at boot.
>
> Side note: you really don't want to do this unless you have an external
> time-source like NTP that resets the clock to the right value after the
> boot is done ;)
>
> diff --git a/arch/i386/kernel/vmlinux.lds.S
> b/arch/i386/kernel/vmlinux.lds.S index 8831303..509af98 100644
> --- a/arch/i386/kernel/vmlinux.lds.S
> +++ b/arch/i386/kernel/vmlinux.lds.S
> @@ -37,6 +37,13 @@ SECTIONS
>
> RODATA
>
> + . = ALIGN(4);
> + __tracedata_start = .;
> + .tracedata : AT(ADDR(.tracedata) - LOAD_OFFSET) {
> + *(.tracedata)
> + }
> + __tracedata_end = .;
> +
> /* writeable */
> .data : AT(ADDR(.data) - LOAD_OFFSET) { /* Data */
> *(.data)
> diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile
> index c0219ad..adc4250 100644
> --- a/drivers/base/power/Makefile
> +++ b/drivers/base/power/Makefile
> @@ -1,5 +1,5 @@
> obj-y := shutdown.o
> -obj-$(CONFIG_PM) += main.o suspend.o resume.o runtime.o sysfs.o
> +obj-$(CONFIG_PM) += main.o suspend.o resume.o runtime.o sysfs.o trace.o
>
> ifeq ($(CONFIG_DEBUG_DRIVER),y)
> EXTRA_CFLAGS += -DDEBUG
> diff --git a/drivers/base/power/trace.c b/drivers/base/power/trace.c
> new file mode 100644
> index 0000000..bcc5f12
> --- /dev/null
> +++ b/drivers/base/power/trace.c
> @@ -0,0 +1,228 @@
> +/*
> + * drivers/base/power/trace.c
> + *
> + * Copyright (C) 2006 Linus Torvalds
> + *
> + * Trace facility for suspend/resume problems, when none of the
> + * devices may be working.
> + */
> +
> +#include <linux/resume-trace.h>
> +#include <linux/rtc.h>
> +
> +#include <asm/rtc.h>
> +
> +#include "power.h"
> +
> +/*
> + * Horrid, horrid, horrid.
> + *
> + * It turns out that the _only_ piece of hardware that actually
> + * keeps its value across a hard boot (and, more importantly, the
> + * POST init sequence) is literally the realtime clock.
> + *
> + * Never mind that an RTC chip has 114 bytes (and often a whole
> + * other bank of an additional 128 bytes) of nice SRAM that is
> + * _designed_ to keep data - the POST will clear it. So we literally
> + * can just use the few bytes of actual time data, which means that
> + * we're really limited.
> + *
> + * It means, for example, that we can't use the seconds at all
> + * (since the time between the hang and the boot might be more
> + * than a minute), and we'd better not depend on the low bits of
> + * the minutes either.
> + *
> + * There are the wday fields etc, but I wouldn't guarantee those
> + * are dependable either. And if the date isn't valid, either the
> + * hw or POST will do strange things.
> + *
> + * So we're left with:
> + * - year: 0-99
> + * - month: 0-11
> + * - day-of-month: 1-28
> + * - hour: 0-23
> + * - min: (0-30)*2
> + *
> + * Giving us a total range of 0-16128000 (0xf61800), ie less
> + * than 24 bits of actual data we can save across reboots.
> + *
> + * And if your box can't boot in less than three minutes,
> + * you're screwed.
> + *
> + * Now, almost 24 bits of data is pitifully small, so we need
> + * to be pretty dense if we want to use it for anything nice.
> + * What we do is that instead of saving off nice readable info,
> + * we save off _hashes_ of information that we can hopefully
> + * regenerate after the reboot.
> + *
> + * In particular, this means that we might be unlucky, and hit
> + * a case where we have a hash collision, and we end up not
> + * being able to tell for certain exactly which case happened.
> + * But that's hopefully unlikely.
> + *
> + * What we do is to take the bits we can fit, and split them
> + * into three parts (16*997*1009 = 16095568), and use the values
> + * for:
> + * - 0-15: user-settable
> + * - 0-996: file + line number
> + * - 0-1008: device
> + */
> +#define USERHASH (16)
> +#define FILEHASH (997)
> +#define DEVHASH (1009)
> +
> +#define DEVSEED (7919)
> +
> +static unsigned int dev_hash_value;
> +
> +static int set_magic_time(unsigned int user, unsigned int file, unsigned
> int device) +{
> + unsigned int n = user + USERHASH*(file + FILEHASH*device);
> +
> + // June 7th, 2006
> + static struct rtc_time time = {
> + .tm_sec = 0,
> + .tm_min = 0,
> + .tm_hour = 0,
> + .tm_mday = 7,
> + .tm_mon = 5, // June - counting from zero
> + .tm_year = 106,
> + .tm_wday = 3,
> + .tm_yday = 160,
> + .tm_isdst = 1
> + };
> +
> + time.tm_year = (n % 100);
> + n /= 100;
> + time.tm_mon = (n % 12);
> + n /= 12;
> + time.tm_mday = (n % 28) + 1;
> + n /= 28;
> + time.tm_hour = (n % 24);
> + n /= 24;
> + time.tm_min = (n % 20) * 3;
> + n /= 20;
> + set_rtc_time(&time);
> + return n ? -1 : 0;
> +}
> +
> +static unsigned int read_magic_time(void)
> +{
> + struct rtc_time time;
> + unsigned int val;
> +
> + get_rtc_time(&time);
> + printk("Time: %2d:%02d:%02d Date: %02d/%02d/%02d\n",
> + time.tm_hour, time.tm_min, time.tm_sec,
> + time.tm_mon, time.tm_mday, time.tm_year);
> + val = time.tm_year; /* 100 years */
> + if (val > 100)
> + val -= 100;
> + val += time.tm_mon * 100; /* 12 months */
> + val += (time.tm_mday-1) * 100 * 12; /* 28 month-days */
> + val += time.tm_hour * 100 * 12 * 28; /* 24 hours */
> + val += (time.tm_min / 3) * 100 * 12 * 28 * 24; /* 20 3-minute intervals
> */ + return val;
> +}
> +
> +/*
> + * This is just the sdbm hash function with a user-supplied
> + * seed and final size parameter.
> + */
> +static unsigned int hash_string(unsigned int seed, const char *data,
> unsigned int mod) +{
> + unsigned char c;
> + while ((c = *data++) != 0) {
> + seed = (seed << 16) + (seed << 6) - seed + c;
> + }
> + return seed % mod;
> +}
> +
> +void set_trace_device(struct device *dev)
> +{
> + dev_hash_value = hash_string(DEVSEED, dev->bus_id, DEVHASH);
> +}
> +
> +/*
> + * We could just take the "tracedata" index into the .tracedata
> + * section instead. Generating a hash of the data gives us a
> + * chance to work across kernel versions, and perhaps more
> + * importantly it also gives us valid/invalid check (ie we will
> + * likely not give totally bogus reports - if the hash matches,
> + * it's not any guarantee, but it's a high _likelihood_ that
> + * the match is valid).
> + */
> +void generate_resume_trace(void *tracedata, unsigned int user)
> +{
> + unsigned short lineno = *(unsigned short *)tracedata;
> + const char *file = *(const char **)(tracedata + 2);
> + unsigned int user_hash_value, file_hash_value;
> +
> + user_hash_value = user % USERHASH;
> + file_hash_value = hash_string(lineno, file, FILEHASH);
> + set_magic_time(user_hash_value, file_hash_value, dev_hash_value);
> +}
> +
> +extern char __tracedata_start, __tracedata_end;
> +static int show_file_hash(unsigned int value)
> +{
> + int match;
> + char *tracedata;
> +
> + match = 0;
> + for (tracedata = &__tracedata_start ; tracedata < &__tracedata_end ;
> tracedata += 6) { + unsigned short lineno = *(unsigned short *)tracedata;
> + const char *file = *(const char **)(tracedata + 2);
> + unsigned int hash = hash_string(lineno, file, FILEHASH);
> + if (hash != value)
> + continue;
> + printk(" hash matches %s:%u\n", file, lineno);
> + match++;
> + }
> + return match;
> +}
> +
> +static int show_dev_hash(unsigned int value)
> +{
> + int match = 0;
> + struct list_head * entry = dpm_active.prev;
> +
> + while (entry != &dpm_active) {
> + struct device * dev = to_device(entry);
> + unsigned int hash = hash_string(DEVSEED, dev->bus_id, DEVHASH);
> + if (hash == value) {
> + printk(" hash matches device %s\n", dev->bus_id);
> + match++;
> + }
> + entry = entry->prev;
> + }
> + return match;
> +}
> +
> +static unsigned int hash_value_early_read;
> +
> +static int early_resume_init(void)
> +{
> + hash_value_early_read = read_magic_time();
> + return 0;
> +}
> +
> +static int late_resume_init(void)
> +{
> + unsigned int val = hash_value_early_read;
> + unsigned int user, file, dev;
> +
> + user = val % USERHASH;
> + val = val / USERHASH;
> + file = val % FILEHASH;
> + val = val / FILEHASH;
> + dev = val /* % DEVHASH */;
> +
> + printk(" Magic number: %d:%d:%d\n", user, file, dev);
> + show_file_hash(file);
> + show_dev_hash(dev);
> + return 0;
> +}
> +
> +core_initcall(early_resume_init);
> +late_initcall(late_resume_init);
> diff --git a/include/asm-generic/rtc.h b/include/asm-generic/rtc.h
> index cef08db..4087037 100644
> --- a/include/asm-generic/rtc.h
> +++ b/include/asm-generic/rtc.h
> @@ -114,6 +114,7 @@ #endif
> /* Set the current date and time in the real time clock. */
> static inline int set_rtc_time(struct rtc_time *time)
> {
> + unsigned long flags;
> unsigned char mon, day, hrs, min, sec;
> unsigned char save_control, save_freq_select;
> unsigned int yrs;
> @@ -131,7 +132,7 @@ #endif
> if (yrs > 255) /* They are unsigned */
> return -EINVAL;
>
> - spin_lock_irq(&rtc_lock);
> + spin_lock_irqsave(&rtc_lock, flags);
> #ifdef CONFIG_MACH_DECSTATION
> real_yrs = yrs;
> leap_yr = ((!((yrs + 1900) % 4) && ((yrs + 1900) % 100)) ||
> @@ -152,7 +153,7 @@ #endif
> * whether the chip is in binary mode or not.
> */
> if (yrs > 169) {
> - spin_unlock_irq(&rtc_lock);
> + spin_unlock_irqrestore(&rtc_lock, flags);
> return -EINVAL;
> }
>
> @@ -187,7 +188,7 @@ #endif
> CMOS_WRITE(save_control, RTC_CONTROL);
> CMOS_WRITE(save_freq_select, RTC_FREQ_SELECT);
>
> - spin_unlock_irq(&rtc_lock);
> + spin_unlock_irqrestore(&rtc_lock, flags);
>
> return 0;
> }
> diff --git a/include/linux/resume-trace.h b/include/linux/resume-trace.h
> new file mode 100644
> index 0000000..e2e1e14
> --- /dev/null
> +++ b/include/linux/resume-trace.h
> @@ -0,0 +1,21 @@
> +#ifndef RESUME_TRACE_H
> +#define RESUME_TRACE_H
> +
> +struct device;
> +extern void set_trace_device(struct device *);
> +extern void generate_resume_trace(void *tracedata, unsigned int user);
> +
> +#define TRACE_DEVICE(dev) set_trace_device(dev)
> +#define TRACE_RESUME(user) do { \
> + void *tracedata; \
> + asm volatile("movl $1f,%0\n" \
> + ".section .tracedata,\"a\"\n" \
> + "1:\t.word %c1\n" \
> + "\t.long %c2\n" \
> + ".previous" \
> + :"=r" (tracedata) \
> + : "i" (__LINE__), "i" (__FILE__)); \
> + generate_resume_trace(tracedata, user); \
> +} while (0)
> +
> +#endif
> _______________________________________________
> linux-pm mailing list
> linux-pm@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/linux-pm
--
Nigel, Michelle and Alisdair Cunningham
5 Mitchell Street
Cobden 3266
Victoria, Australia
[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
@ 2006-06-13 22:25 Gross, Mark
2006-06-13 22:59 ` Linus Torvalds
0 siblings, 1 reply; 10+ messages in thread
From: Gross, Mark @ 2006-06-13 22:25 UTC (permalink / raw)
To: Nigel Cunningham, linux-pm; +Cc: Linus Torvalds
>-----Original Message-----
>From: linux-pm-bounces@lists.osdl.org
[mailto:linux-pm-bounces@lists.osdl.org] On Behalf Of Nigel
>Cunningham
>Sent: Tuesday, June 13, 2006 3:10 PM
>To: linux-pm@lists.osdl.org
>Cc: Linus Torvalds
>Subject: Re: [linux-pm] [PATCH 1/2] Add some basic resume trace
facilities
>
>Hi.
>
>On Wednesday 14 June 2006 07:35, Linus Torvalds wrote:
>> Considering that there isn't a lot of hw we can depend on during
>> resume, this is about as good as it gets.
>>
>> Use "#include <linux/resume-trace.h>", and then sprinkle
TRACE_RESUME(0)
>> commands liberally over the driver that you're trying to figure out
why
>> and where it hangs. Expect to waste a _lot_ of time, but at least
this
>> gives you _some_ chance to actually debug it, instead of just staring
at a
>> dead machine.
>>
>> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
>s/On laptop per child/One bdi2000 per computer/? I'll give it a try.
That thing has a lot of flash one could just use to dump the system log
too.
If you can spare a few blocks it's not too hard to write a low level
synchronous flash write code that works at panic time. I did this to a
2.4.10 kernel a while back, and it was *very* useful for debugging
problems.
--mgross
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 22:10 ` Nigel Cunningham
@ 2006-06-13 22:50 ` Linus Torvalds
0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-06-13 22:50 UTC (permalink / raw)
To: Nigel Cunningham; +Cc: linux-pm
On Wed, 14 Jun 2006, Nigel Cunningham wrote:
> >
> > Use "#include <linux/resume-trace.h>", and then sprinkle TRACE_RESUME(0)
> > commands liberally over the driver that you're trying to figure out why
> > and where it hangs. Expect to waste a _lot_ of time, but at least this
> > gives you _some_ chance to actually debug it, instead of just staring at a
> > dead machine.
> >
> > Signed-off-by: Linus Torvalds <torvalds@osdl.org>
>
> s/On laptop per child/One bdi2000 per computer/? I'll give it a try.
Yeah, well it ain't no JTAG scanner, exactly ;)
The minimal patch to actually _use_ this would be something like the
appended. It's worth noting that the TRACE_DEVICE() macro does _not_
actually generate a trace event in itself, it just prepares the device
hash so that the TRACE_RESUME() code then save the device, filename and
linenumber information in the "trace buffer".
When you reboot, if everything went well, you'll see something like
Magic number: 1:660:259
hash matches drivers/usb/host/ehci-pci.c:258
hash matches device 0000:00:1d.7
in the bootup dmesg logs. The "magic number" is just the hashes, where the
first number is between 0-15 and can be a dynamic value, ie if you are
inside a loop you can do
TRACE_RESUME(loopcounter);
and it will save off the low four bits of the loopcounter in the RTC and
it will show it as the first "magic number". Otherwise you'll just have to
live with totally static information (filename and line number of the last
trace event that triggered).
(The above trace event was obviously not generated with this minimal
patch: it's from a much bigger "sprinkle TRACE_RESUME() stuff all over"
thing of mine, from a real debug session).
And the real problem, of course, is that the trace buffer is just a single
entry deep. It was "interesting" to just fit even _that_, much less a real
trace buffer into the RTC.
Of course, with helper hardware we could do much better, but the whole
point of this was literally to _not_ need any special debug hardware. This
should work on anything.
Linus
---
diff --git a/drivers/base/power/resume.c b/drivers/base/power/resume.c
index 317edbf..bf6ee38 100644
--- a/drivers/base/power/resume.c
+++ b/drivers/base/power/resume.c
@@ -9,6 +9,7 @@
*/
#include <linux/device.h>
+#include <linux/resume-trace.h>
#include "../base.h"
#include "power.h"
@@ -23,6 +24,8 @@ int resume_device(struct device * dev)
{
int error = 0;
+ TRACE_DEVICE(dev);
+ TRACE_RESUME(0);
down(&dev->sem);
if (dev->power.pm_parent
&& dev->power.pm_parent->power.power_state.event) {
@@ -36,6 +39,7 @@ int resume_device(struct device * dev)
error = dev->bus->resume(dev);
}
up(&dev->sem);
+ TRACE_RESUME(1);
return error;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 22:25 [PATCH 1/2] Add some basic resume trace facilities Gross, Mark
@ 2006-06-13 22:59 ` Linus Torvalds
2006-06-13 23:04 ` Dave Jones
2006-06-16 1:49 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-06-13 22:59 UTC (permalink / raw)
To: Gross, Mark; +Cc: linux-pm, Nigel Cunningham
On Tue, 13 Jun 2006, Gross, Mark wrote:
>
> If you can spare a few blocks it's not too hard to write a low level
> synchronous flash write code that works at panic time. I did this to a
> 2.4.10 kernel a while back, and it was *very* useful for debugging
> problems.
Absolutely.
The problem is, there's no safe and documented hardware to do that in
general.
On a standard PC, you have the RTC, and that's it ;(
It used to be that you could toggle serial lines by hand, but that's not
the case any more these days (they're not at a fixed address any more and
generally needs tons of very chip-specific setup to even be visible to the
CPU, but more importantly, most modern hardware doesn't even have the
_connector_ any more).
The Apple Mac Mini I can't even get to _beep_, which is really annoying.
It's a wonderful debug sequence ("oh, I head 15 beeps, it got to point
X"). Or rather, it's "wonderful" compared to something that gives you just
one single piece of data after the reboot ;)
Now, the LPC obviously does have the BIOS flash chip connected to it, but
I suspect very few people would be happy with code that overwrites even
just parts of that. I also suspect it varies a lot from machine to
machine. I wasn't going to even try.
The RTC chip actually has enough memory in it that I could have saved a
lot more information, but the firmware (at least on the Mac Mini) will
clear it on a full-post boot (and the whole point of this is that we
_will_ do a full POST when the resume fails, of course).
It even clears the second 128-byte RTC memory bank, from my testing ;(
The real-time clock was literally the only thing I could find that didn't
get cleared, and that obviously has some serious limitations size-wise.
Linus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 22:59 ` Linus Torvalds
@ 2006-06-13 23:04 ` Dave Jones
2006-06-13 23:13 ` Linus Torvalds
2006-06-16 1:49 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 10+ messages in thread
From: Dave Jones @ 2006-06-13 23:04 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-pm, Nigel Cunningham
On Tue, Jun 13, 2006 at 03:59:57PM -0700, Linus Torvalds wrote:
> The real-time clock was literally the only thing I could find that didn't
> get cleared, and that obviously has some serious limitations size-wise.
Hmm, do they even clear all of video ram on a reset ?
Dave
--
http://www.codemonkey.org.uk
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 23:04 ` Dave Jones
@ 2006-06-13 23:13 ` Linus Torvalds
0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-06-13 23:13 UTC (permalink / raw)
To: Dave Jones; +Cc: linux-pm, Nigel Cunningham
On Tue, 13 Jun 2006, Dave Jones wrote:
>
> On Tue, Jun 13, 2006 at 03:59:57PM -0700, Linus Torvalds wrote:
>
> > The real-time clock was literally the only thing I could find that didn't
> > get cleared, and that obviously has some serious limitations size-wise.
>
> Hmm, do they even clear all of video ram on a reset ?
As far as I can tell, yes. Definitely the parts I looked at.
(This is an integrated video device, so "video ram" is just a part of
system ram, set aside by the BIOS for the graphics).
Linus
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 21:35 ` [PATCH 1/2] Add some basic resume trace facilities Linus Torvalds
2006-06-13 22:10 ` Nigel Cunningham
@ 2006-06-14 10:25 ` Pavel Machek
1 sibling, 0 replies; 10+ messages in thread
From: Pavel Machek @ 2006-06-14 10:25 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Power management list
Hi!
> Not a lot of space in the RTC, but it's the only piece of hardware that is
> (a) reachable at all times, regardless of any other setup and (b) doesn't
> lose it state or have firmware reset the memory of at boot.
>
> Side note: you really don't want to do this unless you have an external
> time-source like NTP that resets the clock to the right value after the
> boot is done ;)
Clever hack, I'd say. I used hardware debugger last time I was trying
to debug this, but I guess that's just not an option in mac mini case...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-13 22:59 ` Linus Torvalds
2006-06-13 23:04 ` Dave Jones
@ 2006-06-16 1:49 ` Benjamin Herrenschmidt
2006-06-16 3:08 ` Linus Torvalds
1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2006-06-16 1:49 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-pm, Nigel Cunningham
On Tue, 2006-06-13 at 15:59 -0700, Linus Torvalds wrote:
> The Apple Mac Mini I can't even get to _beep_, which is really annoying.
> It's a wonderful debug sequence ("oh, I head 15 beeps, it got to point
> X"). Or rather, it's "wonderful" compared to something that gives you just
> one single piece of data after the reboot ;)
No magic GPIO or ACPU command you can use to tweak the front led ?
> The real-time clock was literally the only thing I could find that didn't
> get cleared, and that obviously has some serious limitations size-wise.
What about vram ? the entire of it gets cleared or only the displayed
part ?
Ben.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] Add some basic resume trace facilities
2006-06-16 1:49 ` Benjamin Herrenschmidt
@ 2006-06-16 3:08 ` Linus Torvalds
0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-06-16 3:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linux-pm, Nigel Cunningham
On Fri, 16 Jun 2006, Benjamin Herrenschmidt wrote:
>
> On Tue, 2006-06-13 at 15:59 -0700, Linus Torvalds wrote:
> > The Apple Mac Mini I can't even get to _beep_, which is really annoying.
> > It's a wonderful debug sequence ("oh, I head 15 beeps, it got to point
> > X"). Or rather, it's "wonderful" compared to something that gives you just
> > one single piece of data after the reboot ;)
>
> No magic GPIO or ACPU command you can use to tweak the front led ?
There's a magic bit that can make it flash. I considered doing that, but
decided that I'm better off with the "almost 24 bits of data" and a
reboot, than trying to desperately look at how the led flashes.
That's especially true since resume on that machine is sometimes delayed
by a SATA problem - so it actually _does_ come back, but it has a
30-second timeout because the SATA controller didn't resume. It seems to
be very timing-dependent: fixing the irq#9 issue on that box actually
exposed it, because it no longer gets 10,000 spurious interrupts to slow
it down ;)
An audible beep is actually _much_ better than a visual LED flash, because
you hear it even if you're not looking at the box intently for half a
minute.
Trust me, if I had to watch the LED all the time, I'd just go even crazier
than I am already ;^p
> > The real-time clock was literally the only thing I could find that didn't
> > get cleared, and that obviously has some serious limitations size-wise.
>
> What about vram ? the entire of it gets cleared or only the displayed
> part ?
There's no VRAM. It's all system RAM. This is integrated video.
So I bet that it's all cleared (before it's even marked for the integrated
graphics chip - which is done by the firmware by writing a magic
register for "top or RAM").
And perhaps more importantly, accessing that hidden memory is actually
pretty hard. Since it's past the "top of RAM" register, you can't do it
from the CPU directly, you have to do it through the AGP bridge.
So I didn't verify it all. I did verify that random memory locations were
cleared (actually, not cleared - it looks like the firmware wrote a
test-pattern to it), and I also verified that at least part of the "ACPI
NVS" are was also cleared (it's called "Non Volatile Storage", which is
why I tried it, but in ACPI terms that seems to mean that it's nonvolatile
as far as the ACPI stuff is concerned: the OS isn't supposed to change it,
not non-volatile in the OS kind of sense).
The RTC has the added advantage that it literally exists in every single
PC out there, ie it's a piece of hardware that has absolutely _zero_
firmware or hardware dependencies. That particular code will most likely
run on everything.
The thing is, I don't actually enjoy debugging my own machines. I _much_
prefer having other people debug _their_ machines, and fixing my machine
in the process. So I didn't want just something that worked on the Mac
Mini, I wanted something that works _universally_, so that hopefully
people who are even crazier than me will waste _their_ time trying to get
these machines working.
I say that with a smile, but I'm serious. There's simply no _point_ for me
to write code that just fixes one machine. If I don't believe it can help
fix a bigger problem, it's not worth doing.
Sometimes fixing just one machine tends to mean that a lot of other
machines also suddenly start working. That has historically clearly not
been the case wrt suspend-to-RAM, which is why I wanted something
different.
Linus
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-06-16 3:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-13 22:25 [PATCH 1/2] Add some basic resume trace facilities Gross, Mark
2006-06-13 22:59 ` Linus Torvalds
2006-06-13 23:04 ` Dave Jones
2006-06-13 23:13 ` Linus Torvalds
2006-06-16 1:49 ` Benjamin Herrenschmidt
2006-06-16 3:08 ` Linus Torvalds
-- strict thread matches above, loose matches on Subject: below --
2006-06-13 21:30 [PATCH 0/2] suspend-to-ram debugging patches Linus Torvalds
2006-06-13 21:35 ` [PATCH 1/2] Add some basic resume trace facilities Linus Torvalds
2006-06-13 22:10 ` Nigel Cunningham
2006-06-13 22:50 ` Linus Torvalds
2006-06-14 10:25 ` Pavel Machek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox