public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] PM: add statistics debugfs file for suspend to ram
@ 2011-08-10  7:28 Liu, ShuoX
  2011-08-10 18:00 ` [linux-pm] " Mansoor, Illyas
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Liu, ShuoX @ 2011-08-10  7:28 UTC (permalink / raw)
  To: linux-pm@lists.linux-foundation.org
  Cc: linux-kernel@vger.kernel.org, Rafael J. Wysocki, Greg KH,
	Brown, Len, pavel@ucw.cz, Yanmin_zhang@linux.intel.com

From: ShuoX Liu <shuox.liu@intel.com>

Record S3 failure time about each reason and the latest two failed
devices' names in S3 progress.
We can check it through 'suspend_stats' entry in debugfs.

The motivation of the patch:

We are enabling power features on Medfield. Comparing with PC/notebook,
a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
more frequently. If it can't enter suspend-2-ram in time, the power
might be used up soon.

We often find sometimes, a device suspend fails. Then, system retries
s3 over and over again. As display is off, testers and developers
don't know what happens.

Some testers and developers complain they don't know if system
tries suspend-2-ram, and what device fails to suspend. They need
such info for a quick check. The patch adds suspend_stats under
debugfs for users to check suspend to RAM statistics quickly.

If not using this patch, we have other methods to get info about
what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
would get too much info and testers need recompile the system.

In addition, dynamic debug is another good tool to dump debug info.
But it still doesn't match our utilization scenario closely.
1) user need write a user space parser to process the syslog output;
2) Our testing scenario is we leave the mobile for at least hours.
   Then, check its status. No serial console available during the
   testing. One is because console would be suspended, and the other
   is serial console connecting with spi or HSU devices would consume
   power. These devices are powered off at suspend-2-ram.

Thank Greg, MyungJoo and Rafael for their great comments.

Change Log:
  V4: Improve output format.
  V3: Change some programming styles.
  V2:
	1) Change the sysfs entry to debugfs.
	2) Add some explanation in document file.

Contribution:
	yanmin_zhang@linux.intel.com

Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
---
 Documentation/power/basic-pm-debugging.txt |   24 +++++++
 drivers/base/power/main.c                  |   31 +++++++--
 include/linux/suspend.h                    |   52 ++++++++++++++
 kernel/power/main.c                        |  102 ++++++++++++++++++++++++++++
 kernel/power/suspend.c                     |   17 ++++-
 5 files changed, 218 insertions(+), 8 deletions(-)

diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
index ddd7817..62eca08 100644
--- a/Documentation/power/basic-pm-debugging.txt
+++ b/Documentation/power/basic-pm-debugging.txt
@@ -201,3 +201,27 @@ case, you may be able to search for failing drivers by following the procedure
 analogous to the one described in section 1.  If you find some failing drivers,
 you will have to unload them every time before an STR transition (ie. before
 you run s2ram), and please report the problems with them.
+
+There is a debugfs entry which shows the suspend to RAM statistics. Here is an
+example of its output.
+	# mount -t debugfs none /sys/kernel/debug
+	# cat /sys/kernel/debug/suspend_stats
+	success: 20
+	fail: 5
+	failed_freeze: 0
+	failed_prepare: 0
+	failed_suspend: 5
+	failed_suspend_noirq: 0
+	failed_resume: 0
+	failed_resume_noirq: 0
+	failures:
+	  last_failed_dev:	alarm
+				adc
+	  last_failed_errno:	-16
+				-16
+	  last_failed_step:	suspend
+				suspend
+Field success means the success number of suspend to RAM, and field fail means
+the failure number. Others are the failure number of different steps of suspend
+to RAM. suspend_stats just lists the last 2 failed devices, error number and
+failed step of suspend.
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index a854591..23b755d 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -46,6 +46,7 @@ LIST_HEAD(dpm_prepared_list);
 LIST_HEAD(dpm_suspended_list);
 LIST_HEAD(dpm_noirq_list);
 
+struct suspend_stats suspend_stats;
 static DEFINE_MUTEX(dpm_list_mtx);
 static pm_message_t pm_transition;
 
@@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
 		mutex_unlock(&dpm_list_mtx);
 
 		error = device_resume_noirq(dev, state);
-		if (error)
+		if (error) {
+			suspend_stats.failed_resume_noirq++;
+			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
+			dpm_save_failed_dev(dev_name(dev));
 			pm_dev_err(dev, state, " early", error);
+		}
 
 		mutex_lock(&dpm_list_mtx);
 		put_device(dev);
@@ -626,8 +631,12 @@ void dpm_resume(pm_message_t state)
 			mutex_unlock(&dpm_list_mtx);
 
 			error = device_resume(dev, state, false);
-			if (error)
+			if (error) {
+				suspend_stats.failed_resume++;
+				dpm_save_failed_step(SUSPEND_RESUME);
+				dpm_save_failed_dev(dev_name(dev));
 				pm_dev_err(dev, state, "", error);
+			}
 
 			mutex_lock(&dpm_list_mtx);
 		}
@@ -802,6 +811,9 @@ int dpm_suspend_noirq(pm_message_t state)
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			pm_dev_err(dev, state, " late", error);
+			suspend_stats.failed_suspend_noirq++;
+			dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ);
+			dpm_save_failed_dev(dev_name(dev));
 			put_device(dev);
 			break;
 		}
@@ -923,8 +935,10 @@ static void async_suspend(void *data, async_cookie_t cookie)
 	int error;
 
 	error = __device_suspend(dev, pm_transition, true);
-	if (error)
+	if (error) {
+		dpm_save_failed_dev(dev_name(dev));
 		pm_dev_err(dev, pm_transition, " async", error);
+	}
 
 	put_device(dev);
 }
@@ -967,6 +981,7 @@ int dpm_suspend(pm_message_t state)
 		mutex_lock(&dpm_list_mtx);
 		if (error) {
 			pm_dev_err(dev, state, "", error);
+			dpm_save_failed_dev(dev_name(dev));
 			put_device(dev);
 			break;
 		}
@@ -980,7 +995,10 @@ int dpm_suspend(pm_message_t state)
 	async_synchronize_full();
 	if (!error)
 		error = async_error;
-	if (!error)
+	if (error) {
+		suspend_stats.failed_suspend++;
+		dpm_save_failed_step(SUSPEND_SUSPEND);
+	} else
 		dpm_show_time(starttime, state, NULL);
 	return error;
 }
@@ -1088,7 +1106,10 @@ int dpm_suspend_start(pm_message_t state)
 	int error;
 
 	error = dpm_prepare(state);
-	if (!error)
+	if (error) {
+		suspend_stats.failed_prepare++;
+		dpm_save_failed_step(SUSPEND_PREPARE);
+	} else
 		error = dpm_suspend(state);
 	return error;
 }
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 6bbcef2..76f42e4 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -34,6 +34,58 @@ typedef int __bitwise suspend_state_t;
 #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
 #define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
 
+enum suspend_stat_step {
+	SUSPEND_FREEZE = 1,
+	SUSPEND_PREPARE,
+	SUSPEND_SUSPEND,
+	SUSPEND_SUSPEND_NOIRQ,
+	SUSPEND_RESUME_NOIRQ,
+	SUSPEND_RESUME
+};
+
+struct suspend_stats {
+	int	success;
+	int	fail;
+	int	failed_freeze;
+	int	failed_prepare;
+	int	failed_suspend;
+	int	failed_suspend_noirq;
+	int	failed_resume;
+	int	failed_resume_noirq;
+#define	REC_FAILED_NUM	2
+	int	last_failed_dev;
+	char	failed_devs[REC_FAILED_NUM][40];
+	int	last_failed_errno;
+	int	errno[REC_FAILED_NUM];
+	int	last_failed_step;
+	enum suspend_stat_step	failed_steps[REC_FAILED_NUM];
+};
+
+extern struct suspend_stats suspend_stats;
+
+static inline void dpm_save_failed_dev(const char *name)
+{
+	strlcpy(suspend_stats.failed_devs[suspend_stats.last_failed_dev],
+		name,
+		sizeof(suspend_stats.failed_devs[0]));
+	suspend_stats.last_failed_dev++;
+	suspend_stats.last_failed_dev %= REC_FAILED_NUM;
+}
+
+static inline void dpm_save_failed_errno(int err)
+{
+	suspend_stats.errno[suspend_stats.last_failed_errno] = err;
+	suspend_stats.last_failed_errno++;
+	suspend_stats.last_failed_errno %= REC_FAILED_NUM;
+}
+
+static inline void dpm_save_failed_step(enum suspend_stat_step step)
+{
+	suspend_stats.failed_steps[suspend_stats.last_failed_step] = step;
+	suspend_stats.last_failed_step++;
+	suspend_stats.last_failed_step %= REC_FAILED_NUM;
+}
+
 /**
  * struct platform_suspend_ops - Callbacks for managing platform dependent
  *	system sleep states.
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 6c601f8..2757acb 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -12,6 +12,8 @@
 #include <linux/string.h>
 #include <linux/resume-trace.h>
 #include <linux/workqueue.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
 
 #include "power.h"
 
@@ -133,6 +135,101 @@ power_attr(pm_test);
 
 #endif /* CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_DEBUG_FS
+static char *suspend_step_name(enum suspend_stat_step step)
+{
+	switch (step) {
+	case SUSPEND_FREEZE:
+		return "freeze";
+	case SUSPEND_PREPARE:
+		return "prepare";
+	case SUSPEND_SUSPEND:
+		return "suspend";
+	case SUSPEND_SUSPEND_NOIRQ:
+		return "suspend_noirq";
+	case SUSPEND_RESUME_NOIRQ:
+		return "resume_noirq";
+	case SUSPEND_RESUME:
+		return "resume";
+	default:
+		return "";
+	}
+}
+
+static int suspend_stats_show(struct seq_file *s, void *unused)
+{
+	int i, index, last_dev, last_errno, last_step;
+
+	last_dev = suspend_stats.last_failed_dev + REC_FAILED_NUM - 1;
+	last_dev %= REC_FAILED_NUM;
+	last_errno = suspend_stats.last_failed_errno + REC_FAILED_NUM - 1;
+	last_errno %= REC_FAILED_NUM;
+	last_step = suspend_stats.last_failed_step + REC_FAILED_NUM - 1;
+	last_step %= REC_FAILED_NUM;
+	seq_printf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n"
+			"%s: %d\n%s: %d\n%s: %d\n%s: %d\n",
+			"success", suspend_stats.success,
+			"fail", suspend_stats.fail,
+			"failed_freeze", suspend_stats.failed_freeze,
+			"failed_prepare", suspend_stats.failed_prepare,
+			"failed_suspend", suspend_stats.failed_suspend,
+			"failed_suspend_noirq",
+				suspend_stats.failed_suspend_noirq,
+			"failed_resume", suspend_stats.failed_resume,
+			"failed_resume_noirq",
+				suspend_stats.failed_resume_noirq);
+	seq_printf(s,	"failures:\n  last_failed_dev:\t%-s\n",
+			suspend_stats.failed_devs[last_dev]);
+	for (i = 1; i < REC_FAILED_NUM; i++) {
+		index = last_dev + REC_FAILED_NUM - i;
+		index %= REC_FAILED_NUM;
+		seq_printf(s, "\t\t\t%-s\n",
+			suspend_stats.failed_devs[index]);
+	}
+	seq_printf(s,	"  last_failed_errno:\t%-d\n",
+			suspend_stats.errno[last_errno]);
+	for (i = 1; i < REC_FAILED_NUM; i++) {
+		index = last_errno + REC_FAILED_NUM - i;
+		index %= REC_FAILED_NUM;
+		seq_printf(s, "\t\t\t%-d\n",
+			suspend_stats.errno[index]);
+	}
+	seq_printf(s,	"  last_failed_step:\t%-s\n",
+			suspend_step_name(
+				suspend_stats.failed_steps[last_step]));
+	for (i = 1; i < REC_FAILED_NUM; i++) {
+		index = last_step + REC_FAILED_NUM - i;
+		index %= REC_FAILED_NUM;
+		seq_printf(s, "\t\t\t%-s\n",
+			suspend_step_name(
+				suspend_stats.failed_steps[index]));
+	}
+
+	return 0;
+}
+
+static int suspend_stats_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, suspend_stats_show, NULL);
+}
+
+static const struct file_operations suspend_stats_operations = {
+	.open           = suspend_stats_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = single_release,
+};
+
+static int __init pm_debugfs_init(void)
+{
+	debugfs_create_file("suspend_stats", S_IFREG | S_IRUGO,
+			NULL, NULL, &suspend_stats_operations);
+	return 0;
+}
+
+late_initcall(pm_debugfs_init);
+#endif /* CONFIG_DEBUG_FS */
+
 struct kobject *power_kobj;
 
 /**
@@ -194,6 +291,11 @@ static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
 	}
 	if (state < PM_SUSPEND_MAX && *s)
 		error = enter_state(state);
+		if (error) {
+			suspend_stats.fail++;
+			dpm_save_failed_errno(error);
+		} else
+			suspend_stats.success++;
 #endif
 
  Exit:
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index b6b71ad..595a3dd 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -104,7 +104,10 @@ static int suspend_prepare(void)
 		goto Finish;
 
 	error = suspend_freeze_processes();
-	if (!error)
+	if (error) {
+		suspend_stats.failed_freeze++;
+		dpm_save_failed_step(SUSPEND_FREEZE);
+	} else
 		return 0;
 
 	suspend_thaw_processes();
@@ -315,8 +318,16 @@ int enter_state(suspend_state_t state)
  */
 int pm_suspend(suspend_state_t state)
 {
-	if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
-		return enter_state(state);
+	int ret;
+	if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX) {
+		ret = enter_state(state);
+		if (ret) {
+			suspend_stats.fail++;
+			dpm_save_failed_errno(ret);
+		} else
+			suspend_stats.success++;
+		return ret;
+	}
 	return -EINVAL;
 }
 EXPORT_SYMBOL(pm_suspend);
-- 
1.7.1




^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10  7:28 [PATCH v4] PM: add statistics debugfs file for suspend to ram Liu, ShuoX
@ 2011-08-10 18:00 ` Mansoor, Illyas
  2011-08-10 19:45   ` Greg KH
  2011-08-10 21:07 ` Rafael J. Wysocki
  2011-08-10 21:44 ` Pavel Machek
  2 siblings, 1 reply; 12+ messages in thread
From: Mansoor, Illyas @ 2011-08-10 18:00 UTC (permalink / raw)
  To: Liu, ShuoX, linux-pm@lists.linux-foundation.org
  Cc: Brown, Len, Yanmin_zhang@linux.intel.com, Greg KH,
	linux-kernel@vger.kernel.org

> +struct suspend_stats suspend_stats;
>  static DEFINE_MUTEX(dpm_list_mtx);
>  static pm_message_t pm_transition;
> 
> @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
>  		mutex_unlock(&dpm_list_mtx);
> 
>  		error = device_resume_noirq(dev, state);
> -		if (error)
> +		if (error) {
> +			suspend_stats.failed_resume_noirq++;
> +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> +			dpm_save_failed_dev(dev_name(dev));

Please make these statistics conditionally enabled, so on a production system
If we need to disable these statistics code we should be able to do so.

May be CONFIG_SUSPEND_STATS and convert those lines of code into function calls
That can be stubbed out when the config is not enabled.

-Illyas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 18:00 ` [linux-pm] " Mansoor, Illyas
@ 2011-08-10 19:45   ` Greg KH
  2011-08-10 19:52     ` Mansoor, Illyas
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2011-08-10 19:45 UTC (permalink / raw)
  To: Mansoor, Illyas
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org, Brown, Len,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org

On Wed, Aug 10, 2011 at 11:30:40PM +0530, Mansoor, Illyas wrote:
> > +struct suspend_stats suspend_stats;
> >  static DEFINE_MUTEX(dpm_list_mtx);
> >  static pm_message_t pm_transition;
> > 
> > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> >  		mutex_unlock(&dpm_list_mtx);
> > 
> >  		error = device_resume_noirq(dev, state);
> > -		if (error)
> > +		if (error) {
> > +			suspend_stats.failed_resume_noirq++;
> > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > +			dpm_save_failed_dev(dev_name(dev));
> 
> Please make these statistics conditionally enabled, so on a production system
> If we need to disable these statistics code we should be able to do so.

Why, are they taking time or space that is needed for something else?
What's the downside here of just not always having this enabled?

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 19:45   ` Greg KH
@ 2011-08-10 19:52     ` Mansoor, Illyas
  2011-08-10 19:58       ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Mansoor, Illyas @ 2011-08-10 19:52 UTC (permalink / raw)
  To: Greg KH
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org, Brown, Len,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org

static pm_message_t pm_transition;
> > >
> > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > >  		mutex_unlock(&dpm_list_mtx);
> > >
> > >  		error = device_resume_noirq(dev, state);
> > > -		if (error)
> > > +		if (error) {
> > > +			suspend_stats.failed_resume_noirq++;
> > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > +			dpm_save_failed_dev(dev_name(dev));
> >
> > Please make these statistics conditionally enabled, so on a production system
> > If we need to disable these statistics code we should be able to do so.
> 
> Why, are they taking time or space that is needed for something else?
> What's the downside here of just not always having this enabled?

Why have something that is not required/Used?
Its only useful if DEBUGFS is configured anyways

-Illyas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 19:52     ` Mansoor, Illyas
@ 2011-08-10 19:58       ` Greg KH
  2011-08-10 20:20         ` Mansoor, Illyas
  2011-08-12  7:04         ` Pavel Machek
  0 siblings, 2 replies; 12+ messages in thread
From: Greg KH @ 2011-08-10 19:58 UTC (permalink / raw)
  To: Mansoor, Illyas
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org, Brown, Len,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org

On Thu, Aug 11, 2011 at 01:22:56AM +0530, Mansoor, Illyas wrote:
> static pm_message_t pm_transition;
> > > >
> > > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > > >  		mutex_unlock(&dpm_list_mtx);
> > > >
> > > >  		error = device_resume_noirq(dev, state);
> > > > -		if (error)
> > > > +		if (error) {
> > > > +			suspend_stats.failed_resume_noirq++;
> > > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > > +			dpm_save_failed_dev(dev_name(dev));
> > >
> > > Please make these statistics conditionally enabled, so on a production system
> > > If we need to disable these statistics code we should be able to do so.
> > 
> > Why, are they taking time or space that is needed for something else?
> > What's the downside here of just not always having this enabled?
> 
> Why have something that is not required/Used?

Because someone might need it and rebuilding a kernel isn't possible on
lots of devices.

> Its only useful if DEBUGFS is configured anyways

Almost all systems these days have debugfs enabled, so that's a moot
point.

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 19:58       ` Greg KH
@ 2011-08-10 20:20         ` Mansoor, Illyas
  2011-08-10 20:25           ` Mansoor, Illyas
  2011-08-12  7:04         ` Pavel Machek
  1 sibling, 1 reply; 12+ messages in thread
From: Mansoor, Illyas @ 2011-08-10 20:20 UTC (permalink / raw)
  To: Greg KH
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org, Brown, Len,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org

> On Thu, Aug 11, 2011 at 01:22:56AM +0530, Mansoor, Illyas wrote:
> > static pm_message_t pm_transition;
> > > > >
> > > > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > > > >  		mutex_unlock(&dpm_list_mtx);
> > > > >
> > > > >  		error = device_resume_noirq(dev, state);
> > > > > -		if (error)
> > > > > +		if (error) {
> > > > > +			suspend_stats.failed_resume_noirq++;
> > > > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > > > +			dpm_save_failed_dev(dev_name(dev));
> > > >
> > > > Please make these statistics conditionally enabled, so on a production
> system
> > > > If we need to disable these statistics code we should be able to do so.
> > >
> > > Why, are they taking time or space that is needed for something else?
> > > What's the downside here of just not always having this enabled?
> >
> > Why have something that is not required/Used?
> 
> Because someone might need it and rebuilding a kernel isn't possible on
> lots of devices.

Agreed.

> 
> > Its only useful if DEBUGFS is configured anyways
> 
> Almost all systems these days have debugfs enabled, so that's a moot
> point.

We could do the same for this as well, since DEBUGFS is still a compile time
Option even thou many enable it by default we don't make it part of the kernel isn't it.

-Illyas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 20:20         ` Mansoor, Illyas
@ 2011-08-10 20:25           ` Mansoor, Illyas
  0 siblings, 0 replies; 12+ messages in thread
From: Mansoor, Illyas @ 2011-08-10 20:25 UTC (permalink / raw)
  To: Mansoor, Illyas, Greg KH
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org,
	Brown, Len

> 
> > On Thu, Aug 11, 2011 at 01:22:56AM +0530, Mansoor, Illyas wrote:
> > > static pm_message_t pm_transition;
> > > > > >
> > > > > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > > > > >  		mutex_unlock(&dpm_list_mtx);
> > > > > >
> > > > > >  		error = device_resume_noirq(dev, state);
> > > > > > -		if (error)
> > > > > > +		if (error) {
> > > > > > +			suspend_stats.failed_resume_noirq++;
> > > > > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > > > > +			dpm_save_failed_dev(dev_name(dev));
> > > > >
> > > > > Please make these statistics conditionally enabled, so on a production
> > system
> > > > > If we need to disable these statistics code we should be able to do so.
> > > >
> > > > Why, are they taking time or space that is needed for something else?
> > > > What's the downside here of just not always having this enabled?
> > >
> > > Why have something that is not required/Used?
> >
> > Because someone might need it and rebuilding a kernel isn't possible on
> > lots of devices.
> 
> Agreed.
> 
> >
> > > Its only useful if DEBUGFS is configured anyways
> >
> > Almost all systems these days have debugfs enabled, so that's a moot
> > point.
> 
> We could do the same for this as well, since DEBUGFS is still a compile time
> Option even thou many enable it by default we don't make it part of the kernel
> isn't it.
> 

May be it's not worth debating on this one, I'm okay if it's part of the kernel as well.

-Illyas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10  7:28 [PATCH v4] PM: add statistics debugfs file for suspend to ram Liu, ShuoX
  2011-08-10 18:00 ` [linux-pm] " Mansoor, Illyas
@ 2011-08-10 21:07 ` Rafael J. Wysocki
  2011-08-10 21:44 ` Pavel Machek
  2 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2011-08-10 21:07 UTC (permalink / raw)
  To: Liu, ShuoX
  Cc: linux-pm@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	Greg KH, Brown, Len, pavel@ucw.cz, Yanmin_zhang@linux.intel.com

On Wednesday, August 10, 2011, Liu, ShuoX wrote:
> From: ShuoX Liu <shuox.liu@intel.com>
> 
> Record S3 failure time about each reason and the latest two failed
> devices' names in S3 progress.
> We can check it through 'suspend_stats' entry in debugfs.
> 
> The motivation of the patch:
> 
> We are enabling power features on Medfield. Comparing with PC/notebook,
> a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
> more frequently. If it can't enter suspend-2-ram in time, the power
> might be used up soon.
> 
> We often find sometimes, a device suspend fails. Then, system retries
> s3 over and over again. As display is off, testers and developers
> don't know what happens.
> 
> Some testers and developers complain they don't know if system
> tries suspend-2-ram, and what device fails to suspend. They need
> such info for a quick check. The patch adds suspend_stats under
> debugfs for users to check suspend to RAM statistics quickly.
> 
> If not using this patch, we have other methods to get info about
> what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
> would get too much info and testers need recompile the system.
> 
> In addition, dynamic debug is another good tool to dump debug info.
> But it still doesn't match our utilization scenario closely.
> 1) user need write a user space parser to process the syslog output;
> 2) Our testing scenario is we leave the mobile for at least hours.
>    Then, check its status. No serial console available during the
>    testing. One is because console would be suspended, and the other
>    is serial console connecting with spi or HSU devices would consume
>    power. These devices are powered off at suspend-2-ram.
> 
> Thank Greg, MyungJoo and Rafael for their great comments.
> 
> Change Log:
>   V4: Improve output format.
>   V3: Change some programming styles.
>   V2:
> 	1) Change the sysfs entry to debugfs.
> 	2) Add some explanation in document file.
> 
> Contribution:
> 	yanmin_zhang@linux.intel.com
> 
> Signed-off-by: ShuoX Liu <shuox.liu@intel.com>

Applied to linux-pm/linux-next as 3.2 material.

Thanks,
Rafael


> ---
>  Documentation/power/basic-pm-debugging.txt |   24 +++++++
>  drivers/base/power/main.c                  |   31 +++++++--
>  include/linux/suspend.h                    |   52 ++++++++++++++
>  kernel/power/main.c                        |  102 ++++++++++++++++++++++++++++
>  kernel/power/suspend.c                     |   17 ++++-
>  5 files changed, 218 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
> index ddd7817..62eca08 100644
> --- a/Documentation/power/basic-pm-debugging.txt
> +++ b/Documentation/power/basic-pm-debugging.txt
> @@ -201,3 +201,27 @@ case, you may be able to search for failing drivers by following the procedure
>  analogous to the one described in section 1.  If you find some failing drivers,
>  you will have to unload them every time before an STR transition (ie. before
>  you run s2ram), and please report the problems with them.
> +
> +There is a debugfs entry which shows the suspend to RAM statistics. Here is an
> +example of its output.
> +	# mount -t debugfs none /sys/kernel/debug
> +	# cat /sys/kernel/debug/suspend_stats
> +	success: 20
> +	fail: 5
> +	failed_freeze: 0
> +	failed_prepare: 0
> +	failed_suspend: 5
> +	failed_suspend_noirq: 0
> +	failed_resume: 0
> +	failed_resume_noirq: 0
> +	failures:
> +	  last_failed_dev:	alarm
> +				adc
> +	  last_failed_errno:	-16
> +				-16
> +	  last_failed_step:	suspend
> +				suspend
> +Field success means the success number of suspend to RAM, and field fail means
> +the failure number. Others are the failure number of different steps of suspend
> +to RAM. suspend_stats just lists the last 2 failed devices, error number and
> +failed step of suspend.
> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
> index a854591..23b755d 100644
> --- a/drivers/base/power/main.c
> +++ b/drivers/base/power/main.c
> @@ -46,6 +46,7 @@ LIST_HEAD(dpm_prepared_list);
>  LIST_HEAD(dpm_suspended_list);
>  LIST_HEAD(dpm_noirq_list);
>  
> +struct suspend_stats suspend_stats;
>  static DEFINE_MUTEX(dpm_list_mtx);
>  static pm_message_t pm_transition;
>  
> @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
>  		mutex_unlock(&dpm_list_mtx);
>  
>  		error = device_resume_noirq(dev, state);
> -		if (error)
> +		if (error) {
> +			suspend_stats.failed_resume_noirq++;
> +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> +			dpm_save_failed_dev(dev_name(dev));
>  			pm_dev_err(dev, state, " early", error);
> +		}
>  
>  		mutex_lock(&dpm_list_mtx);
>  		put_device(dev);
> @@ -626,8 +631,12 @@ void dpm_resume(pm_message_t state)
>  			mutex_unlock(&dpm_list_mtx);
>  
>  			error = device_resume(dev, state, false);
> -			if (error)
> +			if (error) {
> +				suspend_stats.failed_resume++;
> +				dpm_save_failed_step(SUSPEND_RESUME);
> +				dpm_save_failed_dev(dev_name(dev));
>  				pm_dev_err(dev, state, "", error);
> +			}
>  
>  			mutex_lock(&dpm_list_mtx);
>  		}
> @@ -802,6 +811,9 @@ int dpm_suspend_noirq(pm_message_t state)
>  		mutex_lock(&dpm_list_mtx);
>  		if (error) {
>  			pm_dev_err(dev, state, " late", error);
> +			suspend_stats.failed_suspend_noirq++;
> +			dpm_save_failed_step(SUSPEND_SUSPEND_NOIRQ);
> +			dpm_save_failed_dev(dev_name(dev));
>  			put_device(dev);
>  			break;
>  		}
> @@ -923,8 +935,10 @@ static void async_suspend(void *data, async_cookie_t cookie)
>  	int error;
>  
>  	error = __device_suspend(dev, pm_transition, true);
> -	if (error)
> +	if (error) {
> +		dpm_save_failed_dev(dev_name(dev));
>  		pm_dev_err(dev, pm_transition, " async", error);
> +	}
>  
>  	put_device(dev);
>  }
> @@ -967,6 +981,7 @@ int dpm_suspend(pm_message_t state)
>  		mutex_lock(&dpm_list_mtx);
>  		if (error) {
>  			pm_dev_err(dev, state, "", error);
> +			dpm_save_failed_dev(dev_name(dev));
>  			put_device(dev);
>  			break;
>  		}
> @@ -980,7 +995,10 @@ int dpm_suspend(pm_message_t state)
>  	async_synchronize_full();
>  	if (!error)
>  		error = async_error;
> -	if (!error)
> +	if (error) {
> +		suspend_stats.failed_suspend++;
> +		dpm_save_failed_step(SUSPEND_SUSPEND);
> +	} else
>  		dpm_show_time(starttime, state, NULL);
>  	return error;
>  }
> @@ -1088,7 +1106,10 @@ int dpm_suspend_start(pm_message_t state)
>  	int error;
>  
>  	error = dpm_prepare(state);
> -	if (!error)
> +	if (error) {
> +		suspend_stats.failed_prepare++;
> +		dpm_save_failed_step(SUSPEND_PREPARE);
> +	} else
>  		error = dpm_suspend(state);
>  	return error;
>  }
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index 6bbcef2..76f42e4 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -34,6 +34,58 @@ typedef int __bitwise suspend_state_t;
>  #define PM_SUSPEND_MEM		((__force suspend_state_t) 3)
>  #define PM_SUSPEND_MAX		((__force suspend_state_t) 4)
>  
> +enum suspend_stat_step {
> +	SUSPEND_FREEZE = 1,
> +	SUSPEND_PREPARE,
> +	SUSPEND_SUSPEND,
> +	SUSPEND_SUSPEND_NOIRQ,
> +	SUSPEND_RESUME_NOIRQ,
> +	SUSPEND_RESUME
> +};
> +
> +struct suspend_stats {
> +	int	success;
> +	int	fail;
> +	int	failed_freeze;
> +	int	failed_prepare;
> +	int	failed_suspend;
> +	int	failed_suspend_noirq;
> +	int	failed_resume;
> +	int	failed_resume_noirq;
> +#define	REC_FAILED_NUM	2
> +	int	last_failed_dev;
> +	char	failed_devs[REC_FAILED_NUM][40];
> +	int	last_failed_errno;
> +	int	errno[REC_FAILED_NUM];
> +	int	last_failed_step;
> +	enum suspend_stat_step	failed_steps[REC_FAILED_NUM];
> +};
> +
> +extern struct suspend_stats suspend_stats;
> +
> +static inline void dpm_save_failed_dev(const char *name)
> +{
> +	strlcpy(suspend_stats.failed_devs[suspend_stats.last_failed_dev],
> +		name,
> +		sizeof(suspend_stats.failed_devs[0]));
> +	suspend_stats.last_failed_dev++;
> +	suspend_stats.last_failed_dev %= REC_FAILED_NUM;
> +}
> +
> +static inline void dpm_save_failed_errno(int err)
> +{
> +	suspend_stats.errno[suspend_stats.last_failed_errno] = err;
> +	suspend_stats.last_failed_errno++;
> +	suspend_stats.last_failed_errno %= REC_FAILED_NUM;
> +}
> +
> +static inline void dpm_save_failed_step(enum suspend_stat_step step)
> +{
> +	suspend_stats.failed_steps[suspend_stats.last_failed_step] = step;
> +	suspend_stats.last_failed_step++;
> +	suspend_stats.last_failed_step %= REC_FAILED_NUM;
> +}
> +
>  /**
>   * struct platform_suspend_ops - Callbacks for managing platform dependent
>   *	system sleep states.
> diff --git a/kernel/power/main.c b/kernel/power/main.c
> index 6c601f8..2757acb 100644
> --- a/kernel/power/main.c
> +++ b/kernel/power/main.c
> @@ -12,6 +12,8 @@
>  #include <linux/string.h>
>  #include <linux/resume-trace.h>
>  #include <linux/workqueue.h>
> +#include <linux/debugfs.h>
> +#include <linux/seq_file.h>
>  
>  #include "power.h"
>  
> @@ -133,6 +135,101 @@ power_attr(pm_test);
>  
>  #endif /* CONFIG_PM_SLEEP */
>  
> +#ifdef CONFIG_DEBUG_FS
> +static char *suspend_step_name(enum suspend_stat_step step)
> +{
> +	switch (step) {
> +	case SUSPEND_FREEZE:
> +		return "freeze";
> +	case SUSPEND_PREPARE:
> +		return "prepare";
> +	case SUSPEND_SUSPEND:
> +		return "suspend";
> +	case SUSPEND_SUSPEND_NOIRQ:
> +		return "suspend_noirq";
> +	case SUSPEND_RESUME_NOIRQ:
> +		return "resume_noirq";
> +	case SUSPEND_RESUME:
> +		return "resume";
> +	default:
> +		return "";
> +	}
> +}
> +
> +static int suspend_stats_show(struct seq_file *s, void *unused)
> +{
> +	int i, index, last_dev, last_errno, last_step;
> +
> +	last_dev = suspend_stats.last_failed_dev + REC_FAILED_NUM - 1;
> +	last_dev %= REC_FAILED_NUM;
> +	last_errno = suspend_stats.last_failed_errno + REC_FAILED_NUM - 1;
> +	last_errno %= REC_FAILED_NUM;
> +	last_step = suspend_stats.last_failed_step + REC_FAILED_NUM - 1;
> +	last_step %= REC_FAILED_NUM;
> +	seq_printf(s, "%s: %d\n%s: %d\n%s: %d\n%s: %d\n"
> +			"%s: %d\n%s: %d\n%s: %d\n%s: %d\n",
> +			"success", suspend_stats.success,
> +			"fail", suspend_stats.fail,
> +			"failed_freeze", suspend_stats.failed_freeze,
> +			"failed_prepare", suspend_stats.failed_prepare,
> +			"failed_suspend", suspend_stats.failed_suspend,
> +			"failed_suspend_noirq",
> +				suspend_stats.failed_suspend_noirq,
> +			"failed_resume", suspend_stats.failed_resume,
> +			"failed_resume_noirq",
> +				suspend_stats.failed_resume_noirq);
> +	seq_printf(s,	"failures:\n  last_failed_dev:\t%-s\n",
> +			suspend_stats.failed_devs[last_dev]);
> +	for (i = 1; i < REC_FAILED_NUM; i++) {
> +		index = last_dev + REC_FAILED_NUM - i;
> +		index %= REC_FAILED_NUM;
> +		seq_printf(s, "\t\t\t%-s\n",
> +			suspend_stats.failed_devs[index]);
> +	}
> +	seq_printf(s,	"  last_failed_errno:\t%-d\n",
> +			suspend_stats.errno[last_errno]);
> +	for (i = 1; i < REC_FAILED_NUM; i++) {
> +		index = last_errno + REC_FAILED_NUM - i;
> +		index %= REC_FAILED_NUM;
> +		seq_printf(s, "\t\t\t%-d\n",
> +			suspend_stats.errno[index]);
> +	}
> +	seq_printf(s,	"  last_failed_step:\t%-s\n",
> +			suspend_step_name(
> +				suspend_stats.failed_steps[last_step]));
> +	for (i = 1; i < REC_FAILED_NUM; i++) {
> +		index = last_step + REC_FAILED_NUM - i;
> +		index %= REC_FAILED_NUM;
> +		seq_printf(s, "\t\t\t%-s\n",
> +			suspend_step_name(
> +				suspend_stats.failed_steps[index]));
> +	}
> +
> +	return 0;
> +}
> +
> +static int suspend_stats_open(struct inode *inode, struct file *file)
> +{
> +	return single_open(file, suspend_stats_show, NULL);
> +}
> +
> +static const struct file_operations suspend_stats_operations = {
> +	.open           = suspend_stats_open,
> +	.read           = seq_read,
> +	.llseek         = seq_lseek,
> +	.release        = single_release,
> +};
> +
> +static int __init pm_debugfs_init(void)
> +{
> +	debugfs_create_file("suspend_stats", S_IFREG | S_IRUGO,
> +			NULL, NULL, &suspend_stats_operations);
> +	return 0;
> +}
> +
> +late_initcall(pm_debugfs_init);
> +#endif /* CONFIG_DEBUG_FS */
> +
>  struct kobject *power_kobj;
>  
>  /**
> @@ -194,6 +291,11 @@ static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
>  	}
>  	if (state < PM_SUSPEND_MAX && *s)
>  		error = enter_state(state);
> +		if (error) {
> +			suspend_stats.fail++;
> +			dpm_save_failed_errno(error);
> +		} else
> +			suspend_stats.success++;
>  #endif
>  
>   Exit:
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index b6b71ad..595a3dd 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -104,7 +104,10 @@ static int suspend_prepare(void)
>  		goto Finish;
>  
>  	error = suspend_freeze_processes();
> -	if (!error)
> +	if (error) {
> +		suspend_stats.failed_freeze++;
> +		dpm_save_failed_step(SUSPEND_FREEZE);
> +	} else
>  		return 0;
>  
>  	suspend_thaw_processes();
> @@ -315,8 +318,16 @@ int enter_state(suspend_state_t state)
>   */
>  int pm_suspend(suspend_state_t state)
>  {
> -	if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX)
> -		return enter_state(state);
> +	int ret;
> +	if (state > PM_SUSPEND_ON && state <= PM_SUSPEND_MAX) {
> +		ret = enter_state(state);
> +		if (ret) {
> +			suspend_stats.fail++;
> +			dpm_save_failed_errno(ret);
> +		} else
> +			suspend_stats.success++;
> +		return ret;
> +	}
>  	return -EINVAL;
>  }
>  EXPORT_SYMBOL(pm_suspend);
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10  7:28 [PATCH v4] PM: add statistics debugfs file for suspend to ram Liu, ShuoX
  2011-08-10 18:00 ` [linux-pm] " Mansoor, Illyas
  2011-08-10 21:07 ` Rafael J. Wysocki
@ 2011-08-10 21:44 ` Pavel Machek
  2011-08-11  1:55   ` Yanmin Zhang
  2 siblings, 1 reply; 12+ messages in thread
From: Pavel Machek @ 2011-08-10 21:44 UTC (permalink / raw)
  To: Liu, ShuoX
  Cc: linux-pm@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	Rafael J. Wysocki, Greg KH, Brown, Len,
	Yanmin_zhang@linux.intel.com

Hi!

> If not using this patch, we have other methods to get info about
> what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
> would get too much info and testers need recompile the system.
> 
> In addition, dynamic debug is another good tool to dump debug info.
> But it still doesn't match our utilization scenario closely.
> 1) user need write a user space parser to process the syslog output;
> 2) Our testing scenario is we leave the mobile for at least hours.
>    Then, check its status. No serial console available during the
>    testing. One is because console would be suspended, and the other
>    is serial console connecting with spi or HSU devices would consume
>    power. These devices are powered off at suspend-2-ram.

1) yes you need parser

1a) Yes, you need CONFIG_PM_DEBUG; but that's better than forcing 200
lines of pure debugging code onto everyone

2)  You can do that. Just check and parse dmesg from userland after
each resume.

If dmesg provides too little/too much information, improve loglevels
so that it is not spammed.
 
								Pavel


> Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
> ---
>  Documentation/power/basic-pm-debugging.txt |   24 +++++++
>  drivers/base/power/main.c                  |   31 +++++++--
>  include/linux/suspend.h                    |   52 ++++++++++++++
>  kernel/power/main.c                        |  102 ++++++++++++++++++++++++++++
>  kernel/power/suspend.c                     |   17 ++++-
>  5 files changed, 218 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/power/basic-pm-debugging.txt b/Documentation/power/basic-pm-debugging.txt
> index ddd7817..62eca08 100644
> --- a/Documentation/power/basic-pm-debugging.txt
> +++ b/Documentation/power/basic-pm-debugging.txt
> @@ -201,3 +201,27 @@ case, you may be able to search for failing drivers by following the procedure
>  analogous to the one described in section 1.  If you find some failing drivers,
>  you will have to unload them every time before an STR transition (ie. before
>  you run s2ram), and please report the problems with them.
> +
> +There is a debugfs entry which shows the suspend to RAM statistics. Here is an
> +example of its output.
> +	# mount -t debugfs none /sys/kernel/debug
> +	# cat /sys/kernel/debug/suspend_stats
> +	success: 20
> +	fail: 5
> +	failed_freeze: 0
> +	failed_prepare: 0
> +	failed_suspend: 5
> +	failed_suspend_noirq: 0
> +	failed_resume: 0
> +	failed_resume_noirq: 0
> +	failures:
> +	  last_failed_dev:	alarm
> +				adc
> +	  last_failed_errno:	-16
> +				-16
> +	  last_failed_step:	suspend
> +				suspend
> +Field success means the success number of suspend to RAM, and field fail means
> +the failure number. Others are the failure number of different steps of suspend
> +to RAM. suspend_stats just lists the last 2 failed devices, error number and
> +failed step of suspend.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 21:44 ` Pavel Machek
@ 2011-08-11  1:55   ` Yanmin Zhang
  0 siblings, 0 replies; 12+ messages in thread
From: Yanmin Zhang @ 2011-08-11  1:55 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Liu, ShuoX, linux-pm@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, Rafael J. Wysocki, Greg KH,
	Brown, Len, MyungJoo Ham

On Wed, 2011-08-10 at 23:44 +0200, Pavel Machek wrote:
> Hi!
> 
> > If not using this patch, we have other methods to get info about
> > what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
> > would get too much info and testers need recompile the system.
> > 
> > In addition, dynamic debug is another good tool to dump debug info.
> > But it still doesn't match our utilization scenario closely.
> > 1) user need write a user space parser to process the syslog output;
> > 2) Our testing scenario is we leave the mobile for at least hours.
> >    Then, check its status. No serial console available during the
> >    testing. One is because console would be suspended, and the other
> >    is serial console connecting with spi or HSU devices would consume
> >    power. These devices are powered off at suspend-2-ram.
> 
Pavel,

Thanks for your good comments. To be honest, we did consider the weakness of
the patch before sending out. There is a tradeoff we need balance.

> 1) yes you need parser
> 
> 1a) Yes, you need CONFIG_PM_DEBUG; but that's better than forcing 200
> lines of pure debugging code onto everyone
The new code lines are pretty simple and direct. We don't change any logic of PM.
CONFIG_PM_DEBUG is used mostly to print out debug info timely, not for statistics.

> 2)  You can do that. Just check and parse dmesg from userland after
> each resume.
Indeed, there are many methods to do so. But if we implement it in kernel directly,
more developers could benefit from it. And any new kernel patch which touches the logic
would also maintain the consistency between main PM logic and the statistics.

> 
> If dmesg provides too little/too much information, improve loglevels
> so that it is not spammed.
We got very good feedback from internal 1st-line developers and testers. They raised
the requirement and we implement it. 

Thanks,
Yanmin



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-pm] [PATCH v4] PM: add statistics debugfs file for suspend to ram
  2011-08-10 19:58       ` Greg KH
  2011-08-10 20:20         ` Mansoor, Illyas
@ 2011-08-12  7:04         ` Pavel Machek
  2011-08-12  7:13           ` Rafael J. Wysocki
  1 sibling, 1 reply; 12+ messages in thread
From: Pavel Machek @ 2011-08-12  7:04 UTC (permalink / raw)
  To: Greg KH
  Cc: Mansoor, Illyas, Liu, ShuoX, linux-pm@lists.linux-foundation.org,
	Brown, Len, Yanmin_zhang@linux.intel.com,
	linux-kernel@vger.kernel.org

On Wed 2011-08-10 12:58:54, Greg KH wrote:
> On Thu, Aug 11, 2011 at 01:22:56AM +0530, Mansoor, Illyas wrote:
> > static pm_message_t pm_transition;
> > > > >
> > > > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > > > >  		mutex_unlock(&dpm_list_mtx);
> > > > >
> > > > >  		error = device_resume_noirq(dev, state);
> > > > > -		if (error)
> > > > > +		if (error) {
> > > > > +			suspend_stats.failed_resume_noirq++;
> > > > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > > > +			dpm_save_failed_dev(dev_name(dev));
> > > >
> > > > Please make these statistics conditionally enabled, so on a production system
> > > > If we need to disable these statistics code we should be able to do so.
> > > 
> > > Why, are they taking time or space that is needed for something else?
> > > What's the downside here of just not always having this enabled?
> > 
> > Why have something that is not required/Used?
> 
> Because someone might need it and rebuilding a kernel isn't possible on
> lots of devices.

Yeah, and someone may need tetris, lets put  it into kernel and enable
unconditionaly :-(.

Really, this is just patch to provide dmesg subset, because someone is
cool enough to hack kernel, but will not just use grep.

The patch should be just dropped. Use perf or dmesg parsing to collect
data; fix printks() to be easier to grep for if needed.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [linux-pm] [PATCH v4] PM: add statistics debugfs file for  suspend to ram
  2011-08-12  7:04         ` Pavel Machek
@ 2011-08-12  7:13           ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2011-08-12  7:13 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Greg KH, Mansoor, Illyas, Liu, ShuoX,
	linux-pm@lists.linux-foundation.org, Brown, Len,
	Yanmin_zhang@linux.intel.com, linux-kernel@vger.kernel.org

On Friday, August 12, 2011, Pavel Machek wrote:
> On Wed 2011-08-10 12:58:54, Greg KH wrote:
> > On Thu, Aug 11, 2011 at 01:22:56AM +0530, Mansoor, Illyas wrote:
> > > static pm_message_t pm_transition;
> > > > > >
> > > > > > @@ -464,8 +465,12 @@ void dpm_resume_noirq(pm_message_t state)
> > > > > >  		mutex_unlock(&dpm_list_mtx);
> > > > > >
> > > > > >  		error = device_resume_noirq(dev, state);
> > > > > > -		if (error)
> > > > > > +		if (error) {
> > > > > > +			suspend_stats.failed_resume_noirq++;
> > > > > > +			dpm_save_failed_step(SUSPEND_RESUME_NOIRQ);
> > > > > > +			dpm_save_failed_dev(dev_name(dev));
> > > > >
> > > > > Please make these statistics conditionally enabled, so on a production system
> > > > > If we need to disable these statistics code we should be able to do so.
> > > > 
> > > > Why, are they taking time or space that is needed for something else?
> > > > What's the downside here of just not always having this enabled?
> > > 
> > > Why have something that is not required/Used?
> > 
> > Because someone might need it and rebuilding a kernel isn't possible on
> > lots of devices.
> 
> Yeah, and someone may need tetris, lets put  it into kernel and enable
> unconditionaly :-(.
> 
> Really, this is just patch to provide dmesg subset, because someone is
> cool enough to hack kernel, but will not just use grep.
> 
> The patch should be just dropped.

And what's the _technical_ reason?

Do you think it's generally wrong to have two different ways of getting
the same debug information from the kernel?

Really, the truth is that suspend debugging is still way too difficult
for the majority of users and I don't quite see a reason to drop patches
that may improve the situation in this area.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-08-12  7:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-10  7:28 [PATCH v4] PM: add statistics debugfs file for suspend to ram Liu, ShuoX
2011-08-10 18:00 ` [linux-pm] " Mansoor, Illyas
2011-08-10 19:45   ` Greg KH
2011-08-10 19:52     ` Mansoor, Illyas
2011-08-10 19:58       ` Greg KH
2011-08-10 20:20         ` Mansoor, Illyas
2011-08-10 20:25           ` Mansoor, Illyas
2011-08-12  7:04         ` Pavel Machek
2011-08-12  7:13           ` Rafael J. Wysocki
2011-08-10 21:07 ` Rafael J. Wysocki
2011-08-10 21:44 ` Pavel Machek
2011-08-11  1:55   ` Yanmin Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox