public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance
       [not found] ` <1201043126.3861.5.camel@yangyi-dev.bj.intel.com>
@ 2008-02-14 23:44   ` Yi Yang
  2008-02-14 23:48   ` Yi Yang
  1 sibling, 0 replies; 9+ messages in thread
From: Yi Yang @ 2008-02-14 23:44 UTC (permalink / raw)
  To: akpm, torvalds, gregkh
  Cc: davej, mingo, cpufreq, linux-pm, linux-kernel, linux-acpi,
	yhlu.kernel, elendil

When one cpu is set to offline, the caller process will hang, according to
the trace data, the problem lies in the refcount error in cpufreq driver,
cpufreq_cpu_callback will wait for completion policy->kobj_unregister
which is nerver completed because a refcount error in function
__cpufreq_remove_dev in file driver/cpufreq/cpufreq.c results in not
calling kobject release method.

In driver/cpufreq/cpufreq.c, the refcount of data->kobj isn't 1 when it
will be unregistered, this problem didn't exist in 2.6.24 and earlier.

The root cause is kobject API switch, kobject_init_and_add and kobject_put
replace older kobject_register and kobject_unregister in 2.6.25-rc1,
compared to 2.6.24, kobject_unregister is deleted in function
__cpufreq_remove_dev but it isn't replaced with kobject_put.

This patch adds kobject_put to balance refcount. I noticed Greg suggests
it will fix a power-off issue to remove kobject_get statement block, but i
think that isn't the best way because those code block has existed very long
and it is helpful because the successive statements are invoking relevant
data.


Signed-off-by: Yi Yang <yi.y.yang@intel.com>
---
 cpufreq.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/cpufreq/cpufreq.c	2008-02-15 04:41:29.000000000 +0800
+++ b/drivers/cpufreq/cpufreq.c	2008-02-15 06:56:56.000000000 +0800
@@ -1057,12 +1057,17 @@ static int __cpufreq_remove_dev (struct 
 
 	unlock_policy_rwsem_write(cpu);
 
+	/* it matches the previous kobject_get */
 	kobject_put(&data->kobj);
 
 	/* we need to make sure that the underlying kobj is actually
 	 * not referenced anymore by anybody before we proceed with
 	 * unloading.
 	 */
+
+	/* unregister data->kobj, it matches kobject_init_and_add */
+	kobject_put(&data->kobj);
+
 	dprintk("waiting for dropping of refcount\n");
 	wait_for_completion(&data->kobj_unregister);
 	dprintk("wait complete\n");



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance
       [not found] ` <1201043126.3861.5.camel@yangyi-dev.bj.intel.com>
  2008-02-14 23:44   ` [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance Yi Yang
@ 2008-02-14 23:48   ` Yi Yang
  2008-02-15 15:52     ` [linux-pm] " Alan Stern
                       ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Yi Yang @ 2008-02-14 23:48 UTC (permalink / raw)
  To: akpm
  Cc: torvalds, gregkh, davej, mingo, cpufreq, linux-pm, linux-kernel,
	linux-acpi

When one cpu is set to offline, the caller process will hang, according to
the trace data, the problem lies in the refcount error in cpufreq driver,
cpufreq_cpu_callback will wait for completion policy->kobj_unregister
which is nerver completed because a refcount error in function
__cpufreq_remove_dev in file driver/cpufreq/cpufreq.c results in not
calling kobject release method.

In driver/cpufreq/cpufreq.c, the refcount of data->kobj isn't 1 when it
will be unregistered, this problem didn't exist in 2.6.24 and earlier.

The root cause is kobject API switch, kobject_init_and_add and kobject_put
replace older kobject_register and kobject_unregister in 2.6.25-rc1,
compared to 2.6.24, kobject_unregister is deleted in function
__cpufreq_remove_dev but it isn't replaced with kobject_put.

This patch adds kobject_put to balance refcount. I noticed Greg suggests
it will fix a power-off issue to remove kobject_get statement block, but i
think that isn't the best way because those code block has existed very long
and it is helpful because the successive statements are invoking relevant
data.


Signed-off-by: Yi Yang <yi.y.yang@intel.com>
---
 cpufreq.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/drivers/cpufreq/cpufreq.c	2008-02-15 04:41:29.000000000 +0800
+++ b/drivers/cpufreq/cpufreq.c	2008-02-15 06:56:56.000000000 +0800
@@ -1057,12 +1057,17 @@ static int __cpufreq_remove_dev (struct 
 
 	unlock_policy_rwsem_write(cpu);
 
+	/* it matches the previous kobject_get */
 	kobject_put(&data->kobj);
 
 	/* we need to make sure that the underlying kobj is actually
 	 * not referenced anymore by anybody before we proceed with
 	 * unloading.
 	 */
+
+	/* unregister data->kobj, it matches kobject_init_and_add */
+	kobject_put(&data->kobj);
+
 	dprintk("waiting for dropping of refcount\n");
 	wait_for_completion(&data->kobj_unregister);
 	dprintk("wait complete\n");



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-pm] [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance
  2008-02-14 23:48   ` Yi Yang
@ 2008-02-15 15:52     ` Alan Stern
  2008-02-15 18:24       ` Greg KH
  2008-02-15 21:01     ` Greg KH
  2008-02-25  0:46     ` [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow Yi Yang
  2 siblings, 1 reply; 9+ messages in thread
From: Alan Stern @ 2008-02-15 15:52 UTC (permalink / raw)
  To: Yi Yang
  Cc: akpm, davej, cpufreq, gregkh, linux-kernel, linux-acpi, mingo,
	torvalds, linux-pm

On Fri, 15 Feb 2008, Yi Yang wrote:

> This patch adds kobject_put to balance refcount. I noticed Greg suggests
> it will fix a power-off issue to remove kobject_get statement block, but i
> think that isn't the best way because those code block has existed very long
> and it is helpful because the successive statements are invoking relevant
> data.

Are you referring to this section of code (before the region affected 
by your patch)?

	if (!kobject_get(&data->kobj)) {
		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
		cpufreq_debug_enable_ratelimit();
		unlock_policy_rwsem_write(cpu);
		return -EFAULT;
	}

Greg is correct that the kobject_get() here is useless and should be
removed.  kobject_get() never returns NULL unless its argument is NULL.  
Since &data->kobj can never be NULL, the "if" test will never fail.  
Hence there's no point in making the test at all.

The fact that a section of code has existed for a long time doesn't 
mean that it is right.  :-)

Furthermore, there's no reason to do the kobject_get().  Holding 2 
references to a kobject is no better than holding just 1 reference.  
Assuming you know that the kobject is still registered, then you also 
know that there is already a reference to it.  So you have no reason to 
take an additional reference.

Alan Stern


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-pm] [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance
  2008-02-15 15:52     ` [linux-pm] " Alan Stern
@ 2008-02-15 18:24       ` Greg KH
  0 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2008-02-15 18:24 UTC (permalink / raw)
  To: Alan Stern
  Cc: Yi Yang, akpm, davej, cpufreq, linux-kernel, linux-acpi, mingo,
	torvalds, linux-pm

On Fri, Feb 15, 2008 at 10:52:51AM -0500, Alan Stern wrote:
> On Fri, 15 Feb 2008, Yi Yang wrote:
> 
> > This patch adds kobject_put to balance refcount. I noticed Greg suggests
> > it will fix a power-off issue to remove kobject_get statement block, but i
> > think that isn't the best way because those code block has existed very long
> > and it is helpful because the successive statements are invoking relevant
> > data.
> 
> Are you referring to this section of code (before the region affected 
> by your patch)?
> 
> 	if (!kobject_get(&data->kobj)) {
> 		spin_unlock_irqrestore(&cpufreq_driver_lock, flags);
> 		cpufreq_debug_enable_ratelimit();
> 		unlock_policy_rwsem_write(cpu);
> 		return -EFAULT;
> 	}
> 
> Greg is correct that the kobject_get() here is useless and should be
> removed.  kobject_get() never returns NULL unless its argument is NULL.  
> Since &data->kobj can never be NULL, the "if" test will never fail.  
> Hence there's no point in making the test at all.
> 
> The fact that a section of code has existed for a long time doesn't 
> mean that it is right.  :-)
> 
> Furthermore, there's no reason to do the kobject_get().  Holding 2 
> references to a kobject is no better than holding just 1 reference.  
> Assuming you know that the kobject is still registered, then you also 
> know that there is already a reference to it.  So you have no reason to 
> take an additional reference.

There's the additional problem that this second reference count is never
dropped, causing a bug :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance
  2008-02-14 23:48   ` Yi Yang
  2008-02-15 15:52     ` [linux-pm] " Alan Stern
@ 2008-02-15 21:01     ` Greg KH
  2008-02-25  0:46     ` [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow Yi Yang
  2 siblings, 0 replies; 9+ messages in thread
From: Greg KH @ 2008-02-15 21:01 UTC (permalink / raw)
  To: Yi Yang
  Cc: akpm, torvalds, davej, mingo, cpufreq, linux-pm, linux-kernel,
	linux-acpi

On Fri, Feb 15, 2008 at 07:48:41AM +0800, Yi Yang wrote:
> When one cpu is set to offline, the caller process will hang, according to
> the trace data, the problem lies in the refcount error in cpufreq driver,
> cpufreq_cpu_callback will wait for completion policy->kobj_unregister
> which is nerver completed because a refcount error in function
> __cpufreq_remove_dev in file driver/cpufreq/cpufreq.c results in not
> calling kobject release method.
> 
> In driver/cpufreq/cpufreq.c, the refcount of data->kobj isn't 1 when it
> will be unregistered, this problem didn't exist in 2.6.24 and earlier.
> 
> The root cause is kobject API switch, kobject_init_and_add and kobject_put
> replace older kobject_register and kobject_unregister in 2.6.25-rc1,
> compared to 2.6.24, kobject_unregister is deleted in function
> __cpufreq_remove_dev but it isn't replaced with kobject_put.
> 
> This patch adds kobject_put to balance refcount. I noticed Greg suggests
> it will fix a power-off issue to remove kobject_get statement block, but i
> think that isn't the best way because those code block has existed very long
> and it is helpful because the successive statements are invoking relevant
> data.
> 
> 
> Signed-off-by: Yi Yang <yi.y.yang@intel.com>

No, the additional kobject_get() needs to be removed.  I posted a patch
for this last night, and so did someone else earlier at:
		http://lkml.org/lkml/2008/2/8/342

this patch should not be added, I'll get the other one in instead.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow
  2008-02-14 23:48   ` Yi Yang
  2008-02-15 15:52     ` [linux-pm] " Alan Stern
  2008-02-15 21:01     ` Greg KH
@ 2008-02-25  0:46     ` Yi Yang
  2008-02-25 10:15       ` Ingo Molnar
  2008-03-26  4:46       ` Len Brown
  2 siblings, 2 replies; 9+ messages in thread
From: Yi Yang @ 2008-02-25  0:46 UTC (permalink / raw)
  To: akpm; +Cc: venkatesh.pallipadi, cpufreq, linux-pm, linux-kernel, linux-acpi

cpuidle C-state sysfs node time and usage are very easy to overflow because
they are all of unsigned int type, time will overflow within about two hours,
usage will take longer time to overflow, but they are increasing for ever.

This patch will convert them to unsigned long long.


Signed-off-by: Yi Yang <yi.y.yang@intel.com>
---
 drivers/cpuidle/cpuidle.c |    2 +-
 drivers/cpuidle/sysfs.c   |   10 ++++++++--
 include/linux/cpuidle.h   |    4 ++--

--- a/include/linux/cpuidle.h	2008-02-25 02:31:26.000000000 -0500
+++ b/include/linux/cpuidle.h	2008-02-25 04:30:24.000000000 -0500
@@ -38,8 +38,8 @@ struct cpuidle_state {
 	unsigned int	power_usage; /* in mW */
 	unsigned int	target_residency; /* in US */
 
-	unsigned int	usage;
-	unsigned int	time; /* in US */
+	unsigned long long	usage;
+	unsigned long long	time; /* in US */
 
 	int (*enter)	(struct cpuidle_device *dev,
 			 struct cpuidle_state *state);
--- a/drivers/cpuidle/cpuidle.c	2008-02-25 02:37:14.000000000 -0500
+++ b/drivers/cpuidle/cpuidle.c	2008-02-25 04:29:19.000000000 -0500
@@ -67,7 +67,7 @@ static void cpuidle_idle_call(void)
 	/* enter the state and update stats */
 	dev->last_residency = target_state->enter(dev, target_state);
 	dev->last_state = target_state;
-	target_state->time += dev->last_residency;
+	target_state->time += (unsigned long long)dev->last_residency;
 	target_state->usage++;
 
 	/* give the governor an opportunity to reflect on the outcome */
--- a/drivers/cpuidle/sysfs.c	2008-02-25 02:33:14.000000000 -0500
+++ b/drivers/cpuidle/sysfs.c	2008-02-25 03:10:50.000000000 -0500
@@ -218,6 +218,12 @@ static ssize_t show_state_##_name(struct
 	return sprintf(buf, "%u\n", state->_name);\
 }
 
+#define define_show_state_ull_function(_name) \
+static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
+{ \
+	return sprintf(buf, "%llu\n", state->_name);\
+}
+
 #define define_show_state_str_function(_name) \
 static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
 { \
@@ -228,8 +234,8 @@ static ssize_t show_state_##_name(struct
 
 define_show_state_function(exit_latency)
 define_show_state_function(power_usage)
-define_show_state_function(usage)
-define_show_state_function(time)
+define_show_state_ull_function(usage)
+define_show_state_ull_function(time)
 define_show_state_str_function(name)
 define_show_state_str_function(desc)
 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow
  2008-02-25 10:15       ` Ingo Molnar
@ 2008-02-25  1:10         ` Yi Yang
  0 siblings, 0 replies; 9+ messages in thread
From: Yi Yang @ 2008-02-25  1:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, venkatesh.pallipadi, cpufreq, linux-pm, linux-kernel,
	linux-acpi

On Mon, 2008-02-25 at 11:15 +0100, Ingo Molnar wrote:
> * Yi Yang <yi.y.yang@intel.com> wrote:
> 
> > cpuidle C-state sysfs node time and usage are very easy to overflow 
> > because they are all of unsigned int type, time will overflow within 
> > about two hours, usage will take longer time to overflow, but they are 
> > increasing for ever.
> > 
> > This patch will convert them to unsigned long long.
> 
> what happens if such an overflow happens - any particular regression or 
> other misbehavior, or just funny looking stats in sysfs?
They are just stats info in sysfs, cpuidle's behaviors don't depend on
them. I didn't notice any regression or misbehaviors.

> 
> 	Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow
  2008-02-25  0:46     ` [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow Yi Yang
@ 2008-02-25 10:15       ` Ingo Molnar
  2008-02-25  1:10         ` Yi Yang
  2008-03-26  4:46       ` Len Brown
  1 sibling, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2008-02-25 10:15 UTC (permalink / raw)
  To: Yi Yang
  Cc: akpm, venkatesh.pallipadi, cpufreq, linux-pm, linux-kernel,
	linux-acpi


* Yi Yang <yi.y.yang@intel.com> wrote:

> cpuidle C-state sysfs node time and usage are very easy to overflow 
> because they are all of unsigned int type, time will overflow within 
> about two hours, usage will take longer time to overflow, but they are 
> increasing for ever.
> 
> This patch will convert them to unsigned long long.

what happens if such an overflow happens - any particular regression or 
other misbehavior, or just funny looking stats in sysfs?

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow
  2008-02-25  0:46     ` [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow Yi Yang
  2008-02-25 10:15       ` Ingo Molnar
@ 2008-03-26  4:46       ` Len Brown
  1 sibling, 0 replies; 9+ messages in thread
From: Len Brown @ 2008-03-26  4:46 UTC (permalink / raw)
  To: yi.y.yang; +Cc: cpufreq, linux-kernel, linux-acpi, akpm, linux-pm

applied

thanks,
-len

On Sunday 24 February 2008, Yi Yang wrote:
> cpuidle C-state sysfs node time and usage are very easy to overflow because
> they are all of unsigned int type, time will overflow within about two hours,
> usage will take longer time to overflow, but they are increasing for ever.
> 
> This patch will convert them to unsigned long long.
> 
> 
> Signed-off-by: Yi Yang <yi.y.yang@intel.com>
> ---
>  drivers/cpuidle/cpuidle.c |    2 +-
>  drivers/cpuidle/sysfs.c   |   10 ++++++++--
>  include/linux/cpuidle.h   |    4 ++--
> 
> --- a/include/linux/cpuidle.h	2008-02-25 02:31:26.000000000 -0500
> +++ b/include/linux/cpuidle.h	2008-02-25 04:30:24.000000000 -0500
> @@ -38,8 +38,8 @@ struct cpuidle_state {
>  	unsigned int	power_usage; /* in mW */
>  	unsigned int	target_residency; /* in US */
>  
> -	unsigned int	usage;
> -	unsigned int	time; /* in US */
> +	unsigned long long	usage;
> +	unsigned long long	time; /* in US */
>  
>  	int (*enter)	(struct cpuidle_device *dev,
>  			 struct cpuidle_state *state);
> --- a/drivers/cpuidle/cpuidle.c	2008-02-25 02:37:14.000000000 -0500
> +++ b/drivers/cpuidle/cpuidle.c	2008-02-25 04:29:19.000000000 -0500
> @@ -67,7 +67,7 @@ static void cpuidle_idle_call(void)
>  	/* enter the state and update stats */
>  	dev->last_residency = target_state->enter(dev, target_state);
>  	dev->last_state = target_state;
> -	target_state->time += dev->last_residency;
> +	target_state->time += (unsigned long long)dev->last_residency;
>  	target_state->usage++;
>  
>  	/* give the governor an opportunity to reflect on the outcome */
> --- a/drivers/cpuidle/sysfs.c	2008-02-25 02:33:14.000000000 -0500
> +++ b/drivers/cpuidle/sysfs.c	2008-02-25 03:10:50.000000000 -0500
> @@ -218,6 +218,12 @@ static ssize_t show_state_##_name(struct
>  	return sprintf(buf, "%u\n", state->_name);\
>  }
>  
> +#define define_show_state_ull_function(_name) \
> +static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
> +{ \
> +	return sprintf(buf, "%llu\n", state->_name);\
> +}
> +
>  #define define_show_state_str_function(_name) \
>  static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
>  { \
> @@ -228,8 +234,8 @@ static ssize_t show_state_##_name(struct
>  
>  define_show_state_function(exit_latency)
>  define_show_state_function(power_usage)
> -define_show_state_function(usage)
> -define_show_state_function(time)
> +define_show_state_ull_function(usage)
> +define_show_state_ull_function(time)
>  define_show_state_str_function(name)
>  define_show_state_str_function(desc)
>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-03-26  4:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1199441414.19185.9.camel@yangyi-dev.bj.intel.com>
     [not found] ` <1201043126.3861.5.camel@yangyi-dev.bj.intel.com>
2008-02-14 23:44   ` [PATCH 2.6.25-rc1] cpufreq: fix cpufreq policy refcount imbalance Yi Yang
2008-02-14 23:48   ` Yi Yang
2008-02-15 15:52     ` [linux-pm] " Alan Stern
2008-02-15 18:24       ` Greg KH
2008-02-15 21:01     ` Greg KH
2008-02-25  0:46     ` [PATCH 2.6.25-rc3] cpuidle: fix cpuidle time and usage overflow Yi Yang
2008-02-25 10:15       ` Ingo Molnar
2008-02-25  1:10         ` Yi Yang
2008-03-26  4:46       ` Len Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox