linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cpuidle - fix lock contention in the idle path
@ 2012-12-26 10:01 Daniel Lezcano
  2012-12-29 10:03 ` Daniel Lezcano
  2013-01-02 21:13 ` Russ Anderson
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Lezcano @ 2012-12-26 10:01 UTC (permalink / raw)
  To: rafael.j.wysocki; +Cc: rja, linux-pm, pdeschrijver, akpm, linux-kernel

The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
a lock in the cpuidle_get_cpu_driver function. This function
is used in the idle_call function.

The problem is the contention with a large number of cpus because
they try to access the idle routine at the same time.

The lock could be safely removed because of how is used the
cpuidle api. The cpuidle_register_driver is called first but
until the cpuidle_register_device is not called we don't
enter in the cpuidle idle call function because the device
is not enabled.

The cpuidle_unregister_driver function, leading the a NULL driver,
is not called before the cpuidle_unregister_device.

This is how is used the cpuidle api from the different drivers.

However, a cleanup around the lock and a proper refcounting
mechanism should be used to ensure the consistency in the api,
like cpuidle_unregister_driver should failed if its refcounting
is not 0.

These modifications will need some code reorganization and rewrite
which does not fit with a fix.

The following patch is a hot fix by returning to the initial behavior
by removing the lock when getting the driver.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/cpuidle/driver.c |    8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 3af841f..c2b281a 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -235,16 +235,10 @@ EXPORT_SYMBOL_GPL(cpuidle_get_driver);
  */
 struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
 {
-	struct cpuidle_driver *drv;
-
 	if (!dev)
 		return NULL;
 
-	spin_lock(&cpuidle_driver_lock);
-	drv = __cpuidle_get_cpu_driver(dev->cpu);
-	spin_unlock(&cpuidle_driver_lock);
-
-	return drv;
+	return __cpuidle_get_cpu_driver(dev->cpu);
 }
 EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2012-12-26 10:01 [PATCH] cpuidle - fix lock contention in the idle path Daniel Lezcano
@ 2012-12-29 10:03 ` Daniel Lezcano
  2012-12-31 15:08   ` Russ Anderson
  2013-01-02 21:13 ` Russ Anderson
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2012-12-29 10:03 UTC (permalink / raw)
  To: rafael.j.wysocki, rja; +Cc: linux-pm, pdeschrijver, akpm, linux-kernel


Hi Russ,

Is it possible you try this patch on your 2048 cpus ?

Thanks

  -- Daniel

On 12/26/2012 11:01 AM, Daniel Lezcano wrote:
> The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
> a lock in the cpuidle_get_cpu_driver function. This function
> is used in the idle_call function.
> 
> The problem is the contention with a large number of cpus because
> they try to access the idle routine at the same time.
> 
> The lock could be safely removed because of how is used the
> cpuidle api. The cpuidle_register_driver is called first but
> until the cpuidle_register_device is not called we don't
> enter in the cpuidle idle call function because the device
> is not enabled.
> 
> The cpuidle_unregister_driver function, leading the a NULL driver,
> is not called before the cpuidle_unregister_device.
> 
> This is how is used the cpuidle api from the different drivers.
> 
> However, a cleanup around the lock and a proper refcounting
> mechanism should be used to ensure the consistency in the api,
> like cpuidle_unregister_driver should failed if its refcounting
> is not 0.
> 
> These modifications will need some code reorganization and rewrite
> which does not fit with a fix.
> 
> The following patch is a hot fix by returning to the initial behavior
> by removing the lock when getting the driver.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  drivers/cpuidle/driver.c |    8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
> index 3af841f..c2b281a 100644
> --- a/drivers/cpuidle/driver.c
> +++ b/drivers/cpuidle/driver.c
> @@ -235,16 +235,10 @@ EXPORT_SYMBOL_GPL(cpuidle_get_driver);
>   */
>  struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
>  {
> -	struct cpuidle_driver *drv;
> -
>  	if (!dev)
>  		return NULL;
>  
> -	spin_lock(&cpuidle_driver_lock);
> -	drv = __cpuidle_get_cpu_driver(dev->cpu);
> -	spin_unlock(&cpuidle_driver_lock);
> -
> -	return drv;
> +	return __cpuidle_get_cpu_driver(dev->cpu);
>  }
>  EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
>  
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2012-12-29 10:03 ` Daniel Lezcano
@ 2012-12-31 15:08   ` Russ Anderson
  0 siblings, 0 replies; 7+ messages in thread
From: Russ Anderson @ 2012-12-31 15:08 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: rafael.j.wysocki, linux-pm, pdeschrijver, akpm, linux-kernel, rja

On Sat, Dec 29, 2012 at 11:03:03AM +0100, Daniel Lezcano wrote:
> 
> Hi Russ,
> 
> Is it possible you try this patch on your 2048 cpus ?

Yes, I will try it later today.
Thanks


> Thanks
> 
>   -- Daniel
> 
> On 12/26/2012 11:01 AM, Daniel Lezcano wrote:
> > The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
> > a lock in the cpuidle_get_cpu_driver function. This function
> > is used in the idle_call function.
> > 
> > The problem is the contention with a large number of cpus because
> > they try to access the idle routine at the same time.
> > 
> > The lock could be safely removed because of how is used the
> > cpuidle api. The cpuidle_register_driver is called first but
> > until the cpuidle_register_device is not called we don't
> > enter in the cpuidle idle call function because the device
> > is not enabled.
> > 
> > The cpuidle_unregister_driver function, leading the a NULL driver,
> > is not called before the cpuidle_unregister_device.
> > 
> > This is how is used the cpuidle api from the different drivers.
> > 
> > However, a cleanup around the lock and a proper refcounting
> > mechanism should be used to ensure the consistency in the api,
> > like cpuidle_unregister_driver should failed if its refcounting
> > is not 0.
> > 
> > These modifications will need some code reorganization and rewrite
> > which does not fit with a fix.
> > 
> > The following patch is a hot fix by returning to the initial behavior
> > by removing the lock when getting the driver.
> > 
> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > ---
> >  drivers/cpuidle/driver.c |    8 +-------
> >  1 file changed, 1 insertion(+), 7 deletions(-)
> > 
> > diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
> > index 3af841f..c2b281a 100644
> > --- a/drivers/cpuidle/driver.c
> > +++ b/drivers/cpuidle/driver.c
> > @@ -235,16 +235,10 @@ EXPORT_SYMBOL_GPL(cpuidle_get_driver);
> >   */
> >  struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
> >  {
> > -	struct cpuidle_driver *drv;
> > -
> >  	if (!dev)
> >  		return NULL;
> >  
> > -	spin_lock(&cpuidle_driver_lock);
> > -	drv = __cpuidle_get_cpu_driver(dev->cpu);
> > -	spin_unlock(&cpuidle_driver_lock);
> > -
> > -	return drv;
> > +	return __cpuidle_get_cpu_driver(dev->cpu);
> >  }
> >  EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
> >  
> > 
> 
> 
> -- 
>  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
> 
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2012-12-26 10:01 [PATCH] cpuidle - fix lock contention in the idle path Daniel Lezcano
  2012-12-29 10:03 ` Daniel Lezcano
@ 2013-01-02 21:13 ` Russ Anderson
  2013-01-04  6:27   ` Daniel Lezcano
  1 sibling, 1 reply; 7+ messages in thread
From: Russ Anderson @ 2013-01-02 21:13 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: rafael.j.wysocki, linux-pm, pdeschrijver, akpm, linux-kernel, rja

On Wed, Dec 26, 2012 at 11:01:48AM +0100, Daniel Lezcano wrote:
> The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
> a lock in the cpuidle_get_cpu_driver function. This function
> is used in the idle_call function.
> 
> The problem is the contention with a large number of cpus because
> they try to access the idle routine at the same time.
> 
> The lock could be safely removed because of how is used the
> cpuidle api. The cpuidle_register_driver is called first but
> until the cpuidle_register_device is not called we don't
> enter in the cpuidle idle call function because the device
> is not enabled.
> 
> The cpuidle_unregister_driver function, leading the a NULL driver,
> is not called before the cpuidle_unregister_device.
> 
> This is how is used the cpuidle api from the different drivers.
> 
> However, a cleanup around the lock and a proper refcounting
> mechanism should be used to ensure the consistency in the api,
> like cpuidle_unregister_driver should failed if its refcounting
> is not 0.
> 
> These modifications will need some code reorganization and rewrite
> which does not fit with a fix.

I agree.

> The following patch is a hot fix by returning to the initial behavior
> by removing the lock when getting the driver.

The patch fixes the problem.  Verified on a system with 1024 cpus.
Thanks.

> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Reported-by: Russ Anderson <rja@sgi.com>
Acked-by: Russ Anderson <rja@sgi.com>

> ---
>  drivers/cpuidle/driver.c |    8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
> index 3af841f..c2b281a 100644
> --- a/drivers/cpuidle/driver.c
> +++ b/drivers/cpuidle/driver.c
> @@ -235,16 +235,10 @@ EXPORT_SYMBOL_GPL(cpuidle_get_driver);
>   */
>  struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
>  {
> -	struct cpuidle_driver *drv;
> -
>  	if (!dev)
>  		return NULL;
>  
> -	spin_lock(&cpuidle_driver_lock);
> -	drv = __cpuidle_get_cpu_driver(dev->cpu);
> -	spin_unlock(&cpuidle_driver_lock);
> -
> -	return drv;
> +	return __cpuidle_get_cpu_driver(dev->cpu);
>  }
>  EXPORT_SYMBOL_GPL(cpuidle_get_cpu_driver);
>  
> -- 
> 1.7.9.5

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2013-01-02 21:13 ` Russ Anderson
@ 2013-01-04  6:27   ` Daniel Lezcano
  2013-01-05  0:03     ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2013-01-04  6:27 UTC (permalink / raw)
  To: rafael.j.wysocki
  Cc: Russ Anderson, linux-pm, pdeschrijver, akpm, linux-kernel, rja

On 01/02/2013 10:13 PM, Russ Anderson wrote:
> On Wed, Dec 26, 2012 at 11:01:48AM +0100, Daniel Lezcano wrote:
>> The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
>> a lock in the cpuidle_get_cpu_driver function. This function
>> is used in the idle_call function.
>>
>> The problem is the contention with a large number of cpus because
>> they try to access the idle routine at the same time.
>>
>> The lock could be safely removed because of how is used the
>> cpuidle api. The cpuidle_register_driver is called first but
>> until the cpuidle_register_device is not called we don't
>> enter in the cpuidle idle call function because the device
>> is not enabled.
>>
>> The cpuidle_unregister_driver function, leading the a NULL driver,
>> is not called before the cpuidle_unregister_device.
>>
>> This is how is used the cpuidle api from the different drivers.
>>
>> However, a cleanup around the lock and a proper refcounting
>> mechanism should be used to ensure the consistency in the api,
>> like cpuidle_unregister_driver should failed if its refcounting
>> is not 0.
>>
>> These modifications will need some code reorganization and rewrite
>> which does not fit with a fix.
> 
> I agree.
> 
>> The following patch is a hot fix by returning to the initial behavior
>> by removing the lock when getting the driver.
> 
> The patch fixes the problem.  Verified on a system with 1024 cpus.
> Thanks.
> 
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> 
> Reported-by: Russ Anderson <rja@sgi.com>
> Acked-by: Russ Anderson <rja@sgi.com>

Hi Rafael,

could you consider this patch for merging ?

Thanks
  -- Daniel


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2013-01-04  6:27   ` Daniel Lezcano
@ 2013-01-05  0:03     ` Rafael J. Wysocki
  2013-01-06 23:07       ` Daniel Lezcano
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2013-01-05  0:03 UTC (permalink / raw)
  To: Daniel Lezcano, Russ Anderson
  Cc: linux-pm, pdeschrijver, akpm, linux-kernel, rja

On Friday, January 04, 2013 07:27:24 AM Daniel Lezcano wrote:
> On 01/02/2013 10:13 PM, Russ Anderson wrote:
> > On Wed, Dec 26, 2012 at 11:01:48AM +0100, Daniel Lezcano wrote:
> >> The commit bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 introduces
> >> a lock in the cpuidle_get_cpu_driver function. This function
> >> is used in the idle_call function.
> >>
> >> The problem is the contention with a large number of cpus because
> >> they try to access the idle routine at the same time.
> >>
> >> The lock could be safely removed because of how is used the
> >> cpuidle api. The cpuidle_register_driver is called first but
> >> until the cpuidle_register_device is not called we don't
> >> enter in the cpuidle idle call function because the device
> >> is not enabled.
> >>
> >> The cpuidle_unregister_driver function, leading the a NULL driver,
> >> is not called before the cpuidle_unregister_device.
> >>
> >> This is how is used the cpuidle api from the different drivers.
> >>
> >> However, a cleanup around the lock and a proper refcounting
> >> mechanism should be used to ensure the consistency in the api,
> >> like cpuidle_unregister_driver should failed if its refcounting
> >> is not 0.
> >>
> >> These modifications will need some code reorganization and rewrite
> >> which does not fit with a fix.
> > 
> > I agree.
> > 
> >> The following patch is a hot fix by returning to the initial behavior
> >> by removing the lock when getting the driver.
> > 
> > The patch fixes the problem.  Verified on a system with 1024 cpus.
> > Thanks.
> > 
> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > 
> > Reported-by: Russ Anderson <rja@sgi.com>
> > Acked-by: Russ Anderson <rja@sgi.com>
> 
> Hi Rafael,
> 
> could you consider this patch for merging ?

Yes, I've taken it already.

I'll include it into the next pull request.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] cpuidle - fix lock contention in the idle path
  2013-01-05  0:03     ` Rafael J. Wysocki
@ 2013-01-06 23:07       ` Daniel Lezcano
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Lezcano @ 2013-01-06 23:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Russ Anderson, linux-pm, pdeschrijver, akpm, linux-kernel, rja


[ ... ]

>>>> The following patch is a hot fix by returning to the initial behavior
>>>> by removing the lock when getting the driver.
>>>
>>> The patch fixes the problem.  Verified on a system with 1024 cpus.
>>> Thanks.
>>>
>>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>>
>>> Reported-by: Russ Anderson <rja@sgi.com>
>>> Acked-by: Russ Anderson <rja@sgi.com>
>>
>> Hi Rafael,
>>
>> could you consider this patch for merging ?
> 
> Yes, I've taken it already.
> 
> I'll include it into the next pull request.

Ok, thanks.

I noticed you rephrased the changelog. Thanks for that also.

 -- Daniel



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

  * English - detected
  * English
  * French

  * English
  * French

 <javascript:void(0);> <#>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-01-06 23:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-26 10:01 [PATCH] cpuidle - fix lock contention in the idle path Daniel Lezcano
2012-12-29 10:03 ` Daniel Lezcano
2012-12-31 15:08   ` Russ Anderson
2013-01-02 21:13 ` Russ Anderson
2013-01-04  6:27   ` Daniel Lezcano
2013-01-05  0:03     ` Rafael J. Wysocki
2013-01-06 23:07       ` Daniel Lezcano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).