public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
       [not found] <20120730151559.772d4055@feng-i7>
@ 2012-07-30 13:39 ` Paul E. McKenney
  2012-07-30 15:07   ` Feng Tang
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 13:39 UTC (permalink / raw)
  To: Feng Tang
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List

On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> Hi All,
> 
> When I debugged a suspend/resume bug, I found that tick_broadcast_mask is not
> restored for a CPU after it is offline/onlined since kernel 3.4, while it's
> fine for 3.3.

Could you please try 3.5?

> Further check show it is caused by the commit 9505626d7bfe
>    ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
> 	
>     The acpi_processor_cst_has_changed() function is invoked from a
>     CPU_ONLINE or CPU_DEAD function, which might well execute on CPU 0
>     even though the CPU being hotplugged is some other CPU.  In addition,
>     acpi_processor_cst_has_changed() invokes smp_processor_id() without
>     protection, resulting in splats when onlining CPUs.
>     
>     This commit therefore changes the smp_processor_id() to pr->id, as is
>     used elsewhere in the code, for example, in acpi_processor_add().
>     
>     Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 0e8e2de..9e57b06 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -1159,8 +1159,7 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
>          * to make the code that updates C-States be called once.
>          */
> 
> -       if (smp_processor_id() == 0 &&
> -                       cpuidle_get_driver() == &acpi_idle_driver) {
> +       if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
> 
>                 cpuidle_pause_and_lock();
>                 /* Protect against cpu-hotplug */
> 
> The root cause is acpi_processor_cst_has_changed() will also be called when
> cpu_up() is run on cpu 0 to boot up other cpu, this commit will prevent the
> following code be run for that cpu, which triggers some side effect like the
> broadcast_mask is not restored. 
> 
> I raise this problem up and I don't if revert is a good solution here.

Indeed, that would re-introduce the splats from unprotected use of
smp_processor_id().  :-(

							Thanx, Paul


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
  2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney
@ 2012-07-30 15:07   ` Feng Tang
  2012-07-30 17:08     ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-07-30 15:07 UTC (permalink / raw)
  To: paulmck
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List, linux-kernel

Hi Paul,

On Mon, 30 Jul 2012 06:39:13 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > Hi All,
> > 
> > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > it's fine for 3.3.
> 
> Could you please try 3.5?

Yes, it's the same for 3.5

Thanks,
Feng

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
  2012-07-30 15:07   ` Feng Tang
@ 2012-07-30 17:08     ` Paul E. McKenney
  2012-07-30 17:42       ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 17:08 UTC (permalink / raw)
  To: Feng Tang
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List

On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> Hi Paul,
> 
> On Mon, 30 Jul 2012 06:39:13 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > Hi All,
> > > 
> > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > > it's fine for 3.3.
> > 
> > Could you please try 3.5?
> 
> Yes, it's the same for 3.5

Thank you for checking, Feng.

Len, the comment above the change says:

	/*
	 * FIXME:  Design the ACPI notification to make it once per
	 * system instead of once per-cpu.  This condition is a hack
	 * to make the code that updates C-States be called once.
	 */

Is it time for this design-level change?  Or is there something obvious
that I missed when fixing the smp_processor_id() splat?

I could revert back, but use raw_smp_processor_id() rather than
smp_processor_id(), but that feels like papering over a problem rather
than fixing it.

Thoughts?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
  2012-07-30 17:08     ` Paul E. McKenney
@ 2012-07-30 17:42       ` Paul E. McKenney
  2012-07-31  3:18         ` Feng Tang
  0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 17:42 UTC (permalink / raw)
  To: Feng Tang
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List

On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > Hi Paul,
> > 
> > On Mon, 30 Jul 2012 06:39:13 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > 
> > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > Hi All,
> > > > 
> > > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > > > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > > > it's fine for 3.3.
> > > 
> > > Could you please try 3.5?
> > 
> > Yes, it's the same for 3.5
> 
> Thank you for checking, Feng.
> 
> Len, the comment above the change says:
> 
> 	/*
> 	 * FIXME:  Design the ACPI notification to make it once per
> 	 * system instead of once per-cpu.  This condition is a hack
> 	 * to make the code that updates C-States be called once.
> 	 */
> 
> Is it time for this design-level change?  Or is there something obvious
> that I missed when fixing the smp_processor_id() splat?
> 
> I could revert back, but use raw_smp_processor_id() rather than
> smp_processor_id(), but that feels like papering over a problem rather
> than fixing it.

But should papering be appropriate, here is the patch.

							Thanx, Paul

------------------------------------------------------------------------

ACPI: Repair fix to unprotected smp_processor_id()

Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
This commit therefore introduces a bug-for-bug compatible fix for the
original problem.

Reported-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 47a8caa..19c151a 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
 	 * to make the code that updates C-States be called once.
 	 */
 
-	if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
+	if (raw_smp_processor_id() == 0 &&
+	    cpuidle_get_driver() == &acpi_idle_driver) {
 
 		cpuidle_pause_and_lock();
 		/* Protect against cpu-hotplug */


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
  2012-07-30 17:42       ` Paul E. McKenney
@ 2012-07-31  3:18         ` Feng Tang
  2012-07-31  4:09           ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-07-31  3:18 UTC (permalink / raw)
  To: paulmck
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List

Hi Paul,

On Mon, 30 Jul 2012 10:42:18 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > > Hi Paul,
> > > 
> > > On Mon, 30 Jul 2012 06:39:13 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > 
> > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > > Hi All,
> > > > > 
> > > > > When I debugged a suspend/resume bug, I found that
> > > > > tick_broadcast_mask is not restored for a CPU after it is
> > > > > offline/onlined since kernel 3.4, while it's fine for 3.3.
> > > > 
> > > > Could you please try 3.5?
> > > 
> > > Yes, it's the same for 3.5
> > 
> > Thank you for checking, Feng.
> > 
> > Len, the comment above the change says:
> > 
> > 	/*
> > 	 * FIXME:  Design the ACPI notification to make it once per
> > 	 * system instead of once per-cpu.  This condition is a hack
> > 	 * to make the code that updates C-States be called once.
> > 	 */
> > 
> > Is it time for this design-level change?  Or is there something obvious
> > that I missed when fixing the smp_processor_id() splat?
> > 
> > I could revert back, but use raw_smp_processor_id() rather than
> > smp_processor_id(), but that feels like papering over a problem rather
> > than fixing it.
> 
> But should papering be appropriate, here is the patch.
> 
> 							Thanx, Paul

Just found and have a patch to fix a typo in acpi processor_driver.c, which
could also fix  this tick_broadcast_mask issue.
	
Patch is in https://lkml.org/lkml/2012/7/30/483 

So I think we don't need this "papering over" patch :)

Thanks,
Feng

> 
> ------------------------------------------------------------------------
> 
> ACPI: Repair fix to unprotected smp_processor_id()
> 
> Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
> acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
> This commit therefore introduces a bug-for-bug compatible fix for the
> original problem.
> 
> Reported-by: Feng Tang <feng.tang@intel.com>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> 
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 47a8caa..19c151a 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct
> acpi_processor *pr)
>  	 * to make the code that updates C-States be called once.
>  	 */
>  
> -	if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
> +	if (raw_smp_processor_id() == 0 &&
> +	    cpuidle_get_driver() == &acpi_idle_driver) {
>  
>  		cpuidle_pause_and_lock();
>  		/* Protect against cpu-hotplug */
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
  2012-07-31  3:18         ` Feng Tang
@ 2012-07-31  4:09           ` Paul E. McKenney
  0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-31  4:09 UTC (permalink / raw)
  To: Feng Tang
  Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
	Linux Kernel Mail List

On Tue, Jul 31, 2012 at 11:18:32AM +0800, Feng Tang wrote:
> Hi Paul,
> 
> On Mon, 30 Jul 2012 10:42:18 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > > > Hi Paul,
> > > > 
> > > > On Mon, 30 Jul 2012 06:39:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > > 
> > > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > > > Hi All,
> > > > > > 
> > > > > > When I debugged a suspend/resume bug, I found that
> > > > > > tick_broadcast_mask is not restored for a CPU after it is
> > > > > > offline/onlined since kernel 3.4, while it's fine for 3.3.
> > > > > 
> > > > > Could you please try 3.5?
> > > > 
> > > > Yes, it's the same for 3.5
> > > 
> > > Thank you for checking, Feng.
> > > 
> > > Len, the comment above the change says:
> > > 
> > > 	/*
> > > 	 * FIXME:  Design the ACPI notification to make it once per
> > > 	 * system instead of once per-cpu.  This condition is a hack
> > > 	 * to make the code that updates C-States be called once.
> > > 	 */
> > > 
> > > Is it time for this design-level change?  Or is there something obvious
> > > that I missed when fixing the smp_processor_id() splat?
> > > 
> > > I could revert back, but use raw_smp_processor_id() rather than
> > > smp_processor_id(), but that feels like papering over a problem rather
> > > than fixing it.
> > 
> > But should papering be appropriate, here is the patch.
> > 
> > 							Thanx, Paul
> 
> Just found and have a patch to fix a typo in acpi processor_driver.c, which
> could also fix  this tick_broadcast_mask issue.
> 	
> Patch is in https://lkml.org/lkml/2012/7/30/483 
> 
> So I think we don't need this "papering over" patch :)

Very good, I have dropped it.

							Thanx, Paul

> Thanks,
> Feng
> 
> > 
> > ------------------------------------------------------------------------
> > 
> > ACPI: Repair fix to unprotected smp_processor_id()
> > 
> > Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
> > acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
> > This commit therefore introduces a bug-for-bug compatible fix for the
> > original problem.
> > 
> > Reported-by: Feng Tang <feng.tang@intel.com>
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > 
> > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> > index 47a8caa..19c151a 100644
> > --- a/drivers/acpi/processor_idle.c
> > +++ b/drivers/acpi/processor_idle.c
> > @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct
> > acpi_processor *pr)
> >  	 * to make the code that updates C-States be called once.
> >  	 */
> >  
> > -	if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
> > +	if (raw_smp_processor_id() == 0 &&
> > +	    cpuidle_get_driver() == &acpi_idle_driver) {
> >  
> >  		cpuidle_pause_and_lock();
> >  		/* Protect against cpu-hotplug */
> > 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-07-31  4:10 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20120730151559.772d4055@feng-i7>
2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney
2012-07-30 15:07   ` Feng Tang
2012-07-30 17:08     ` Paul E. McKenney
2012-07-30 17:42       ` Paul E. McKenney
2012-07-31  3:18         ` Feng Tang
2012-07-31  4:09           ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox