* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined [not found] <20120730151559.772d4055@feng-i7> @ 2012-07-30 13:39 ` Paul E. McKenney 2012-07-30 15:07 ` Feng Tang 0 siblings, 1 reply; 6+ messages in thread From: Paul E. McKenney @ 2012-07-30 13:39 UTC (permalink / raw) To: Feng Tang Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > Hi All, > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is not > restored for a CPU after it is offline/onlined since kernel 3.4, while it's > fine for 3.3. Could you please try 3.5? > Further check show it is caused by the commit 9505626d7bfe > ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed() > > The acpi_processor_cst_has_changed() function is invoked from a > CPU_ONLINE or CPU_DEAD function, which might well execute on CPU 0 > even though the CPU being hotplugged is some other CPU. In addition, > acpi_processor_cst_has_changed() invokes smp_processor_id() without > protection, resulting in splats when onlining CPUs. > > This commit therefore changes the smp_processor_id() to pr->id, as is > used elsewhere in the code, for example, in acpi_processor_add(). > > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index 0e8e2de..9e57b06 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -1159,8 +1159,7 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr) > * to make the code that updates C-States be called once. > */ > > - if (smp_processor_id() == 0 && > - cpuidle_get_driver() == &acpi_idle_driver) { > + if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) { > > cpuidle_pause_and_lock(); > /* Protect against cpu-hotplug */ > > The root cause is acpi_processor_cst_has_changed() will also be called when > cpu_up() is run on cpu 0 to boot up other cpu, this commit will prevent the > following code be run for that cpu, which triggers some side effect like the > broadcast_mask is not restored. > > I raise this problem up and I don't if revert is a good solution here. Indeed, that would re-introduce the splats from unprotected use of smp_processor_id(). :-( Thanx, Paul ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined 2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney @ 2012-07-30 15:07 ` Feng Tang 2012-07-30 17:08 ` Paul E. McKenney 0 siblings, 1 reply; 6+ messages in thread From: Feng Tang @ 2012-07-30 15:07 UTC (permalink / raw) To: paulmck Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List, linux-kernel Hi Paul, On Mon, 30 Jul 2012 06:39:13 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > > Hi All, > > > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is > > not restored for a CPU after it is offline/onlined since kernel 3.4, while > > it's fine for 3.3. > > Could you please try 3.5? Yes, it's the same for 3.5 Thanks, Feng ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined 2012-07-30 15:07 ` Feng Tang @ 2012-07-30 17:08 ` Paul E. McKenney 2012-07-30 17:42 ` Paul E. McKenney 0 siblings, 1 reply; 6+ messages in thread From: Paul E. McKenney @ 2012-07-30 17:08 UTC (permalink / raw) To: Feng Tang Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote: > Hi Paul, > > On Mon, 30 Jul 2012 06:39:13 -0700 > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > > > Hi All, > > > > > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is > > > not restored for a CPU after it is offline/onlined since kernel 3.4, while > > > it's fine for 3.3. > > > > Could you please try 3.5? > > Yes, it's the same for 3.5 Thank you for checking, Feng. Len, the comment above the change says: /* * FIXME: Design the ACPI notification to make it once per * system instead of once per-cpu. This condition is a hack * to make the code that updates C-States be called once. */ Is it time for this design-level change? Or is there something obvious that I missed when fixing the smp_processor_id() splat? I could revert back, but use raw_smp_processor_id() rather than smp_processor_id(), but that feels like papering over a problem rather than fixing it. Thoughts? Thanx, Paul ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined 2012-07-30 17:08 ` Paul E. McKenney @ 2012-07-30 17:42 ` Paul E. McKenney 2012-07-31 3:18 ` Feng Tang 0 siblings, 1 reply; 6+ messages in thread From: Paul E. McKenney @ 2012-07-30 17:42 UTC (permalink / raw) To: Feng Tang Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote: > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote: > > Hi Paul, > > > > On Mon, 30 Jul 2012 06:39:13 -0700 > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > > > > Hi All, > > > > > > > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is > > > > not restored for a CPU after it is offline/onlined since kernel 3.4, while > > > > it's fine for 3.3. > > > > > > Could you please try 3.5? > > > > Yes, it's the same for 3.5 > > Thank you for checking, Feng. > > Len, the comment above the change says: > > /* > * FIXME: Design the ACPI notification to make it once per > * system instead of once per-cpu. This condition is a hack > * to make the code that updates C-States be called once. > */ > > Is it time for this design-level change? Or is there something obvious > that I missed when fixing the smp_processor_id() splat? > > I could revert back, but use raw_smp_processor_id() rather than > smp_processor_id(), but that feels like papering over a problem rather > than fixing it. But should papering be appropriate, here is the patch. Thanx, Paul ------------------------------------------------------------------------ ACPI: Repair fix to unprotected smp_processor_id() Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()) introduced a suspend/resume bug. This commit therefore introduces a bug-for-bug compatible fix for the original problem. Reported-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 47a8caa..19c151a 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr) * to make the code that updates C-States be called once. */ - if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) { + if (raw_smp_processor_id() == 0 && + cpuidle_get_driver() == &acpi_idle_driver) { cpuidle_pause_and_lock(); /* Protect against cpu-hotplug */ ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined 2012-07-30 17:42 ` Paul E. McKenney @ 2012-07-31 3:18 ` Feng Tang 2012-07-31 4:09 ` Paul E. McKenney 0 siblings, 1 reply; 6+ messages in thread From: Feng Tang @ 2012-07-31 3:18 UTC (permalink / raw) To: paulmck Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List Hi Paul, On Mon, 30 Jul 2012 10:42:18 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote: > > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote: > > > Hi Paul, > > > > > > On Mon, 30 Jul 2012 06:39:13 -0700 > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > > > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > > > > > Hi All, > > > > > > > > > > When I debugged a suspend/resume bug, I found that > > > > > tick_broadcast_mask is not restored for a CPU after it is > > > > > offline/onlined since kernel 3.4, while it's fine for 3.3. > > > > > > > > Could you please try 3.5? > > > > > > Yes, it's the same for 3.5 > > > > Thank you for checking, Feng. > > > > Len, the comment above the change says: > > > > /* > > * FIXME: Design the ACPI notification to make it once per > > * system instead of once per-cpu. This condition is a hack > > * to make the code that updates C-States be called once. > > */ > > > > Is it time for this design-level change? Or is there something obvious > > that I missed when fixing the smp_processor_id() splat? > > > > I could revert back, but use raw_smp_processor_id() rather than > > smp_processor_id(), but that feels like papering over a problem rather > > than fixing it. > > But should papering be appropriate, here is the patch. > > Thanx, Paul Just found and have a patch to fix a typo in acpi processor_driver.c, which could also fix this tick_broadcast_mask issue. Patch is in https://lkml.org/lkml/2012/7/30/483 So I think we don't need this "papering over" patch :) Thanks, Feng > > ------------------------------------------------------------------------ > > ACPI: Repair fix to unprotected smp_processor_id() > > Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in > acpi_processor_cst_has_changed()) introduced a suspend/resume bug. > This commit therefore introduces a bug-for-bug compatible fix for the > original problem. > > Reported-by: Feng Tang <feng.tang@intel.com> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > index 47a8caa..19c151a 100644 > --- a/drivers/acpi/processor_idle.c > +++ b/drivers/acpi/processor_idle.c > @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct > acpi_processor *pr) > * to make the code that updates C-States be called once. > */ > > - if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) { > + if (raw_smp_processor_id() == 0 && > + cpuidle_get_driver() == &acpi_idle_driver) { > > cpuidle_pause_and_lock(); > /* Protect against cpu-hotplug */ > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined 2012-07-31 3:18 ` Feng Tang @ 2012-07-31 4:09 ` Paul E. McKenney 0 siblings, 0 replies; 6+ messages in thread From: Paul E. McKenney @ 2012-07-31 4:09 UTC (permalink / raw) To: Feng Tang Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki, Linux Kernel Mail List On Tue, Jul 31, 2012 at 11:18:32AM +0800, Feng Tang wrote: > Hi Paul, > > On Mon, 30 Jul 2012 10:42:18 -0700 > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote: > > > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote: > > > > Hi Paul, > > > > > > > > On Mon, 30 Jul 2012 06:39:13 -0700 > > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > > > > > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote: > > > > > > Hi All, > > > > > > > > > > > > When I debugged a suspend/resume bug, I found that > > > > > > tick_broadcast_mask is not restored for a CPU after it is > > > > > > offline/onlined since kernel 3.4, while it's fine for 3.3. > > > > > > > > > > Could you please try 3.5? > > > > > > > > Yes, it's the same for 3.5 > > > > > > Thank you for checking, Feng. > > > > > > Len, the comment above the change says: > > > > > > /* > > > * FIXME: Design the ACPI notification to make it once per > > > * system instead of once per-cpu. This condition is a hack > > > * to make the code that updates C-States be called once. > > > */ > > > > > > Is it time for this design-level change? Or is there something obvious > > > that I missed when fixing the smp_processor_id() splat? > > > > > > I could revert back, but use raw_smp_processor_id() rather than > > > smp_processor_id(), but that feels like papering over a problem rather > > > than fixing it. > > > > But should papering be appropriate, here is the patch. > > > > Thanx, Paul > > Just found and have a patch to fix a typo in acpi processor_driver.c, which > could also fix this tick_broadcast_mask issue. > > Patch is in https://lkml.org/lkml/2012/7/30/483 > > So I think we don't need this "papering over" patch :) Very good, I have dropped it. Thanx, Paul > Thanks, > Feng > > > > > ------------------------------------------------------------------------ > > > > ACPI: Repair fix to unprotected smp_processor_id() > > > > Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in > > acpi_processor_cst_has_changed()) introduced a suspend/resume bug. > > This commit therefore introduces a bug-for-bug compatible fix for the > > original problem. > > > > Reported-by: Feng Tang <feng.tang@intel.com> > > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > > > > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c > > index 47a8caa..19c151a 100644 > > --- a/drivers/acpi/processor_idle.c > > +++ b/drivers/acpi/processor_idle.c > > @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct > > acpi_processor *pr) > > * to make the code that updates C-States be called once. > > */ > > > > - if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) { > > + if (raw_smp_processor_id() == 0 && > > + cpuidle_get_driver() == &acpi_idle_driver) { > > > > cpuidle_pause_and_lock(); > > /* Protect against cpu-hotplug */ > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-31 4:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20120730151559.772d4055@feng-i7>
2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney
2012-07-30 15:07 ` Feng Tang
2012-07-30 17:08 ` Paul E. McKenney
2012-07-30 17:42 ` Paul E. McKenney
2012-07-31 3:18 ` Feng Tang
2012-07-31 4:09 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox