* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
[not found] <20120730151559.772d4055@feng-i7>
@ 2012-07-30 13:39 ` Paul E. McKenney
2012-07-30 15:07 ` Feng Tang
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 13:39 UTC (permalink / raw)
To: Feng Tang
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List
On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> Hi All,
>
> When I debugged a suspend/resume bug, I found that tick_broadcast_mask is not
> restored for a CPU after it is offline/onlined since kernel 3.4, while it's
> fine for 3.3.
Could you please try 3.5?
> Further check show it is caused by the commit 9505626d7bfe
> ACPI: Fix unprotected smp_processor_id() in acpi_processor_cst_has_changed()
>
> The acpi_processor_cst_has_changed() function is invoked from a
> CPU_ONLINE or CPU_DEAD function, which might well execute on CPU 0
> even though the CPU being hotplugged is some other CPU. In addition,
> acpi_processor_cst_has_changed() invokes smp_processor_id() without
> protection, resulting in splats when onlining CPUs.
>
> This commit therefore changes the smp_processor_id() to pr->id, as is
> used elsewhere in the code, for example, in acpi_processor_add().
>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 0e8e2de..9e57b06 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -1159,8 +1159,7 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
> * to make the code that updates C-States be called once.
> */
>
> - if (smp_processor_id() == 0 &&
> - cpuidle_get_driver() == &acpi_idle_driver) {
> + if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
>
> cpuidle_pause_and_lock();
> /* Protect against cpu-hotplug */
>
> The root cause is acpi_processor_cst_has_changed() will also be called when
> cpu_up() is run on cpu 0 to boot up other cpu, this commit will prevent the
> following code be run for that cpu, which triggers some side effect like the
> broadcast_mask is not restored.
>
> I raise this problem up and I don't if revert is a good solution here.
Indeed, that would re-introduce the splats from unprotected use of
smp_processor_id(). :-(
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney
@ 2012-07-30 15:07 ` Feng Tang
2012-07-30 17:08 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-07-30 15:07 UTC (permalink / raw)
To: paulmck
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List, linux-kernel
Hi Paul,
On Mon, 30 Jul 2012 06:39:13 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > Hi All,
> >
> > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > it's fine for 3.3.
>
> Could you please try 3.5?
Yes, it's the same for 3.5
Thanks,
Feng
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
2012-07-30 15:07 ` Feng Tang
@ 2012-07-30 17:08 ` Paul E. McKenney
2012-07-30 17:42 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 17:08 UTC (permalink / raw)
To: Feng Tang
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List
On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> Hi Paul,
>
> On Mon, 30 Jul 2012 06:39:13 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > Hi All,
> > >
> > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > > it's fine for 3.3.
> >
> > Could you please try 3.5?
>
> Yes, it's the same for 3.5
Thank you for checking, Feng.
Len, the comment above the change says:
/*
* FIXME: Design the ACPI notification to make it once per
* system instead of once per-cpu. This condition is a hack
* to make the code that updates C-States be called once.
*/
Is it time for this design-level change? Or is there something obvious
that I missed when fixing the smp_processor_id() splat?
I could revert back, but use raw_smp_processor_id() rather than
smp_processor_id(), but that feels like papering over a problem rather
than fixing it.
Thoughts?
Thanx, Paul
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
2012-07-30 17:08 ` Paul E. McKenney
@ 2012-07-30 17:42 ` Paul E. McKenney
2012-07-31 3:18 ` Feng Tang
0 siblings, 1 reply; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-30 17:42 UTC (permalink / raw)
To: Feng Tang
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List
On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > Hi Paul,
> >
> > On Mon, 30 Jul 2012 06:39:13 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >
> > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > Hi All,
> > > >
> > > > When I debugged a suspend/resume bug, I found that tick_broadcast_mask is
> > > > not restored for a CPU after it is offline/onlined since kernel 3.4, while
> > > > it's fine for 3.3.
> > >
> > > Could you please try 3.5?
> >
> > Yes, it's the same for 3.5
>
> Thank you for checking, Feng.
>
> Len, the comment above the change says:
>
> /*
> * FIXME: Design the ACPI notification to make it once per
> * system instead of once per-cpu. This condition is a hack
> * to make the code that updates C-States be called once.
> */
>
> Is it time for this design-level change? Or is there something obvious
> that I missed when fixing the smp_processor_id() splat?
>
> I could revert back, but use raw_smp_processor_id() rather than
> smp_processor_id(), but that feels like papering over a problem rather
> than fixing it.
But should papering be appropriate, here is the patch.
Thanx, Paul
------------------------------------------------------------------------
ACPI: Repair fix to unprotected smp_processor_id()
Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
This commit therefore introduces a bug-for-bug compatible fix for the
original problem.
Reported-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 47a8caa..19c151a 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
* to make the code that updates C-States be called once.
*/
- if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
+ if (raw_smp_processor_id() == 0 &&
+ cpuidle_get_driver() == &acpi_idle_driver) {
cpuidle_pause_and_lock();
/* Protect against cpu-hotplug */
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
2012-07-30 17:42 ` Paul E. McKenney
@ 2012-07-31 3:18 ` Feng Tang
2012-07-31 4:09 ` Paul E. McKenney
0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-07-31 3:18 UTC (permalink / raw)
To: paulmck
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List
Hi Paul,
On Mon, 30 Jul 2012 10:42:18 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > > Hi Paul,
> > >
> > > On Mon, 30 Jul 2012 06:39:13 -0700
> > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > >
> > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > > Hi All,
> > > > >
> > > > > When I debugged a suspend/resume bug, I found that
> > > > > tick_broadcast_mask is not restored for a CPU after it is
> > > > > offline/onlined since kernel 3.4, while it's fine for 3.3.
> > > >
> > > > Could you please try 3.5?
> > >
> > > Yes, it's the same for 3.5
> >
> > Thank you for checking, Feng.
> >
> > Len, the comment above the change says:
> >
> > /*
> > * FIXME: Design the ACPI notification to make it once per
> > * system instead of once per-cpu. This condition is a hack
> > * to make the code that updates C-States be called once.
> > */
> >
> > Is it time for this design-level change? Or is there something obvious
> > that I missed when fixing the smp_processor_id() splat?
> >
> > I could revert back, but use raw_smp_processor_id() rather than
> > smp_processor_id(), but that feels like papering over a problem rather
> > than fixing it.
>
> But should papering be appropriate, here is the patch.
>
> Thanx, Paul
Just found and have a patch to fix a typo in acpi processor_driver.c, which
could also fix this tick_broadcast_mask issue.
Patch is in https://lkml.org/lkml/2012/7/30/483
So I think we don't need this "papering over" patch :)
Thanks,
Feng
>
> ------------------------------------------------------------------------
>
> ACPI: Repair fix to unprotected smp_processor_id()
>
> Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
> acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
> This commit therefore introduces a bug-for-bug compatible fix for the
> original problem.
>
> Reported-by: Feng Tang <feng.tang@intel.com>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
>
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 47a8caa..19c151a 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct
> acpi_processor *pr)
> * to make the code that updates C-States be called once.
> */
>
> - if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
> + if (raw_smp_processor_id() == 0 &&
> + cpuidle_get_driver() == &acpi_idle_driver) {
>
> cpuidle_pause_and_lock();
> /* Protect against cpu-hotplug */
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined
2012-07-31 3:18 ` Feng Tang
@ 2012-07-31 4:09 ` Paul E. McKenney
0 siblings, 0 replies; 6+ messages in thread
From: Paul E. McKenney @ 2012-07-31 4:09 UTC (permalink / raw)
To: Feng Tang
Cc: Paul E. McKenney, Len Brown, Rafael J. Wysocki,
Linux Kernel Mail List
On Tue, Jul 31, 2012 at 11:18:32AM +0800, Feng Tang wrote:
> Hi Paul,
>
> On Mon, 30 Jul 2012 10:42:18 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
>
> > On Mon, Jul 30, 2012 at 10:08:47AM -0700, Paul E. McKenney wrote:
> > > On Mon, Jul 30, 2012 at 11:07:47PM +0800, Feng Tang wrote:
> > > > Hi Paul,
> > > >
> > > > On Mon, 30 Jul 2012 06:39:13 -0700
> > > > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> > > >
> > > > > On Mon, Jul 30, 2012 at 03:15:59PM +0800, Feng Tang wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > When I debugged a suspend/resume bug, I found that
> > > > > > tick_broadcast_mask is not restored for a CPU after it is
> > > > > > offline/onlined since kernel 3.4, while it's fine for 3.3.
> > > > >
> > > > > Could you please try 3.5?
> > > >
> > > > Yes, it's the same for 3.5
> > >
> > > Thank you for checking, Feng.
> > >
> > > Len, the comment above the change says:
> > >
> > > /*
> > > * FIXME: Design the ACPI notification to make it once per
> > > * system instead of once per-cpu. This condition is a hack
> > > * to make the code that updates C-States be called once.
> > > */
> > >
> > > Is it time for this design-level change? Or is there something obvious
> > > that I missed when fixing the smp_processor_id() splat?
> > >
> > > I could revert back, but use raw_smp_processor_id() rather than
> > > smp_processor_id(), but that feels like papering over a problem rather
> > > than fixing it.
> >
> > But should papering be appropriate, here is the patch.
> >
> > Thanx, Paul
>
> Just found and have a patch to fix a typo in acpi processor_driver.c, which
> could also fix this tick_broadcast_mask issue.
>
> Patch is in https://lkml.org/lkml/2012/7/30/483
>
> So I think we don't need this "papering over" patch :)
Very good, I have dropped it.
Thanx, Paul
> Thanks,
> Feng
>
> >
> > ------------------------------------------------------------------------
> >
> > ACPI: Repair fix to unprotected smp_processor_id()
> >
> > Commit 9505626d (ACPI: Fix unprotected smp_processor_id() in
> > acpi_processor_cst_has_changed()) introduced a suspend/resume bug.
> > This commit therefore introduces a bug-for-bug compatible fix for the
> > original problem.
> >
> > Reported-by: Feng Tang <feng.tang@intel.com>
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> >
> > diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> > index 47a8caa..19c151a 100644
> > --- a/drivers/acpi/processor_idle.c
> > +++ b/drivers/acpi/processor_idle.c
> > @@ -1218,7 +1218,8 @@ int acpi_processor_cst_has_changed(struct
> > acpi_processor *pr)
> > * to make the code that updates C-States be called once.
> > */
> >
> > - if (pr->id == 0 && cpuidle_get_driver() == &acpi_idle_driver) {
> > + if (raw_smp_processor_id() == 0 &&
> > + cpuidle_get_driver() == &acpi_idle_driver) {
> >
> > cpuidle_pause_and_lock();
> > /* Protect against cpu-hotplug */
> >
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-07-31 4:10 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20120730151559.772d4055@feng-i7>
2012-07-30 13:39 ` [Regression 3.4] tick_broadcast_mask is not restored after a CPU has been offline/onlined Paul E. McKenney
2012-07-30 15:07 ` Feng Tang
2012-07-30 17:08 ` Paul E. McKenney
2012-07-30 17:42 ` Paul E. McKenney
2012-07-31 3:18 ` Feng Tang
2012-07-31 4:09 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox