linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus()
@ 2012-05-22  2:52 Feng Tang
  2012-05-22 13:33 ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-05-22  2:52 UTC (permalink / raw)
  To: Don Zickus, Ingo Molnar, Peter Zijlstra, x86, linux-kernel

>From 96214a16a0e0da523ae770c30a8641dc8bec89ce Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@intel.com>
Date: Tue, 22 May 2012 09:50:40 +0800
Subject: [PATCH] x86, reboot: Fix a warning message triggered by
 stop_other_cpus()

When rebooting our 24 CPU Westmere servers with 3.4-rc6, we can always see
such warning msg

Restarting system.
machine restart
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x74/0xa7()
Hardware name: X8DTN
Modules linked in: igb [last unloaded: scsi_wait_scan]
Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
Call Trace:
 <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
 [<ffffffff81036768>] update_process_times+0x60/0x70
 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
 [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
 <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
 [<ffffffff81018926>] machine_shutdown+0xa/0xc
 [<ffffffff8101897e>] native_machine_restart+0x20/0x32
 [<ffffffff810189b0>] machine_restart+0xa/0xc
 [<ffffffff8103b196>] kernel_restart+0x47/0x4c
 [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
 [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
---[ end trace 320af5cb1cb60c5b ]---

The root cause seems to be the default_send_IPI_mask_allbutself_phys()
takes quiet some time (I measured it could be several ms) to complete
sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the
time is long enough for a timer interrupt to happen, which will in
turn trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().

So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there should
be nobody caring the timer interrupt in such reboot stage.

The patch is against v3.4-rc6

Signed-off-by: Feng Tang <feng.tang@intel.com>
---
 arch/x86/kernel/reboot.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index d840e69..5aa8797 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -659,9 +659,12 @@ void native_machine_shutdown(void)
 	/* Make certain I only run on the appropriate processor */
 	set_cpus_allowed_ptr(current, cpumask_of(reboot_cpu_id));
 
-	/* O.K Now that I'm on the appropriate processor,
-	 * stop all of the others.
+	/*
+	 * O.K Now that I'm on the appropriate processor, stop all of the
+	 * others. Also disable the local irq to not receive the per-cpu
+	 * timer interrupt which may trigger scheduler's load balance.
 	 */
+	local_irq_disable();
 	stop_other_cpus();
 #endif
 
-- 
1.7.5.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus()
  2012-05-22  2:52 [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus() Feng Tang
@ 2012-05-22 13:33 ` Don Zickus
  2012-05-28  1:22   ` Feng Tang
  0 siblings, 1 reply; 6+ messages in thread
From: Don Zickus @ 2012-05-22 13:33 UTC (permalink / raw)
  To: Feng Tang; +Cc: Ingo Molnar, Peter Zijlstra, x86, linux-kernel

On Tue, May 22, 2012 at 10:52:33AM +0800, Feng Tang wrote:
> 
> The root cause seems to be the default_send_IPI_mask_allbutself_phys()
> takes quiet some time (I measured it could be several ms) to complete
> sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the

I sent Ingo a patch (which he took in), that reverts this path to interrupts
again (and NMIs as a fallback if it fails).  The problem will still exist
I assume though I don't know if it changes the timing characteristics.

> time is long enough for a timer interrupt to happen, which will in
> turn trigger to kick load balance to a stopped CPU and cause this
> warning in native_smp_send_reschedule().
> 
> So disabling the local irq before stop_other_cpu() can fix this
> problem (tested 25 times reboot ok), and it is fine as there should
> be nobody caring the timer interrupt in such reboot stage.

It seems to be fine.  I just do not know enough about the load balancing
the timer interrupt does to know if this is the right way to go or not.

So from the shutdown cpu perspective, you are just blocking the interrupt
from occurring while we shutdown the other cpus, which prevents the load
balancer from running and seeing cpus disappear beneath it.  Sounds
reasonable.

Though on the other hand the same interrupt on another cpu might spit out
the same warning as cpus disappear without notification if it tries to
reschedule.  Probably rare and the window is probably small as the cpus
would probably all 'stop' rather quickly.

I don't know.  I guess it is ok.

Cheers,
Don

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus()
  2012-05-22 13:33 ` Don Zickus
@ 2012-05-28  1:22   ` Feng Tang
  2012-05-30  8:51     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-05-28  1:22 UTC (permalink / raw)
  To: Don Zickus; +Cc: Ingo Molnar, Peter Zijlstra, x86, linux-kernel

Hi Don,

On Tue, 22 May 2012 09:33:56 -0400
Don Zickus <dzickus@redhat.com> wrote:

> On Tue, May 22, 2012 at 10:52:33AM +0800, Feng Tang wrote:
> > 
> > The root cause seems to be the default_send_IPI_mask_allbutself_phys()
> > takes quiet some time (I measured it could be several ms) to complete
> > sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the
> 
> I sent Ingo a patch (which he took in), that reverts this path to interrupts
> again (and NMIs as a fallback if it fails).  The problem will still exist
> I assume though I don't know if it changes the timing characteristics.

Thanks for your review and letting me know the revert which I just saw in
the latest Linus' tree. And I think this new mechanism will greatly reduce
the possibility, but it still exist as the old nmi_stop_other_cpus's
implementation still exist. So

Ingo,

Could you review this patch? thanks,

-Feng


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus()
  2012-05-28  1:22   ` Feng Tang
@ 2012-05-30  8:51     ` Ingo Molnar
  2012-05-30 15:15       ` Feng Tang
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2012-05-30  8:51 UTC (permalink / raw)
  To: Feng Tang; +Cc: Don Zickus, Ingo Molnar, Peter Zijlstra, x86, linux-kernel


* Feng Tang <feng.tang@intel.com> wrote:

> Hi Don,
> 
> On Tue, 22 May 2012 09:33:56 -0400
> Don Zickus <dzickus@redhat.com> wrote:
> 
> > On Tue, May 22, 2012 at 10:52:33AM +0800, Feng Tang wrote:
> > > 
> > > The root cause seems to be the default_send_IPI_mask_allbutself_phys()
> > > takes quiet some time (I measured it could be several ms) to complete
> > > sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the
> > 
> > I sent Ingo a patch (which he took in), that reverts this path to interrupts
> > again (and NMIs as a fallback if it fails).  The problem will still exist
> > I assume though I don't know if it changes the timing characteristics.
> 
> Thanks for your review and letting me know the revert which I just saw in
> the latest Linus' tree. And I think this new mechanism will greatly reduce
> the possibility, but it still exist as the old nmi_stop_other_cpus's
> implementation still exist. So
> 
> Ingo,
> 
> Could you review this patch? thanks,

It doesn't apply anymore, so please merge and add Don's 
Acked-by.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus()
  2012-05-30  8:51     ` Ingo Molnar
@ 2012-05-30 15:15       ` Feng Tang
  2012-06-06 15:19         ` [tip:x86/urgent] x86/reboot: " tip-bot for Feng Tang
  0 siblings, 1 reply; 6+ messages in thread
From: Feng Tang @ 2012-05-30 15:15 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Don Zickus, Ingo Molnar, Peter Zijlstra, x86, linux-kernel

On Wed, 30 May 2012 10:51:10 +0200
Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Feng Tang <feng.tang@intel.com> wrote:
> 
> > Hi Don,
> > 
> > On Tue, 22 May 2012 09:33:56 -0400
> > Don Zickus <dzickus@redhat.com> wrote:
> > 
> > > On Tue, May 22, 2012 at 10:52:33AM +0800, Feng Tang wrote:
> > > > 
> > > > The root cause seems to be the default_send_IPI_mask_allbutself_phys()
> > > > takes quiet some time (I measured it could be several ms) to complete
> > > > sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the
> > > 
> > > I sent Ingo a patch (which he took in), that reverts this path to
> > > interrupts again (and NMIs as a fallback if it fails).  The problem will
> > > still exist I assume though I don't know if it changes the timing
> > > characteristics.
> > 
> > Thanks for your review and letting me know the revert which I just saw in
> > the latest Linus' tree. And I think this new mechanism will greatly reduce
> > the possibility, but it still exist as the old nmi_stop_other_cpus's
> > implementation still exist. So
> > 
> > Ingo,
> > 
> > Could you review this patch? thanks,
> 
> It doesn't apply anymore, so please merge and add Don's 
> Acked-by.

Hi Ingo,

Following is the updated patch, which could be applied to Linus's tree
and tip tree's master branch.

Thanks,
Feng
-------------------------

>From 9c4fc1b95da9cf35e1b1e8a71a67fc594adfdd78 Mon Sep 17 00:00:00 2001
From: Feng Tang <feng.tang@intel.com>
Date: Wed, 30 May 2012 21:01:49 +0800
Subject: [PATCH] x86, reboot: Fix a warning message triggered by
 stop_other_cpus()

When rebooting our 24 CPU Westmere servers with 3.4-rc6, we can always see
such warning msg:

Restarting system.
machine restart
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x74/0xa7()
Hardware name: X8DTN
Modules linked in: igb [last unloaded: scsi_wait_scan]
Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
Call Trace:
 <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
 [<ffffffff81036768>] update_process_times+0x60/0x70
 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
 [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
 <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
 [<ffffffff81018926>] machine_shutdown+0xa/0xc
 [<ffffffff8101897e>] native_machine_restart+0x20/0x32
 [<ffffffff810189b0>] machine_restart+0xa/0xc
 [<ffffffff8103b196>] kernel_restart+0x47/0x4c
 [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
 [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
---[ end trace 320af5cb1cb60c5b ]---

The root cause seems to be the default_send_IPI_mask_allbutself_phys()
takes quiet some time (I measured it could be several ms) to complete
sending NMIs to all the other 23 CPUs, and for HZ=250/1000 system, the
time is long enough for a timer interrupt to happen, which will in
turn trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().

So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there should
be nobody caring the timer interrupt in such reboot stage.

The latest 3.4 kernel slightly change this behavior by sending
REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
fails, and this patch is still needed to prevent the problem.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Don Zickus <dzickus@redhat.com>
---
 arch/x86/kernel/reboot.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 77215c2..ba823c8 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -658,9 +658,11 @@ void native_machine_shutdown(void)
 	set_cpus_allowed_ptr(current, cpumask_of(reboot_cpu_id));
 
 	/*
-	 * O.K Now that I'm on the appropriate processor,
-	 * stop all of the others.
+	 * O.K Now that I'm on the appropriate processor, stop all of the
+	 * others. Also disable the local irq to not receive the per-cpu
+	 * timer interrupt which may trigger scheduler's load balance.
 	 */
+	local_irq_disable();
 	stop_other_cpus();
 #endif
 
-- 
1.7.9


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [tip:x86/urgent] x86/reboot: Fix a warning message triggered by stop_other_cpus()
  2012-05-30 15:15       ` Feng Tang
@ 2012-06-06 15:19         ` tip-bot for Feng Tang
  0 siblings, 0 replies; 6+ messages in thread
From: tip-bot for Feng Tang @ 2012-06-06 15:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, peterz, tglx, feng.tang, dzickus

Commit-ID:  55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
Gitweb:     http://git.kernel.org/tip/55c844a4dd16a4d1fdc0cf2a283ec631a02ec448
Author:     Feng Tang <feng.tang@intel.com>
AuthorDate: Wed, 30 May 2012 23:15:41 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 6 Jun 2012 12:03:23 +0200

x86/reboot: Fix a warning message triggered by stop_other_cpus()

When rebooting our 24 CPU Westmere servers with 3.4-rc6, we
always see this warning msg:

Restarting system.
machine restart
------------[ cut here ]------------
WARNING: at arch/x86/kernel/smp.c:125
native_smp_send_reschedule+0x74/0xa7() Hardware name: X8DTN
Modules linked in: igb [last unloaded: scsi_wait_scan]
Pid: 1, comm: systemd-shutdow Not tainted 3.4.0-rc6+ #22
Call Trace:
 <IRQ>  [<ffffffff8102a41f>] warn_slowpath_common+0x7e/0x96
 [<ffffffff8102a44c>] warn_slowpath_null+0x15/0x17
 [<ffffffff81018cf7>] native_smp_send_reschedule+0x74/0xa7
 [<ffffffff810561c1>] trigger_load_balance+0x279/0x2a6
 [<ffffffff81050112>] scheduler_tick+0xe0/0xe9
 [<ffffffff81036768>] update_process_times+0x60/0x70
 [<ffffffff81062f2f>] tick_sched_timer+0x68/0x92
 [<ffffffff81046e33>] __run_hrtimer+0xb3/0x13c
 [<ffffffff81062ec7>] ? tick_nohz_handler+0xd0/0xd0
 [<ffffffff810474f2>] hrtimer_interrupt+0xdb/0x198
 [<ffffffff81019a35>] smp_apic_timer_interrupt+0x81/0x94
 [<ffffffff81655187>] apic_timer_interrupt+0x67/0x70
 <EOI>  [<ffffffff8101a3c4>] ? default_send_IPI_mask_allbutself_phys+0xb4/0xc4
 [<ffffffff8101c680>] physflat_send_IPI_allbutself+0x12/0x14
 [<ffffffff81018db4>] native_nmi_stop_other_cpus+0x8a/0xd6
 [<ffffffff810188ba>] native_machine_shutdown+0x50/0x67
 [<ffffffff81018926>] machine_shutdown+0xa/0xc
 [<ffffffff8101897e>] native_machine_restart+0x20/0x32
 [<ffffffff810189b0>] machine_restart+0xa/0xc
 [<ffffffff8103b196>] kernel_restart+0x47/0x4c
 [<ffffffff8103b2e6>] sys_reboot+0x13e/0x17c
 [<ffffffff8164e436>] ? _raw_spin_unlock_bh+0x10/0x12
 [<ffffffff810fcac9>] ? bdi_queue_work+0xcf/0xd8
 [<ffffffff810fe82f>] ? __bdi_start_writeback+0xae/0xb7
 [<ffffffff810e0d64>] ? iterate_supers+0xa3/0xb7
 [<ffffffff816547a2>] system_call_fastpath+0x16/0x1b
---[ end trace 320af5cb1cb60c5b ]---

The root cause seems to be the
default_send_IPI_mask_allbutself_phys() takes quite some time (I
measured it could be several ms) to complete sending NMIs to all
the other 23 CPUs, and for HZ=250/1000 system, the time is long
enough for a timer interrupt to happen, which will in turn
trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().

So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there
should be nobody caring the timer interrupt in such reboot
stage.

The latest 3.4 kernel slightly changes this behavior by sending
REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
fails, and this patch is still needed to prevent the problem.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/reboot.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 79c45af..25b48ed 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -639,9 +639,11 @@ void native_machine_shutdown(void)
 	set_cpus_allowed_ptr(current, cpumask_of(reboot_cpu_id));
 
 	/*
-	 * O.K Now that I'm on the appropriate processor,
-	 * stop all of the others.
+	 * O.K Now that I'm on the appropriate processor, stop all of the
+	 * others. Also disable the local irq to not receive the per-cpu
+	 * timer interrupt which may trigger scheduler's load balance.
 	 */
+	local_irq_disable();
 	stop_other_cpus();
 #endif
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-06-06 15:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-22  2:52 [PATCH]x86, reboot: Fix a warning message triggered by stop_other_cpus() Feng Tang
2012-05-22 13:33 ` Don Zickus
2012-05-28  1:22   ` Feng Tang
2012-05-30  8:51     ` Ingo Molnar
2012-05-30 15:15       ` Feng Tang
2012-06-06 15:19         ` [tip:x86/urgent] x86/reboot: " tip-bot for Feng Tang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).