public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Jiri Kosina <jkosina@suse.cz>, Tony Luck <tony.luck@intel.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org
Subject: Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule
Date: Fri, 10 May 2013 23:38:52 +0200	[thread overview]
Message-ID: <20130510213851.GC9358@somewhere> (raw)
In-Reply-To: <20130510162340.GE22942@pd.tnic>

On Fri, May 10, 2013 at 06:23:40PM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2013 at 05:43:50PM +0200, Frederic Weisbecker wrote:
> > So either interrupts are spuriously enabled early, or ts->tick_stopped
> > is not correctly initialized.
> 
> Hmm, it can't be interrupts disabled because add_timer_on() does
> spin_lock_irqsave() before calling wake_up_nohz_cpu()... Maybe something
> like the below could help check this...

Right, by the time we call wake_up_nohz_cpu(), irqs can't be enabled, but may
be they were enabled before in start_secondary().

> 
> Although AFAICT, we're enabling interrupts much later in
> start_secondary, even after we've set the bit in the online mask.

Right.

> 
> ---
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 9c73b51817e4..1b679b0fa57a 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -201,6 +201,8 @@ static void __cpuinit smp_callin(void)
>  	 */
>  	setup_vector_irq(smp_processor_id());
>  
> +	WARN_ON(!irqs_disabled());
> +

The problem is that it doesn't catch issues with irqs that have been enabled
before in start_secondary(), then re-disabled somewhow. Warning on offline CPU from the place 
that disables the tick should catch the issue.

Jiri, could you test the following patch? I also added some code to dump
the value of ts->tick_stopped, in case it's not well initialized or something.

Note this may give you spurious warning when you unplug a CPU or when you shutdown the
system. But it's interesting if it dumps something in the boot.

Thanks!

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 58453b8..9853125 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -616,8 +616,17 @@ static bool wake_up_full_nohz_cpu(int cpu)
 {
 	if (tick_nohz_full_cpu(cpu)) {
 		if (cpu != smp_processor_id() ||
-		    tick_nohz_tick_stopped())
+		    tick_nohz_tick_stopped()) {
+			if (!cpu_online(cpu)) {
+				static int printed = 0;
+				if (!printed) {
+					printk("src: %d dst: %d stopped: %d\n", cpu, smp_processor_id(), tick_nohz_tick_stopped());
+					dump_stack();
+					printed = 1;
+				}
+			}
 			smp_send_reschedule(cpu);
+		}
 		return true;
 	}
 
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index bc67d42..abfa8c3 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -650,6 +650,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
 
 			ts->last_tick = hrtimer_get_expires(&ts->sched_timer);
 			ts->tick_stopped = 1;
+			WARN_ON_ONCE(!cpu_online(cpu));
 			trace_tick_stop(1, " ");
 		}
 


  reply	other threads:[~2013-05-10 21:38 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-09 12:29 NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule Jiri Kosina
2013-05-09 12:50 ` Borislav Petkov
2013-05-09 12:58   ` Borislav Petkov
2013-05-15 18:45     ` Paul E. McKenney
2013-05-15 22:43       ` Borislav Petkov
2013-05-15 23:55         ` Paul E. McKenney
2013-05-17 13:56           ` NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule, round 2 Borislav Petkov
2013-05-20  3:16             ` Michael Wang
2013-05-20  4:50               ` Borislav Petkov
2013-05-20  6:23                 ` Michael Wang
2013-05-20  6:47                   ` Borislav Petkov
2013-05-20  6:58                     ` Michael Wang
2013-05-20  7:06                       ` Michael Wang
2013-05-20  7:12                         ` Viresh Kumar
2013-05-20  7:25                           ` Michael Wang
2013-05-20  8:56                             ` Michael Wang
2013-05-20  9:09                               ` Viresh Kumar
2013-05-20  9:24                                 ` Michael Wang
2013-05-20 13:23                                   ` Borislav Petkov
2013-05-20 13:43                                     ` Viresh Kumar
2013-05-20 15:08                                       ` Borislav Petkov
2013-05-21  2:20                                     ` Michael Wang
2013-05-21  2:37                                       ` Michael Wang
2013-05-21  7:21                                       ` Borislav Petkov
2013-05-21  7:58                                         ` Michael Wang
2013-05-20  7:36                     ` Tejun Heo
2013-05-20  8:10                 ` Frederic Weisbecker
2013-05-20  9:31                   ` Srivatsa S. Bhat
2013-05-20  9:40                     ` Viresh Kumar
2013-05-20 10:24                       ` Viresh Kumar
2013-06-04 21:20             ` Jiri Kosina
2013-06-05  2:30               ` Michael Wang
2013-06-05  8:08                 ` Jiri Kosina
2013-06-05  8:12                   ` Michael Wang
2013-05-10  0:29 ` NOHZ: WARNING: at arch/x86/kernel/smp.c:123 native_smp_send_reschedule Frederic Weisbecker
2013-05-10  9:28   ` Borislav Petkov
2013-05-10  9:26     ` Frederic Weisbecker
2013-05-10  9:37       ` Ingo Molnar
2013-05-10  9:45         ` Borislav Petkov
2013-05-10 15:31           ` Frederic Weisbecker
2013-05-10  9:43       ` Borislav Petkov
2013-05-10 15:42       ` Jiri Kosina
2013-05-10 15:03   ` Jiri Kosina
2013-05-10 15:21     ` Borislav Petkov
2013-05-10 15:43       ` Frederic Weisbecker
2013-05-10 16:23         ` Borislav Petkov
2013-05-10 21:38           ` Frederic Weisbecker [this message]
2013-05-13 14:56             ` Jiri Kosina
2013-05-13 19:40               ` Thomas Gleixner
2013-05-13 20:01                 ` Jiri Kosina
2013-05-14 15:46                 ` [tip:timers/urgent] tick: Don't invoke tick_nohz_stop_sched_tick( ) if the cpu is offline tip-bot for Thomas Gleixner
2013-05-15 19:41                   ` Frederic Weisbecker
2013-05-16 14:06                     ` Thomas Gleixner
2013-05-16 14:15                       ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130510213851.GC9358@somewhere \
    --to=fweisbec@gmail.com \
    --cc=bp@alien8.de \
    --cc=jkosina@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox