All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	linux-kernel@vger.kernel.org,
	Christof Schmitt <christof.schmitt@de.ibm.com>,
	Frank Blaschka <frank.blaschka@de.ibm.com>,
	Horst Hartmann <horsth@linux.vnet.ibm.com>,
	stable@kernel.org
Subject: Re: [patch 2/3] nohz: fix printk_needs_cpu() return value on offline cpus
Date: Fri, 26 Nov 2010 13:11:52 +0100	[thread overview]
Message-ID: <1290773512.2145.139.camel@laptop> (raw)
In-Reply-To: <20101126120235.406766476@de.ibm.com>

On Fri, 2010-11-26 at 13:00 +0100, Heiko Carstens wrote:
> plain text document attachment (002_printk_needs_cpu.diff)
> From: Heiko Carstens <heiko.carstens@de.ibm.com>
> 
> This patch fixes a hang observed with 2.6.32 kernels where timers got
> enqueued on offline cpus.
> 
> printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
> offlined it schedules the idle process which, before killing its own cpu,
> will call tick_nohz_stop_sched_tick().
> That function in turn will call printk_needs_cpu() in order to check if the
> local tick can be disabled. On offline cpus this function should naturally
> return 0 since regardless if the tick gets disabled or not the cpu will be
> dead short after. That is besides the fact that __cpu_disable() should already
> have made sure that no interrupts on the offlined cpu will be delivered anyway.
> 
> In this case it prevents tick_nohz_stop_sched_tick() to call
> select_nohz_load_balancer(). No idea if that really is a problem. However what
> made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
> used within __mod_timer() to select a cpu on which a timer gets enqueued.
> If printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
> updated when a cpu gets offlined. It may contain the cpu number of an offline
> cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
> they never expire and cause system hangs.
> 
> This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
> get_nohz_timer_target() which doesn't have that problem. However there might
> be other problems because of the too early exit tick_nohz_stop_sched_tick()
> in case a cpu goes offline.
> 
> Easiest way to fix this is just to test if the current cpu is offline and
> call printk_tick() directly which clears the condition.
> 
> Alternatively I tried a cpu hotplug notifier which would clear the condition,
> however between calling the notifier function and printk_needs_cpu() something
> could have called printk() again and the problem is back again. This seems to
> be the safest fix.
> 
> Cc: stable@kernel.org
> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
> ---
>  kernel/printk.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> --- a/kernel/printk.c
> +++ b/kernel/printk.c
> @@ -1082,6 +1082,8 @@ void printk_tick(void)
>  
>  int printk_needs_cpu(int cpu)
>  {
> +	if (unlikely(cpu_is_offline(cpu)))
> +		printk_tick();
>  	return per_cpu(printk_pending, cpu);
>  }
>  

Nice,.. applied.



  reply	other threads:[~2010-11-26 12:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-26 12:00 [patch 0/3] three cpu hotplug fixes Heiko Carstens
2010-11-26 12:00 ` [patch 1/3] printk: fix wake_up_klogd() vs cpu hotplug Heiko Carstens
2010-11-26 12:10   ` Peter Zijlstra
2010-11-26 12:13     ` Heiko Carstens
2010-11-26 12:15       ` Peter Zijlstra
2010-11-26 12:35   ` Eric Dumazet
2010-11-26 12:42     ` Heiko Carstens
2010-11-26 13:01       ` Eric Dumazet
2010-11-26 15:02       ` [tip:sched/urgent] printk: Fix " tip-bot for Heiko Carstens
2010-11-26 12:00 ` [patch 2/3] nohz: fix printk_needs_cpu() return value on offline cpus Heiko Carstens
2010-11-26 12:11   ` Peter Zijlstra [this message]
2010-12-07 21:32     ` [stable] " Greg KH
2010-12-08  8:07       ` Heiko Carstens
2010-12-08 11:13         ` Peter Zijlstra
2010-11-26 15:02   ` [tip:sched/urgent] nohz: Fix " tip-bot for Heiko Carstens
2010-11-26 16:22     ` [PATCH] printk: use this_cpu_{read|write} api on printk_pending Eric Dumazet
2010-11-26 16:29       ` Peter Zijlstra
2010-11-26 16:40       ` Christoph Lameter
2010-11-26 16:59         ` Eric Dumazet
2010-11-26 17:11           ` Christoph Lameter
2010-11-26 17:20             ` Eric Dumazet
2010-11-26 17:27               ` Christoph Lameter
2010-12-08 20:41       ` [tip:sched/core] printk: Use " tip-bot for Eric Dumazet
2010-12-08 21:47         ` Christoph Lameter
2010-12-09  1:43           ` Eric Dumazet
2010-12-09 23:38             ` Christoph Lameter
2010-11-26 12:01 ` [patch 3/3] nohz/s390: fix arch_needs_cpu() return value on offline cpus Heiko Carstens
2010-11-26 12:14   ` Peter Zijlstra
2010-11-26 12:17     ` Heiko Carstens
2010-12-01  9:11     ` Heiko Carstens
2010-12-01 12:19       ` Peter Zijlstra
2010-12-08 20:41       ` [tip:sched/urgent] nohz: Fix get_next_timer_interrupt() vs cpu hotplug tip-bot for Heiko Carstens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1290773512.2145.139.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=christof.schmitt@de.ibm.com \
    --cc=frank.blaschka@de.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=horsth@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=schwidefsky@de.ibm.com \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.