public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Prarit Bhargava <prarit@redhat.com>
To: mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org,
	bitbucket@online.de, tglx@linutronix.de, prarit@redhat.com
Cc: tip-bot for Thomas Gleixner <tipbot@zytor.com>,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:timers/urgent] tick: Cleanup NOHZ per cpu data on cpu down
Date: Mon, 13 May 2013 10:51:44 -0400	[thread overview]
Message-ID: <5190FE00.6010508@redhat.com> (raw)
In-Reply-To: <tip-4b0c0f294f60abcdd20994a8341a95c8ac5eeb96@git.kernel.org>



On 05/12/2013 06:27 AM, tip-bot for Thomas Gleixner wrote:
> Commit-ID:  4b0c0f294f60abcdd20994a8341a95c8ac5eeb96
> Gitweb:     http://git.kernel.org/tip/4b0c0f294f60abcdd20994a8341a95c8ac5eeb96
> Author:     Thomas Gleixner <tglx@linutronix.de>
> AuthorDate: Fri, 3 May 2013 15:02:50 +0200
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Sun, 12 May 2013 12:20:09 +0200
> 
> tick: Cleanup NOHZ per cpu data on cpu down
> 
> Prarit reported a crash on CPU offline/online. The reason is that on
> CPU down the NOHZ related per cpu data of the dead cpu is not cleaned
> up. If at cpu online an interrupt happens before the per cpu tick
> device is registered the irq_enter() check potentially sees stale data
> and dereferences a NULL pointer.
> 
> Cleanup the data after the cpu is dead.

Thomas, while this does fix up the NULL pointer issue, I think you've introduced
a new bug in the schedule timer code.

While doing up and downs on the same CPU, I now occasionally see long delays in
the up and down...

[   65.150073] smpboot: Booting Node 1 Processor 19 APIC 0x28
[   66.715339] smpboot: CPU 19 is now offline
[   67.752751] smpboot: Booting Node 1 Processor 19 APIC 0x28
[   68.758711] smpboot: CPU 19 is now offline

Everything is normal ...

[   69.711612] smpboot: Booting Node 1 Processor 19 APIC 0x28
[   70.731521] smpboot: CPU 19 is now offline

Long delay in bringing CPU "down"

[   81.744565] smpboot: Booting Node 1 Processor 19 APIC 0x28
[   82.848591] smpboot: CPU 19 is now offline

Long delay in bringing CPU "up"

[   89.826533] smpboot: Booting Node 1 Processor 19 APIC 0x28
[   84.905358] smpboot: CPU 19 is now offline
[   87.565274] smpboot: Booting Node 1 Processor 19 APIC 0x28

Also, if the system is in this state I cannot reboot -- the system appears to
hang while bringing down CPUs...

Oddly, if I do

+       memset(ts, 0, sizeof(*ts));
+       ts->tick_stopped = 1;

instead of your memset, everything works.  I'm looking at the tick-sched.c code
to see why setting tick_stopped = 1 seems to fix the problem.

P.

  reply	other threads:[~2013-05-13 14:51 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-30 12:36 [PATCH] NOHZ, check to see if tick device is initialized in IRQ handling path Prarit Bhargava
2013-05-02 22:51 ` Tony Luck
2013-05-02 22:56 ` Thomas Gleixner
2013-05-03  8:10   ` Thomas Gleixner
2013-05-03 12:34     ` Prarit Bhargava
2013-05-03 13:02       ` Thomas Gleixner
2013-05-03 13:43         ` Prarit Bhargava
2013-05-05  6:20         ` [tip:timers/urgent] tick: Cleanup NOHZ per cpu data on cpu down tip-bot for Thomas Gleixner
2013-05-05 19:54           ` Prarit Bhargava
2013-05-06  8:48             ` Thomas Gleixner
2013-05-05 12:48         ` tip-bot for Thomas Gleixner
2013-05-05 14:14         ` tip-bot for Thomas Gleixner
2013-05-12 10:27         ` tip-bot for Thomas Gleixner
2013-05-13 14:51           ` Prarit Bhargava [this message]
2013-05-13 19:10             ` Thomas Gleixner
2013-05-14 13:48               ` Prarit Bhargava

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5190FE00.6010508@redhat.com \
    --to=prarit@redhat.com \
    --cc=bitbucket@online.de \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tipbot@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox