public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: john stultz <johnstul@us.ibm.com>
To: Ruben Kerkhof <ruben@rubenkerkhof.com>
Cc: "Greg KH" <greg@kroah.com>,
	linux-kernel@vger.kernel.org, seto.hidetoshi@jp.fujitsu.com,
	"Peter Zijlstra" <peterz@infradead.org>,
	"MINOURA Makoto" <minoura@valinux.co.jp>,
	"Ingo Molnar" <mingo@elte.hu>,
	stable@kernel.org, "Hervé Commowick" <hcommowick@exosec.fr>,
	Rand@jasper.es, "Andrew Morton" <akpm@linux-foundation.org>,
	"Willy Tarreau" <w@1wt.eu>,
	"Faidon Liambotis" <paravoid@debian.org>
Subject: Re: [stable] 2.6.32.21 - uptime related crashes?
Date: Tue, 25 Oct 2011 15:44:30 -0700	[thread overview]
Message-ID: <1319582670.17505.31.camel@work-vm> (raw)
In-Reply-To: <CAPed3OHzO5usHfyeD_rK8dDhcGakZF+ByzEyZZb_Tdh3U00vOg@mail.gmail.com>

On Sun, 2011-10-23 at 20:31 +0200, Ruben Kerkhof wrote:
> On Mon, Sep 5, 2011 at 01:26, Faidon Liambotis <paravoid@debian.org> wrote:
> > On Tue, Aug 30, 2011 at 03:38:29PM -0700, Greg KH wrote:
> >> On Thu, Aug 25, 2011 at 09:56:16PM +0300, Faidon Liambotis wrote:
> >> > On Thu, Jul 21, 2011 at 08:45:25PM +0200, Ingo Molnar wrote:
> >> > > * Peter Zijlstra <peterz@infradead.org> wrote:
> >> > >
> >> > > > On Thu, 2011-07-21 at 14:50 +0200, Nikola Ciprich wrote:
> >> > > > > thanks for the patch! I'll put this on our testing boxes...
> >> > > >
> >> > > > With a patch that frobs the starting value close to overflowing I hope,
> >> > > > otherwise we'll not hear from you in like 7 months ;-)
> >> > > >
> >> > > > > Are You going to push this upstream so we can ask Greg to push this to
> >> > > > > -stable?
> >> > > >
> >> > > > Yeah, I think we want to commit this with a -stable tag, Ingo?
> >> > >
> >> > > yeah - and we also want a Reported-by tag and an explanation of how
> >> > > it can crash and why it matters in practice. I can then stick it into
> >> > > the urgent branch for Linus. (probably will only hit upstream in the
> >> > > merge window though.)
> >> >
> >> > Has this been pushed or has the problem been solved somehow? Time is
> >> > against us on this bug as more boxes will crash as they reach 200 days
> >> > of uptime...
> >> >
> >> > In any case, feel free to use me as a Reported-by, my full report of the
> >> > problem being <20110430173905.GA25641@tty.gr>.
> >> >
> >> > FWIW and if I understand correctly, my symptoms were caused by *two*
> >> > different bugs:
> >> > a) the 54 bits wraparound at 208 days that Peter fixed above,
> >> > b) a kernel crash at ~215 days related to RT tasks, fixed by
> >> > 305e6835e05513406fa12820e40e4a8ecb63743c (already in -stable).
> >>
> >> So, what do I do here as part of the .32-longterm kernel?  Is there a
> >> fix that is in Linus's tree that I need to apply here?
> >>
> >> confused,
> >
> > Is this even pushed upstream? I checked Linus' tree and the proposed
> > patch is *not* merged there. I'm not really sure if it was fixed some
> > other way, though. I thought this was intended to be an "urgent" fix or
> > something?
> >
> > Regards,
> > Faidon
> 
> I just had two crashes on two different machines, both with an uptime
> of 208 days.
> Both were 5520's running 2.6.34.8, but with a CONFIG_HZ of 1000
> 
> 2011-10-23T16:49:18.618029+02:00 phy001 kernel: BUG: soft lockup -
> CPU#0 stuck for 17163091968s! [qemu-kvm:16949]

So were these actual crashes, or just softlockup false positives?

I had thought the earlier crash issue (div by zero) fix from PeterZ had
been already pushed upstream, but maybe that was just against 2.6.32 and
not 2.6.33?

The softlockup false positive issue should have been fixed by Peter's
"x86, intel: Don't mark sched_clock() as stable" below. But I'm not
seeing it upstream.  Peter, is this still the right fix?

thanks
-john


From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: x86, intel: Don't mark sched_clock() as stable

Because the x86 sched_clock() implementation wraps at 54 bits and the
scheduler code assumes it wraps at the full 64bits we can get into
trouble after 208 days (~7 months) of uptime. 

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 arch/x86/kernel/cpu/intel.c |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index ed6086e..c8dc48b 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -91,8 +91,15 @@ static void __cpuinit early_init_intel(struct cpuinfo_x86 *c)
        if (c->x86_power & (1 << 8)) {
                set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
                set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
+               /*
+                * Unfortunately our __cycles_2_ns() implementation makes
+                * the raw sched_clock() interface wrap at 54-bits, which
+                * makes it unsuitable for direct use, so disable this
+                * for now.
+                *
                if (!check_tsc_unstable())
                        sched_clock_stable = 1;
+                */
        }
 
        /*




  parent reply	other threads:[~2011-10-25 22:44 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-28  8:26 2.6.32.21 - uptime related crashes? Nikola Ciprich
2011-04-28 18:34 ` [stable] " Willy Tarreau
2011-04-29 10:02   ` Nikola Ciprich
2011-04-30  9:36     ` Willy Tarreau
2011-04-30 11:22       ` Henrique de Moraes Holschuh
2011-04-30 11:54         ` Willy Tarreau
2011-04-30 12:32           ` Henrique de Moraes Holschuh
2011-04-30 12:02       ` Nikola Ciprich
2011-04-30 15:57         ` Greg KH
2011-04-30 16:08           ` Randy Dunlap
2011-04-30 16:49             ` Willy Tarreau
2011-04-30 18:14               ` Henrique de Moraes Holschuh
2011-04-30 17:39       ` Faidon Liambotis
2011-04-30 20:14         ` Willy Tarreau
2011-05-14 19:04           ` Nikola Ciprich
2011-05-14 20:45             ` Willy Tarreau
2011-05-14 20:59               ` Ben Hutchings
2011-05-14 23:13               ` Nicolas Carlier
2011-05-15 22:56             ` Faidon Liambotis
2011-05-16  6:49               ` Apollon Oikonomopoulos
2011-06-28  2:25         ` john stultz
2011-06-28  5:17           ` Willy Tarreau
2011-06-28  6:19             ` Apollon Oikonomopoulos
2011-07-06  6:15           ` Andrew Morton
2011-07-12  1:18             ` MINOURA Makoto / 箕浦 真
2011-07-12  1:40               ` john stultz
2011-07-12  2:49                 ` MINOURA Makoto / 箕浦 真
2011-07-12  4:19                   ` Willy Tarreau
2011-07-15  0:35                     ` john stultz
2011-07-15  8:30                       ` Peter Zijlstra
2011-07-15 10:02                         ` Peter Zijlstra
2011-07-15 18:03                           ` john stultz
2011-07-15 10:01                       ` Peter Zijlstra
2011-07-15 17:59                         ` john stultz
2011-07-21  7:22                           ` Ingo Molnar
2011-07-21 12:24                             ` Peter Zijlstra
2011-07-21 12:50                               ` Nikola Ciprich
2011-07-21 12:53                                 ` Peter Zijlstra
2011-07-21 18:45                                   ` Ingo Molnar
2011-07-21 19:32                                     ` Nikola Ciprich
2011-08-25 18:56                                     ` Faidon Liambotis
2011-08-30 22:38                                       ` [stable] " Greg KH
2011-09-04 23:26                                         ` Faidon Liambotis
2011-10-23 18:31                                           ` Ruben Kerkhof
2011-10-23 22:07                                             ` Greg KH
2011-10-25 22:44                                             ` john stultz [this message]
2011-10-25 23:25                                               ` Willy Tarreau
2011-12-02 23:45                                                 ` Greg KH
2011-12-03  0:02                                                   ` john stultz
2011-12-03  1:02                                                     ` Greg KH
2011-12-03  7:00                                                       ` Willy Tarreau
2011-12-05 16:53                                                       ` Ingo Molnar
2011-10-26 18:21                                               ` Ruben Kerkhof
2011-07-21 19:25                                   ` Nikola Ciprich
2011-07-21 19:37                                     ` john stultz
2011-07-21 19:53                             ` john stultz
2011-05-06  3:12     ` [stable] " Hidetoshi Seto
2011-05-13 22:08   ` Nicolas Carlier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1319582670.17505.31.camel@work-vm \
    --to=johnstul@us.ibm.com \
    --cc=Rand@jasper.es \
    --cc=akpm@linux-foundation.org \
    --cc=greg@kroah.com \
    --cc=hcommowick@exosec.fr \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=minoura@valinux.co.jp \
    --cc=paravoid@debian.org \
    --cc=peterz@infradead.org \
    --cc=ruben@rubenkerkhof.com \
    --cc=seto.hidetoshi@jp.fujitsu.com \
    --cc=stable@kernel.org \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox