All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Prarit Bhargava <prarit@redhat.com>,
	Richard Cochran <richardcochran@gmail.com>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Ingo Molnar <mingo@kernel.org>,
	Clark Williams <williams@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH 8/9] clocksource: Improve unstable clocksource detection
Date: Wed, 26 Aug 2015 10:15:34 -0700	[thread overview]
Message-ID: <20150826171533.GA2189998@devbig257.prn2.facebook.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1508182215100.3873@nanos>

On Tue, Aug 18, 2015 at 10:18:09PM +0200, Thomas Gleixner wrote:
> On Tue, 18 Aug 2015, John Stultz wrote:
> > On Tue, Aug 18, 2015 at 12:28 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > > On Tue, 18 Aug 2015, John Stultz wrote:
> > >> On Tue, Aug 18, 2015 at 1:38 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > >> > On Mon, 17 Aug 2015, John Stultz wrote:
> > >> >> On Mon, Aug 17, 2015 at 3:04 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > >> >> > On Mon, 17 Aug 2015, John Stultz wrote:
> > >> >> >
> > >> >> >> From: Shaohua Li <shli@fb.com>
> > >> >> >>
> > >> >> >> >From time to time we saw TSC is marked as unstable in our systems, while
> > >> >> >
> > >> >> > Stray '>'
> > >> >> >
> > >> >> >> the CPUs declare to have stable TSC. Looking at the clocksource unstable
> > >> >> >> detection, there are two problems:
> > >> >> >> - watchdog clock source wrap. HPET is the most common watchdog clock
> > >> >> >>   source. It's 32-bit and runs in 14.3Mhz. That means the hpet counter
> > >> >> >>   can wrap in about 5 minutes.
> > >> >> >> - threshold isn't scaled against interval. The threshold is 0.0625s in
> > >> >> >>   0.5s interval. What if the actual interval is bigger than 0.5s?
> > >> >> >>
> > >> >> >> The watchdog runs in a timer bh, so hard/soft irq can defer its running.
> > >> >> >> Heavy network stack softirq can hog a cpu. IPMI driver can disable
> > >> >> >> interrupt for a very long time.
> > >> >> >
> > >> >> > And they hold off the timer softirq for more than a second? Don't you
> > >> >> > think that's the problem which needs to be fixed?
> > >> >>
> > >> >> Though this is an issue I've experienced (and tried unsuccessfully to
> > >> >> fix in a more complicated way) with the RT kernel, where high priority
> > >> >> tasks blocked the watchdog long enough that we'd disqualify the TSC.
> > >> >
> > >> > Did it disqualify the watchdog due to HPET wraparounds (5 minutes) or
> > >> > due to the fixed threshold being applied?
> > >>
> > >> This was years ago, but in my experience, the watchdog false positives
> > >> were due to HPET wraparounds.
> > >
> > > Blocking stuff for 5 minutes is insane ....
> > 
> > Yea. It was usually due to -RT stress testing, which keept the
> > machines busy for quite awhile. But again, if you have machines being
> > maxed out with networking load, etc, even for long amounts of time, we
> > still want to avoid false positives. Because after the watchdog
> 
> The networking softirq does not hog the other softirqs. It has a limit
> on processing loops and then goes back to let the other softirqs be
> handled. So no, I doubt that heavy networking can cause this. If it
> does then we have some other way more serious problems.
> 
> I can see the issue with RT stress testing, but not with networking in
> mainline.

Ok, the issue is triggerd in my kvm guest, I guess it's easier to
trigger in kvm because hpet is 100Mhz.

[  135.930067] clocksource: timekeeping watchdog: Marking clocksource 'tsc' as unstable because the skew is too large:
[  135.930095] clocksource:                       'hpet' wd_now: 2bc19ea0 wd_last: 6c4e5570 mask: ffffffff
[  135.930105] clocksource:                       'tsc' cs_now: 481250b45b cs_last: 219e6efb50 mask: ffffffffffffffff
[  135.938750] clocksource: Switched to clocksource hpet

The HPET clock is 100MHz, CPU speed is 2200MHz, kvm is passed correct cpu
info, so guest cpuinfo shows TSC is stable.

hpet interval is ((0x2bc19ea0 - 0x6c4e5570) & 0xffffffff) / 100000000 = 32.1s.

The HPET wraps interval is 0xffffffff / 100000000 = 42.9s

tsc interval is (0x481250b45b - 0x219e6efb50) / 2200000000 = 75s

32.1 + 42.9 = 75

The example shows hpet wraps, while tsc is marked unstable

Thanks,
Shaohua

  reply	other threads:[~2015-08-26 17:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-17 20:40 [PATCH 0/9] Time items for 4.3 John Stultz
2015-08-17 20:40 ` [PATCH 1/9] timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers John Stultz
2015-08-17 21:01   ` Shuah Khan
2015-08-17 21:04     ` Shuah Khan
2015-08-17 21:05     ` John Stultz
2015-08-17 20:40 ` [PATCH 2/9] time: Fix nanosecond file time rounding in timespec_trunc() John Stultz
2015-08-17 22:14   ` Thomas Gleixner
2015-08-17 20:40 ` [PATCH 3/9] time: Always make sure wall_to_monotonic isn't positive John Stultz
2015-08-17 20:40 ` [PATCH 4/9] time: Add the common weak version of update_persistent_clock() John Stultz
2015-08-17 20:40 ` [PATCH 5/9] time: Introduce struct itimerspec64 John Stultz
2015-08-17 20:41 ` [PATCH 6/9] time: Introduce current_kernel_time64() John Stultz
2015-08-17 20:41 ` [PATCH 7/9] time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64() John Stultz
2015-08-17 20:41 ` [PATCH 8/9] clocksource: Improve unstable clocksource detection John Stultz
2015-08-17 22:04   ` Thomas Gleixner
2015-08-17 22:17     ` John Stultz
2015-08-18  2:57       ` Shaohua Li
2015-08-18  3:39         ` John Stultz
2015-08-18  8:57         ` Thomas Gleixner
2015-08-18  8:38       ` Thomas Gleixner
2015-08-18 17:49         ` John Stultz
2015-08-18 19:28           ` Thomas Gleixner
2015-08-18 20:11             ` John Stultz
2015-08-18 20:18               ` Thomas Gleixner
2015-08-26 17:15                 ` Shaohua Li [this message]
2015-08-31 21:12                   ` Shaohua Li
2015-08-31 21:47                     ` Thomas Gleixner
2015-08-31 22:39                       ` Shaohua Li
2015-09-01 17:13                         ` Thomas Gleixner
2015-09-01 18:14                           ` Shaohua Li
2015-09-01 18:55                             ` Thomas Gleixner
2015-09-01 19:35                             ` Steven Rostedt
2015-09-02  6:50                               ` Peter Zijlstra
2015-08-17 20:41 ` [PATCH 9/9] clocksource: Sanity check watchdog clocksource John Stultz
2015-08-17 21:24   ` Thomas Gleixner
2015-08-17 22:03     ` John Stultz
2015-08-17 22:08       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150826171533.GA2189998@devbig257.prn2.facebook.com \
    --to=shli@fb.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=john.stultz@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=prarit@redhat.com \
    --cc=richardcochran@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.