public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: Justin Forbes <jmforbes@linuxtx.org>,
	Zwane Mwaikambo <zwane@arm.linux.org.uk>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Randy Dunlap <rdunlap@xenotime.net>,
	Dave Jones <davej@redhat.com>,
	Chuck Wolber <chuckw@quantumlinux.com>,
	Chris Wedgwood <reviews@ml.cw.f00f.org>,
	Michael Krufky <mkrufky@linuxtv.org>,
	Chuck Ebbert <cebbert@redhat.com>,
	Domenico Andreoli <cavokz@gmail.com>,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk,
	Dave Johnson <djohnson@sw.starentnetworks.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>
Subject: [patch 03/26] x86: fix TSC clock source calibration error
Date: Mon, 19 Nov 2007 10:18:19 -0800	[thread overview]
Message-ID: <20071119181819.GD15425@kroah.com> (raw)
In-Reply-To: <20071119181746.GA15425@kroah.com>

[-- Attachment #1: x86-fix-tsc-clock-source-calibration-error.patch --]
[-- Type: text/plain, Size: 3442 bytes --]

2.6.22-stable review patch.  If anyone has any objections, please let us
know.

------------------
From: Dave Johnson <djohnson@sw.starentnetworks.com>

patch edaf420fdc122e7a42326fe39274c8b8c9b19d41 in mainline.

I ran into this problem on a system that was unable to obtain NTP sync
because the clock was running very slow (over 10000ppm slow). ntpd had
declared all of its peers 'reject' with 'peer_dist' reason.

On investigation, the tsc_khz variable was significantly incorrect
causing xtime to run slow.  After a reboot tsc_khz was correct so I
did a reboot test to see how often the problem occurred:

Test was done on a 2000 Mhz Xeon system.  Of 689 reboots, 8 of them
had unacceptable tsc_khz values (>500ppm):

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          21      3.048%
1999800 - 1999850         166     24.128%
1999850 - 1999900         241     35.029%
1999900 - 1999950         211     30.669%
1999950 - 2000000          42      6.105%
2000000 - 2000000           0      0.000%
2000050 - 2000100           0      0.000%
                   [...]
2000100 - 2015000           1      0.145%  << BAD
2015000 - 2030000           6      0.872%  << BAD
2030000 - 2045000           1      0.145%  << BAD
2045000 <                   0      0.000%

The worst boot was 2032.577 Mhz, over 1.5% off!

It appears that on rare occasions, mach_countup() is taking longer to
complete than necessary.

I suspect that this is caused by the CPU taking a periodic SMI
interrupt right at the end of the 30ms calibration loop.  This would
cause the loop to delay while the SMI BIOS hander runs. The resulting
TSC value is beyond what it actually should be resulting in a higher
tsc_khz.

The below patch makes native_calculate_cpu_khz() take the best
(shortest duration, lowest khz) run of it's 3 calibration loops.  If a
SMI goes off causing a bad result (long duration, higher khz) it will
be discarded.

With the patch applied, 300 boots of the same system produce good
results:

 range of tsc_khz  # of boots  % of boots
 ----------------  ----------  ----------
        < 1999750           0      0.000%
1999750 - 1999800          30     10.000%
1999800 - 1999850         166     55.333%
1999850 - 1999900          89     29.667%
1999900 - 1999950          15      5.000%
1999950 <                   0      0.000%

Problem was found and tested against 2.6.18.  Patch is against 2.6.22.

Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 arch/i386/kernel/tsc.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

--- a/arch/i386/kernel/tsc.c
+++ b/arch/i386/kernel/tsc.c
@@ -122,7 +122,7 @@ unsigned long native_calculate_cpu_khz(v
 {
 	unsigned long long start, end;
 	unsigned long count;
-	u64 delta64;
+	u64 delta64 = (u64)ULLONG_MAX;
 	int i;
 	unsigned long flags;
 
@@ -134,6 +134,7 @@ unsigned long native_calculate_cpu_khz(v
 		rdtscll(start);
 		mach_countup(&count);
 		rdtscll(end);
+		delta64 = min(delta64, (end - start));
 	}
 	/*
 	 * Error: ECTCNEVERSET
@@ -144,8 +145,6 @@ unsigned long native_calculate_cpu_khz(v
 	if (count <= 1)
 		goto err;
 
-	delta64 = end - start;
-
 	/* cpu freq too fast: */
 	if (delta64 > (1ULL<<32))
 		goto err;

-- 

  parent reply	other threads:[~2007-11-19 18:23 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20071119181438.617190424@mini.kroah.org>
2007-11-19 18:17 ` [patch 00/26] 2.6.22-stable review Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 01/26] SLUB: Fix memory leak by not reusing cpu_slab Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 02/26] Fix compat futex hangs Greg Kroah-Hartman
2007-11-19 18:18   ` Greg Kroah-Hartman [this message]
2007-11-19 18:18   ` [patch 04/26] writeback: dont propagate AOP_WRITEPAGE_ACTIVATE Greg Kroah-Hartman
2007-11-19 19:04     ` Hugh Dickins
2007-11-19 23:05       ` [stable] " Greg KH
2007-11-19 18:18   ` [patch 05/26] fix param_sysfs_builtin name length check Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 06/26] NETFILTER: nf_conntrack_tcp: fix connection reopening Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 07/26] fix the softlockup watchdog to actually work Greg Kroah-Hartman
2007-11-19 19:02     ` Ingo Molnar
2007-11-19 23:02       ` [stable] " Greg KH
2007-11-19 18:18   ` [patch 08/26] Fix TEQL oops Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 09/26] Fix netlink timeouts Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 10/26] Fix error returns in sys_socketpair() Greg Kroah-Hartman
2007-11-19 18:18   ` [patch 11/26] Fix endianness bug in U32 classifier Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 12/26] Fix crypto_alloc_comp() error checking Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 13/26] ALSA: hdsp - Fix zero division Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 14/26] ALSA: hda-codec - Add array terminator for dmic in STAC codec Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 15/26] forcedeth msi bugfix Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 16/26] hptiop: avoid buffer overflow when returning sense data Greg Kroah-Hartman
2007-11-19 18:38     ` Matthew Wilcox
2007-11-19 23:03       ` [stable] " Greg KH
2007-11-19 18:19   ` [patch 17/26] USB: kobil_sct: trivial backport to fix libct Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 18/26] USB: usbserial - fix potential deadlock between write() and IRQ Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 19/26] USB: mutual exclusion for EHCI init and port resets Greg Kroah-Hartman
2007-11-19 18:41     ` David Brownell
2007-11-19 18:43       ` Greg KH
2007-11-19 19:04       ` Alan Stern
2007-11-19 19:59         ` David Brownell
2007-11-19 22:32           ` David Miller
2007-11-19 22:52             ` Greg KH
2007-11-19 18:19   ` [patch 20/26] i4l: Fix random hard freeze with AVM c4 card Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 21/26] i4l: fix random freezes with AVM B1 drivers Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 22/26] ide: fix serverworks.c UDMA regression Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 23/26] ocfs2: fix write() performance regression Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 24/26] i2c-pasemi: Fix NACK detection Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 25/26] i2c/eeprom: Hide Sony Vaio serial numbers Greg Kroah-Hartman
2007-11-19 18:19   ` [patch 26/26] i2c/eeprom: Recognize VGN as a valid Sony Vaio name prefix Greg Kroah-Hartman
2007-11-19 18:22   ` [patch 00/26] 2.6.22-stable review Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071119181819.GD15425@kroah.com \
    --to=gregkh@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cavokz@gmail.com \
    --cc=cebbert@redhat.com \
    --cc=chuckw@quantumlinux.com \
    --cc=davej@redhat.com \
    --cc=djohnson@sw.starentnetworks.com \
    --cc=jmforbes@linuxtx.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mkrufky@linuxtv.org \
    --cc=rdunlap@xenotime.net \
    --cc=reviews@ml.cw.f00f.org \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@mit.edu \
    --cc=zwane@arm.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox