kernel-testers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
To: "Rafael J. Wysocki" <rjw-KKrjLPT3xs0@public.gmane.org>
Cc: Linux Kernel Mailing List
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Kernel Testers List
	<kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	alexs <alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
	Yanmin Zhang
	<yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	john stultz <johnstul-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Subject: Re: [Bug #11970] gettimeofday return a old time in mmbench
Date: Thu, 4 Dec 2008 08:45:40 +0100	[thread overview]
Message-ID: <20081204074540.GA29151@elte.hu> (raw)
In-Reply-To: <bfPM5B8Tx2B.A.Qi.7pxNJB@chimera>


* Rafael J. Wysocki <rjw-KKrjLPT3xs0@public.gmane.org> wrote:

> This message has been generated automatically as a part of a report
> of recent regressions.
> 
> The following bug entry is on the current list of known regressions
> from 2.6.27.  Please verify if it still should be listed and let me know
> (either way).
> 
> 
> Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=11970
> Subject		: gettimeofday return a old time in mmbench
> Submitter	: alexs <alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Date		: 2008-11-06 23:57 (28 days old)
> First-Bad-Commit: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=99ebcf8285df28f32fd2d1c19a7166e70f00309c
> Handled-By	: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
> 		  Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> 		  Yanmin Zhang <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>

fixed by the patch below from John Stultz, queued up in 
tip/timers/urgent.

The bisection-blamed merge commit above likely just causes a random shift 
in the timings or compiler optimization conditions of this code - making 
the bug more likely to trigger. The bug/race itself is old.

	Ingo

------------------------->
From 6c9bacb41c10ba84ff68f238e234d96f35fb64f7 Mon Sep 17 00:00:00 2001
From: john stultz <johnstul-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Date: Mon, 1 Dec 2008 18:34:41 -0800
Subject: [PATCH] time: catch xtime_nsec underflows and fix them

Impact: fix time warp bug

Alex Shi, along with Yanmin Zhang have been noticing occasional time
inconsistencies recently. Through their great diagnosis, they found that
the xtime_nsec value used in update_wall_time was occasionally going
negative. After looking through the code for awhile, I realized we have
the possibility for an underflow when three conditions are met in
update_wall_time():

  1) We have accumulated a second's worth of nanoseconds, so we
     incremented xtime.tv_sec and appropriately decrement xtime_nsec.
     (This doesn't cause xtime_nsec to go negative, but it can cause it
      to be small).

  2) The remaining offset value is large, but just slightly less then
     cycle_interval.

  3) clocksource_adjust() is speeding up the clock, causing a
     corrective amount (compensating for the increase in the multiplier
     being multiplied against the unaccumulated offset value) to be
     subtracted from xtime_nsec.

This can cause xtime_nsec to underflow.

Unfortunately, since we notify the NTP subsystem via second_overflow()
whenever we accumulate a full second, and this effects the error
accumulation that has already occured, we cannot simply revert the
accumulated second from xtime nor move the second accumulation to after
the clocksource_adjust call without a change in behavior.

This leaves us with (at least) two options:

1) Simply return from clocksource_adjust() without making a change if we
   notice the adjustment would cause xtime_nsec to go negative.

This would work, but I'm concerned that if a large adjustment was needed
(due to the error being large), it may be possible to get stuck with an
ever increasing error that becomes too large to correct (since it may
always force xtime_nsec negative). This may just be paranoia on my part.

2) Catch xtime_nsec if it is negative, then add back the amount its
   negative to both xtime_nsec and the error.

This second method is consistent with how we've handled earlier rounding
issues, and also has the benefit that the error being added is always in
the oposite direction also always equal or smaller then the correction
being applied. So the risk of a corner case where things get out of
control is lessened.

This patch fixes bug 11970, as tested by Yanmin Zhang
http://bugzilla.kernel.org/show_bug.cgi?id=11970

Reported-by: alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org
Signed-off-by: John Stultz <johnstul-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Acked-by: "Zhang, Yanmin" <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Tested-by: "Zhang, Yanmin" <yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Signed-off-by: Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
---
 kernel/time/timekeeping.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e7acfb4..fa05e88 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -518,6 +518,28 @@ void update_wall_time(void)
 	/* correct the clock when NTP error is too big */
 	clocksource_adjust(offset);
 
+	/*
+	 * Since in the loop above, we accumulate any amount of time
+	 * in xtime_nsec over a second into xtime.tv_sec, its possible for
+	 * xtime_nsec to be fairly small after the loop. Further, if we're
+	 * slightly speeding the clocksource up in clocksource_adjust(),
+	 * its possible the required corrective factor to xtime_nsec could
+	 * cause it to underflow.
+	 *
+	 * Now, we cannot simply roll the accumulated second back, since
+	 * the NTP subsystem has been notified via second_overflow. So
+	 * instead we push xtime_nsec forward by the amount we underflowed,
+	 * and add that amount into the error.
+	 *
+	 * We'll correct this error next time through this function, when
+	 * xtime_nsec is not as small.
+	 */
+	if (unlikely((s64)clock->xtime_nsec < 0)) {
+		s64 neg = -(s64)clock->xtime_nsec;
+		clock->xtime_nsec = 0;
+		clock->error += neg << (NTP_SCALE_SHIFT - clock->shift);
+	}
+
 	/* store full nanoseconds into xtime after rounding it up and
 	 * add the remainder to the error difference.
 	 */

  reply	other threads:[~2008-12-04  7:45 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-03 21:49 2.6.28-rc7-git2: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-12-03 21:49 ` [Bug #11828] Linux 2.6.27-git3: no SD card reader Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11925] cdrom: missing compat ioctls Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11906] 2.6.28-rc2 seems to fail at powering down the monitor when it should Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11858] Timeout regression introduced by 242f9dcb8ba6f68fcd217a119a7648a4f69290e9 Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11898] mke2fs hang on AIC79 device Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11849] default IRQ affinity change in v2.6.27 (breaking several SMP PPC based systems) Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11947] 2.6.28-rc VC switching with Intel graphics broken Rafael J. Wysocki
2008-12-04 10:55   ` Romano Giannetti (lists)
2008-12-03 21:57 ` [Bug #11958] [2.6.27.x =&gt; 2.6.28-rc3] Xorg crash with xf86MapVidMem error Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #12028] i915 DRM is broken in 2.6.28-rc4 Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
2008-12-04  7:45   ` Ingo Molnar [this message]
     [not found]     ` <20081204074540.GA29151-X9Un+BFzKDI@public.gmane.org>
2008-12-07 20:25       ` Rafael J. Wysocki
2008-12-03 21:57 ` [Bug #12020] scsi_times_out NULL pointer dereference Rafael J. Wysocki
2008-12-04  0:14   ` James Bottomley
     [not found]     ` <1228349648.5551.98.camel-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2008-12-07 20:22       ` Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12034] snd-hda-intel on Realtek ALC268 chip shows only Master volume (for playback) Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12061] snd_hda_intel: power_save: sound cracks on powerdown Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12031] DRM enabled kernel hangs hard on resume (Intel graphics) Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12047] ACPI toshiba: only register rfkill if bt is enabled Rafael J. Wysocki
2008-12-04  7:31   ` Frederik Deweerdt
2008-12-03 21:58 ` [Bug #12081] xen: pin correct PGD on suspend Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12082] IRQ and MSI allocations broken without sparse irq Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12120] [Block layer or SCSI] requests aborted too early during check_partition() Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12100] resume (S2R) broken by Intel microcode module, on A110L Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12153] 2.6.28-rc2: runaway loop modprobe char-major-5-1 Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12152] Huge wakeups number from i1915 Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12156] v2.6.28-rc2: x86_32 relocation regression? Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12154] Mmiotrace regression in 2.6.28-rc6 Rafael J. Wysocki
2008-12-04 19:02   ` Pekka Paalanen
     [not found]     ` <20081204210234.4c9971aa-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
2008-12-07 20:09       ` Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12155] Regression in 2.6.28-rc and 2.6.27-stable - hibernate related Rafael J. Wysocki
2008-12-04  8:50   ` Fabio Comolli
     [not found]     ` <b637ec0b0812040050g58f3e28dxba5bca8c77bb5a94-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-12-06 20:08       ` Fabio Comolli
     [not found]         ` <b637ec0b0812061208i9520bd0tfd404f1a9c8a95ba-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-12-07 13:39           ` Fabio Comolli
     [not found]             ` <b637ec0b0812070539i20684ec2tfc6748f01d99fff9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-12-07 20:08               ` Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12158] commit b1ee26b freezes system on switching from X to text console Rafael J. Wysocki
2008-12-04  0:44   ` Linus Torvalds
2008-12-04  1:31     ` Benjamin Herrenschmidt
2008-12-05 16:21     ` Gaudenz Steinlin
2008-12-03 21:58 ` [Bug #12159] 2.6.28-rc6-git1 -- No sound produced from Intel HDA ALSA driver Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12160] networking oops after resume from s2ram (2.6.28-rc6) Rafael J. Wysocki
2008-12-03 21:58 ` [Bug #12162] Commit 7cd5b08be3 breaks startup on Toshiba Portege R500 Rafael J. Wysocki
2008-12-04  0:34   ` Linus Torvalds
     [not found]     ` <alpine.LFD.2.00.0812031632420.3256-nfNrOhbfy2R17+2ddN/4kux8cNe9sq/dYPYVAmT7z5s@public.gmane.org>
2008-12-04  1:12       ` Rafael J. Wysocki
2008-12-04  8:05       ` Wim Van Sebroeck
     [not found]         ` <20081204080537.GI7017-flHiHfN8CTx+UeId83wJPCEMzqIPX00as0AfqQuZ5sE@public.gmane.org>
2008-12-04 22:45           ` Wim Van Sebroeck
2008-12-03 21:58 ` [Bug #12161] [i915 drm] irq 16: nobody cared with latest git kernels Rafael J. Wysocki
2008-12-07 19:16   ` Frederik
     [not found]     ` <200812072016.35510.mailinglists.fredi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-12-07 20:12       ` Rafael J. Wysocki
2008-12-04  0:17 ` 2.6.28-rc7-git2: Reported regressions from 2.6.27 James Bottomley
  -- strict thread matches above, loose matches on Subject: below --
2008-11-22 20:24 2.6.28-rc6-git1: " Rafael J. Wysocki
2008-11-22 20:28 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
2008-11-16 16:24 2.6.28-rc5: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-16 16:35 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki
2008-11-09 17:53 2.6.28-rc3-git6: Reported regressions from 2.6.27 Rafael J. Wysocki
2008-11-09 17:59 ` [Bug #11970] gettimeofday return a old time in mmbench Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081204074540.GA29151@elte.hu \
    --to=mingo-x9un+bfzkdi@public.gmane.org \
    --cc=alex.shi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=johnstul-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=kernel-testers-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=rjw-KKrjLPT3xs0@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=yanmin_zhang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).