Re: [RFC][PATCH] timekeeping: Keep xtime_nsec remainder separate from ntp_error

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: john stultz <johnstul@us.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Zippel <zippel@linux-m68k.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [RFC][PATCH] timekeeping: Keep xtime_nsec remainder separate from ntp_error
Date: Wed, 08 Dec 2010 11:59:42 -0800	[thread overview]
Message-ID: <1291838382.2909.17.camel@work-vm> (raw)
In-Reply-To: <1291833054.5015.1133.camel@gandalf.stny.rr.com>

On Wed, 2010-12-08 at 13:30 -0500, Steven Rostedt wrote:
> While doing my end of year unlikely() cleanup, running the annotate
> branch profiler, I came across this:
> 
>  correct incorrect  %        Function                  File              Line
>  ------- ---------  -        --------                  ----              ----
>   122588 65653641  99 timekeeping_adjust             timekeeping.c        664
>   167493 14584927  98 timekeeping_adjust             timekeeping.c        658
> 
> This shows that the following likely()'s are wrong most of the time:
> 
> 	if (error > interval) {
> 		error >>= 2;
> 		if (likely(error <= interval))
> 			adj = 1;
> 		else
> 			adj = timekeeping_bigadjust(error, &interval, &offset);
> 	} else if (error < -interval) {
> 		error >>= 2;
> 		if (likely(error >= -interval)) {
> 
> Talking about this with John Stultz, we both agreed that the annotations
> should be correct and is not the issue, but something else is going
> wrong.
> 
> Adding in trace_printks(), I saw that the adj values that were added to
> the "mult" multiplier were sometimes quite large. The time intervals
> never got down into a small error, but instead was making large
> oscillations, both positive and negative to where it should be.
> 
> John noticed that if he removed the commit:
> 
> commit 5cd1c9c5cf30d4b33df3d3f74d8142f278d536b7
> timekeeping: fix rounding problem during clock update
> 
> that the problem would go away and we would get back into a tight
> oscillation that would stay within the fast path (and the likely()'s
> were again likely).
> 
> What the above commit did was to fix a bug that caused time to go
> backward a nanosec due to the truncating of the xtime_nsec shifted into
> the xtime.tv_nsec.  The fix for that bug (and what that commit did) was
> to always round up one. It added +1 to the xtime.tv_nsec after it did
> the conversion, and then took the difference between this shifted and
> the xtime_nsec and stored that into the ntp_error.
> 
> The ntp_error is used to control the frequency, and this constant adding
> of the shift remainder would cause the large oscillation.
> 
> This patch instead adds another field to the timekeeping structure that
> stores the remainder separately. On re-entry into update_wall_time(),
> the remainder is added back onto the xtime_nsec after it is set to the
> xtime.tv_nsec and restoring its original value.
> 
> This handles the rounding problem that the original commit addressed but
> does not cause the large oscillation that it caused.

Hey Steven!

Thanks for the great analysis and tooling to help find these unexpected
behaviors! 

Sadly, I believe your proposed change can still cause minor nsec
inconsistencies from gettimeofday/vgettimeofday. In fact, the previous
implementation where the nsec inconsistency error was observed preserved
the sub-nanosecond remainder in xtime_nsec.

I suspect we may need to still round up and store the error, but tweak
the adjustment code to handle the larger error per-iteration then it was
originally designed for (note: the current code is still functioning
properly, its just not often hitting the expected trivial case).

The only alternative would be to integrate the sub-ns remainder into the
gettime caclculation (including reworking all the vsyscall
implementations to utilize it as well).

thanks
-john

next prev parent reply	other threads:[~2010-12-08 19:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-08 18:30 [RFC][PATCH] timekeeping: Keep xtime_nsec remainder separate from ntp_error Steven Rostedt
2010-12-08 19:59 ` john stultz [this message]
2010-12-08 20:15   ` Steven Rostedt
2010-12-08 22:47     ` john stultz
2010-12-09  2:09       ` Steven Rostedt
2010-12-09  3:44         ` john stultz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291838382.2909.17.camel@work-vm \
    --to=johnstul@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=zippel@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox