From: John Stultz <john.stultz@linaro.org>
To: Miroslav Lichvar <mlichvar@redhat.com>
Cc: linux-kernel@vger.kernel.org, Prarit Bhargava <prarit@redhat.com>,
Richard Cochran <richardcochran@gmail.com>
Subject: Re: [PATCH RFC] timekeeping: Fix clock stability with nohz
Date: Mon, 02 Dec 2013 16:53:58 -0800 [thread overview]
Message-ID: <529D2BA6.5020005@linaro.org> (raw)
In-Reply-To: <20131120183900.GA25735@localhost>
On 11/20/2013 10:39 AM, Miroslav Lichvar wrote:
> On Mon, Nov 18, 2013 at 12:46:00PM -0800, John Stultz wrote:
>>> In a simulation with 1GHz TSC clock and 10Hz clock update the maximum
>>> error went down from 4.7 microseconds to 5.5 nanoseconds. With 1Hz
>>> update the maximum error went down from 480 microseconds to 55
>>> nanoseconds.
>> Curious what sort of a environment you're using for simulation (as well
>> as the real test below)?
> I compile the kernel timekeeping.c file into a test program where a
> fake TSC is provided to the kernel code and I can easily control the
> tick length and timekeeper updates. The program collects pairs of
> TSC/getnstimeofday() values and then it runs linear regression through
> the points to see the frequency offset and how stable the clock was.
>
> http://mlichvar.fedorapeople.org/tmp/tk_test.tar.gz
So.. this doesn't build for me.
timekeeping.o: In function `change_clocksource':
timekeeping.c:(.text+0x535): undefined reference to `lock_acquire'
timekeeping.c:(.text+0x588): undefined reference to `lock_release'
timekeeping.o: In function `__getnstimeofday':
timekeeping.c:(.text+0x675): undefined reference to `trace_hardirqs_off'
timekeeping.c:(.text+0x69b): undefined reference to `lock_acquire'
timekeeping.c:(.text+0x6b0): undefined reference to `lock_release'
timekeeping.c:(.text+0x6c2): undefined reference to `trace_hardirqs_on'
timekeeping.c:(.text+0x77c): undefined reference to `trace_hardirqs_off'
Do you need a special .config in your kernel source? Changing lockdep
and irq tracer configs don't seem to fix things for me.
>> But this is all very interesting! Thanks again for digging into this
>> issue and sending the patch!
> Thanks for looking into it. I agree this is a rather radical change
> and it needs to be done very carefully. At this point, I'd like to
> know if you think there are no fundamental problems in the design and
> whether it would be an acceptable replacement.
>
> A different approach to fix this problem would be controlling the
> maximum idle interval from the tick adjusting code so that the NTP
> error corrections can be accurate. That would further increase the
> complexity of the code and increase the interrupt rate.
So I'm still trying to break apart the larger change you've made into
smaller functional steps.
The main two differences I see are:
1) Calculating the mult value directly from the ntp_tick_length() value
(via the division) each update cycle.
2) Drastically simplifying the ntp_error correction to a 0/1 multiplier
adjustment (assuming the calculated mult will be slightly slow, due to
truncating the remainder in the division) based on the sign of the error.
(and I'm ignoring the previously mentioned accounting changes that
appear to be improperly dropped ;)
This makes me suspect the main issue is we're over-correcting in the
timekeeping_bigadjust() logic with nohz and I'm curious if instead we
could either limit timekeeping_bigadjust() adjustment to achieve the
same stability? In particular, bigadjust()'s exponential adjustment of
mult seems problematic.
Am I missing or glossing over anything else that is key to your changes?
thanks
-john
next prev parent reply other threads:[~2013-12-03 0:54 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-14 14:50 [PATCH RFC] timekeeping: Fix clock stability with nohz Miroslav Lichvar
2013-11-14 18:01 ` Rik van Riel
2013-11-16 7:03 ` Richard Cochran
2013-11-18 21:28 ` John Stultz
2013-11-19 14:13 ` Richard Cochran
2013-11-27 10:07 ` Richard Cochran
2013-11-21 10:12 ` Miroslav Lichvar
2013-11-18 20:46 ` John Stultz
2013-11-20 18:39 ` Miroslav Lichvar
2013-12-03 0:53 ` John Stultz [this message]
2013-12-03 4:03 ` John Stultz
2013-12-06 14:26 ` Miroslav Lichvar
2013-12-06 18:09 ` John Stultz
2013-12-06 18:37 ` Miroslav Lichvar
2013-12-07 1:43 ` John Stultz
2013-12-07 17:56 ` Richard Cochran
2013-12-07 22:16 ` John Stultz
2013-12-10 10:20 ` Miroslav Lichvar
2013-12-10 11:26 ` Miroslav Lichvar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=529D2BA6.5020005@linaro.org \
--to=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlichvar@redhat.com \
--cc=prarit@redhat.com \
--cc=richardcochran@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).