From: Nicolai Stange <nicstange@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicolai Stange <nicstange@gmail.com>,
John Stultz <john.stultz@linaro.org>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC v7 00/23] adapt clockevents frequencies to mono clock
Date: Mon, 26 Sep 2016 12:15:55 +0200 [thread overview]
Message-ID: <87twd30xwk.fsf@gmail.com> (raw)
In-Reply-To: <87shsrlfte.fsf@gmail.com> (Nicolai Stange's message of "Fri, 23 Sep 2016 00:39:41 +0200")
Nicolai Stange <nicstange@gmail.com> writes:
> Thomas Gleixner <tglx@linutronix.de> writes:
>
>> On Wed, 21 Sep 2016, Nicolai Stange wrote:
>>> Thomas Gleixner <tglx@linutronix.de> writes:
>>>
>>> > On Wed, 21 Sep 2016, Nicolai Stange wrote:
>>> >> Thomas Gleixner <tglx@linutronix.de> writes:
>>> >> > Have you ever measured the overhead of the extra work which has to be done
>>> >> > in clockevents_adjust_all_freqs() ?
>>> >>
>>> >> Not exactly, I had a look at its invocation frequency which seems to
>>> >> decay exponentially with uptime, presumably because the NTP error
>>> >> approaches zero.
>>> >>
>>> >> However, I've just gathered a function_graph ftrace on my Intel
>>> >> i7-4800MQ (Haswell, 8HTs):
>>> >>
>>> >> # TIME CPU DURATION FUNCTION CALLS
>>> >> # | | | | | | | |
>>> >> 85.287027 | 0) 0.899 us | clockevents_adjust_all_freqs();
>>> >> 85.288026 | 0) 0.759 us | clockevents_adjust_all_freqs();
>>> >> 85.289026 | 0) 0.735 us | clockevents_adjust_all_freqs();
>>> >> 85.290026 | 0) 0.671 us | clockevents_adjust_all_freqs();
>>> >> 149.503656 | 2) 2.477 us | clockevents_adjust_all_freqs();
>>> >
>>> > That's not that bad. Though I'd like to see numbers for ARM (especially the
>>> > less powerful SoCs) as well.
>>>
>>> On a Raspberry Pi 2B (bcm2836, ARMv7) with CONFIG_SMP=y, the mean over
>>> ~5300 samples is 5.14+/-1.04us with a max of 11.15us.
>>
>> So why is the variance that high?
>
> I think this is because the histogram has got two peaks, c.f. [1]
>
> Also, the 11us maximum is not isolated but a flat tail is reaching to
> this point which I admittedly can't explain.
It turned out that the linux-next kernel always ran the RPi2B at what
apparently is its minimum speed.
lmbench3's mhz gave me 560MHz and lat_mem_rd reports a memory latency of
120ns on linux-next.
Compare this to an "official" kernel from the Raspberry Pi Foundation
obtained from [2]: mhz says that the CPU runs at 900MHz and according to
lat_mem_rd, the memory latency is at 50ns.
Especially the high memory latency killed my benchmark: both, the second
peak and the long tail stemmed from cache misses.
In order to verify this, I separated the tracing data from linux-next
into those samples that do not have any other calls to
clockevents_adjust_all_freqs() within a time span of 100ms before them
("first of run") and those that do ("not first of run"). The result can
be seen at [3]: the second peak as well as the long tail is generated
exclusively by the "first of run"'s.
Unfortunately I was not able to get this RPi2B running at its full
capabilities with linux-next. So I applied this series on top of the RPi
Foundation's kernel and did further benchmarking there. The results can
be found at [4]: no second peak, no particularly long tail.
Some statistics:
0% 25% 50% 75% 100%
1.250 1.511 1.667 1.927 7.031
Mean: 1.89
sd: 0.69
Much better IMHO. Good enough?
A random note: during tracing, I recognized that the adjustment should
better skip those CLOCK_EVT_FEAT_DUMMY devices. v8 will do this. Both
measurements include that change already.
>> You have an outlier on that intel as well which might be caused by
>> NMI, but it might also be a systematic issue depending on the input
>> parameters.
>
> AFACIT, the "algorithmic" runtime should be constant per CED, so it
> should not be dependent on any input parameters.
Well, this is not exactly true: the __do_div64() on ARM is implemented in
software. Basically this algorithm's runtime depends on the position of
the dividend's MSB. However, the range of the "adj" dividend as given by
__clockevents_calc_adjust_freq() should be relatively narrow.
I traced __do_div64() and there haven't been any apparent abnormalities.
>> 11 us on that ARM worries me.
These are 7us now. Also, this max value isn't nearly as connected to the
rest of the histogram as that 11us sample before. So it *might* be an
outlier now. I can't tell for sure though.
Thanks,
Nicolai
> [1] https://nicst.de/cev-freqadjust/adjust_all_freqs-function_graph_hist.png
[2] https://github.com/raspberrypi/linux
[3] https://nicst.de/cev-freqadjust/hist-adjust-smp.pdf
[4] https://nicst.de/cev-freqadjust/hist-adjust-official-smp.pdf
prev parent reply other threads:[~2016-09-26 10:16 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-16 20:08 [RFC v7 00/23] adapt clockevents frequencies to mono clock Nicolai Stange
2016-09-16 20:08 ` [RFC v7 01/23] clocksource: sh_cmt: compute rate before registration again Nicolai Stange
2016-09-16 20:08 ` [RFC v7 02/23] clocksource: sh_tmu: " Nicolai Stange
2016-09-16 20:08 ` [RFC v7 03/23] clocksource: em_sti: split clock prepare and enable steps Nicolai Stange
2016-09-16 20:08 ` [RFC v7 04/23] clocksource: em_sti: compute rate before registration Nicolai Stange
2016-09-16 20:08 ` [RFC v7 05/23] clocksource: h8300_timer8: don't reset rate in ->set_state_oneshot() Nicolai Stange
2016-09-16 20:08 ` [RFC v7 06/23] clockevents: make clockevents_config() static Nicolai Stange
2016-09-16 20:08 ` [RFC v7 07/23] many clockevent drivers: set ->min_delta_ticks and ->max_delta_ticks Nicolai Stange
2016-09-16 20:08 ` [RFC v7 08/23] arch/s390/kernel/time: " Nicolai Stange
2016-09-16 20:08 ` [RFC v7 09/23] arch/x86/platform/uv/uv_time: " Nicolai Stange
2016-09-16 20:08 ` [RFC v7 10/23] arch/tile/kernel/time: " Nicolai Stange
2016-09-16 20:08 ` [RFC v7 11/23] clockevents: always initialize ->min_delta_ns and ->max_delta_ns Nicolai Stange
2016-09-16 20:11 ` [RFC v7 12/23] many clockevent drivers: don't set " Nicolai Stange
2016-09-16 20:11 ` [RFC v7 13/23] clockevents: introduce CLOCK_EVT_FEAT_NO_ADJUST flag Nicolai Stange
2016-09-16 20:11 ` [RFC v7 14/23] clockevents: decouple ->max_delta_ns from ->max_delta_ticks Nicolai Stange
2016-09-16 20:11 ` [RFC v7 15/23] clockevents: do comparison of delta against minimum in terms of cycles Nicolai Stange
2016-09-16 20:11 ` [RFC v7 16/23] clockevents: clockevents_program_min_delta(): don't set ->next_event Nicolai Stange
2016-09-16 20:11 ` [RFC v7 17/23] clockevents: use ->min_delta_ticks_adjusted to program minimum delta Nicolai Stange
2016-09-16 20:11 ` [RFC v7 18/23] clockevents: min delta increment: calculate min_delta_ns from ticks Nicolai Stange
2016-09-16 20:12 ` [RFC v7 19/23] timer_list: print_tickdevice(): calculate ->min_delta_ns dynamically Nicolai Stange
2016-09-16 20:12 ` [RFC v7 20/23] clockevents: purge ->min_delta_ns Nicolai Stange
2016-09-16 20:12 ` [RFC v7 21/23] clockevents: initial support for mono to raw time conversion Nicolai Stange
2016-09-16 20:12 ` [RFC v7 22/23] clockevents: make setting of ->mult and ->mult_adjusted atomic Nicolai Stange
2016-09-16 20:27 ` [RFC v7 23/23] timekeeping: inform clockevents about freq adjustments Nicolai Stange
2016-09-20 20:54 ` [RFC v7 00/23] adapt clockevents frequencies to mono clock Thomas Gleixner
2016-09-20 23:08 ` Nicolai Stange
2016-09-20 23:36 ` Thomas Gleixner
2016-09-21 14:06 ` Nicolai Stange
2016-09-22 21:39 ` Thomas Gleixner
2016-09-22 22:39 ` Nicolai Stange
2016-09-26 10:15 ` Nicolai Stange [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87twd30xwk.fsf@gmail.com \
--to=nicstange@gmail.com \
--cc=john.stultz@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.