From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: xenomai-core <xenomai@xenomai.org>
Subject: Re: [Xenomai-core] ns vs. tsc as internal timer base
Date: Tue, 13 Jun 2006 15:51:57 +0200 [thread overview]
Message-ID: <448EC2FD.8050805@domain.hid> (raw)
In-Reply-To: <448EBE8C.60900@domain.hid>
Jan Kiszka wrote:
> Philippe Gerum wrote:
>
>>Jan Kiszka wrote:
>>
>>>Philippe Gerum wrote:
>>>
>>>>from i386/kernel/timers/timer_tsc.c. And indeed, I had x 20 performance
>>>>improvements in some cases.
>>>
>>>Oops, that sounds like a bit too extreme optimisations. Is the original
>>>version varying that much? I didn't observe this.
>>>
>>>Here is my current version, BTW:
>>>
>>>long tsc_scale;
>>>unsigned int tsc_shift = 31;
>>>
>>>static inline long long fast_tsc_to_ns(long long ts)
>>>{
>>> long long ret;
>>>
>>> __asm__ (
>>> /* HI = HIWORD(ts) * tsc_scale */
>>> "mov %%eax,%%ebx\n\t"
>>> "mov %%edx,%%eax\n\t"
>>> "imull %2\n\t"
>>> "mov %%eax,%%esi\n\t"
>>> "mov %%edx,%%edi\n\t"
>>>
>>> /* LO = LOWORD(ts) * tsc_scale */
>>> "mov %%ebx,%%eax\n\t"
>>> "mull %2\n\t"
>>>
>>> /* ret = (HI << 32) + LO */
>>> "add %%esi,%%edx\n\t"
>>> "adc $0,%%edi\n\t"
>>>
>>> /* ret = ret >> tsc_shift */
>>> "shrd %%cl,%%edx,%%eax\n\t"
>>> "shrd %%cl,%%edi,%%edx\n\t"
>>> : "=A"(ret)
>>> : "A" (ts), "m" (tsc_scale), "c" (tsc_shift)
>>> : "ebx", "esi", "edi");
>>>
>>> return ret;
>>>}
>>>
>>>void init_tsc(unsigned long cpu_freq)
>>>{
>>> unsigned long long scale;
>>>
>>> while (1) {
>>> scale = do_div(1000000000LL << tsc_shift, cpu_freq);
>>> if (scale <= 0x7FFFFFFF)
>>> break;
>>> tsc_shift--;
>>> }
>>> tsc_scale = scale;
>>>}
>>>
>>>This version will use 31 (GHz cpu_freq) to 26 (~32 MHz) shifts, i.e. a
>>>bit more than the Linux kernel's 22 bits.
>>>
>>
>>Here is likely why we have different levels of accuracy and performance,
>> firstly my version is bluntly based on the khz freq, secondly it
>>calculates the other way around, i.e. ns2tsc, so that tsc are keep in
>>the inner code, but more efficiently converted from ns counts passed to
>>the outer interface:
>>
>>static unsigned long ns2cyc_scale;
>>#define NS2CYC_SCALE_FACTOR 10 /* 2^10, carefully chosen */
>
>
> Linux only uses 10 bits for scheduling time calculation, which is
> tick-based (low-res) anyway.
This code is rather used to compute TSC offsets within a tick, so the
max operand is short, bounded and known by design. Hence the scale
factor, AFAICS.
The tsc clock_source uses 22 bits. The
> latter overflows after an hour or so, because they drop all bits > 64
> after the multiplication - insignificantly faster when using optimised
> code anyway.
>
This path to optimizing is about computing reasonably short delays this
way, so roll-over and precision would not be a key factor.
>
>>static inline void set_ns2cyc_scale(unsigned long cpu_khz)
>>{
>> ns2cyc_scale = (cpu_khz << NS2CYC_SCALE_FACTOR) / 1000000;
>>}
>>
>>static inline unsigned long long ns_2_cycles(unsigned long long ns)
>>{
>> return ns * ns2cyc_scale >> NS2CYC_SCALE_FACTOR;
>>}
>>
>>
>>>>TSC are not the whole nucleus time base, but only the timer management
>>>>one. The motivation to use TSCs in nucleus/timer.c was to pick a unit
>>>>which would not require any conversion beyond the initial one in
>>>>xntimer_start.
>>>
>>>
>>>That helps strictly periodic application timers, not aperiodic ones like
>>>timeouts.
>>>
>>
>>It depends, periodic timers usually exhibit larger delays, so the gain
>>is more significant with oneshot timings incurring smaller delays, hence
>>a higher number of calculations.
>>
>>
>>>>>Any pitfalls down the road (except introducing regressions)?
>>>>
>>>>Well, pitfalls expected from changing the core idea of time of the timer
>>>>management code... :o>
>>>>
>>>You mean turning
>>>
>>>rthal_timer_program_shot(rthal_imuldiv(delay,RTHAL_TIMER_FREQ,RTHAL_CPU_FREQ));
>>>
>>>
>>>into
>>>
>>>rthal_timer_program_shot(rthal_imuldiv(delay,RTHAL_TIMER_FREQ,1000000000));
>>>
>>>
>>
>>Not really, it was a general remark about changing a code that might
>>have some assumtions on using TSCs. Additionally, only x86 needs to
>>rescale TSC values to the timer frequency, other archs use the same unit
>>on both sides, and such unit might even have nothing to do with any CPU
>>accounting (e.g. blackfin uses a free running timer, ppc uses the
>>internal timebase, etc).
>
>
> Ok, an interesting aspect I already assumed but didn't check in details
> yet. That makes dealing with TSCs interesting again on != x86. In
> contrast, on x86, there is the aspect of frequency scaling that Anders
> brought up and which would speak pro nanos.
>
>
>>This said, it should not have that many assumptions, and in any case,
>>they should be confined to nucleus/timers.c. I think we should give this
>>kind of optimization a try.
>>
>
>
> Yep, it just needs some more brain cycles how to do this precisely.
>
> Jan
>
--
Philippe.
next prev parent reply other threads:[~2006-06-13 13:51 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-13 10:51 [Xenomai-core] ns vs. tsc as internal timer base Jan Kiszka
2006-06-13 11:16 ` Philippe Gerum
2006-06-13 11:56 ` Jan Kiszka
2006-06-13 12:31 ` Philippe Gerum
2006-06-13 13:07 ` Gilles Chanteperdrix
2006-06-13 13:28 ` Philippe Gerum
2006-06-13 13:34 ` Gilles Chanteperdrix
2006-06-13 13:45 ` Philippe Gerum
2006-06-13 13:33 ` Jan Kiszka
2006-06-13 13:51 ` Philippe Gerum [this message]
2006-06-13 16:19 ` Jan Kiszka
2006-06-13 16:29 ` Gilles Chanteperdrix
2006-06-13 17:04 ` Philippe Gerum
2006-06-13 17:13 ` Gilles Chanteperdrix
2006-06-13 17:58 ` Philippe Gerum
2006-06-14 9:25 ` Jim Cromie
2006-06-14 12:29 ` Philippe Gerum
2006-06-14 13:07 ` Jan Kiszka
2006-06-14 16:04 ` Jan Kiszka
2006-07-25 18:26 ` [Xenomai-core] Timer optimisations, continued Jan Kiszka
2006-07-27 8:53 ` Philippe Gerum
2006-07-27 12:42 ` Gilles Chanteperdrix
2006-07-27 13:19 ` Philippe Gerum
2006-07-27 13:54 ` Jan Kiszka
2006-07-27 14:10 ` Philippe Gerum
2006-06-13 11:59 ` [Xenomai-core] ns vs. tsc as internal timer base Gilles Chanteperdrix
2006-06-13 12:00 ` Anders Blomdell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=448EC2FD.8050805@domain.hid \
--to=rpm@xenomai.org \
--cc=jan.kiszka@domain.hid \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.