From: Jan Kiszka <jan.kiszka@domain.hid>
To: Philippe Gerum <rpm@xenomai.org>
Cc: xenomai-core <xenomai@xenomai.org>
Subject: Re: [Xenomai-core] ns vs. tsc as internal timer base
Date: Tue, 13 Jun 2006 13:56:39 +0200 [thread overview]
Message-ID: <448EA7F7.5000802@domain.hid> (raw)
In-Reply-To: <448E9E8B.70809@domain.hid>
[-- Attachment #1: Type: text/plain, Size: 4043 bytes --]
Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> between some football half-times of the last days ;), I played a bit
>> with a hand-optimised xnarch_tsc_to_ns() for x86. Using scaled math, I
>> achieved between 3 (P-I 133 MHz) to 4 times (P-M 1.3 GHz) faster
>> conversions than with the current variant. While this optimisation only
>> saves a few ten nanoseconds on high-end, slow processors can gain
>> several hundreds of nanos per conversion (my P-133: -600 ns).
>>
>
> I did exactely the same a few weeks ago, based on Anzinger's scaled math
:) We should coordinate better.
> from i386/kernel/timers/timer_tsc.c. And indeed, I had x 20 performance
> improvements in some cases.
Oops, that sounds like a bit too extreme optimisations. Is the original
version varying that much? I didn't observe this.
Here is my current version, BTW:
long tsc_scale;
unsigned int tsc_shift = 31;
static inline long long fast_tsc_to_ns(long long ts)
{
long long ret;
__asm__ (
/* HI = HIWORD(ts) * tsc_scale */
"mov %%eax,%%ebx\n\t"
"mov %%edx,%%eax\n\t"
"imull %2\n\t"
"mov %%eax,%%esi\n\t"
"mov %%edx,%%edi\n\t"
/* LO = LOWORD(ts) * tsc_scale */
"mov %%ebx,%%eax\n\t"
"mull %2\n\t"
/* ret = (HI << 32) + LO */
"add %%esi,%%edx\n\t"
"adc $0,%%edi\n\t"
/* ret = ret >> tsc_shift */
"shrd %%cl,%%edx,%%eax\n\t"
"shrd %%cl,%%edi,%%edx\n\t"
: "=A"(ret)
: "A" (ts), "m" (tsc_scale), "c" (tsc_shift)
: "ebx", "esi", "edi");
return ret;
}
void init_tsc(unsigned long cpu_freq)
{
unsigned long long scale;
while (1) {
scale = do_div(1000000000LL << tsc_shift, cpu_freq);
if (scale <= 0x7FFFFFFF)
break;
tsc_shift--;
}
tsc_scale = scale;
}
This version will use 31 (GHz cpu_freq) to 26 (~32 MHz) shifts, i.e. a
bit more than the Linux kernel's 22 bits.
>
>> This does not come for free: accuracy of very large values is slightly
>> worse, but that's likely negligible compared to the clock accuracy of
>> TSCs (does anyone have any real numbers on the latter, BTW?).
>>
>
> We do start losing significant precision for 2 ms delays and above,
> IIRC. This could be an issue for some events in aperiodic mode, albeit
> we could use a plain divide for those. The cost of conditionally doing
> this remains to be evaluated though.
Maybe I tested (not calculated - math is too hard for me :o)) the wrong
values, but I didn't see such high regressions.
>
>> As we loose some bits the one way, converting back still requires "real"
>> division (i.e. the use of the existing slower xnarch_ns_to_tsc).
>> Otherwise, we would get significant errors already for small intervals.
>>
>> To avoid loosing the optimisation again in ns_to_tsc, I thought about
>> basing the whole internal timer arithmetics on nanoseconds instead of
>> TSCs as it is now. Although I dug quite a lot in the current timer
>> subsystem the last weeks, I may still oversee aspects and I'm
>> x86-biased. Therefore my question before thinking or even patching
>> further this way: What was the motivation to choose TSCs as internal
>> time base?
>
> TSC are not the whole nucleus time base, but only the timer management
> one. The motivation to use TSCs in nucleus/timer.c was to pick a unit
> which would not require any conversion beyond the initial one in
> xntimer_start.
That helps strictly periodic application timers, not aperiodic ones like
timeouts.
>
>> Any pitfalls down the road (except introducing regressions)?
>
> Well, pitfalls expected from changing the core idea of time of the timer
> management code... :o>
>
You mean turning
rthal_timer_program_shot(rthal_imuldiv(delay,RTHAL_TIMER_FREQ,RTHAL_CPU_FREQ));
into
rthal_timer_program_shot(rthal_imuldiv(delay,RTHAL_TIMER_FREQ,1000000000));
e.g. ?
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
next prev parent reply other threads:[~2006-06-13 11:56 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-13 10:51 [Xenomai-core] ns vs. tsc as internal timer base Jan Kiszka
2006-06-13 11:16 ` Philippe Gerum
2006-06-13 11:56 ` Jan Kiszka [this message]
2006-06-13 12:31 ` Philippe Gerum
2006-06-13 13:07 ` Gilles Chanteperdrix
2006-06-13 13:28 ` Philippe Gerum
2006-06-13 13:34 ` Gilles Chanteperdrix
2006-06-13 13:45 ` Philippe Gerum
2006-06-13 13:33 ` Jan Kiszka
2006-06-13 13:51 ` Philippe Gerum
2006-06-13 16:19 ` Jan Kiszka
2006-06-13 16:29 ` Gilles Chanteperdrix
2006-06-13 17:04 ` Philippe Gerum
2006-06-13 17:13 ` Gilles Chanteperdrix
2006-06-13 17:58 ` Philippe Gerum
2006-06-14 9:25 ` Jim Cromie
2006-06-14 12:29 ` Philippe Gerum
2006-06-14 13:07 ` Jan Kiszka
2006-06-14 16:04 ` Jan Kiszka
2006-07-25 18:26 ` [Xenomai-core] Timer optimisations, continued Jan Kiszka
2006-07-27 8:53 ` Philippe Gerum
2006-07-27 12:42 ` Gilles Chanteperdrix
2006-07-27 13:19 ` Philippe Gerum
2006-07-27 13:54 ` Jan Kiszka
2006-07-27 14:10 ` Philippe Gerum
2006-06-13 11:59 ` [Xenomai-core] ns vs. tsc as internal timer base Gilles Chanteperdrix
2006-06-13 12:00 ` Anders Blomdell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=448EA7F7.5000802@domain.hid \
--to=jan.kiszka@domain.hid \
--cc=rpm@xenomai.org \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.