All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] rtdm_clock_read overhead/accuracy question
@ 2015-11-18 13:34 JK.Behnke
  2015-11-18 13:52 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 14+ messages in thread
From: JK.Behnke @ 2015-11-18 13:34 UTC (permalink / raw)
  To: xenomai

   Hello,

   I'm currently debugging a problem in my xenomai application.
   In order ot get timing information I do the following inside the
   read-handler of my rtdm driver.

   // --- start code snippet ---------------------------------
   nanosecs_abs_t  t1, t2;
   rtdm_lock_get_irqsave(&lockobj, lockctx);

   t1=rtdm_clock_read();
   rtdm_task_busy_sleep(10000); // code block to be measured.
   t2=rtdm_clock_read();

   rtdm_lock_put_irqrestore(&lockobj, lockctx);
   // --- end code snippet ---------------------------------

   I observe the following time differences  t2-t1:
   min=10176 ns,  max=10737 ns, avg=10314 ns

   This means in this case the average overhead of the two rtdm_clock_read
   calls is
   approximately 314 ns/2 = 157 ns.

   Is that realistic?
   Are there any other functions I might use to measure execution time of
   an arbitrary code block?

   I'm running xenomai 2.6.3 on linux kernel 3.8.13.
   Hardware platform is Intel Atom E640 (1.0 GHz)

   Thanks in advance
   Jochen

^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: [Xenomai] rtdm_clock_read overhead/accuracy question
@ 2015-11-19  8:09 JK.Behnke
  2015-11-19  8:55 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 14+ messages in thread
From: JK.Behnke @ 2015-11-19  8:09 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

>On Wed, Nov 18, 2015 at 06:11:00PM +0100, JK.Behnke at web.de wrote:
>> Hello Gilles,
>> 
>> >On Wed, Nov 18, 2015 at 03:36:25PM +0100, JK.Behnke at web.de wrote:
>> >> Hello Gilles,
>> >>
>> >> thanks for your quick response.
>> >> See my comments/questions below.
>> >>
>> >> >On Wed, Nov 18, 2015 at 02:34:56PM +0100, JK.Behnke at web.de wrote:
>> >> >> Hello,
>> >> >>
>> >> >> I'm currently debugging a problem in my xenomai application.
>> >> >> In order ot get timing information I do the following inside the
>> >> >> read-handler of my rtdm driver.
>> >> >>
>> >> >> // --- start code snippet ---------------------------------
>> >> >> nanosecs_abs_t t1, t2;
>> >> >> rtdm_lock_get_irqsave(&lockobj, lockctx);
>> >> >>
>> >> >> t1=rtdm_clock_read();
>> >> >> rtdm_task_busy_sleep(10000); // code block to be measured.
>> >> >> t2=rtdm_clock_read();
>> >> >>
>> >> >> rtdm_lock_put_irqrestore(&lockobj, lockctx);
>> >> >> // --- end code snippet ---------------------------------
>> >> >>
>> >> >> I observe the following time differences t2-t1:
>> >> >> min=10176 ns, max=10737 ns, avg=10314 ns
>> >> >>
>> >> >> This means in this case the average overhead of the two rtdm_clock_read
>> >> >> calls is
>> >> >> approximately 314 ns/2 = 157 ns.
>> >> >
>> >> >No, this means that the average overhead of rtdm_clock_read is
>> >> >314ns.
>> >> >
>> >> >what you do is:
>> >> >read time
>> >> >convert to ns
>> >> >busy sleep
>> >> >read time
>> >> >convert to ns
>> >> >
>> >> >As you can see, between the two time reads, there is only one
>> >> >conversion to nanoseconds, (which we assume is the bulk of the
>> >> >overhead). And more generally, the overhead of the second time read
>> >> >is not counted either.
>> >> >
>> >> >
>> >> >>
>> >> >> Is that realistic?
>> >> >
>> >> >This seems to be a lot. The "tsc" test will measure the average
>> >> >difference between two time reads, and the "arith" test will measure
>> >> >the average time for a "multiply then shift" operation, adding the
>> >> >two should give you an idea of the overhead. From what I recall,
>> >> >reading the tsc on an atom N230 (a very old one) had an overhead
>> >> >around 20ns.
>> >> Where do I find the "tsc test", which test application do I have to run?
>> >
>> >The application is named "tsc".
>> 
>> Thanks, meanwhile I found the tsc tool under
>> /usr/xenomai/bin/regression
>
>Yeah, well, not a very good default directory. You can onverride it
>with an option from the configure script.
>
>> 
>> Here are my "tsc" results
>> Atom E640 (1.0 GHz): min: 26, max: 936, avg: 34.0001 -> 0.0262211 us  ==> 26 ns
>> Atom E660 (1.3 GHz): min: 26, max: 1105, avg: 34.0001 -> 0.0262211 us ==> 26 ns
>
>Well, there is a problem with the time conversion routines. 34
>cycles on a 1.0 GHz machine should be 34ns, not 26ns. Other than
>that, it seems consistent with what I observe on my old atom.

What can I do about that?

By the way, I measured the difference of two consecutive xnarch_get_cpu_tsc() calls.
Atom E640 (1.0 GHz): min=30 max=40 avg=30.2

As in my case I have disabled interrupts using rtdm_lock_get_irqsave() during
execution of xnarch_get_cpu_tsc(), I wonder why I get different min/max values at all?
I assume that xnarch_get_cpu_tsc() is compiled to a single RDTSC instruction.
Doesn't a RDTSC machine instruction have a deterministic number of cpu cycles?
Aah, obviously this has something to do with so called serializing instructions inside the cpu.
Some people recommend using the new RDTSCP instruction to measure execution times.
Is it for performance reasons that xnarch_get_cpu_tsc uses RDTSC instruction?


Regards
Jochen


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-20  6:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-18 13:34 [Xenomai] rtdm_clock_read overhead/accuracy question JK.Behnke
2015-11-18 13:52 ` Gilles Chanteperdrix
2015-11-18 14:36   ` JK.Behnke
2015-11-18 16:24     ` Gilles Chanteperdrix
2015-11-18 17:11       ` JK.Behnke
2015-11-18 17:57         ` Gilles Chanteperdrix
2015-11-18 14:38   ` Jeroen Van den Keybus
2015-11-18 16:03     ` JK.Behnke
2015-11-18 16:22     ` Gilles Chanteperdrix
  -- strict thread matches above, loose matches on Subject: below --
2015-11-19  8:09 JK.Behnke
2015-11-19  8:55 ` Gilles Chanteperdrix
2015-11-19 10:34   ` JK.Behnke
2015-11-19 20:21     ` Gilles Chanteperdrix
2015-11-20  6:44       ` JK.Behnke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.