All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-help] Xenomai degrades syscall time on PPC405
@ 2007-08-20 12:11 Dirk Eibach
  2007-08-23 20:45 ` Philippe Gerum
  0 siblings, 1 reply; 4+ messages in thread
From: Dirk Eibach @ 2007-08-20 12:11 UTC (permalink / raw)
  To: Xenomai help

Hello,

I am trying to understand some performance problems I ran into on my 
xenomai system.

The testcode I attached performs a syscall(gettimeofday). Measuring time on 
a vanilla Kernel gives the following output:

ticks for ppc_getccounter: 116, for gettimeofday: 1479

The same kernel with xenomai patched in gives this:

ticks for ppc_getccounter: 116, for gettimeofday: 4278

Is this the expected behaviour? Why is the time for a system call nearly 3 
times as much?

#include <stdio.h>
#include <sys/time.h>
#include <time.h>

static void
ppc_getcounter(unsigned long long *v)
{
	register unsigned long tbu, tb, tbu2;

     loop:
	asm volatile ("mftbu %0" : "=r" (tbu) );
	asm volatile ("mftb  %0" : "=r" (tb)  );
	asm volatile ("mftbu %0" : "=r" (tbu2));
	if (__builtin_expect(tbu != tbu2, 0)) goto loop;

	/* The slightly peculiar way of writing the next lines is
	compiled better by GCC than any other way I tried. */
	((long*)(v))[0] = tbu;
	((long*)(v))[1] = tb;
}

int main(int argc, char* argv[])
{
	while(1) {
		struct timeval tv;
		unsigned long long count0, count1, count2;
		ppc_getcounter(&count0);
		ppc_getcounter(&count1);
		gettimeofday(&tv, NULL);
		ppc_getcounter(&count2);
		printf("ticks for ppc_getccounter: %lld, for gettimeofday: %lld\n",
			count1 - count0, count2- count1);
		sleep(1);
	}
}


Cheers
-- 
Dirk Eibach
Entwicklung
Guntermann & Drunck GmbH Systementwicklung





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xenomai-help] Xenomai degrades syscall time on PPC405
  2007-08-20 12:11 [Xenomai-help] Xenomai degrades syscall time on PPC405 Dirk Eibach
@ 2007-08-23 20:45 ` Philippe Gerum
  2007-08-24  6:33   ` Dirk Eibach
  0 siblings, 1 reply; 4+ messages in thread
From: Philippe Gerum @ 2007-08-23 20:45 UTC (permalink / raw)
  To: Dirk Eibach; +Cc: Xenomai help

On Mon, 2007-08-20 at 14:11 +0200, Dirk Eibach wrote:
> Hello,
> 
> I am trying to understand some performance problems I ran into on my 
> xenomai system.
> 
> The testcode I attached performs a syscall(gettimeofday). Measuring time on 
> a vanilla Kernel gives the following output:
> 
> ticks for ppc_getccounter: 116, for gettimeofday: 1479
> 
> The same kernel with xenomai patched in gives this:
> 
> ticks for ppc_getccounter: 116, for gettimeofday: 4278
> 
> Is this the expected behaviour? Why is the time for a system call nearly 3 
> times as much?
> 

When reporting figures, please report all of them. Here is your test ran
on a 405GPr, linux-2.6.14, I-pipe 1.5-03, with full output:

bash-3.00# ./ppc405_rt 
ticks for ppc_getccounter: 122, for gettimeofday: 7337
ticks for ppc_getccounter: 208, for gettimeofday: 2816
ticks for ppc_getccounter: 210, for gettimeofday: 2512
ticks for ppc_getccounter: 174, for gettimeofday: 2622
ticks for ppc_getccounter: 174, for gettimeofday: 2320
ticks for ppc_getccounter: 208, for gettimeofday: 2634
ticks for ppc_getccounter: 174, for gettimeofday: 2344
ticks for ppc_getccounter: 208, for gettimeofday: 2660
ticks for ppc_getccounter: 174, for gettimeofday: 2320
ticks for ppc_getccounter: 208, for gettimeofday: 2816
ticks for ppc_getccounter: 208, for gettimeofday: 2734
ticks for ppc_getccounter: 208, for gettimeofday: 2802
...

Only the first shot exhibits a pathological latency, and switching on
CONFIG_XENO_OPT_DEBUG_NUCLEUS explains why:

# dmesg

...
Looking up port of RPC 100003/2 on 192.168.0.7
Looking up port of RPC 100005/1 on 192.168.0.7
VFS: Mounted root (nfs filesystem).
Freeing unused kernel memory: 128k init
Xenomai: Switching ppc405_rt to secondary mode after exception #1025 from user-space at 0xfe54f40 (pid 248)

This is confirmed by the task stats:

# cat /proc/xenomai/stat

CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          2          0     00400080   99.9  ROOT
  0  248    1          1          1*    00300380    0.0  ppc405_rt
  0  0      0          48031      0     00000000    0.1  IRQ32: [timer]

(*) 1 page fault caught in primary mode, causing a Xenomai mode switch.

The test program hits a minor fault, due to a virtual memory management
artefact, which in turn forces a Xenomai switch for your task from
primary to secondary mode, in order to process the fault gracefully
(basically mapping a missing page table entry). 7 us is the time needed
to yield the control of the task to the regular kernel, then handle the
fault in Linux context, then eventually perform the syscall. This is a
known issue, and a solution may be available at some point, but not yet.

This mailing list contains other reports of the same issue, and detailed
explanations.

Regarding the other figures in the 2-3 us range, it is absolutely true
that Xenomai adds some overhead to the system call path which may get
noticeable on low-end boards. This is due to the pipelining of system
calls to the Xenomai domain: there is more code to execute than the
regular Linux has to run for the same task. This value may improve too
over time, since we do have a few opportunities for optimization here.

Last point: your measure do not take into account the possibility for
the syscall path to be preempted by a real-time interrupt, or even by a
plain Linux one after the task has switched to secondary mode, so you
should always read such data with a grain of salt. In any case, you
should never rely on a single sample.

> #include <stdio.h>
> #include <sys/time.h>
> #include <time.h>
> 
> static void
> ppc_getcounter(unsigned long long *v)
> {
> 	register unsigned long tbu, tb, tbu2;
> 
>      loop:
> 	asm volatile ("mftbu %0" : "=r" (tbu) );
> 	asm volatile ("mftb  %0" : "=r" (tb)  );
> 	asm volatile ("mftbu %0" : "=r" (tbu2));
> 	if (__builtin_expect(tbu != tbu2, 0)) goto loop;
> 
> 	/* The slightly peculiar way of writing the next lines is
> 	compiled better by GCC than any other way I tried. */
> 	((long*)(v))[0] = tbu;
> 	((long*)(v))[1] = tb;
> }
> 
> int main(int argc, char* argv[])
> {
> 	while(1) {
> 		struct timeval tv;
> 		unsigned long long count0, count1, count2;
> 		ppc_getcounter(&count0);
> 		ppc_getcounter(&count1);
> 		gettimeofday(&tv, NULL);
> 		ppc_getcounter(&count2);
> 		printf("ticks for ppc_getccounter: %lld, for gettimeofday: %lld\n",
> 			count1 - count0, count2- count1);
> 		sleep(1);
> 	}
> }
> 
> 
> Cheers
-- 
Philippe.




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xenomai-help] Xenomai degrades syscall time on PPC405
  2007-08-23 20:45 ` Philippe Gerum
@ 2007-08-24  6:33   ` Dirk Eibach
  2007-08-24  8:45     ` Philippe Gerum
  0 siblings, 1 reply; 4+ messages in thread
From: Dirk Eibach @ 2007-08-24  6:33 UTC (permalink / raw)
  To: rpm; +Cc: Xenomai help

Hello Philippe,

tank you for your detailed answer.

>> Hello,
>>
>> I am trying to understand some performance problems I ran into on my 
>> xenomai system.
>>
>> The testcode I attached performs a syscall(gettimeofday). Measuring time on 
>> a vanilla Kernel gives the following output:
>>
>> ticks for ppc_getccounter: 116, for gettimeofday: 1479
>>
>> The same kernel with xenomai patched in gives this:
>>
>> ticks for ppc_getccounter: 116, for gettimeofday: 4278
>>
>> Is this the expected behaviour? Why is the time for a system call nearly 3 
>> times as much?
>>
> 
> When reporting figures, please report all of them. Here is your test ran
> on a 405GPr, linux-2.6.14, I-pipe 1.5-03, with full output:
> 
> bash-3.00# ./ppc405_rt 
> ticks for ppc_getccounter: 122, for gettimeofday: 7337
> ticks for ppc_getccounter: 208, for gettimeofday: 2816
> ticks for ppc_getccounter: 210, for gettimeofday: 2512
> ticks for ppc_getccounter: 174, for gettimeofday: 2622
> ticks for ppc_getccounter: 174, for gettimeofday: 2320
> ticks for ppc_getccounter: 208, for gettimeofday: 2634
> ticks for ppc_getccounter: 174, for gettimeofday: 2344
> ticks for ppc_getccounter: 208, for gettimeofday: 2660
> ticks for ppc_getccounter: 174, for gettimeofday: 2320
> ticks for ppc_getccounter: 208, for gettimeofday: 2816
> ticks for ppc_getccounter: 208, for gettimeofday: 2734
> ticks for ppc_getccounter: 208, for gettimeofday: 2802
> ...

I certainly tried to pick a representative sample. Here is the full series 
with xenomai patched in:
# ./syscall-test
ticks for ppc_getccounter: 92, for gettimeofday: 18375
ticks for ppc_getccounter: 116, for gettimeofday: 4250
ticks for ppc_getccounter: 116, for gettimeofday: 4416
ticks for ppc_getccounter: 116, for gettimeofday: 4372
ticks for ppc_getccounter: 116, for gettimeofday: 4486
ticks for ppc_getccounter: 116, for gettimeofday: 4376
ticks for ppc_getccounter: 116, for gettimeofday: 4446
ticks for ppc_getccounter: 116, for gettimeofday: 4434
ticks for ppc_getccounter: 116, for gettimeofday: 4368
ticks for ppc_getccounter: 116, for gettimeofday: 4418
ticks for ppc_getccounter: 116, for gettimeofday: 4404
ticks for ppc_getccounter: 116, for gettimeofday: 4384
...


And here we go without:
# ./syscall-test
ticks for ppc_getccounter: 92, for gettimeofday: 6761
ticks for ppc_getccounter: 116, for gettimeofday: 1768
ticks for ppc_getccounter: 116, for gettimeofday: 1594
ticks for ppc_getccounter: 116, for gettimeofday: 1560
ticks for ppc_getccounter: 116, for gettimeofday: 1600
ticks for ppc_getccounter: 116, for gettimeofday: 1574
ticks for ppc_getccounter: 116, for gettimeofday: 1618
ticks for ppc_getccounter: 116, for gettimeofday: 1560
ticks for ppc_getccounter: 116, for gettimeofday: 1600
ticks for ppc_getccounter: 116, for gettimeofday: 1540
ticks for ppc_getccounter: 116, for gettimeofday: 1618
ticks for ppc_getccounter: 116, for gettimeofday: 1566
ticks for ppc_getccounter: 116, for gettimeofday: 1608
...

> 
> Only the first shot exhibits a pathological latency, and switching on
> CONFIG_XENO_OPT_DEBUG_NUCLEUS explains why:
> 
> # dmesg
> 
> ...
> Looking up port of RPC 100003/2 on 192.168.0.7
> Looking up port of RPC 100005/1 on 192.168.0.7
> VFS: Mounted root (nfs filesystem).
> Freeing unused kernel memory: 128k init
> Xenomai: Switching ppc405_rt to secondary mode after exception #1025 from user-space at 0xfe54f40 (pid 248)
> 
> This is confirmed by the task stats:
> 
> # cat /proc/xenomai/stat
> 
> CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
>   0  0      0          2          0     00400080   99.9  ROOT
>   0  248    1          1          1*    00300380    0.0  ppc405_rt
>   0  0      0          48031      0     00000000    0.1  IRQ32: [timer]
> 
> (*) 1 page fault caught in primary mode, causing a Xenomai mode switch.
> 
> The test program hits a minor fault, due to a virtual memory management
> artefact, which in turn forces a Xenomai switch for your task from
> primary to secondary mode, in order to process the fault gracefully
> (basically mapping a missing page table entry). 7 us is the time needed
> to yield the control of the task to the regular kernel, then handle the
> fault in Linux context, then eventually perform the syscall. This is a
> known issue, and a solution may be available at some point, but not yet.
> 
> This mailing list contains other reports of the same issue, and detailed
> explanations.
> 
> Regarding the other figures in the 2-3 us range, it is absolutely true
> that Xenomai adds some overhead to the system call path which may get
> noticeable on low-end boards. This is due to the pipelining of system
> calls to the Xenomai domain: there is more code to execute than the
> regular Linux has to run for the same task. This value may improve too
> over time, since we do have a few opportunities for optimization here.

But when I compare my values above I still get an overhead of 10us which is 
at least strange and not acceptable for my application. I am asking myself 
if it will be possible to get this fixed with xenomai or if I have to go a 
completely different way.

> Last point: your measure do not take into account the possibility for
> the syscall path to be preempted by a real-time interrupt, or even by a
> plain Linux one after the task has switched to secondary mode, so you
> should always read such data with a grain of salt. In any case, you
> should never rely on a single sample.
> 
>> #include <stdio.h>
>> #include <sys/time.h>
>> #include <time.h>
>>
>> static void
>> ppc_getcounter(unsigned long long *v)
>> {
>> 	register unsigned long tbu, tb, tbu2;
>>
>>      loop:
>> 	asm volatile ("mftbu %0" : "=r" (tbu) );
>> 	asm volatile ("mftb  %0" : "=r" (tb)  );
>> 	asm volatile ("mftbu %0" : "=r" (tbu2));
>> 	if (__builtin_expect(tbu != tbu2, 0)) goto loop;
>>
>> 	/* The slightly peculiar way of writing the next lines is
>> 	compiled better by GCC than any other way I tried. */
>> 	((long*)(v))[0] = tbu;
>> 	((long*)(v))[1] = tb;
>> }
>>
>> int main(int argc, char* argv[])
>> {
>> 	while(1) {
>> 		struct timeval tv;
>> 		unsigned long long count0, count1, count2;
>> 		ppc_getcounter(&count0);
>> 		ppc_getcounter(&count1);
>> 		gettimeofday(&tv, NULL);
>> 		ppc_getcounter(&count2);
>> 		printf("ticks for ppc_getccounter: %lld, for gettimeofday: %lld\n",
>> 			count1 - count0, count2- count1);
>> 		sleep(1);
>> 	}
>> }
>>
>>
>> Cheers


-- 
Dirk Eibach
Entwicklung
Guntermann & Drunck GmbH Systementwicklung

Telefon: +49 2739 8901-100
Fax:     +49 2739 8901-120
EMail:   mailto:eibach@domain.hid

-------------------------------------------------------------------------------
Guntermann & Drunck GmbH Systementwicklung
Dortmunder Str. 4a
D-57234 Wilnsdorf - Germany
Tel: +49 (0) 27 39 / 89 01 - 100  Fax: +49 (0) 27 39 / 89 01 - 120
E-Mail: mailto:sales@domain.hid Web: www.gdsys.de
-------------------------------------------------------------------------------
Geschäftsführer: Udo Guntermann - Martin Drunck - Reiner Ruelmann - Klaus Tocke
HRB 2884, Amtsgericht Siegen
USt.-Id.-Nr. DE 126575222 - Steuer-Nr. 342 / 5835 / 1041
-------------------------------------------------------------------------------





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Xenomai-help] Xenomai degrades syscall time on PPC405
  2007-08-24  6:33   ` Dirk Eibach
@ 2007-08-24  8:45     ` Philippe Gerum
  0 siblings, 0 replies; 4+ messages in thread
From: Philippe Gerum @ 2007-08-24  8:45 UTC (permalink / raw)
  To: Dirk Eibach; +Cc: Xenomai help

On Fri, 2007-08-24 at 08:33 +0200, Dirk Eibach wrote:
> Hello Philippe,
> 
> tank you for your detailed answer.
> 
> >> Hello,
> >>
> >> I am trying to understand some performance problems I ran into on my 
> >> xenomai system.
> >>
> >> The testcode I attached performs a syscall(gettimeofday). Measuring time on 
> >> a vanilla Kernel gives the following output:
> >>
> >> ticks for ppc_getccounter: 116, for gettimeofday: 1479
> >>
> >> The same kernel with xenomai patched in gives this:
> >>
> >> ticks for ppc_getccounter: 116, for gettimeofday: 4278
> >>
> >> Is this the expected behaviour? Why is the time for a system call nearly 3 
> >> times as much?
> >>
> > 
> > When reporting figures, please report all of them. Here is your test ran
> > on a 405GPr, linux-2.6.14, I-pipe 1.5-03, with full output:
> > 
> > bash-3.00# ./ppc405_rt 
> > ticks for ppc_getccounter: 122, for gettimeofday: 7337
> > ticks for ppc_getccounter: 208, for gettimeofday: 2816
> > ticks for ppc_getccounter: 210, for gettimeofday: 2512
> > ticks for ppc_getccounter: 174, for gettimeofday: 2622
> > ticks for ppc_getccounter: 174, for gettimeofday: 2320
> > ticks for ppc_getccounter: 208, for gettimeofday: 2634
> > ticks for ppc_getccounter: 174, for gettimeofday: 2344
> > ticks for ppc_getccounter: 208, for gettimeofday: 2660
> > ticks for ppc_getccounter: 174, for gettimeofday: 2320
> > ticks for ppc_getccounter: 208, for gettimeofday: 2816
> > ticks for ppc_getccounter: 208, for gettimeofday: 2734
> > ticks for ppc_getccounter: 208, for gettimeofday: 2802
> > ...
> 
> I certainly tried to pick a representative sample. Here is the full series 
> with xenomai patched in:
> # ./syscall-test
> ticks for ppc_getccounter: 92, for gettimeofday: 18375
> ticks for ppc_getccounter: 116, for gettimeofday: 4250
> ticks for ppc_getccounter: 116, for gettimeofday: 4416
> ticks for ppc_getccounter: 116, for gettimeofday: 4372
> ticks for ppc_getccounter: 116, for gettimeofday: 4486
> ticks for ppc_getccounter: 116, for gettimeofday: 4376
> ticks for ppc_getccounter: 116, for gettimeofday: 4446
> ticks for ppc_getccounter: 116, for gettimeofday: 4434
> ticks for ppc_getccounter: 116, for gettimeofday: 4368
> ticks for ppc_getccounter: 116, for gettimeofday: 4418
> ticks for ppc_getccounter: 116, for gettimeofday: 4404
> ticks for ppc_getccounter: 116, for gettimeofday: 4384
> ...
> 
> 
> And here we go without:
> # ./syscall-test
> ticks for ppc_getccounter: 92, for gettimeofday: 6761
> ticks for ppc_getccounter: 116, for gettimeofday: 1768
> ticks for ppc_getccounter: 116, for gettimeofday: 1594

I have comparable values on the 405GPr in the vanilla case:

ticks for ppc_getccounter: 127, for gettimeofday: 6851
ticks for ppc_getccounter: 177, for gettimeofday: 1733
ticks for ppc_getccounter: 153, for gettimeofday: 1455
ticks for ppc_getccounter: 153, for gettimeofday: 1401

but still much lower values than you have in the Xenomai case.
The answer is either:

- in your kernel configuration,
- in the I-pipe patch,
- in the Xenomai version you are currently using

Btw, where does the I-pipe patch 2.6.18 1.5-00 you seem to be using come
from? Where did you pick it?

> ticks for ppc_getccounter: 116, for gettimeofday: 1560
> ticks for ppc_getccounter: 116, for gettimeofday: 1600
> ticks for ppc_getccounter: 116, for gettimeofday: 1574
> ticks for ppc_getccounter: 116, for gettimeofday: 1618
> ticks for ppc_getccounter: 116, for gettimeofday: 1560
> ticks for ppc_getccounter: 116, for gettimeofday: 1600
> ticks for ppc_getccounter: 116, for gettimeofday: 1540
> ticks for ppc_getccounter: 116, for gettimeofday: 1618
> ticks for ppc_getccounter: 116, for gettimeofday: 1566
> ticks for ppc_getccounter: 116, for gettimeofday: 1608
> ...
> 

-- 
Philippe.




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-08-24  8:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-20 12:11 [Xenomai-help] Xenomai degrades syscall time on PPC405 Dirk Eibach
2007-08-23 20:45 ` Philippe Gerum
2007-08-24  6:33   ` Dirk Eibach
2007-08-24  8:45     ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.