* syscalls performance
@ 2011-02-25 17:03 Mauro Romano Trajber
2011-02-25 17:15 ` Daniel Baluta
0 siblings, 1 reply; 13+ messages in thread
From: Mauro Romano Trajber @ 2011-02-25 17:03 UTC (permalink / raw)
To: kernelnewbies
I was doing some performance tests of system calls and I find an interesting
behavior.
Using RDTSC to count the CPU cycles, a single call to the getpid() consumes
about 7k of CPU clock cycles and ten calls consume approximately 9,800
cycles.
The fact is that from the second call, the CPU cycles grows at a rate of
about 350 CPU cycles per call.
Why does this happen? There is some hardware optimization when the syscall
ID is already in EAX register ?
Any ideias ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/006d5510/attachment.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
[not found] <D3CBA33B-3809-4AF5-9F66-0F2D256C4A82@cs.ucsd.edu>
@ 2011-02-25 17:14 ` Enrico Granata
0 siblings, 0 replies; 13+ messages in thread
From: Enrico Granata @ 2011-02-25 17:14 UTC (permalink / raw)
To: kernelnewbies
If you are calling getpid() as a libc function, the library is caching the result. This means, after the first time calling it (when you really have to go into the kernel), subsequent calls will simply return the cached value without ever leaving user level
Enrico Granata
Computer Science & Engineering Department (EBU3B) - Room 3240
office phone 858 534 9914
University of California, San Diego
> On Feb 25, 2011, at 9:03 AM, Mauro Romano Trajber wrote:
>
>> I was doing some performance tests of system calls and I find an interesting behavior.
>> Using RDTSC to count the CPU cycles, a single call to the getpid() consumes about 7k of CPU clock cycles and ten calls consume approximately 9,800 cycles.
>> The fact is that from the second call, the CPU cycles grows at a rate of about 350 CPU cycles per call.
>> Why does this happen? There is some hardware optimization when the syscall ID is already in EAX register ?
>>
>> Any ideias ?
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies at kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/dc975a60/attachment.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 17:03 Mauro Romano Trajber
@ 2011-02-25 17:15 ` Daniel Baluta
2011-02-25 18:22 ` Mauro Romano Trajber
0 siblings, 1 reply; 13+ messages in thread
From: Daniel Baluta @ 2011-02-25 17:15 UTC (permalink / raw)
To: kernelnewbies
Hi Mauro,
On Fri, Feb 25, 2011 at 7:03 PM, Mauro Romano Trajber <trajber@gmail.com> wrote:
> I was doing some performance tests of system calls and I find an interesting
> behavior.
> Using RDTSC to count the CPU cycles, a single call to the getpid() consumes
> about 7k of CPU clock cycles and ten calls consume approximately 9,800
> cycles.
> The fact is that from the second call, the CPU cycles grows at a rate of
> about 350 CPU cycles per call.
> Why does this happen? There is some hardware optimization when the syscall
> ID is already in EAX register ?
Use strace and check to number of getpid() syscalls.
You'll notice that only 1 system call is made, glibc caching the pid value.
thanks,
Daniel.
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 17:15 ` Daniel Baluta
@ 2011-02-25 18:22 ` Mauro Romano Trajber
2011-02-25 20:15 ` Daniel Baluta
0 siblings, 1 reply; 13+ messages in thread
From: Mauro Romano Trajber @ 2011-02-25 18:22 UTC (permalink / raw)
To: kernelnewbies
Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this
is not the root cause of this behavior.
Going further, I decide to use call getpid without glibc, using
syscall(SYS_getpid) to test this behavior and it happened again.
Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes
about 10k CPU cycles.
Any ideas ?
On Fri, Feb 25, 2011 at 2:15 PM, Daniel Baluta <daniel.baluta@gmail.com>wrote:
> Hi Mauro,
>
> On Fri, Feb 25, 2011 at 7:03 PM, Mauro Romano Trajber <trajber@gmail.com>
> wrote:
> > I was doing some performance tests of system calls and I find an
> interesting
> > behavior.
> > Using RDTSC to count the CPU cycles, a single call to the getpid()
> consumes
> > about 7k of CPU clock cycles and ten calls consume approximately 9,800
> > cycles.
> > The fact is that from the second call, the CPU cycles grows at a rate of
> > about 350 CPU cycles per call.
> > Why does this happen? There is some hardware optimization when the
> syscall
> > ID is already in EAX register ?
>
> Use strace and check to number of getpid() syscalls.
> You'll notice that only 1 system call is made, glibc caching the pid value.
>
> thanks,
> Daniel.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/73a1b69c/attachment.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 18:22 ` Mauro Romano Trajber
@ 2011-02-25 20:15 ` Daniel Baluta
2011-02-25 20:30 ` Mauro Romano Trajber
0 siblings, 1 reply; 13+ messages in thread
From: Daniel Baluta @ 2011-02-25 20:15 UTC (permalink / raw)
To: kernelnewbies
On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> wrote:
> Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this
> is not the root cause of this behavior.
> Going further, I decide to use call getpid without glibc, using
> ?syscall(SYS_getpid) to test this behavior and it?happened again.
> Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes
> about 10k CPU cycles.
> Any ideas ?
Can you post a pointer to your code and information about how you got
this numbers?
thanks,
Daniel.
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 20:15 ` Daniel Baluta
@ 2011-02-25 20:30 ` Mauro Romano Trajber
2011-02-25 20:42 ` Enrico Granata
0 siblings, 1 reply; 13+ messages in thread
From: Mauro Romano Trajber @ 2011-02-25 20:30 UTC (permalink / raw)
To: kernelnewbies
Sure, the code is attached.
On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com>wrote:
> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com>
> wrote:
> > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but
> this
> > is not the root cause of this behavior.
> > Going further, I decide to use call getpid without glibc, using
> > syscall(SYS_getpid) to test this behavior and it happened again.
> > Calling it once, the test consumes about 7k CPU cycles and 10 calls
> consumes
> > about 10k CPU cycles.
> > Any ideas ?
>
> Can you post a pointer to your code and information about how you got
> this numbers?
>
> thanks,
> Daniel.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/f273b11b/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syscallperf.c
Type: text/x-csrc
Size: 732 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/f273b11b/attachment.bin
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 20:30 ` Mauro Romano Trajber
@ 2011-02-25 20:42 ` Enrico Granata
2011-02-25 20:59 ` Jim Kukunas
2011-02-26 4:31 ` Mulyadi Santosa
0 siblings, 2 replies; 13+ messages in thread
From: Enrico Granata @ 2011-02-25 20:42 UTC (permalink / raw)
To: kernelnewbies
I modified the source code to show exactly how many clock ticks it is taking for each call. It seems that the behavior hinted by Mauro Romano Trajber is actually there:
[enrico at espresso ~]$ ./syscallperf 15
4925
1190
942
942
935
942
636
577
627
621
580
591
565
580
565
I am starting to wonder if this depends on the syscall itself OR on some call optimization.. any gcc experts around?
Enrico Granata
Computer Science & Engineering Department (EBU3B) - Room 3240
office phone 858 534 9914
University of California, San Diego
On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote:
> Sure, the code is attached.
>
>
> On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com> wrote:
> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> wrote:
> > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this
> > is not the root cause of this behavior.
> > Going further, I decide to use call getpid without glibc, using
> > syscall(SYS_getpid) to test this behavior and it happened again.
> > Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes
> > about 10k CPU cycles.
> > Any ideas ?
>
> Can you post a pointer to your code and information about how you got
> this numbers?
>
> thanks,
> Daniel.
>
> <syscallperf.c>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: syscallperf.c
Type: application/octet-stream
Size: 745 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment.obj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment-0001.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 20:42 ` Enrico Granata
@ 2011-02-25 20:59 ` Jim Kukunas
2011-02-25 21:00 ` Enrico Granata
2011-02-26 4:31 ` Mulyadi Santosa
1 sibling, 1 reply; 13+ messages in thread
From: Jim Kukunas @ 2011-02-25 20:59 UTC (permalink / raw)
To: kernelnewbies
On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote:
>
>
> I modified the source code to show exactly how many clock ticks it is taking
> for each call. It seems that the behavior hinted by Mauro Romano Trajber is
> actually there:
> [enrico at espresso ~]$ ./syscallperf 15
> 4925
> 1190
> 942
> 942
> 935
> 942
> 636
> 577
> 627
> 621
> 580
> 591
> 565
> 580
> 565
> I am starting to wonder if this depends on the syscall itself OR on some
> call optimization.. any gcc experts around?
>From the getpid(2) manpage:
"Since glibc version 2.3.4, the glibc wrapper function for getpid()
caches PIDs, so as to avoid additional system calls when a process
calls getpid() repeatedly."
> Enrico Granata
> Computer Science & Engineering Department (EBU3B) -?Room 3240
> office phone 858 534 9914
> University of California, San Diego
> On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote:
>
> Sure, the code is attached.
>
> On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com>
> wrote:
>>
>> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com>
>> wrote:
>> > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but
>> > this
>> > is not the root cause of this behavior.
>> > Going further, I decide to use call getpid without glibc, using
>> > ?syscall(SYS_getpid) to test this behavior and it?happened again.
>> > Calling it once, the test consumes about 7k CPU cycles and 10 calls
>> > consumes
>> > about 10k CPU cycles.
>> > Any ideas ?
>>
>> Can you post a pointer to your code and information about how you got
>> this numbers?
>>
>> thanks,
>> Daniel.
>
> <syscallperf.c>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
--
Jim Kukunas
jkukunas at acm.org
http://member.acm.org/~treak007
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 20:59 ` Jim Kukunas
@ 2011-02-25 21:00 ` Enrico Granata
2011-02-25 22:03 ` Mauro Romano Trajber
0 siblings, 1 reply; 13+ messages in thread
From: Enrico Granata @ 2011-02-25 21:00 UTC (permalink / raw)
To: kernelnewbies
We already ruled that out this morning. The code is now using syscall() directly:
inline long mygetpid() {
return syscall(SYS_getpid);
}
Enrico Granata
Computer Science & Engineering Department (EBU3B) - Room 3240
office phone 858 534 9914
University of California, San Diego
On Feb 25, 2011, at 12:59 PM, Jim Kukunas wrote:
> On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote:
>>
>>
>> I modified the source code to show exactly how many clock ticks it is taking
>> for each call. It seems that the behavior hinted by Mauro Romano Trajber is
>> actually there:
>> [enrico at espresso ~]$ ./syscallperf 15
>> 4925
>> 1190
>> 942
>> 942
>> 935
>> 942
>> 636
>> 577
>> 627
>> 621
>> 580
>> 591
>> 565
>> 580
>> 565
>> I am starting to wonder if this depends on the syscall itself OR on some
>> call optimization.. any gcc experts around?
>
> From the getpid(2) manpage:
>
> "Since glibc version 2.3.4, the glibc wrapper function for getpid()
> caches PIDs, so as to avoid additional system calls when a process
> calls getpid() repeatedly."
>
>
>> Enrico Granata
>> Computer Science & Engineering Department (EBU3B) - Room 3240
>> office phone 858 534 9914
>> University of California, San Diego
>> On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote:
>>
>> Sure, the code is attached.
>>
>> On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com>
>> wrote:
>>>
>>> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com>
>>> wrote:
>>>> Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but
>>>> this
>>>> is not the root cause of this behavior.
>>>> Going further, I decide to use call getpid without glibc, using
>>>> syscall(SYS_getpid) to test this behavior and it happened again.
>>>> Calling it once, the test consumes about 7k CPU cycles and 10 calls
>>>> consumes
>>>> about 10k CPU cycles.
>>>> Any ideas ?
>>>
>>> Can you post a pointer to your code and information about how you got
>>> this numbers?
>>>
>>> thanks,
>>> Daniel.
>>
>> <syscallperf.c>
>>
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies at kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>
> --
> Jim Kukunas
> jkukunas at acm.org
> http://member.acm.org/~treak007
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/cc6cf7f6/attachment.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 21:00 ` Enrico Granata
@ 2011-02-25 22:03 ` Mauro Romano Trajber
2011-02-25 22:06 ` Daniel Baluta
0 siblings, 1 reply; 13+ messages in thread
From: Mauro Romano Trajber @ 2011-02-25 22:03 UTC (permalink / raw)
To: kernelnewbies
Same behavior for a new syscall created from scratch.
On Friday, February 25, 2011, Enrico Granata <egranata@ucsd.edu> wrote:
> We already ruled that out this morning. The code is now using syscall() directly:inline long mygetpid() { ? ? return syscall(SYS_getpid);}
>
>
> Enrico GranataComputer Science & Engineering Department (EBU3B) -?Room 3240office phone 858 534 9914University of California, San Diego
>
>
> On Feb 25, 2011, at 12:59 PM, Jim Kukunas wrote:
> On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote:
>
>
> I modified the source code to show exactly how many clock ticks it is taking
> for each call. It seems that the behavior hinted by Mauro Romano Trajber is
> actually there:
> [enrico at espresso ~]$ ./syscallperf 15
> 4925
> 1190
> 942
> 942
> 935
> 942
> 636
> 577
> 627
> 621
> 580
> 591
> 565
> 580
> 565
> I am starting to wonder if this depends on the syscall itself OR on some
> call optimization.. any gcc experts around?
>
> From the getpid(2) manpage:
>
> ??????"Since glibc version 2.3.4, the ?glibc ?wrapper ?function ?for ?getpid()
> ???????caches ?PIDs, ?so ?as ?to ?avoid additional system calls when a process
> ???????calls getpid() repeatedly."
>
>
> Enrico Granata
> Computer Science & Engineering Department (EBU3B) -?Room 3240
> office phone 858 534 9914
> University of California, San Diego
> On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote:
>
> Sure, the code is attached.
>
> On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com>
> wrote:
>
> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com>
> wrote:
> Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but
> this
> is not the root cause of this behavior.
> Going further, I decide to use call getpid without glibc, using
> ?syscall(SYS_getpid) to test this behavior and it?happened again.
> Calling it once, the test consumes about 7k CPU cycles and 10 calls
> consumes
> about 10k CPU cycles.
> Any ideas ?
>
> Can you post a pointer to your code and information about how you got
> this numbers?
>
> thanks,
> Daniel.
>
> <syscallperf.c>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 22:03 ` Mauro Romano Trajber
@ 2011-02-25 22:06 ` Daniel Baluta
2011-02-25 22:14 ` Enrico Granata
0 siblings, 1 reply; 13+ messages in thread
From: Daniel Baluta @ 2011-02-25 22:06 UTC (permalink / raw)
To: kernelnewbies
On Sat, Feb 26, 2011 at 12:03 AM, Mauro Romano Trajber
<trajber@gmail.com> wrote:
> Same behavior for a new syscall created from scratch.
Instruction caching?
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 22:06 ` Daniel Baluta
@ 2011-02-25 22:14 ` Enrico Granata
0 siblings, 0 replies; 13+ messages in thread
From: Enrico Granata @ 2011-02-25 22:14 UTC (permalink / raw)
To: kernelnewbies
same behavior if you strip away the function call and call syscall() directly inside the for() loop:
for (i = 0; i < qtd; i++) {
start = RDTSC();
syscall(SYS_getpid);
end = RDTSC();
printf("%lld\n", (end - start));
}
I do not have the source code for syscall() available ATM but I would look into that to see if something there looks like it may be caching information
Enrico Granata
Computer Science & Engineering Department (EBU3B) - Room 3240
office phone 858 534 9914
University of California, San Diego
On Feb 25, 2011,@2:06 PM, Daniel Baluta wrote:
> On Sat, Feb 26, 2011 at 12:03 AM, Mauro Romano Trajber
> <trajber@gmail.com> wrote:
>> Same behavior for a new syscall created from scratch.
>
> Instruction caching?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/1e997eb6/attachment-0001.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance
2011-02-25 20:42 ` Enrico Granata
2011-02-25 20:59 ` Jim Kukunas
@ 2011-02-26 4:31 ` Mulyadi Santosa
1 sibling, 0 replies; 13+ messages in thread
From: Mulyadi Santosa @ 2011-02-26 4:31 UTC (permalink / raw)
To: kernelnewbies
Hi Enrico...
On Sat, Feb 26, 2011 at 03:42, Enrico Granata <egranata@ucsd.edu> wrote:
>
>
> I modified the source code to show exactly how many clock ticks it is taking
> for each call. It seems that the behavior hinted by Mauro Romano Trajber is
> actually there:
> [enrico at espresso ~]$ ./syscallperf 15
> 4925
> 1190
> 942
watching strace, objdump etc etc, I roughly conclude the high initial
tsc number might be due to initial symbol resolve to mygetpid() and/or
syscall() (IIRC this is called PLT/GOT symbol resolve).
After the GOT/PLT table had the syscall/mygetpid address, the rest went fast.
PS: if using gcc -O2, seems like mygetpid() is really inlined....
--
regards,
Mulyadi Santosa
Freelance Linux trainer and consultant
blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-02-26 4:31 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <D3CBA33B-3809-4AF5-9F66-0F2D256C4A82@cs.ucsd.edu>
2011-02-25 17:14 ` syscalls performance Enrico Granata
2011-02-25 17:03 Mauro Romano Trajber
2011-02-25 17:15 ` Daniel Baluta
2011-02-25 18:22 ` Mauro Romano Trajber
2011-02-25 20:15 ` Daniel Baluta
2011-02-25 20:30 ` Mauro Romano Trajber
2011-02-25 20:42 ` Enrico Granata
2011-02-25 20:59 ` Jim Kukunas
2011-02-25 21:00 ` Enrico Granata
2011-02-25 22:03 ` Mauro Romano Trajber
2011-02-25 22:06 ` Daniel Baluta
2011-02-25 22:14 ` Enrico Granata
2011-02-26 4:31 ` Mulyadi Santosa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).