* syscalls performance @ 2011-02-25 17:03 Mauro Romano Trajber 2011-02-25 17:15 ` Daniel Baluta 0 siblings, 1 reply; 13+ messages in thread From: Mauro Romano Trajber @ 2011-02-25 17:03 UTC (permalink / raw) To: kernelnewbies I was doing some performance tests of system calls and I find an interesting behavior. Using RDTSC to count the CPU cycles, a single call to the getpid() consumes about 7k of CPU clock cycles and ten calls consume approximately 9,800 cycles. The fact is that from the second call, the CPU cycles grows at a rate of about 350 CPU cycles per call. Why does this happen? There is some hardware optimization when the syscall ID is already in EAX register ? Any ideias ? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/006d5510/attachment.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 17:03 syscalls performance Mauro Romano Trajber @ 2011-02-25 17:15 ` Daniel Baluta 2011-02-25 18:22 ` Mauro Romano Trajber 0 siblings, 1 reply; 13+ messages in thread From: Daniel Baluta @ 2011-02-25 17:15 UTC (permalink / raw) To: kernelnewbies Hi Mauro, On Fri, Feb 25, 2011 at 7:03 PM, Mauro Romano Trajber <trajber@gmail.com> wrote: > I was doing some performance tests of system calls and I find an interesting > behavior. > Using RDTSC to count the CPU cycles, a single call to the getpid() consumes > about 7k of CPU clock cycles and ten calls consume approximately 9,800 > cycles. > The fact is that from the second call, the CPU cycles grows at a rate of > about 350 CPU cycles per call. > Why does this happen? There is some hardware optimization when the syscall > ID is already in EAX register ? Use strace and check to number of getpid() syscalls. You'll notice that only 1 system call is made, glibc caching the pid value. thanks, Daniel. ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 17:15 ` Daniel Baluta @ 2011-02-25 18:22 ` Mauro Romano Trajber 2011-02-25 20:15 ` Daniel Baluta 0 siblings, 1 reply; 13+ messages in thread From: Mauro Romano Trajber @ 2011-02-25 18:22 UTC (permalink / raw) To: kernelnewbies Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this is not the root cause of this behavior. Going further, I decide to use call getpid without glibc, using syscall(SYS_getpid) to test this behavior and it happened again. Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes about 10k CPU cycles. Any ideas ? On Fri, Feb 25, 2011 at 2:15 PM, Daniel Baluta <daniel.baluta@gmail.com>wrote: > Hi Mauro, > > On Fri, Feb 25, 2011 at 7:03 PM, Mauro Romano Trajber <trajber@gmail.com> > wrote: > > I was doing some performance tests of system calls and I find an > interesting > > behavior. > > Using RDTSC to count the CPU cycles, a single call to the getpid() > consumes > > about 7k of CPU clock cycles and ten calls consume approximately 9,800 > > cycles. > > The fact is that from the second call, the CPU cycles grows at a rate of > > about 350 CPU cycles per call. > > Why does this happen? There is some hardware optimization when the > syscall > > ID is already in EAX register ? > > Use strace and check to number of getpid() syscalls. > You'll notice that only 1 system call is made, glibc caching the pid value. > > thanks, > Daniel. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/73a1b69c/attachment.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 18:22 ` Mauro Romano Trajber @ 2011-02-25 20:15 ` Daniel Baluta 2011-02-25 20:30 ` Mauro Romano Trajber 0 siblings, 1 reply; 13+ messages in thread From: Daniel Baluta @ 2011-02-25 20:15 UTC (permalink / raw) To: kernelnewbies On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> wrote: > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this > is not the root cause of this behavior. > Going further, I decide to use call getpid without glibc, using > ?syscall(SYS_getpid) to test this behavior and it?happened again. > Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes > about 10k CPU cycles. > Any ideas ? Can you post a pointer to your code and information about how you got this numbers? thanks, Daniel. ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 20:15 ` Daniel Baluta @ 2011-02-25 20:30 ` Mauro Romano Trajber 2011-02-25 20:42 ` Enrico Granata 0 siblings, 1 reply; 13+ messages in thread From: Mauro Romano Trajber @ 2011-02-25 20:30 UTC (permalink / raw) To: kernelnewbies Sure, the code is attached. On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com>wrote: > On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> > wrote: > > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but > this > > is not the root cause of this behavior. > > Going further, I decide to use call getpid without glibc, using > > syscall(SYS_getpid) to test this behavior and it happened again. > > Calling it once, the test consumes about 7k CPU cycles and 10 calls > consumes > > about 10k CPU cycles. > > Any ideas ? > > Can you post a pointer to your code and information about how you got > this numbers? > > thanks, > Daniel. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/f273b11b/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: syscallperf.c Type: text/x-csrc Size: 732 bytes Desc: not available Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/f273b11b/attachment.bin ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 20:30 ` Mauro Romano Trajber @ 2011-02-25 20:42 ` Enrico Granata 2011-02-25 20:59 ` Jim Kukunas 2011-02-26 4:31 ` Mulyadi Santosa 0 siblings, 2 replies; 13+ messages in thread From: Enrico Granata @ 2011-02-25 20:42 UTC (permalink / raw) To: kernelnewbies I modified the source code to show exactly how many clock ticks it is taking for each call. It seems that the behavior hinted by Mauro Romano Trajber is actually there: [enrico at espresso ~]$ ./syscallperf 15 4925 1190 942 942 935 942 636 577 627 621 580 591 565 580 565 I am starting to wonder if this depends on the syscall itself OR on some call optimization.. any gcc experts around? Enrico Granata Computer Science & Engineering Department (EBU3B) - Room 3240 office phone 858 534 9914 University of California, San Diego On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote: > Sure, the code is attached. > > > On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com> wrote: > On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> wrote: > > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but this > > is not the root cause of this behavior. > > Going further, I decide to use call getpid without glibc, using > > syscall(SYS_getpid) to test this behavior and it happened again. > > Calling it once, the test consumes about 7k CPU cycles and 10 calls consumes > > about 10k CPU cycles. > > Any ideas ? > > Can you post a pointer to your code and information about how you got > this numbers? > > thanks, > Daniel. > > <syscallperf.c> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: syscallperf.c Type: application/octet-stream Size: 745 bytes Desc: not available Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment.obj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/59862a7d/attachment-0001.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 20:42 ` Enrico Granata @ 2011-02-25 20:59 ` Jim Kukunas 2011-02-25 21:00 ` Enrico Granata 2011-02-26 4:31 ` Mulyadi Santosa 1 sibling, 1 reply; 13+ messages in thread From: Jim Kukunas @ 2011-02-25 20:59 UTC (permalink / raw) To: kernelnewbies On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote: > > > I modified the source code to show exactly how many clock ticks it is taking > for each call. It seems that the behavior hinted by Mauro Romano Trajber is > actually there: > [enrico at espresso ~]$ ./syscallperf 15 > 4925 > 1190 > 942 > 942 > 935 > 942 > 636 > 577 > 627 > 621 > 580 > 591 > 565 > 580 > 565 > I am starting to wonder if this depends on the syscall itself OR on some > call optimization.. any gcc experts around? >From the getpid(2) manpage: "Since glibc version 2.3.4, the glibc wrapper function for getpid() caches PIDs, so as to avoid additional system calls when a process calls getpid() repeatedly." > Enrico Granata > Computer Science & Engineering Department (EBU3B) -?Room 3240 > office phone 858 534 9914 > University of California, San Diego > On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote: > > Sure, the code is attached. > > On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com> > wrote: >> >> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> >> wrote: >> > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but >> > this >> > is not the root cause of this behavior. >> > Going further, I decide to use call getpid without glibc, using >> > ?syscall(SYS_getpid) to test this behavior and it?happened again. >> > Calling it once, the test consumes about 7k CPU cycles and 10 calls >> > consumes >> > about 10k CPU cycles. >> > Any ideas ? >> >> Can you post a pointer to your code and information about how you got >> this numbers? >> >> thanks, >> Daniel. > > <syscallperf.c> > > _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies at kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > > -- Jim Kukunas jkukunas at acm.org http://member.acm.org/~treak007 ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 20:59 ` Jim Kukunas @ 2011-02-25 21:00 ` Enrico Granata 2011-02-25 22:03 ` Mauro Romano Trajber 0 siblings, 1 reply; 13+ messages in thread From: Enrico Granata @ 2011-02-25 21:00 UTC (permalink / raw) To: kernelnewbies We already ruled that out this morning. The code is now using syscall() directly: inline long mygetpid() { return syscall(SYS_getpid); } Enrico Granata Computer Science & Engineering Department (EBU3B) - Room 3240 office phone 858 534 9914 University of California, San Diego On Feb 25, 2011, at 12:59 PM, Jim Kukunas wrote: > On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote: >> >> >> I modified the source code to show exactly how many clock ticks it is taking >> for each call. It seems that the behavior hinted by Mauro Romano Trajber is >> actually there: >> [enrico at espresso ~]$ ./syscallperf 15 >> 4925 >> 1190 >> 942 >> 942 >> 935 >> 942 >> 636 >> 577 >> 627 >> 621 >> 580 >> 591 >> 565 >> 580 >> 565 >> I am starting to wonder if this depends on the syscall itself OR on some >> call optimization.. any gcc experts around? > > From the getpid(2) manpage: > > "Since glibc version 2.3.4, the glibc wrapper function for getpid() > caches PIDs, so as to avoid additional system calls when a process > calls getpid() repeatedly." > > >> Enrico Granata >> Computer Science & Engineering Department (EBU3B) - Room 3240 >> office phone 858 534 9914 >> University of California, San Diego >> On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote: >> >> Sure, the code is attached. >> >> On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com> >> wrote: >>> >>> On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> >>> wrote: >>>> Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but >>>> this >>>> is not the root cause of this behavior. >>>> Going further, I decide to use call getpid without glibc, using >>>> syscall(SYS_getpid) to test this behavior and it happened again. >>>> Calling it once, the test consumes about 7k CPU cycles and 10 calls >>>> consumes >>>> about 10k CPU cycles. >>>> Any ideas ? >>> >>> Can you post a pointer to your code and information about how you got >>> this numbers? >>> >>> thanks, >>> Daniel. >> >> <syscallperf.c> >> >> _______________________________________________ >> Kernelnewbies mailing list >> Kernelnewbies at kernelnewbies.org >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >> >> > > -- > Jim Kukunas > jkukunas at acm.org > http://member.acm.org/~treak007 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/cc6cf7f6/attachment.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 21:00 ` Enrico Granata @ 2011-02-25 22:03 ` Mauro Romano Trajber 2011-02-25 22:06 ` Daniel Baluta 0 siblings, 1 reply; 13+ messages in thread From: Mauro Romano Trajber @ 2011-02-25 22:03 UTC (permalink / raw) To: kernelnewbies Same behavior for a new syscall created from scratch. On Friday, February 25, 2011, Enrico Granata <egranata@ucsd.edu> wrote: > We already ruled that out this morning. The code is now using syscall() directly:inline long mygetpid() { ? ? return syscall(SYS_getpid);} > > > Enrico GranataComputer Science & Engineering Department (EBU3B) -?Room 3240office phone 858 534 9914University of California, San Diego > > > On Feb 25, 2011, at 12:59 PM, Jim Kukunas wrote: > On Fri, Feb 25, 2011 at 12:42 PM, Enrico Granata <egranata@ucsd.edu> wrote: > > > I modified the source code to show exactly how many clock ticks it is taking > for each call. It seems that the behavior hinted by Mauro Romano Trajber is > actually there: > [enrico at espresso ~]$ ./syscallperf 15 > 4925 > 1190 > 942 > 942 > 935 > 942 > 636 > 577 > 627 > 621 > 580 > 591 > 565 > 580 > 565 > I am starting to wonder if this depends on the syscall itself OR on some > call optimization.. any gcc experts around? > > From the getpid(2) manpage: > > ??????"Since glibc version 2.3.4, the ?glibc ?wrapper ?function ?for ?getpid() > ???????caches ?PIDs, ?so ?as ?to ?avoid additional system calls when a process > ???????calls getpid() repeatedly." > > > Enrico Granata > Computer Science & Engineering Department (EBU3B) -?Room 3240 > office phone 858 534 9914 > University of California, San Diego > On Feb 25, 2011, at 12:30 PM, Mauro Romano Trajber wrote: > > Sure, the code is attached. > > On Fri, Feb 25, 2011 at 5:15 PM, Daniel Baluta <daniel.baluta@gmail.com> > wrote: > > On Fri, Feb 25, 2011 at 8:22 PM, Mauro Romano Trajber <trajber@gmail.com> > wrote: > Thanks Enrico and Daniel, you're right. glibc was caching getpid(); but > this > is not the root cause of this behavior. > Going further, I decide to use call getpid without glibc, using > ?syscall(SYS_getpid) to test this behavior and it?happened again. > Calling it once, the test consumes about 7k CPU cycles and 10 calls > consumes > about 10k CPU cycles. > Any ideas ? > > Can you post a pointer to your code and information about how you got > this numbers? > > thanks, > Daniel. > > <syscallperf.c> > ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 22:03 ` Mauro Romano Trajber @ 2011-02-25 22:06 ` Daniel Baluta 2011-02-25 22:14 ` Enrico Granata 0 siblings, 1 reply; 13+ messages in thread From: Daniel Baluta @ 2011-02-25 22:06 UTC (permalink / raw) To: kernelnewbies On Sat, Feb 26, 2011 at 12:03 AM, Mauro Romano Trajber <trajber@gmail.com> wrote: > Same behavior for a new syscall created from scratch. Instruction caching? ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 22:06 ` Daniel Baluta @ 2011-02-25 22:14 ` Enrico Granata 0 siblings, 0 replies; 13+ messages in thread From: Enrico Granata @ 2011-02-25 22:14 UTC (permalink / raw) To: kernelnewbies same behavior if you strip away the function call and call syscall() directly inside the for() loop: for (i = 0; i < qtd; i++) { start = RDTSC(); syscall(SYS_getpid); end = RDTSC(); printf("%lld\n", (end - start)); } I do not have the source code for syscall() available ATM but I would look into that to see if something there looks like it may be caching information Enrico Granata Computer Science & Engineering Department (EBU3B) - Room 3240 office phone 858 534 9914 University of California, San Diego On Feb 25, 2011,@2:06 PM, Daniel Baluta wrote: > On Sat, Feb 26, 2011 at 12:03 AM, Mauro Romano Trajber > <trajber@gmail.com> wrote: >> Same behavior for a new syscall created from scratch. > > Instruction caching? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/1e997eb6/attachment-0001.html ^ permalink raw reply [flat|nested] 13+ messages in thread
* syscalls performance 2011-02-25 20:42 ` Enrico Granata 2011-02-25 20:59 ` Jim Kukunas @ 2011-02-26 4:31 ` Mulyadi Santosa 1 sibling, 0 replies; 13+ messages in thread From: Mulyadi Santosa @ 2011-02-26 4:31 UTC (permalink / raw) To: kernelnewbies Hi Enrico... On Sat, Feb 26, 2011 at 03:42, Enrico Granata <egranata@ucsd.edu> wrote: > > > I modified the source code to show exactly how many clock ticks it is taking > for each call. It seems that the behavior hinted by Mauro Romano Trajber is > actually there: > [enrico at espresso ~]$ ./syscallperf 15 > 4925 > 1190 > 942 watching strace, objdump etc etc, I roughly conclude the high initial tsc number might be due to initial symbol resolve to mygetpid() and/or syscall() (IIRC this is called PLT/GOT symbol resolve). After the GOT/PLT table had the syscall/mygetpid address, the rest went fast. PS: if using gcc -O2, seems like mygetpid() is really inlined.... -- regards, Mulyadi Santosa Freelance Linux trainer and consultant blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <D3CBA33B-3809-4AF5-9F66-0F2D256C4A82@cs.ucsd.edu>]
* syscalls performance [not found] <D3CBA33B-3809-4AF5-9F66-0F2D256C4A82@cs.ucsd.edu> @ 2011-02-25 17:14 ` Enrico Granata 0 siblings, 0 replies; 13+ messages in thread From: Enrico Granata @ 2011-02-25 17:14 UTC (permalink / raw) To: kernelnewbies If you are calling getpid() as a libc function, the library is caching the result. This means, after the first time calling it (when you really have to go into the kernel), subsequent calls will simply return the cached value without ever leaving user level Enrico Granata Computer Science & Engineering Department (EBU3B) - Room 3240 office phone 858 534 9914 University of California, San Diego > On Feb 25, 2011, at 9:03 AM, Mauro Romano Trajber wrote: > >> I was doing some performance tests of system calls and I find an interesting behavior. >> Using RDTSC to count the CPU cycles, a single call to the getpid() consumes about 7k of CPU clock cycles and ten calls consume approximately 9,800 cycles. >> The fact is that from the second call, the CPU cycles grows at a rate of about 350 CPU cycles per call. >> Why does this happen? There is some hardware optimization when the syscall ID is already in EAX register ? >> >> Any ideias ? >> _______________________________________________ >> Kernelnewbies mailing list >> Kernelnewbies at kernelnewbies.org >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20110225/dc975a60/attachment.html ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-02-26 4:31 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-25 17:03 syscalls performance Mauro Romano Trajber
2011-02-25 17:15 ` Daniel Baluta
2011-02-25 18:22 ` Mauro Romano Trajber
2011-02-25 20:15 ` Daniel Baluta
2011-02-25 20:30 ` Mauro Romano Trajber
2011-02-25 20:42 ` Enrico Granata
2011-02-25 20:59 ` Jim Kukunas
2011-02-25 21:00 ` Enrico Granata
2011-02-25 22:03 ` Mauro Romano Trajber
2011-02-25 22:06 ` Daniel Baluta
2011-02-25 22:14 ` Enrico Granata
2011-02-26 4:31 ` Mulyadi Santosa
[not found] <D3CBA33B-3809-4AF5-9F66-0F2D256C4A82@cs.ucsd.edu>
2011-02-25 17:14 ` Enrico Granata
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).