kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* How to measure performance inside Kernel?
@ 2012-02-09 12:58 Peter Senna Tschudin
  2012-02-09 18:12 ` michi1 at michaelblizek.twilightparadox.com
  2012-02-10 21:47 ` Peter Senna Tschudin
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Senna Tschudin @ 2012-02-09 12:58 UTC (permalink / raw)
  To: kernelnewbies

Dear list,

I'm looking for a way to compare the performance of two different
codes inside Kernel. I was able to do some comparison on user land but
I want to test the specific portion of code inside Kernel.

At line 1195 of drivers/media/video/videobuf2-core.c:
/*
 * Reinitialize all buffers for next use.
 */
for (i = 0; i < q->num_buffers; ++i)
       q->bufs[i]->state = VB2_BUF_STATE_DEQUEUED;

With:

/* buf2 */
/*
 * Reinitialize all buffers for next use.
 */
buf_ptr_end = q->bufs[q->num_buffers];

for (buf_ptr = q->bufs[0]; buf_ptr < buf_ptr_end; ++buf_ptr)
       buf_ptr->state = VB2_BUF_STATE_DEQUEUED;

To test on user land I've created two separate C source codes and
compiled with gcc -O2, then used the "perf" tool on the entire
application. With num_buffers = 131072:

$ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-references,cache-misses
-r 2048 ./buf1

Performance counter stats for './buf1' (2048 runs):

16,538,039 cycles                #0.000 GHz                  (+-0.06%)[80.23%]
6,917,411 stalled-cycles-frontend#41.83% frontend cycles idle(+-0.14%)[80.25%]
4,686,384 stalled-cycles-backend #28.34% backend  cycles idle(+-0.14%)[80.28%]
148,990 cache-references                                     (+-0.38%)[80.24%]
71,180 cache-misses              #47.775 % of all cache refs (+-0.22%)[88.14%]

0.005234340 seconds time elapsed

$ perf stat -e cycles,stalled-cycles-frontend,stalled-cycles-backend,cache-references,cache-misses
-r 2048 ./buf2
Performance counter stats for './buf2' (2048 runs):

14,740,563 cycles                #0.000 GHz                  (+-0.04%)[77.89%]
5,187,716 stalled-cycles-frontend#35.19% frontend cycles idle(+-0.14%)[77.81%]
3,383,748 stalled-cycles-backend #
101,894 cache-references                                     (+-0.23%)[84.60%]
66,647 cache-misses              #65.408 % of all cache refs (+-0.14%)[90.52%]

0.004661826 seconds time elapsed                             (+-0.06%)

But I want to repeat the tests on specific portion of code, not on
entire application. Is there a safe way of do something like:

start_bench ( ?? ); /* start measurement */

buf_ptr_end = q->bufs[q->num_buffers];

for (buf_ptr = q->bufs[0]; buf_ptr < buf_ptr_end; ++buf_ptr)
       buf_ptr->state = VB2_BUF_STATE_DEQUEUED;

end_bench ( ?? ); /* end measurement */

And is this the correct approach for testing the performance of
specific portion of Kernel code?

Thank you!

Peter



-- 
Peter Senna Tschudin
peter.senna at gmail.com
gpg id: 48274C36

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-02-14  3:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-09 12:58 How to measure performance inside Kernel? Peter Senna Tschudin
2012-02-09 18:12 ` michi1 at michaelblizek.twilightparadox.com
2012-02-10 21:47 ` Peter Senna Tschudin
2012-02-10 22:06   ` Jeff Haran
2012-02-11  0:22     ` Peter Senna Tschudin
2012-02-11  3:44       ` Graeme Russ
2012-02-11 13:14         ` Peter Senna Tschudin
2012-02-11 13:57           ` Peter Senna Tschudin
2012-02-12 11:46             ` Mulyadi Santosa
2012-02-13 23:13               ` Peter Senna Tschudin
2012-02-14  3:43                 ` Mulyadi Santosa
2012-02-11  7:34       ` michi1 at michaelblizek.twilightparadox.com
2012-02-11  7:22   ` michi1 at michaelblizek.twilightparadox.com
2012-02-11 13:29     ` Peter Senna Tschudin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).