kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* Module vs Kernel main performacne
@ 2012-05-29 23:50 Abu Rasheda
  2012-05-30  4:18 ` Mulyadi Santosa
  2012-06-07 23:36 ` Peter Senna Tschudin
  0 siblings, 2 replies; 16+ messages in thread
From: Abu Rasheda @ 2012-05-29 23:50 UTC (permalink / raw)
  To: kernelnewbies

Hi,

I am working on x8_64 arch. Profiled (oprofile) Linux kernel module
and notice that whole lot of cycles are spent in copy_from_user call.
I compared same flow from kernel proper and noticed that for more data
through put cycles spent in copy_from_user are much less. Kernel
proper has 1/8 cycles compared to module. (There is a user process
which keeps sending data, like iperf)

Used perf tool to gather some statistics and found that call from kernel proper

185,719,857,837 cpu-cycles               #    3.318 GHz
     [90.01%]
  99,886,030,243 instructions              #    0.54  insns per cycle
       [95.00%]
    1,696,072,702 cache-references     #   30.297 M/sec
   [94.99%]
       786,929,244 cache-misses           #   46.397 % of all cache
refs     [95.00%]
  16,867,747,688 branch-instructions   #  301.307 M/sec
   [95.03%]
         86,752,646 branch-misses          #    0.51% of all branches
       [95.00%]
    5,482,768,332 bus-cycles                #   97.938 M/sec
        [20.08%]
    55967.269801 cpu-clock
    55981.842225 task-clock                 #    0.933 CPUs utilized

and call from kernel module

 9,388,787,678 cpu-cycles               #    1.527 GHz
    [89.77%]
 1,706,203,221 instructions             #    0.18  insns per cycle
    [94.59%]
    551,010,961 cache-references    #   89.588 M/sec                   [94.73%]
   369,632,492 cache-misses           #   67.083 % of all cache refs
  [95.18%]
   291,358,658 branch-instructions   #   47.372 M/sec                   [94.68%]
    10,291,678 branch-misses           #    3.53% of all branches
   [95.01%]
  582,651,999 bus-cycles                 #   94.733 M/sec
     [20.55%]
 6112.471585 cpu-clock
 6150.490210 task-clock                 #    0.102 CPUs utilized
                367 page-faults                #    0.000 M/sec
                367 minor-faults                #    0.000 M/sec
                    0 major-faults                #    0.000 M/sec
           25,770 context-switches        #    0.004 M/sec
                 23 cpu-migrations            #    0.000 M/sec


So obviously, CPU is stalling when it is copying data and there are
more cache misses. My question is, is there a difference calling
copy_from_user from kernel proper compared to calling from LKM ?

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-06-09  1:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-29 23:50 Module vs Kernel main performacne Abu Rasheda
2012-05-30  4:18 ` Mulyadi Santosa
2012-05-30  4:51   ` Abu Rasheda
2012-05-30 16:45     ` Mulyadi Santosa
2012-05-30 21:44       ` Abu Rasheda
2012-05-31  0:17         ` Abu Rasheda
2012-05-31  5:35         ` Mulyadi Santosa
2012-05-31 13:35           ` Abu Rasheda
2012-06-01  0:27             ` Chetan Nanda
2012-06-01 18:52               ` Abu Rasheda
2012-06-07 13:11                 ` Peter Senna Tschudin
2012-06-07 17:47                   ` Abu Rasheda
2012-06-07 18:10                     ` Peter Senna Tschudin
2012-06-09  1:52                       ` Abu Rasheda
2012-06-07 23:36 ` Peter Senna Tschudin
2012-06-07 23:41   ` Abu Rasheda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).