From: jamal <hadi@cyberus.ca>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
xiaosuo@gmail.com, therbert@google.com, shemminger@vyatta.com,
netdev@vger.kernel.org, Eilon Greenstein <eilong@broadcom.com>,
Brian Bloniarz <bmb@athenacr.com>
Subject: Re: [PATCH net-next-2.6] net: speedup udp receive path
Date: Wed, 28 Apr 2010 19:44:53 -0400 [thread overview]
Message-ID: <1272498293.4258.121.camel@bigi> (raw)
In-Reply-To: <1272463605.2267.70.camel@edumazet-laptop>
[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]
On Wed, 2010-04-28 at 16:06 +0200, Eric Dumazet wrote:
> Here it is ;)
Sorry - things got a little hectic with TheMan.
I am afraid i dont have good news.
Actually, I should say i dont have good news in regards to rps.
For my sample app, two things seem to be happening:
a) The overall performance has gotten better for both rps
and non-rps.
b) non-rps is now performing relatively better
This is just what i see in net-next not related to your patch.
It seems the kernels i tested prior to April 23 showed rps better.
The one i tested on Apr23 showed rps being about the same as non-rps.
As i stated in my last result posting, I thought i didnt test properly
but i did again today and saw the same thing. And now non-rps is
_consistently_ better.
So some regression is going on...
Your patch has improved the performance of rps relative to what is in
net-next very lightly; but it has also improved the performance of
non-rps;->
My traces look different for the app cpu than yours - likely because of
the apps being different.
At the moment i dont have time to dig deeper into code, but i could
test as cycles show up.
I am attaching the profile traces and results.
cheers,
jamal
[-- Attachment #2: sum-apr23and28.txt --]
[-- Type: text/plain, Size: 1469 bytes --]
April 23 net-next
kernel sink cpu all cpuint cpuapp
---------------------------------------------------------
nn 93.95% 84.5% 99.8% 79.8%
nn-rps 96.41% 85.4% 95.5% 82.5%
nn-cl 97.29% 84.0% 99.9% 79.6%
nn-cl-rps 97.76% 86.5% 96.5% 84.8%
nn: Basic net-next from Apr23
nn-rps: Basic net-next from Apr23 with rps mask ee and irq affinity to cpu0
nn-cl: Basic net-next from Apr23 + Changli patch
nn-cl-rps: Basic net-next from Apr23 + Changli patch + rps mask ee,irq aff cpu0
sink: the amount of traffic the system was able to sink in.
cpu all: avg % system cpu consumed in test
cpuint: avg %cpu consumed by the cpu where interrupts happened
cpuapp: avg %cpu consumed by a sample cpu which did app processing
Now repeat with Erics changes and kernel from Apr-28
kernel sink cpu all cpuint cpuapp
---------------------------------------------------------
nn2 98.78% 83.6% 100.0% 82.8%
nn2-rps 94.43% 84.2% 98.1% 82.0%
nn2-ed 98.74% 83.2% 99.9% 81.6%
nn2-ed-rps 95.15% 84.5% 97.3% 82.1%
nn2: Basic net-next from Apr28
nn2-rps: Basic net-next from Apr23 with rps mask ee and irq affinity to cpu0
nn2-ed: Basic net-next from Apr23 + Eric patch
nn2-ed-rps: Basic net-next from Apr23 + Eric patch + rps mask ee,irq aff cpu0
[-- Attachment #3: nn-apr28-summary.txt --]
[-- Type: text/plain, Size: 78977 bytes --]
I: net-next
Average udp sink: 98.78%
--------------------------------------------------------------------------------------------------
PerfTop: 3632 irqs/sec kernel:83.7% [1000Hz cycles], (all, 8 CPUs)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ____________________
2738.00 9.8% sky2_poll [sky2]
1543.00 5.5% _raw_spin_lock_irqsave [kernel]
1019.00 3.7% system_call [kernel]
740.00 2.7% copy_user_generic_string [kernel]
687.00 2.5% fget [kernel]
640.00 2.3% _raw_spin_unlock_irqrestore [kernel]
634.00 2.3% sys_epoll_ctl [kernel]
613.00 2.2% datagram_poll [kernel]
553.00 2.0% _raw_spin_lock_bh [kernel]
530.00 1.9% kmem_cache_free [kernel]
522.00 1.9% schedule [kernel]
487.00 1.7% vread_tsc [kernel].vsyscall_fn
467.00 1.7% _raw_spin_lock [kernel]
432.00 1.5% udp_recvmsg [kernel]
426.00 1.5% kmem_cache_alloc [kernel]
418.00 1.5% __udp4_lib_lookup [kernel]
417.00 1.5% sys_epoll_wait [kernel]
376.00 1.3% fput [kernel]
361.00 1.3% ip_route_input [kernel]
344.00 1.2% local_bh_enable_ip [kernel]
326.00 1.2% ip_rcv [kernel]
321.00 1.2% first_packet_length [kernel]
307.00 1.1% ep_remove [kernel]
303.00 1.1% dst_release [kernel]
301.00 1.1% skb_copy_datagram_iovec [kernel]
297.00 1.1% mutex_lock [kernel]
--------------------------------------------------------------------------------------------------
PerfTop: 4018 irqs/sec kernel:83.3% [1000Hz cycles], (all, 8 CPUs)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
4274.00 9.7% sky2_poll [sky2]
2473.00 5.6% _raw_spin_lock_irqsave [kernel]
1585.00 3.6% system_call [kernel]
1179.00 2.7% copy_user_generic_string [kernel]
1089.00 2.5% fget [kernel]
1019.00 2.3% _raw_spin_unlock_irqrestore [kernel]
1011.00 2.3% sys_epoll_ctl [kernel]
965.00 2.2% datagram_poll [kernel]
902.00 2.0% kmem_cache_free [kernel]
841.00 1.9% _raw_spin_lock_bh [kernel]
837.00 1.9% schedule [kernel]
735.00 1.7% vread_tsc [kernel].vsyscall_fn
730.00 1.7% udp_recvmsg [kernel]
729.00 1.7% _raw_spin_lock [kernel]
678.00 1.5% kmem_cache_alloc [kernel]
651.00 1.5% sys_epoll_wait [kernel]
635.00 1.4% __udp4_lib_lookup [kernel]
595.00 1.3% fput [kernel]
568.00 1.3% local_bh_enable_ip [kernel]
562.00 1.3% ip_route_input [kernel]
516.00 1.2% dst_release [kernel]
502.00 1.1% ep_remove [kernel]
485.00 1.1% skb_copy_datagram_iovec [kernel]
484.00 1.1% first_packet_length [kernel]
476.00 1.1% ip_rcv [kernel]
470.00 1.1% __alloc_skb [kernel]
459.00 1.0% epoll_ctl /lib/libc-2.7.so
458.00 1.0% mutex_lock [kernel]
--------------------------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 0)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ________
3534.00 34.7% sky2_poll [sky2]
545.00 5.3% __udp4_lib_lookup [kernel]
537.00 5.3% ip_route_input [kernel]
427.00 4.2% _raw_spin_lock_irqsave [kernel]
401.00 3.9% __alloc_skb [kernel]
360.00 3.5% ip_rcv [kernel]
332.00 3.3% _raw_spin_lock [kernel]
292.00 2.9% sock_queue_rcv_skb [kernel]
291.00 2.9% __udp4_lib_rcv [kernel]
273.00 2.7% sock_def_readable [kernel]
269.00 2.6% __netif_receive_skb [kernel]
209.00 2.1% __wake_up_common [kernel]
196.00 1.9% __kmalloc [kernel]
164.00 1.6% _raw_read_lock [kernel]
157.00 1.5% kmem_cache_alloc [kernel]
157.00 1.5% ep_poll_callback [kernel]
133.00 1.3% resched_task [kernel]
128.00 1.3% task_rq_lock [kernel]
120.00 1.2% swiotlb_sync_single [kernel]
120.00 1.2% sky2_rx_submit [sky2]
117.00 1.1% udp_queue_rcv_skb [kernel]
108.00 1.1% ip_local_deliver [kernel]
104.00 1.0% try_to_wake_up [kernel]
102.00 1.0% _raw_spin_unlock_irqrestore [kernel]
98.00 1.0% select_task_rq_fair [kernel]
--------------------------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 0)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ________
4601.00 34.0% sky2_poll [sky2]
732.00 5.4% __udp4_lib_lookup [kernel]
724.00 5.3% ip_route_input [kernel]
527.00 3.9% _raw_spin_lock_irqsave [kernel]
520.00 3.8% __alloc_skb [kernel]
483.00 3.6% ip_rcv [kernel]
441.00 3.3% _raw_spin_lock [kernel]
401.00 3.0% sock_queue_rcv_skb [kernel]
373.00 2.8% __udp4_lib_rcv [kernel]
365.00 2.7% sock_def_readable [kernel]
353.00 2.6% __netif_receive_skb [kernel]
285.00 2.1% __wake_up_common [kernel]
273.00 2.0% __kmalloc [kernel]
230.00 1.7% _raw_read_lock [kernel]
208.00 1.5% ep_poll_callback [kernel]
199.00 1.5% kmem_cache_alloc [kernel]
180.00 1.3% task_rq_lock [kernel]
172.00 1.3% sky2_rx_submit [sky2]
171.00 1.3% resched_task [kernel]
165.00 1.2% ip_local_deliver [kernel]
162.00 1.2% udp_queue_rcv_skb [kernel]
158.00 1.2% _raw_spin_unlock_irqrestore [kernel]
148.00 1.1% select_task_rq_fair [kernel]
144.00 1.1% try_to_wake_up [kernel]
142.00 1.0% sky2_remove [sky2]
140.00 1.0% swiotlb_sync_single [kernel]
95.00 0.7% cache_alloc_refill [kernel]
92.00 0.7% dev_gro_receive [kernel]
82.00 0.6% is_swiotlb_buffer [kernel]
--------------------------------------------------------------------------------------------------
PerfTop: 622 irqs/sec kernel:74.9% [1000Hz cycles], (all, cpu: 2)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ _____________________________________
113.00 6.5% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
105.00 6.0% system_call /lib/modules/2.6.34-rc5/build/vmlinux
69.00 3.9% fget /lib/modules/2.6.34-rc5/build/vmlinux
64.00 3.7% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
56.00 3.2% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
55.00 3.1% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
53.00 3.0% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
46.00 2.6% _raw_spin_lock_bh /lib/modules/2.6.34-rc5/build/vmlinux
42.00 2.4% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
37.00 2.1% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
37.00 2.1% schedule /lib/modules/2.6.34-rc5/build/vmlinux
35.00 2.0% mutex_lock /lib/modules/2.6.34-rc5/build/vmlinux
35.00 2.0% vread_tsc [kernel].vsyscall_fn
35.00 2.0% udp_recvmsg /lib/modules/2.6.34-rc5/build/vmlinux
34.00 1.9% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
31.00 1.8% local_bh_enable_ip /lib/modules/2.6.34-rc5/build/vmlinux
29.00 1.7% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
28.00 1.6% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
27.00 1.5% process_recv /home/hadi/udp_sink/mcpudp
25.00 1.4% mutex_unlock /lib/modules/2.6.34-rc5/build/vmlinux
24.00 1.4% ep_send_events_proc /lib/modules/2.6.34-rc5/build/vmlinux
24.00 1.4% clock_gettime /lib/librt-2.7.so
23.00 1.3% fput /lib/modules/2.6.34-rc5/build/vmlinux
23.00 1.3% skb_copy_datagram_iovec /lib/modules/2.6.34-rc5/build/vmlinux
20.00 1.1% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
20.00 1.1% inet_recvmsg /lib/modules/2.6.34-rc5/build/vmlinux
19.00 1.1% epoll_dispatch /usr/lib/libevent-1.3e.so.1.0.3
19.00 1.1% first_packet_length /lib/modules/2.6.34-rc5/build/vmlinux
--------------------------------------------------------------------------------------------------
PerfTop: 625 irqs/sec kernel:83.0% [1000Hz cycles], (all, cpu: 2)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ _____________________________________
315.00 6.8% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
232.00 5.0% system_call /lib/modules/2.6.34-rc5/build/vmlinux
175.00 3.8% fget /lib/modules/2.6.34-rc5/build/vmlinux
174.00 3.8% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
168.00 3.6% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
155.00 3.4% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
144.00 3.1% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
133.00 2.9% _raw_spin_lock_bh /lib/modules/2.6.34-rc5/build/vmlinux
126.00 2.7% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
113.00 2.4% vread_tsc [kernel].vsyscall_fn
110.00 2.4% _raw_spin_unlock_bh /lib/modules/2.6.34-rc5/build/vmlinux
106.00 2.3% schedule /lib/modules/2.6.34-rc5/build/vmlinux
103.00 2.2% local_bh_enable_ip /lib/modules/2.6.34-rc5/build/vmlinux
101.00 2.2% udp_recvmsg /lib/modules/2.6.34-rc5/build/vmlinux
97.00 2.1% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
84.00 1.8% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
78.00 1.7% fput /lib/modules/2.6.34-rc5/build/vmlinux
75.00 1.6% first_packet_length /lib/modules/2.6.34-rc5/build/vmlinux
74.00 1.6% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
71.00 1.5% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
69.00 1.5% epoll_ctl /lib/libc-2.7.so
67.00 1.5% mutex_lock /lib/modules/2.6.34-rc5/build/vmlinux
65.00 1.4% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
65.00 1.4% inet_recvmsg /lib/modules/2.6.34-rc5/build/vmlinux
64.00 1.4% process_recv /home/hadi/udp_sink/mcpudp
62.00 1.3% skb_copy_datagram_iovec /lib/modules/2.6.34-rc5/build/vmlinux
60.00 1.3% clock_gettime /lib/librt-2.7.so
--------------------------------------------------------------------------------------------------
PerfTop: 700 irqs/sec kernel:84.3% [1000Hz cycles], (all, cpu: 2)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ _____________________________________
489.00 6.4% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
376.00 4.9% system_call /lib/modules/2.6.34-rc5/build/vmlinux
308.00 4.0% fget /lib/modules/2.6.34-rc5/build/vmlinux
302.00 3.9% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
280.00 3.6% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
274.00 3.6% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
249.00 3.2% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
223.00 2.9% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
221.00 2.9% _raw_spin_unlock_bh /lib/modules/2.6.34-rc5/build/vmlinux
221.00 2.9% local_bh_enable_ip /lib/modules/2.6.34-rc5/build/vmlinux
208.00 2.7% vread_tsc [kernel].vsyscall_fn
200.00 2.6% _raw_spin_lock_bh /lib/modules/2.6.34-rc5/build/vmlinux
191.00 2.5% schedule /lib/modules/2.6.34-rc5/build/vmlinux
188.00 2.4% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
177.00 2.3% udp_recvmsg /lib/modules/2.6.34-rc5/build/vmlinux
141.00 1.8% fput /lib/modules/2.6.34-rc5/build/vmlinux
140.00 1.8% first_packet_length /lib/modules/2.6.34-rc5/build/vmlinux
128.00 1.7% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
119.00 1.5% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
105.00 1.4% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
104.00 1.4% epoll_ctl /lib/libc-2.7.so
102.00 1.3% skb_copy_datagram_iovec /lib/modules/2.6.34-rc5/build/vmlinux
100.00 1.3% mutex_lock /lib/modules/2.6.34-rc5/build/vmlinux
95.00 1.2% mutex_unlock /lib/modules/2.6.34-rc5/build/vmlinux
94.00 1.2% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
92.00 1.2% ep_send_events_proc /lib/modules/2.6.34-rc5/build/vmlinux
92.00 1.2% clock_gettime /lib/librt-2.7.so
92.00 1.2% __skb_recv_datagram /lib/modules/2.6.34-rc5/build/vmlinux
91.00 1.2% process_recv /home/hadi/udp_sink/mcpudp
88.00 1.1% kfree /lib/modules/2.6.34-rc5/build/vmlinux
86.00 1.1% _raw_spin_lock /lib/modules/2.6.34-rc5/build/vmlinux
II: net-next with rps = ee
94.43%
--------------
--------------------------------------------------------------------------------------------------
PerfTop: 4328 irqs/sec kernel:84.0% [1000Hz cycles], (all, 8 CPUs)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ ______________________
3908.00 17.1% sky2_poll [sky2]
694.00 3.0% _raw_spin_lock_irqsave [kernel]
584.00 2.6% sky2_intr [sky2]
557.00 2.4% system_call [kernel]
490.00 2.1% _raw_spin_unlock_irqrestore [kernel]
488.00 2.1% fget [kernel]
425.00 1.9% ip_rcv [kernel]
405.00 1.8% sys_epoll_ctl [kernel]
398.00 1.7% __netif_receive_skb [kernel]
375.00 1.6% _raw_spin_lock [kernel]
365.00 1.6% copy_user_generic_string [kernel]
363.00 1.6% ip_route_input [kernel]
350.00 1.5% kmem_cache_free [kernel]
346.00 1.5% schedule [kernel]
319.00 1.4% call_function_single_interrupt [kernel]
295.00 1.3% vread_tsc [kernel].vsyscall_fn
270.00 1.2% __udp4_lib_lookup [kernel]
264.00 1.2% kmem_cache_alloc [kernel]
235.00 1.0% fput [kernel]
219.00 1.0% datagram_poll [kernel]
--------------------------------------------------------------------------------------------------
PerfTop: 3791 irqs/sec kernel:84.4% [1000Hz cycles], (all, 8 CPUs)
--------------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ ______________________
6274.00 17.2% sky2_poll [sky2]
1139.00 3.1% _raw_spin_lock_irqsave [kernel]
953.00 2.6% system_call [kernel]
942.00 2.6% sky2_intr [sky2]
785.00 2.2% _raw_spin_unlock_irqrestore [kernel]
745.00 2.0% fget [kernel]
695.00 1.9% ip_rcv [kernel]
653.00 1.8% sys_epoll_ctl [kernel]
609.00 1.7% ip_route_input [kernel]
606.00 1.7% __netif_receive_skb [kernel]
583.00 1.6% _raw_spin_lock [kernel]
569.00 1.6% kmem_cache_free [kernel]
564.00 1.5% copy_user_generic_string [kernel]
554.00 1.5% schedule [kernel]
510.00 1.4% call_function_single_interrupt [kernel]
488.00 1.3% vread_tsc [kernel].vsyscall_fn
459.00 1.3% kmem_cache_alloc [kernel]
417.00 1.1% __udp4_lib_lookup [kernel]
387.00 1.1% fput [kernel]
358.00 1.0% __udp4_lib_rcv [kernel]
347.00 1.0% event_base_loop libevent-1.3e.so.1.0.3
-----------------------------------------------------------------------------------------------
PerfTop: 997 irqs/sec kernel:98.2% [1000Hz cycles], (all, cpu: 0)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________________ ________
3926.00 61.0% sky2_poll [sky2]
671.00 10.4% sky2_intr [sky2]
192.00 3.0% __alloc_skb [kernel]
126.00 2.0% get_rps_cpu [kernel]
111.00 1.7% __kmalloc [kernel]
97.00 1.5% enqueue_to_backlog [kernel]
95.00 1.5% _raw_spin_lock_irqsave [kernel]
93.00 1.4% _raw_spin_lock [kernel]
79.00 1.2% kmem_cache_alloc [kernel]
63.00 1.0% sky2_rx_submit [sky2]
-----------------------------------------------------------------------------------------------
PerfTop: 980 irqs/sec kernel:98.0% [1000Hz cycles], (all, cpu: 0)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________________ ____________________
6945.00 61.4% sky2_poll [sky2]
1219.00 10.8% sky2_intr [sky2]
323.00 2.9% __alloc_skb [kernel]
243.00 2.1% get_rps_cpu [kernel]
195.00 1.7% __kmalloc [kernel]
161.00 1.4% _raw_spin_lock_irqsave [kernel]
149.00 1.3% enqueue_to_backlog [kernel]
139.00 1.2% _raw_spin_lock [kernel]
136.00 1.2% kmem_cache_alloc [kernel]
135.00 1.2% irq_entries_start [kernel]
108.00 1.0% sky2_rx_submit [sky2]
-----------------------------------------------------------------------------------------------
PerfTop: 458 irqs/sec kernel:80.8% [1000Hz cycles], (all, cpu: 2)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ _____________________________________
130.00 4.7% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
114.00 4.1% system_call /lib/modules/2.6.34-rc5/build/vmlinux
91.00 3.3% ip_rcv /lib/modules/2.6.34-rc5/build/vmlinux
82.00 3.0% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
74.00 2.7% call_function_single_interrupt /lib/modules/2.6.34-rc5/build/vmlinux
74.00 2.7% fget /lib/modules/2.6.34-rc5/build/vmlinux
71.00 2.6% __netif_receive_skb /lib/modules/2.6.34-rc5/build/vmlinux
69.00 2.5% ip_route_input /lib/modules/2.6.34-rc5/build/vmlinux
66.00 2.4% schedule /lib/modules/2.6.34-rc5/build/vmlinux
63.00 2.3% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
61.00 2.2% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
61.00 2.2% __udp4_lib_lookup /lib/modules/2.6.34-rc5/build/vmlinux
57.00 2.1% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
49.00 1.8% vread_tsc [kernel].vsyscall_fn
49.00 1.8% _raw_spin_lock /lib/modules/2.6.34-rc5/build/vmlinux
47.00 1.7% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
45.00 1.6% fput /lib/modules/2.6.34-rc5/build/vmlinux
44.00 1.6% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
40.00 1.4% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
40.00 1.4% local_bh_enable_ip /lib/modules/2.6.34-rc5/build/vmlinux
38.00 1.4% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
35.00 1.3% process_recv /home/hadi/udp_sink/mcpudp
34.00 1.2% mutex_unlock /lib/modules/2.6.34-rc5/build/vmlinux
31.00 1.1% _raw_spin_unlock_bh /lib/modules/2.6.34-rc5/build/vmlinux
31.00 1.1% event_base_loop /usr/lib/libevent-1.3e.so.1.0.3
-----------------------------------------------------------------------------------------------
PerfTop: 552 irqs/sec kernel:82.4% [1000Hz cycles], (all, cpu: 2)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ _____________________________________
204.00 4.7% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
169.00 3.9% system_call /lib/modules/2.6.34-rc5/build/vmlinux
151.00 3.5% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
132.00 3.0% ip_rcv /lib/modules/2.6.34-rc5/build/vmlinux
129.00 3.0% fget /lib/modules/2.6.34-rc5/build/vmlinux
123.00 2.8% __netif_receive_skb /lib/modules/2.6.34-rc5/build/vmlinux
115.00 2.6% ip_route_input /lib/modules/2.6.34-rc5/build/vmlinux
112.00 2.6% call_function_single_interrupt /lib/modules/2.6.34-rc5/build/vmlinux
112.00 2.6% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
103.00 2.4% schedule /lib/modules/2.6.34-rc5/build/vmlinux
94.00 2.2% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
89.00 2.0% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
86.00 2.0% _raw_spin_lock /lib/modules/2.6.34-rc5/build/vmlinux
83.00 1.9% __udp4_lib_lookup /lib/modules/2.6.34-rc5/build/vmlinux
76.00 1.7% vread_tsc [kernel].vsyscall_fn
68.00 1.6% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
67.00 1.5% fput /lib/modules/2.6.34-rc5/build/vmlinux
64.00 1.5% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
62.00 1.4% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
60.00 1.4% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
60.00 1.4% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
56.00 1.3% _raw_spin_lock_bh /lib/modules/2.6.34-rc5/build/vmlinux
53.00 1.2% event_base_loop /usr/lib/libevent-1.3e.so.1.0.3
51.00 1.2% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
48.00 1.1% epoll_ctl /lib/libc-2.7.so
48.00 1.1% kfree /lib/modules/2.6.34-rc5/build/vmlinux
47.00 1.1% _raw_spin_unlock_bh /lib/modules/2.6.34-rc5/build/vmlinux
47.00 1.1% mutex_unlock /lib/modules/2.6.34-rc5/build/vmlinux
45.00 1.0% __udp4_lib_rcv /lib/modules/2.6.34-rc5/build/vmlinux
45.00 1.0% tick_nohz_stop_sched_tick /lib/modules/2.6.34-rc5/build/vmlinux
-----------------------------------------------------------------------------------------------
PerfTop: 408 irqs/sec kernel:82.1% [1000Hz cycles], (all, cpu: 2)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ _____________________________________
240.00 4.8% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
200.00 4.0% system_call /lib/modules/2.6.34-rc5/build/vmlinux
165.00 3.3% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
161.00 3.2% ip_rcv /lib/modules/2.6.34-rc5/build/vmlinux
158.00 3.1% fget /lib/modules/2.6.34-rc5/build/vmlinux
150.00 3.0% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
135.00 2.7% __netif_receive_skb /lib/modules/2.6.34-rc5/build/vmlinux
122.00 2.4% ip_route_input /lib/modules/2.6.34-rc5/build/vmlinux
117.00 2.3% call_function_single_interrupt /lib/modules/2.6.34-rc5/build/vmlinux
114.00 2.3% schedule /lib/modules/2.6.34-rc5/build/vmlinux
110.00 2.2% _raw_spin_lock /lib/modules/2.6.34-rc5/build/vmlinux
108.00 2.1% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
101.00 2.0% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
94.00 1.9% vread_tsc [kernel].vsyscall_fn
90.00 1.8% __udp4_lib_lookup /lib/modules/2.6.34-rc5/build/vmlinux
85.00 1.7% fput /lib/modules/2.6.34-rc5/build/vmlinux
78.00 1.5% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
77.00 1.5% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
75.00 1.5% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
74.00 1.5% _raw_spin_lock_bh /lib/modules/2.6.34-rc5/build/vmlinux
69.00 1.4% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
68.00 1.3% event_base_loop /usr/lib/libevent-1.3e.so.1.0.3
68.00 1.3% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
62.00 1.2% _raw_spin_unlock_bh /lib/modules/2.6.34-rc5/build/vmlinux
62.00 1.2% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
55.00 1.1% epoll_ctl /lib/libc-2.7.so
53.00 1.1% local_bh_enable_ip /lib/modules/2.6.34-rc5/build/vmlinux
53.00 1.1% tick_nohz_stop_sched_tick /lib/modules/2.6.34-rc5/build/vmlinux
52.00 1.0% mutex_unlock /lib/modules/2.6.34-rc5/build/vmlinux
-----------------------------------------------------------------------------------------------
PerfTop: 440 irqs/sec kernel:85.0% [1000Hz cycles], (all, cpu: 2)
-----------------------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ______________________________ _____________________________________
226.00 4.6% _raw_spin_lock_irqsave /lib/modules/2.6.34-rc5/build/vmlinux
213.00 4.3% system_call /lib/modules/2.6.34-rc5/build/vmlinux
154.00 3.1% _raw_spin_unlock_irqrestore /lib/modules/2.6.34-rc5/build/vmlinux
148.00 3.0% ip_rcv /lib/modules/2.6.34-rc5/build/vmlinux
143.00 2.9% fget /lib/modules/2.6.34-rc5/build/vmlinux
143.00 2.9% ip_route_input /lib/modules/2.6.34-rc5/build/vmlinux
140.00 2.8% __netif_receive_skb /lib/modules/2.6.34-rc5/build/vmlinux
124.00 2.5% call_function_single_interrupt /lib/modules/2.6.34-rc5/build/vmlinux
124.00 2.5% sys_epoll_ctl /lib/modules/2.6.34-rc5/build/vmlinux
104.00 2.1% copy_user_generic_string /lib/modules/2.6.34-rc5/build/vmlinux
103.00 2.1% vread_tsc [kernel].vsyscall_fn
101.00 2.0% schedule /lib/modules/2.6.34-rc5/build/vmlinux
100.00 2.0% kmem_cache_free /lib/modules/2.6.34-rc5/build/vmlinux
99.00 2.0% _raw_spin_lock /lib/modules/2.6.34-rc5/build/vmlinux
93.00 1.9% __udp4_lib_lookup /lib/modules/2.6.34-rc5/build/vmlinux
80.00 1.6% fput /lib/modules/2.6.34-rc5/build/vmlinux
76.00 1.5% kmem_cache_alloc /lib/modules/2.6.34-rc5/build/vmlinux
75.00 1.5% sock_recv_ts_and_drops /lib/modules/2.6.34-rc5/build/vmlinux
73.00 1.5% dst_release /lib/modules/2.6.34-rc5/build/vmlinux
70.00 1.4% sys_epoll_wait /lib/modules/2.6.34-rc5/build/vmlinux
69.00 1.4% datagram_poll /lib/modules/2.6.34-rc5/build/vmlinux
65.00 1.3% event_base_loop /usr/lib/libevent-1.3e.so.1.0.3
65.00 1.3% ep_remove /lib/modules/2.6.34-rc5/build/vmlinux
III: Kernel compiled with Erics patch, rps mask 00
Avg udp packets sunk: 98.74%
-------------------------------------------------------------------------------
PerfTop: 4202 irqs/sec kernel:82.5% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
1639.00 9.0% sky2_poll [sky2]
1051.00 5.8% _raw_spin_lock_irqsave [kernel]
665.00 3.7% system_call [kernel]
578.00 3.2% fget [kernel]
476.00 2.6% _raw_spin_unlock_irqrestore [kernel]
457.00 2.5% copy_user_generic_string [kernel]
427.00 2.4% sys_epoll_ctl [kernel]
401.00 2.2% datagram_poll [kernel]
391.00 2.2% kmem_cache_free [kernel]
349.00 1.9% schedule [kernel]
339.00 1.9% vread_tsc [kernel].vsyscall_fn
323.00 1.8% udp_recvmsg [kernel]
292.00 1.6% kmem_cache_alloc [kernel]
285.00 1.6% _raw_spin_lock [kernel]
272.00 1.5% _raw_spin_lock_bh [kernel]
268.00 1.5% sys_epoll_wait [kernel]
260.00 1.4% fput [kernel]
234.00 1.3% ip_route_input [kernel]
221.00 1.2% __udp4_lib_lookup [kernel]
212.00 1.2% dst_release [kernel]
209.00 1.2% ip_rcv [kernel]
203.00 1.1% ep_remove [kernel]
202.00 1.1% first_packet_length [kernel]
-------------------------------------------------------------------------------
PerfTop: 3999 irqs/sec kernel:82.3% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
3452.00 9.3% sky2_poll [sky2]
2212.00 5.9% _raw_spin_lock_irqsave [kernel]
1350.00 3.6% system_call [kernel]
1187.00 3.2% fget [kernel]
1010.00 2.7% copy_user_generic_string [kernel]
965.00 2.6% _raw_spin_unlock_irqrestore [kernel]
842.00 2.3% sys_epoll_ctl [kernel]
833.00 2.2% datagram_poll [kernel]
770.00 2.1% kmem_cache_free [kernel]
710.00 1.9% vread_tsc [kernel].vsyscall_fn
688.00 1.8% schedule [kernel]
651.00 1.7% udp_recvmsg [kernel]
603.00 1.6% _raw_spin_lock_bh [kernel]
599.00 1.6% _raw_spin_lock [kernel]
597.00 1.6% sys_epoll_wait [kernel]
594.00 1.6% kmem_cache_alloc [kernel]
553.00 1.5% ip_route_input [kernel]
528.00 1.4% fput [kernel]
496.00 1.3% __udp4_lib_lookup [kernel]
444.00 1.2% dst_release [kernel]
433.00 1.2% ip_rcv [kernel]
408.00 1.1% first_packet_length [kernel]
-------------------------------------------------------------------------------
PerfTop: 3765 irqs/sec kernel:83.7% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
4275.00 9.5% sky2_poll [sky2]
2684.00 6.0% _raw_spin_lock_irqsave [kernel]
1654.00 3.7% system_call [kernel]
1447.00 3.2% fget [kernel]
1223.00 2.7% copy_user_generic_string [kernel]
1146.00 2.5% _raw_spin_unlock_irqrestore [kernel]
1036.00 2.3% sys_epoll_ctl [kernel]
1019.00 2.3% datagram_poll [kernel]
974.00 2.2% kmem_cache_free [kernel]
843.00 1.9% vread_tsc [kernel].vsyscall_fn
799.00 1.8% schedule [kernel]
761.00 1.7% udp_recvmsg [kernel]
736.00 1.6% kmem_cache_alloc [kernel]
719.00 1.6% _raw_spin_lock_bh [kernel]
716.00 1.6% _raw_spin_lock [kernel]
696.00 1.5% sys_epoll_wait [kernel]
680.00 1.5% ip_route_input [kernel]
657.00 1.5% fput [kernel]
613.00 1.4% __udp4_lib_lookup [kernel]
552.00 1.2% dst_release [kernel]
507.00 1.1% ip_rcv [kernel]
-------------------------------------------------------------------------------
PerfTop: 1001 irqs/sec kernel:99.9% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ________
669.00 32.2% sky2_poll [sky2]
128.00 6.2% ip_route_input [kernel]
106.00 5.1% ip_rcv [kernel]
105.00 5.1% __udp4_lib_lookup [kernel]
86.00 4.1% _raw_spin_lock [kernel]
85.00 4.1% _raw_spin_lock_irqsave [kernel]
82.00 3.9% __alloc_skb [kernel]
78.00 3.8% sock_queue_rcv_skb [kernel]
57.00 2.7% __netif_receive_skb [kernel]
53.00 2.6% __wake_up_common [kernel]
47.00 2.3% __udp4_lib_rcv [kernel]
42.00 2.0% sock_def_readable [kernel]
37.00 1.8% kmem_cache_alloc [kernel]
34.00 1.6% ep_poll_callback [kernel]
34.00 1.6% __kmalloc [kernel]
34.00 1.6% select_task_rq_fair [kernel]
30.00 1.4% _raw_read_lock [kernel]
27.00 1.3% _raw_spin_unlock_irqrestore [kernel]
24.00 1.2% sky2_rx_submit [sky2]
22.00 1.1% udp_queue_rcv_skb [kernel]
21.00 1.0% try_to_wake_up [kernel]
-------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ________
3061.00 31.9% sky2_poll [sky2]
529.00 5.5% ip_route_input [kernel]
518.00 5.4% __udp4_lib_lookup [kernel]
424.00 4.4% ip_rcv [kernel]
390.00 4.1% _raw_spin_lock_irqsave [kernel]
389.00 4.1% __alloc_skb [kernel]
365.00 3.8% _raw_spin_lock [kernel]
326.00 3.4% sock_queue_rcv_skb [kernel]
297.00 3.1% __netif_receive_skb [kernel]
273.00 2.8% __udp4_lib_rcv [kernel]
223.00 2.3% sock_def_readable [kernel]
205.00 2.1% __wake_up_common [kernel]
181.00 1.9% __kmalloc [kernel]
151.00 1.6% kmem_cache_alloc [kernel]
147.00 1.5% _raw_read_lock [kernel]
143.00 1.5% ep_poll_callback [kernel]
136.00 1.4% sky2_rx_submit [sky2]
123.00 1.3% task_rq_lock [kernel]
118.00 1.2% _raw_spin_unlock_irqrestore [kernel]
114.00 1.2% select_task_rq_fair [kernel]
104.00 1.1% resched_task [kernel]
104.00 1.1% sky2_remove [sky2]
102.00 1.1% udp_queue_rcv_skb [kernel]
-------------------------------------------------------------------------------
PerfTop: 1001 irqs/sec kernel:100.0% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ________
3898.00 31.0% sky2_poll [sky2]
715.00 5.7% ip_route_input [kernel]
651.00 5.2% __udp4_lib_lookup [kernel]
576.00 4.6% ip_rcv [kernel]
534.00 4.2% __alloc_skb [kernel]
518.00 4.1% _raw_spin_lock_irqsave [kernel]
441.00 3.5% sock_queue_rcv_skb [kernel]
439.00 3.5% _raw_spin_lock [kernel]
396.00 3.1% __netif_receive_skb [kernel]
351.00 2.8% __udp4_lib_rcv [kernel]
300.00 2.4% sock_def_readable [kernel]
264.00 2.1% __wake_up_common [kernel]
260.00 2.1% __kmalloc [kernel]
198.00 1.6% kmem_cache_alloc [kernel]
193.00 1.5% ep_poll_callback [kernel]
192.00 1.5% _raw_read_lock [kernel]
168.00 1.3% sky2_rx_submit [sky2]
167.00 1.3% task_rq_lock [kernel]
153.00 1.2% udp_queue_rcv_skb [kernel]
149.00 1.2% _raw_spin_unlock_irqrestore [kernel]
147.00 1.2% ip_local_deliver [kernel]
144.00 1.1% resched_task [kernel]
137.00 1.1% sky2_remove [sky2]
-------------------------------------------------------------------------------
PerfTop: 663 irqs/sec kernel:81.9% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ____________________
129.00 7.0% _raw_spin_lock_irqsave [kernel]
84.00 4.5% fget [kernel]
83.00 4.5% system_call [kernel]
82.00 4.4% copy_user_generic_string [kernel]
67.00 3.6% _raw_spin_unlock_irqrestore [kernel]
63.00 3.4% datagram_poll [kernel]
57.00 3.1% udp_recvmsg [kernel]
55.00 3.0% sys_epoll_ctl [kernel]
55.00 3.0% vread_tsc [kernel].vsyscall_fn
43.00 2.3% sys_epoll_wait [kernel]
43.00 2.3% _raw_spin_lock_bh [kernel]
41.00 2.2% first_packet_length [kernel]
40.00 2.2% dst_release [kernel]
37.00 2.0% fput [kernel]
37.00 2.0% kmem_cache_free [kernel]
36.00 1.9% mutex_unlock [kernel]
35.00 1.9% schedule [kernel]
34.00 1.8% skb_copy_datagram_iovec [kernel]
34.00 1.8% ep_remove [kernel]
29.00 1.6% mutex_lock [kernel]
29.00 1.6% _raw_spin_lock [kernel]
28.00 1.5% __skb_recv_datagram [kernel]
25.00 1.4% epoll_ctl /lib/libc-2.7.so
25.00 1.4% tick_nohz_stop_sched_tick [kernel]
-------------------------------------------------------------------------------
PerfTop: 629 irqs/sec kernel:81.1% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
351.00 7.9% _raw_spin_lock_irqsave [kernel]
248.00 5.6% system_call [kernel]
219.00 5.0% fget [kernel]
194.00 4.4% copy_user_generic_string [kernel]
184.00 4.2% datagram_poll [kernel]
162.00 3.7% sys_epoll_ctl [kernel]
159.00 3.6% _raw_spin_unlock_irqrestore [kernel]
129.00 2.9% udp_recvmsg [kernel]
129.00 2.9% kmem_cache_free [kernel]
123.00 2.8% vread_tsc [kernel].vsyscall_fn
108.00 2.4% schedule [kernel]
107.00 2.4% _raw_spin_lock_bh [kernel]
104.00 2.4% sys_epoll_wait [kernel]
100.00 2.3% fput [kernel]
94.00 2.1% dst_release [kernel]
78.00 1.8% first_packet_length [kernel]
73.00 1.7% ep_remove [kernel]
69.00 1.6% epoll_ctl /lib/libc-2.7.so
66.00 1.5% skb_copy_datagram_iovec [kernel]
66.00 1.5% mutex_unlock [kernel]
64.00 1.4% __skb_recv_datagram [kernel]
64.00 1.4% mutex_lock [kernel]
57.00 1.3% sock_recv_ts_and_drops [kernel]
51.00 1.2% kmem_cache_alloc [kernel]
49.00 1.1% ep_send_events_proc [kernel]
-------------------------------------------------------------------------------
PerfTop: 457 irqs/sec kernel:72.0% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ___________________________ ______________________
411.00 7.8% _raw_spin_lock_irqsave [kernel]
280.00 5.3% system_call [kernel]
269.00 5.1% fget [kernel]
239.00 4.5% copy_user_generic_string [kernel]
232.00 4.4% datagram_poll [kernel]
175.00 3.3% _raw_spin_unlock_irqrestore [kernel]
170.00 3.2% sys_epoll_ctl [kernel]
169.00 3.2% kmem_cache_free [kernel]
149.00 2.8% udp_recvmsg [kernel]
144.00 2.7% vread_tsc [kernel].vsyscall_fn
129.00 2.4% sys_epoll_wait [kernel]
128.00 2.4% _raw_spin_lock_bh [kernel]
115.00 2.2% fput [kernel]
112.00 2.1% schedule [kernel]
108.00 2.0% dst_release [kernel]
88.00 1.7% first_packet_length [kernel]
86.00 1.6% ep_remove [kernel]
83.00 1.6% mutex_lock [kernel]
79.00 1.5% skb_copy_datagram_iovec [kernel]
76.00 1.4% mutex_unlock [kernel]
75.00 1.4% epoll_ctl /lib/libc-2.7.so
73.00 1.4% sock_recv_ts_and_drops [kernel]
67.00 1.3% __skb_recv_datagram [kernel]
65.00 1.2% tick_nohz_stop_sched_tick [kernel]
Interesting stuff; check cache miss contributions - wow, how low is eth_type_trans..
and yet we keep optimizing that!
-------------------------------------------------------------------------------
PerfTop: 1021 irqs/sec kernel:98.8% [1000Hz cache-misses], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________ ________
5271.00 77.8% sky2_poll [sky2]
706.00 10.4% kmem_cache_alloc [kernel]
154.00 2.3% dev_gro_receive [kernel]
149.00 2.2% __napi_gro_receive [kernel]
128.00 1.9% napi_gro_receive [kernel]
106.00 1.6% __alloc_skb [kernel]
57.00 0.8% eth_type_trans [kernel]
45.00 0.7% skb_gro_reset_offset [kernel]
26.00 0.4% drain_array [kernel]
23.00 0.3% perf_session__mmap_read_counter perf
10.00 0.1% cache_alloc_refill [kernel]
9.00 0.1% __netdev_alloc_skb [kernel]
9.00 0.1% event__preprocess_sample perf
-------------------------------------------------------------------------------
PerfTop: 997 irqs/sec kernel:100.0% [1000Hz cache-misses], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ____________________ ________
3019.00 79.4% sky2_poll [sky2]
360.00 9.5% kmem_cache_alloc [kernel]
91.00 2.4% dev_gro_receive [kernel]
86.00 2.3% __alloc_skb [kernel]
83.00 2.2% __napi_gro_receive [kernel]
69.00 1.8% napi_gro_receive [kernel]
45.00 1.2% eth_type_trans [kernel]
25.00 0.7% skb_gro_reset_offset [kernel]
9.00 0.2% __netdev_alloc_skb [kernel]
5.00 0.1% cache_alloc_refill [kernel]
5.00 0.1% skb_pull [kernel]
-------------------------------------------------------------------------------
PerfTop: 997 irqs/sec kernel:100.0% [1000Hz cache-misses], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ____________________ ________
8887.00 79.8% sky2_poll [sky2]
1138.00 10.2% kmem_cache_alloc [kernel]
273.00 2.5% __napi_gro_receive [kernel]
246.00 2.2% dev_gro_receive [kernel]
189.00 1.7% napi_gro_receive [kernel]
159.00 1.4% __alloc_skb [kernel]
119.00 1.1% eth_type_trans [kernel]
86.00 0.8% skb_gro_reset_offset [kernel]
13.00 0.1% __netdev_alloc_skb [kernel]
8.00 0.1% skb_pull [kernel]
7.00 0.1% cache_alloc_refill [kernel]
Not much going on in other cpus .. i.e hardly anything shows up in
the profile ..
IV: rps with ee and irq affinity to cpu0
Avg udp packets sunk: 95.15%
-------------------------------------------------------------------------------
PerfTop: 3558 irqs/sec kernel:84.6% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
3096.00 17.1% sky2_poll [sky2]
645.00 3.6% _raw_spin_lock_irqsave [kernel]
493.00 2.7% system_call [kernel]
462.00 2.6% sky2_intr [sky2]
416.00 2.3% _raw_spin_unlock_irqrestore [kernel]
382.00 2.1% fget [kernel]
361.00 2.0% __netif_receive_skb [kernel]
342.00 1.9% ip_rcv [kernel]
334.00 1.8% _raw_spin_lock [kernel]
320.00 1.8% sys_epoll_ctl [kernel]
298.00 1.6% copy_user_generic_string [kernel]
288.00 1.6% call_function_single_interrup [kernel]
277.00 1.5% load_balance [kernel]
271.00 1.5% ip_route_input [kernel]
270.00 1.5% vread_tsc [kernel].vsyscall_fn
256.00 1.4% kmem_cache_free [kernel]
222.00 1.2% __udp4_lib_lookup [kernel]
222.00 1.2% schedule [kernel]
194.00 1.1% fput [kernel]
189.00 1.0% kmem_cache_alloc [kernel]
171.00 0.9% sys_epoll_wait [kernel]
164.00 0.9% ep_remove [kernel]
-------------------------------------------------------------------------------
PerfTop: 3452 irqs/sec kernel:84.3% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
5033.00 16.2% sky2_poll [sky2]
1147.00 3.7% _raw_spin_lock_irqsave [kernel]
888.00 2.9% system_call [kernel]
774.00 2.5% sky2_intr [sky2]
757.00 2.4% _raw_spin_unlock_irqrestore [kernel]
702.00 2.3% fget [kernel]
630.00 2.0% __netif_receive_skb [kernel]
609.00 2.0% _raw_spin_lock [kernel]
607.00 2.0% ip_rcv [kernel]
553.00 1.8% sys_epoll_ctl [kernel]
514.00 1.7% ip_route_input [kernel]
508.00 1.6% call_function_single_interrup [kernel]
504.00 1.6% copy_user_generic_string [kernel]
466.00 1.5% kmem_cache_free [kernel]
452.00 1.5% schedule [kernel]
450.00 1.4% vread_tsc [kernel].vsyscall_fn
390.00 1.3% load_balance [kernel]
377.00 1.2% fput [kernel]
364.00 1.2% __udp4_lib_lookup [kernel]
329.00 1.1% kmem_cache_alloc [kernel]
314.00 1.0% ep_remove [kernel]
289.00 0.9% dst_release [kernel]
276.00 0.9% sys_epoll_wait [kernel]
265.00 0.9% datagram_poll [kernel]
-------------------------------------------------------------------------------
PerfTop: 3328 irqs/sec kernel:85.7% [1000Hz cycles], (all, 8 CPUs)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
6788.00 17.5% sky2_poll [sky2]
1413.00 3.6% _raw_spin_lock_irqsave [kernel]
1042.00 2.7% system_call [kernel]
997.00 2.6% sky2_intr [sky2]
903.00 2.3% _raw_spin_unlock_irqrestore [kernel]
837.00 2.2% fget [kernel]
740.00 1.9% _raw_spin_lock [kernel]
725.00 1.9% __netif_receive_skb [kernel]
722.00 1.9% ip_rcv [kernel]
651.00 1.7% sys_epoll_ctl [kernel]
609.00 1.6% call_function_single_interrup [kernel]
604.00 1.6% ip_route_input [kernel]
601.00 1.5% copy_user_generic_string [kernel]
573.00 1.5% schedule [kernel]
561.00 1.4% kmem_cache_free [kernel]
538.00 1.4% load_balance [kernel]
515.00 1.3% vread_tsc [kernel].vsyscall_fn
480.00 1.2% fput [kernel]
421.00 1.1% kmem_cache_alloc [kernel]
418.00 1.1% __udp4_lib_lookup [kernel]
377.00 1.0% ep_remove [kernel]
347.00 0.9% datagram_poll [kernel]
335.00 0.9% dst_release [kernel]
-------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:96.2% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
2109.00 61.3% sky2_poll [sky2]
366.00 10.6% sky2_intr [sky2]
84.00 2.4% __alloc_skb [kernel]
57.00 1.7% _raw_spin_lock_irqsave [kernel]
56.00 1.6% get_rps_cpu [kernel]
52.00 1.5% __kmalloc [kernel]
39.00 1.1% irq_entries_start [kernel]
39.00 1.1% enqueue_to_backlog [kernel]
34.00 1.0% kmem_cache_alloc [kernel]
33.00 1.0% default_send_IPI_mask_sequenc [kernel]
32.00 0.9% sky2_rx_submit [sky2]
30.00 0.9% swiotlb_sync_single [kernel]
28.00 0.8% _raw_spin_lock [kernel]
23.00 0.7% sky2_remove [sky2]
22.00 0.6% __smp_call_function_single [kernel]
19.00 0.6% system_call [kernel]
18.00 0.5% sys_epoll_ctl [kernel]
18.00 0.5% fget [kernel]
17.00 0.5% cache_alloc_refill [kernel]
16.00 0.5% copy_user_generic_string [kernel]
16.00 0.5% _raw_spin_unlock_irqrestore [kernel]
15.00 0.4% dev_gro_receive [kernel]
14.00 0.4% net_rx_action [kernel]
-------------------------------------------------------------------------------
PerfTop: 1000 irqs/sec kernel:97.9% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________ ____________________
4479.00 60.9% sky2_poll [sky2]
849.00 11.5% sky2_intr [sky2]
163.00 2.2% __alloc_skb [kernel]
155.00 2.1% get_rps_cpu [kernel]
121.00 1.6% _raw_spin_lock_irqsave [kernel]
92.00 1.3% __kmalloc [kernel]
89.00 1.2% _raw_spin_lock [kernel]
83.00 1.1% enqueue_to_backlog [kernel]
79.00 1.1% irq_entries_start [kernel]
78.00 1.1% kmem_cache_alloc [kernel]
69.00 0.9% sky2_rx_submit [sky2]
65.00 0.9% swiotlb_sync_single [kernel]
58.00 0.8% default_send_IPI_mask_sequence_ [kernel]
50.00 0.7% system_call [kernel]
45.00 0.6% fget [kernel]
40.00 0.5% sky2_remove [sky2]
37.00 0.5% __smp_call_function_single [kernel]
36.00 0.5% datagram_poll [kernel]
36.00 0.5% _raw_spin_unlock_irqrestore [kernel]
34.00 0.5% cache_alloc_refill [kernel]
31.00 0.4% net_rx_action [kernel]
28.00 0.4% kmem_cache_free [kernel]
27.00 0.4% _raw_spin_lock_bh [kernel]
27.00 0.4% copy_user_generic_string [kernel]
25.00 0.3% dev_gro_receive [kernel]
-------------------------------------------------------------------------------
PerfTop: 980 irqs/sec kernel:97.3% [1000Hz cycles], (all, cpu: 0)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _______________________________ ____________________
6544.00 61.6% sky2_poll [sky2]
1098.00 10.3% sky2_intr [sky2]
248.00 2.3% __alloc_skb [kernel]
198.00 1.9% get_rps_cpu [kernel]
182.00 1.7% _raw_spin_lock_irqsave [kernel]
144.00 1.4% __kmalloc [kernel]
138.00 1.3% _raw_spin_lock [kernel]
127.00 1.2% kmem_cache_alloc [kernel]
125.00 1.2% irq_entries_start [kernel]
119.00 1.1% enqueue_to_backlog [kernel]
93.00 0.9% sky2_rx_submit [sky2]
91.00 0.9% swiotlb_sync_single [kernel]
83.00 0.8% default_send_IPI_mask_sequence_ [kernel]
82.00 0.8% system_call [kernel]
64.00 0.6% sky2_remove [sky2]
60.00 0.6% fget [kernel]
58.00 0.5% cache_alloc_refill [kernel]
57.00 0.5% _raw_spin_unlock_irqrestore [kernel]
51.00 0.5% datagram_poll [kernel]
47.00 0.4% copy_user_generic_string [kernel]
-------------------------------------------------------------------------------
PerfTop: 315 irqs/sec kernel:81.0% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
114.00 4.5% system_call [kernel]
98.00 3.9% _raw_spin_lock_irqsave [kernel]
89.00 3.5% _raw_spin_unlock_irqrestore [kernel]
89.00 3.5% ip_rcv [kernel]
83.00 3.3% call_function_single_interrup [kernel]
76.00 3.0% __netif_receive_skb [kernel]
67.00 2.6% fget [kernel]
62.00 2.4% ip_route_input [kernel]
59.00 2.3% vread_tsc [kernel].vsyscall_fn
54.00 2.1% kmem_cache_free [kernel]
54.00 2.1% sys_epoll_ctl [kernel]
51.00 2.0% schedule [kernel]
49.00 1.9% _raw_spin_lock [kernel]
49.00 1.9% __udp4_lib_lookup [kernel]
44.00 1.7% ep_remove [kernel]
44.00 1.7% copy_user_generic_string [kernel]
41.00 1.6% fput [kernel]
38.00 1.5% sys_epoll_wait [kernel]
37.00 1.5% tick_nohz_stop_sched_tick [kernel]
36.00 1.4% kmem_cache_alloc [kernel]
34.00 1.3% datagram_poll [kernel]
33.00 1.3% __udp4_lib_rcv [kernel]
31.00 1.2% process_recv mcpudp
-------------------------------------------------------------------------------
PerfTop: 292 irqs/sec kernel:82.9% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
154.00 4.7% _raw_spin_lock_irqsave [kernel]
140.00 4.2% system_call [kernel]
111.00 3.4% ip_rcv [kernel]
106.00 3.2% _raw_spin_unlock_irqrestore [kernel]
96.00 2.9% call_function_single_interrup [kernel]
95.00 2.9% fget [kernel]
90.00 2.7% __netif_receive_skb [kernel]
89.00 2.7% sys_epoll_ctl [kernel]
77.00 2.3% copy_user_generic_string [kernel]
77.00 2.3% ip_route_input [kernel]
76.00 2.3% kmem_cache_free [kernel]
74.00 2.2% _raw_spin_lock [kernel]
71.00 2.1% schedule [kernel]
69.00 2.1% vread_tsc [kernel].vsyscall_fn
58.00 1.8% __udp4_lib_lookup [kernel]
52.00 1.6% __udp4_lib_rcv [kernel]
51.00 1.5% fput [kernel]
47.00 1.4% ep_remove [kernel]
47.00 1.4% event_base_loop libevent-1.3e.so.1.0.3
39.00 1.2% process_recv mcpudp
39.00 1.2% sys_epoll_wait [kernel]
38.00 1.2% udp_recvmsg [kernel]
38.00 1.2% sock_recv_ts_and_drops [kernel]
37.00 1.1% __switch_to [kernel]
-------------------------------------------------------------------------------
PerfTop: 290 irqs/sec kernel:82.1% [1000Hz cycles], (all, cpu: 2)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ _____________________________ ______________________
175.00 4.7% _raw_spin_lock_irqsave [kernel]
153.00 4.2% system_call [kernel]
122.00 3.3% ip_rcv [kernel]
114.00 3.1% _raw_spin_unlock_irqrestore [kernel]
114.00 3.1% fget [kernel]
105.00 2.8% __netif_receive_skb [kernel]
101.00 2.7% sys_epoll_ctl [kernel]
100.00 2.7% call_function_single_interrup [kernel]
90.00 2.4% copy_user_generic_string [kernel]
84.00 2.3% schedule [kernel]
76.00 2.1% kmem_cache_free [kernel]
76.00 2.1% _raw_spin_lock [kernel]
72.00 2.0% ip_route_input [kernel]
70.00 1.9% vread_tsc [kernel].vsyscall_fn
68.00 1.8% __udp4_lib_lookup [kernel]
68.00 1.8% __udp4_lib_rcv [kernel]
57.00 1.5% ep_remove [kernel]
57.00 1.5% fput [kernel]
55.00 1.5% kmem_cache_alloc [kernel]
51.00 1.4% process_recv mcpudp
next prev parent reply other threads:[~2010-04-28 23:45 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-23 8:12 [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue Changli Gao
2010-04-23 9:27 ` Eric Dumazet
2010-04-23 22:02 ` jamal
2010-04-24 14:10 ` jamal
2010-04-26 14:03 ` Eric Dumazet
2010-04-26 14:55 ` Eric Dumazet
2010-04-26 21:06 ` jamal
[not found] ` <20100429174056.GA8044@gargoyle.fritz.box>
2010-04-29 17:56 ` Eric Dumazet
2010-04-29 18:10 ` OFT - reserving CPU's for networking Stephen Hemminger
2010-04-29 19:19 ` Thomas Gleixner
2010-04-29 20:02 ` Eric Dumazet
2010-04-30 18:15 ` Brian Bloniarz
2010-04-30 18:57 ` David Miller
2010-04-30 19:58 ` Thomas Gleixner
2010-04-30 21:01 ` Andi Kleen
2010-04-30 22:30 ` David Miller
2010-05-01 10:53 ` Andi Kleen
2010-05-01 22:03 ` David Miller
2010-05-01 22:58 ` Andi Kleen
2010-05-01 23:29 ` David Miller
2010-05-01 23:44 ` Ben Hutchings
2010-05-01 20:31 ` Martin Josefsson
2010-05-01 22:13 ` David Miller
[not found] ` <20100429182347.GA8512@gargoyle.fritz.box>
2010-04-29 19:12 ` [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue Eric Dumazet
[not found] ` <20100429214144.GA10663@gargoyle.fritz.box>
2010-04-30 5:25 ` Eric Dumazet
2010-04-30 23:38 ` David Miller
2010-05-01 11:00 ` Andi Kleen
2010-05-02 6:56 ` Eric Dumazet
2010-05-02 9:20 ` Andi Kleen
2010-05-02 10:54 ` Eric Dumazet
2010-05-02 14:13 ` Arjan van de Ven
2010-05-02 14:27 ` Eric Dumazet
2010-05-02 15:32 ` Eric Dumazet
2010-05-02 17:54 ` Arjan van de Ven
2010-05-02 19:22 ` Eric Dumazet
2010-05-02 22:06 ` Andi Kleen
2010-05-03 3:50 ` Arjan van de Ven
2010-05-03 5:17 ` Eric Dumazet
2010-05-03 10:22 ` Arjan van de Ven
2010-05-03 10:34 ` Andi Kleen
2010-05-03 14:09 ` Arjan van de Ven
2010-05-03 14:45 ` Brian Bloniarz
2010-05-04 1:10 ` Arjan van de Ven
2010-05-03 15:52 ` Andi Kleen
2010-05-04 1:11 ` Arjan van de Ven
2010-05-02 21:30 ` Andi Kleen
2010-05-02 15:46 ` Andi Kleen
2010-05-02 16:35 ` Eric Dumazet
2010-05-02 17:43 ` Arjan van de Ven
2010-05-02 17:47 ` Eric Dumazet
2010-05-02 21:25 ` Andi Kleen
2010-05-02 21:45 ` Eric Dumazet
2010-05-02 21:54 ` Andi Kleen
2010-05-02 22:08 ` Eric Dumazet
2010-05-03 20:15 ` jamal
2010-04-26 21:03 ` jamal
2010-04-23 10:26 ` Eric Dumazet
2010-04-27 22:08 ` David Miller
2010-04-27 22:18 ` [PATCH net-next-2.6] bnx2x: Remove two prefetch() Eric Dumazet
2010-04-27 22:19 ` David Miller
2010-04-28 13:14 ` Eilon Greenstein
2010-04-28 15:44 ` Eliezer Tamir
2010-04-28 16:53 ` David Miller
[not found] ` <w2ue8f3c3211004280842r9f2589e8qb8fd4b7933cd9756@mail.gmail.com>
2010-04-28 16:55 ` David Miller
2010-04-28 11:33 ` jamal
2010-04-28 12:33 ` Eric Dumazet
2010-04-28 12:36 ` jamal
2010-04-28 14:06 ` [PATCH net-next-2.6] net: speedup udp receive path Eric Dumazet
2010-04-28 14:19 ` Eric Dumazet
2010-04-28 14:34 ` Eric Dumazet
2010-04-28 21:36 ` David Miller
2010-04-28 22:22 ` [PATCH net-next-2.6] net: ip_queue_rcv_skb() helper Eric Dumazet
2010-04-28 22:39 ` David Miller
2010-04-28 23:44 ` jamal [this message]
2010-04-29 0:00 ` [PATCH net-next-2.6] net: speedup udp receive path jamal
2010-04-29 4:09 ` Eric Dumazet
2010-04-29 11:35 ` jamal
2010-04-29 12:12 ` Changli Gao
2010-04-29 12:45 ` Eric Dumazet
2010-04-29 13:17 ` jamal
2010-04-29 13:21 ` Eric Dumazet
2010-04-29 13:37 ` jamal
2010-04-29 13:49 ` Eric Dumazet
2010-04-29 13:56 ` jamal
2010-04-29 20:36 ` jamal
2010-04-29 21:01 ` [PATCH net-next-2.6] net: sock_def_readable() and friends RCU conversion Eric Dumazet
2010-04-30 13:55 ` Brian Bloniarz
2010-04-30 17:26 ` Eric Dumazet
2010-04-30 23:35 ` David Miller
2010-05-01 4:56 ` Eric Dumazet
2010-05-01 7:02 ` Eric Dumazet
2010-05-01 8:03 ` Eric Dumazet
2010-05-01 22:00 ` David Miller
2010-04-30 19:30 ` [PATCH net-next-2.6] net: speedup udp receive path jamal
2010-04-30 20:40 ` Eric Dumazet
2010-05-01 0:06 ` jamal
2010-05-01 5:57 ` Eric Dumazet
2010-05-01 6:14 ` Eric Dumazet
2010-05-01 10:24 ` Changli Gao
2010-05-01 10:47 ` Eric Dumazet
2010-05-01 11:29 ` jamal
2010-05-01 11:23 ` jamal
2010-05-01 11:42 ` Eric Dumazet
2010-05-01 11:56 ` jamal
2010-05-01 13:22 ` Eric Dumazet
2010-05-01 13:49 ` jamal
2010-05-03 20:10 ` jamal
2010-04-29 23:07 ` Changli Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1272498293.4258.121.camel@bigi \
--to=hadi@cyberus.ca \
--cc=bmb@athenacr.com \
--cc=davem@davemloft.net \
--cc=eilong@broadcom.com \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
--cc=therbert@google.com \
--cc=xiaosuo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.