* loaded router, excessive getnstimeofday in oprofile
@ 2008-08-22 1:57 Denys Fedoryshchenko
2008-08-22 2:23 ` Denys Fedoryshchenko
` (3 more replies)
0 siblings, 4 replies; 69+ messages in thread
From: Denys Fedoryshchenko @ 2008-08-22 1:57 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel
I have loaded router (~650 Mbps In+Out), based on 2xAMD Opteron 248, Sun Fire
X4100. HPET timer available (TSC seems not available on this platform).
Network interfaces is onboard, connected over PCI-X.
Right now i am using only one processor, cause using only one interface and
interrupts stick to it. Other is almost not used.
At peak time i notice in mpstat, that this processor is almost "dead", and if
i run minor application consuming resources - ping over this router will be
terrible. For me it is clear - system overloaded. I did oprofile, and here is
result (at low load time, but at peak time it is very similar).
CPU: AMD64 processors, speed 2193.74 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 100000
CPU_CLK_UNHALT...|
samples| %|
------------------
2679376 71.9851 vmlinux
287212 7.7163 e1000
278674 7.4870 ip_tables
259923 6.9832 nf_conntrack
29699 0.7979 iptable_nat
26752 0.7187 nf_nat
26093 0.7010 nf_conntrack_ipv4
16525 0.4440 iptable_mangle
14988 0.4027 oprofiled
CPU: AMD64 processors, speed 2193.74 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 100000
samples % symbol name
1031727 37.1736 getnstimeofday
230457 8.3035 __napi_schedule
122154 4.4013 __do_softirq
110036 3.9647 dev_queue_xmit
88800 3.1995 net_rx_action
71163 2.5640 ip_route_input
52232 1.8819 local_bh_enable
43804 1.5783 get_next_timer_interrupt
43387 1.5633 ip_forward
35501 1.2791 nf_iterate
35212 1.2687 __slab_alloc
34652 1.2485 default_idle
32375 1.1665 kfree
28127 1.0134 kmem_cache_alloc
What is bothering me, why getnstimeofday called so much? Even i remove HTB
shaper, it still takes 30-40% of whole vmlinux time. From other
applications - only zebra is running.
Any ideas?
^ permalink raw reply [flat|nested] 69+ messages in thread* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-22 1:57 loaded router, excessive getnstimeofday in oprofile Denys Fedoryshchenko @ 2008-08-22 2:23 ` Denys Fedoryshchenko 2008-08-26 9:51 ` Jarek Poplawski ` (2 subsequent siblings) 3 siblings, 0 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-22 2:23 UTC (permalink / raw) To: netdev; +Cc: linux-kernel On Friday 22 August 2008, Denys Fedoryshchenko wrote: Most significant event types where i notice getnstimeofday at top of list. Additions: Counted MEMORY_REQUESTS events (Memory requests by type) with a unit mask of 0x01 (Requests to non-cacheable (UC) memory) count 5000 samples % samples % symbol name 129 31.0843 596 31.1879 getnstimeofday 54 13.0120 251 13.1345 __napi_schedule 36 8.6747 178 9.3145 default_idle 34 8.1928 164 8.5819 irq_entries_start 23 5.5422 143 7.4830 __do_softirq and CPU: AMD64 processors, speed 2193.74 MHz (estimated) Counted INTERRUPTS_MASKED_CYCLES events (Cycles with interrupts masked (IF=0)) with a unit mask of 0x00 (No unit mask) count 5000 samples % symbol name 630015 62.4741 getnstimeofday 28634 2.8394 get_next_timer_interrupt 23279 2.3084 __slab_alloc 15775 1.5643 schedule 14765 1.4641 __slab_free 11154 1.1061 native_read_tsc 10953 1.0861 kmem_cache_alloc 10918 1.0827 tick_nohz_stop_sched_tick 10752 1.0662 update_wall_time 10430 1.0343 net_rx_action 10220 1.0134 __do_softirq 9895 0.9812 __update_sched_clock ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-22 1:57 loaded router, excessive getnstimeofday in oprofile Denys Fedoryshchenko 2008-08-22 2:23 ` Denys Fedoryshchenko @ 2008-08-26 9:51 ` Jarek Poplawski 2008-08-26 10:29 ` Denys Fedoryshchenko 2008-08-26 20:14 ` Evgeniy Polyakov 2008-08-28 3:35 ` Stephen Hemminger 3 siblings, 1 reply; 69+ messages in thread From: Jarek Poplawski @ 2008-08-26 9:51 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On 22-08-2008 03:57, Denys Fedoryshchenko wrote: > I have loaded router (~650 Mbps In+Out), based on 2xAMD Opteron 248, Sun Fire > X4100. HPET timer available (TSC seems not available on this platform). > Network interfaces is onboard, connected over PCI-X. > > Right now i am using only one processor, cause using only one interface and > interrupts stick to it. Other is almost not used. > At peak time i notice in mpstat, that this processor is almost "dead", and if > i run minor application consuming resources - ping over this router will be > terrible. For me it is clear - system overloaded. I did oprofile, and here is > result (at low load time, but at peak time it is very similar). ... > CPU: AMD64 processors, speed 2193.74 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit > mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 1031727 37.1736 getnstimeofday > 230457 8.3035 __napi_schedule > 122154 4.4013 __do_softirq > 110036 3.9647 dev_queue_xmit ... > What is bothering me, why getnstimeofday called so much? Even i remove HTB > shaper, it still takes 30-40% of whole vmlinux time. From other > applications - only zebra is running. > Any ideas? This function is really used in many places, and these profiles are not enough at least to me, but it seems you could have a lot of softirqs (and probably hrtimers) scheduling, so maybe you should try if e.g. disabling hrtimers or changing kernel HZ makes any difference. Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 9:51 ` Jarek Poplawski @ 2008-08-26 10:29 ` Denys Fedoryshchenko 2008-08-26 10:47 ` Jarek Poplawski 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-26 10:29 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev, linux-kernel On Tuesday 26 August 2008, Jarek Poplawski wrote: > > This function is really used in many places, and these profiles are > not enough at least to me, but it seems you could have a lot of > softirqs (and probably hrtimers) scheduling, so maybe you should try > if e.g. disabling hrtimers or changing kernel HZ makes any difference. > > Jarek P. One user is shapers, it is ok for me. I am not sure, but maybe another user is softlockup debug option... and if there is a lot of task switches maybe it will cause excessive load of timers slow? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 10:29 ` Denys Fedoryshchenko @ 2008-08-26 10:47 ` Jarek Poplawski 2008-08-26 10:49 ` Denys Fedoryshchenko 0 siblings, 1 reply; 69+ messages in thread From: Jarek Poplawski @ 2008-08-26 10:47 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Tue, Aug 26, 2008 at 01:29:53PM +0300, Denys Fedoryshchenko wrote: > On Tuesday 26 August 2008, Jarek Poplawski wrote: > > > > This function is really used in many places, and these profiles are > > not enough at least to me, but it seems you could have a lot of > > softirqs (and probably hrtimers) scheduling, so maybe you should try > > if e.g. disabling hrtimers or changing kernel HZ makes any difference. > > > > Jarek P. > One user is shapers, it is ok for me. The question is if you really need so exact shaping at a cost of higher system load. > I am not sure, but maybe another user is softlockup debug option... and if > there is a lot of task switches maybe it will cause excessive load of timers > slow? Maybe. Anyway, you could try if lower HZ (with longer jiffies) can help with processing more skbs without rescheduling. Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 10:47 ` Jarek Poplawski @ 2008-08-26 10:49 ` Denys Fedoryshchenko 2008-08-26 11:07 ` Jarek Poplawski 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-26 10:49 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev, linux-kernel On Tuesday 26 August 2008, Jarek Poplawski wrote: > The question is if you really need so exact shaping at a cost of > higher system load. Thats maybe another reason to have your patch in mainline :-) I will try it today with this case, if it will help. Maybe it can be optional, and enabled via kernel parameter and /sys , so it can be useful in case of crashes when TSC used and when timer is too slow. Because it is not so useful just to disable hrtimers completely, if you need them for some other task... ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 10:49 ` Denys Fedoryshchenko @ 2008-08-26 11:07 ` Jarek Poplawski 2008-08-26 11:15 ` Jarek Poplawski 0 siblings, 1 reply; 69+ messages in thread From: Jarek Poplawski @ 2008-08-26 11:07 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Tue, Aug 26, 2008 at 01:49:09PM +0300, Denys Fedoryshchenko wrote: > On Tuesday 26 August 2008, Jarek Poplawski wrote: > > The question is if you really need so exact shaping at a cost of > > higher system load. > Thats maybe another reason to have your patch in mainline :-) We should be first sure when it's really needed. > I will try it today with this case, if it will help. > > Maybe it can be optional, and enabled via kernel parameter and /sys , so it > can be useful in case of crashes when TSC used and when timer is too slow. > Because it is not so useful just to disable hrtimers completely, if you need > them for some other task... Maybe it could be enough to use current parameters like: "highres=off" according to Documentation/kernel-parameters.txt? Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 11:07 ` Jarek Poplawski @ 2008-08-26 11:15 ` Jarek Poplawski 2008-08-26 11:16 ` Denys Fedoryshchenko 0 siblings, 1 reply; 69+ messages in thread From: Jarek Poplawski @ 2008-08-26 11:15 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Tue, Aug 26, 2008 at 11:07:46AM +0000, Jarek Poplawski wrote: > On Tue, Aug 26, 2008 at 01:49:09PM +0300, Denys Fedoryshchenko wrote: ... > > Maybe it can be optional, and enabled via kernel parameter and /sys , so it > > can be useful in case of crashes when TSC used and when timer is too slow. > > Because it is not so useful just to disable hrtimers completely, if you need > > them for some other task... > > Maybe it could be enough to use current parameters like: "highres=off" > according to Documentation/kernel-parameters.txt? Hmm.. it isn't actually answer to your question, sorry. As I said before I think we need to have more people interested in using such additional options, and btw. I understood from your message that disabling htb didn't solve the problem? Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 11:15 ` Jarek Poplawski @ 2008-08-26 11:16 ` Denys Fedoryshchenko 2008-08-26 11:32 ` Jarek Poplawski 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-26 11:16 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev, linux-kernel On Tuesday 26 August 2008, Jarek Poplawski wrote: > Hmm.. it isn't actually answer to your question, sorry. As I said > before I think we need to have more people interested in using such > additional options, and btw. I understood from your message that > disabling htb didn't solve the problem? > > Jarek P. Only HTB - no. If i disable softlockup debug - seems the load is less (i must make sure), and if i remove HTB - it is becoming low. I will try to give exact numbers in recent days. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 11:16 ` Denys Fedoryshchenko @ 2008-08-26 11:32 ` Jarek Poplawski 2008-08-26 11:32 ` Denys Fedoryshchenko 0 siblings, 1 reply; 69+ messages in thread From: Jarek Poplawski @ 2008-08-26 11:32 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Tue, Aug 26, 2008 at 02:16:32PM +0300, Denys Fedoryshchenko wrote: > On Tuesday 26 August 2008, Jarek Poplawski wrote: > > Hmm.. it isn't actually answer to your question, sorry. As I said > > before I think we need to have more people interested in using such > > additional options, and btw. I understood from your message that > > disabling htb didn't solve the problem? > > > > Jarek P. > Only HTB - no. If i disable softlockup debug - seems the load is less (i must > make sure), and if i remove HTB - it is becoming low. I will try to give > exact numbers in recent days. > So maybe you could try again this htb patch for limiting qdisc_watchdog_schedule()? Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 11:32 ` Jarek Poplawski @ 2008-08-26 11:32 ` Denys Fedoryshchenko 0 siblings, 0 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-26 11:32 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev, linux-kernel On Tuesday 26 August 2008, Jarek Poplawski wrote: > So maybe you could try again this htb patch for limiting > qdisc_watchdog_schedule()? > > Jarek P. Yes, and i am going to take snapshops from system load with different boot flags. It will take time but, cause it is major router. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-22 1:57 loaded router, excessive getnstimeofday in oprofile Denys Fedoryshchenko 2008-08-22 2:23 ` Denys Fedoryshchenko 2008-08-26 9:51 ` Jarek Poplawski @ 2008-08-26 20:14 ` Evgeniy Polyakov 2008-08-26 20:44 ` Eric Dumazet 2008-08-28 3:35 ` Stephen Hemminger 3 siblings, 1 reply; 69+ messages in thread From: Evgeniy Polyakov @ 2008-08-26 20:14 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Fri, Aug 22, 2008 at 04:57:40AM +0300, Denys Fedoryshchenko (denys@visp.net.lb) wrote: > I have loaded router (~650 Mbps In+Out), based on 2xAMD Opteron 248, Sun Fire > X4100. HPET timer available (TSC seems not available on this platform). > Network interfaces is onboard, connected over PCI-X. > > Right now i am using only one processor, cause using only one interface and > interrupts stick to it. Other is almost not used. > At peak time i notice in mpstat, that this processor is almost "dead", and if > i run minor application consuming resources - ping over this router will be > terrible. For me it is clear - system overloaded. I did oprofile, and here is > result (at low load time, but at peak time it is very similar). Do you have any packet sockets in this system? Like running dhcp daemon? -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 20:14 ` Evgeniy Polyakov @ 2008-08-26 20:44 ` Eric Dumazet 2008-08-26 20:51 ` Evgeniy Polyakov 0 siblings, 1 reply; 69+ messages in thread From: Eric Dumazet @ 2008-08-26 20:44 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Denys Fedoryshchenko, netdev, linux-kernel Evgeniy Polyakov a écrit : > On Fri, Aug 22, 2008 at 04:57:40AM +0300, Denys Fedoryshchenko (denys@visp.net.lb) wrote: >> I have loaded router (~650 Mbps In+Out), based on 2xAMD Opteron 248, Sun Fire >> X4100. HPET timer available (TSC seems not available on this platform). >> Network interfaces is onboard, connected over PCI-X. >> >> Right now i am using only one processor, cause using only one interface and >> interrupts stick to it. Other is almost not used. >> At peak time i notice in mpstat, that this processor is almost "dead", and if >> i run minor application consuming resources - ping over this router will be >> terrible. For me it is clear - system overloaded. I did oprofile, and here is >> result (at low load time, but at peak time it is very similar). > > Do you have any packet sockets in this system? Like running dhcp daemon? > Another way to see this problem can be to start a sniffer on the machine, even with a restrictive pcap filter, to check if performance change or not. (It should decrease) For example, I believe that running "ping" could have the same effect (increasing netstamp_needed variable : every incoming packet has to be timestamped) So beware of pings, traceroute and other networking tools... ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 20:44 ` Eric Dumazet @ 2008-08-26 20:51 ` Evgeniy Polyakov 2008-08-27 12:09 ` Denys Fedoryshchenko 2008-08-27 12:54 ` Andi Kleen 0 siblings, 2 replies; 69+ messages in thread From: Evgeniy Polyakov @ 2008-08-26 20:51 UTC (permalink / raw) To: Eric Dumazet; +Cc: Denys Fedoryshchenko, netdev, linux-kernel On Tue, Aug 26, 2008 at 10:44:56PM +0200, Eric Dumazet (dada1@cosmosbay.com) wrote: > >Do you have any packet sockets in this system? Like running dhcp daemon? > > > > Another way to see this problem can be to start a sniffer on the machine, > even with a restrictive pcap filter, to check if performance change or not. > (It should decrease) Or just check /proc/net/packet iirc. Anyway, having at least one packet socket ends up with timestamping of each packet, so you will get fair load of getnstimeofday() in that case. > For example, I believe that running "ping" could have the same effect > (increasing netstamp_needed variable : every incoming packet has to be > timestamped) > > So beware of pings, traceroute and other networking tools... Yup, this innocent toys can end up with this such behaviour on modern highly loaded machines. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 20:51 ` Evgeniy Polyakov @ 2008-08-27 12:09 ` Denys Fedoryshchenko 2008-08-27 12:36 ` Evgeniy Polyakov 2008-08-27 12:54 ` Andi Kleen 1 sibling, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-27 12:09 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Eric Dumazet, netdev, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1344 bytes --] On Tuesday 26 August 2008, Evgeniy Polyakov wrote: > On Tue, Aug 26, 2008 at 10:44:56PM +0200, Eric Dumazet (dada1@cosmosbay.com) wrote: > > >Do you have any packet sockets in this system? Like running dhcp daemon? No, nothing at all. > > > > Another way to see this problem can be to start a sniffer on the machine, > > even with a restrictive pcap filter, to check if performance change or > > not. (It should decrease) Yes, when i run tcpdump even without promisc at peak time, machine will be almost dead. Transit traffic will be 100ms+. I know that it is timestamping packets. Same almost for any libpcap app. > > Or just check /proc/net/packet iirc. > Anyway, having at least one packet socket ends up with timestamping of > each packet, so you will get fair load of getnstimeofday() in that case. There is very short list of tasks. Attached. /proc/net/packet clean, nothing there. > > > For example, I believe that running "ping" could have the same effect > > (increasing netstamp_needed variable : every incoming packet has to be > > timestamped) Even answering icmp timestamp request will take resources. > > > > So beware of pings, traceroute and other networking tools... When i am measuring performance - they are all off. > > Yup, this innocent toys can end up with this such behaviour on modern > highly loaded machines. [-- Attachment #2: tasks.txt --] [-- Type: text/plain, Size: 1169 bytes --] tcp 0 0 127.0.0.1:2600 0.0.0.0:* LISTEN 3167/zebra tcp 0 0 0.0.0.0:2601 0.0.0.0:* LISTEN 3167/zebra tcp 0 0 0.0.0.0:2602 0.0.0.0:* LISTEN 3174/ripd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 3549/sshd tcp 0 0 194.146.153.17:22 194.146.154.161:37549 ESTABLISHED11593/sshd tcp 0 0 194.146.153.17:22 192.168.0.92:45891 ESTABLISHED11803/sshd tcp 0 0 127.0.0.1:42537 127.0.0.1:2600 ESTABLISHED3174/ripd tcp 0 0 194.146.153.17:22 194.146.153.18:51810 ESTABLISHED11799/sshd tcp 0 0 127.0.0.1:2600 127.0.0.1:42537 ESTABLISHED3167/zebra udp 0 0 0.0.0.0:520 0.0.0.0:* 3174/ripd udp 0 0 0.0.0.0:161 0.0.0.0:* 3194/snmpd udp 0 0 0.0.0.0:67 0.0.0.0:* 3207/udhcpd udp 111360 0 0.0.0.0:49619 0.0.0.0:* 2449/syslogd ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 12:09 ` Denys Fedoryshchenko @ 2008-08-27 12:36 ` Evgeniy Polyakov 2008-08-27 14:00 ` Denys Fedoryshchenko 0 siblings, 1 reply; 69+ messages in thread From: Evgeniy Polyakov @ 2008-08-27 12:36 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: Eric Dumazet, netdev, linux-kernel On Wed, Aug 27, 2008 at 03:09:17PM +0300, Denys Fedoryshchenko (denys@visp.net.lb) wrote: > On Tuesday 26 August 2008, Evgeniy Polyakov wrote: > > On Tue, Aug 26, 2008 at 10:44:56PM +0200, Eric Dumazet (dada1@cosmosbay.com) > wrote: > > > >Do you have any packet sockets in this system? Like running dhcp daemon? > No, nothing at all. Can you put debug print into net_enable_timestamp()/net_disable_timestamp() to determine if someone enabled timestamp socket option? > tcp 0 0 127.0.0.1:2600 0.0.0.0:* LISTEN 3167/zebra > tcp 0 0 0.0.0.0:2601 0.0.0.0:* LISTEN 3167/zebra > tcp 0 0 0.0.0.0:2602 0.0.0.0:* LISTEN 3174/ripd > tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 3549/sshd > tcp 0 0 194.146.153.17:22 194.146.154.161:37549 ESTABLISHED11593/sshd > tcp 0 0 194.146.153.17:22 192.168.0.92:45891 ESTABLISHED11803/sshd > tcp 0 0 127.0.0.1:42537 127.0.0.1:2600 ESTABLISHED3174/ripd > tcp 0 0 194.146.153.17:22 194.146.153.18:51810 ESTABLISHED11799/sshd > tcp 0 0 127.0.0.1:2600 127.0.0.1:42537 ESTABLISHED3167/zebra > udp 0 0 0.0.0.0:520 0.0.0.0:* 3174/ripd > udp 0 0 0.0.0.0:161 0.0.0.0:* 3194/snmpd > udp 0 0 0.0.0.0:67 0.0.0.0:* 3207/udhcpd This one looks suspicious ^^^^^^^^^^ > udp 111360 0 0.0.0.0:49619 0.0.0.0:* 2449/syslogd -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 12:36 ` Evgeniy Polyakov @ 2008-08-27 14:00 ` Denys Fedoryshchenko 2008-08-27 14:23 ` Evgeniy Polyakov 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-27 14:00 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Eric Dumazet, netdev, linux-kernel On Wednesday 27 August 2008, Evgeniy Polyakov wrote: > Can you put debug print into > net_enable_timestamp()/net_disable_timestamp() to determine if someone > enabled timestamp socket option? OK, i will do that on next system reboot. > > 0.0.0.0:* 3207/udhcpd > > This one looks suspicious > ^^^^^^^^^^ It is busybox udhcpd... i guess it is innocent. Even i kill it - it doesn't change anything at all. Only who possible listen multicast socket - it is ripd, i cannot kill him. But i think it doesn't matter much too... ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 14:00 ` Denys Fedoryshchenko @ 2008-08-27 14:23 ` Evgeniy Polyakov 0 siblings, 0 replies; 69+ messages in thread From: Evgeniy Polyakov @ 2008-08-27 14:23 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: Eric Dumazet, netdev, linux-kernel On Wed, Aug 27, 2008 at 05:00:35PM +0300, Denys Fedoryshchenko (denys@visp.net.lb) wrote: > > > 0.0.0.0:* 3207/udhcpd > > > > This one looks suspicious > > ^^^^^^^^^^ > It is busybox udhcpd... i guess it is innocent. Even i kill it - it doesn't > change anything at all. > Only who possible listen multicast socket - it is ripd, i cannot kill him. But > i think it doesn't matter much too... It depends... If it turns timestamps on, then you will have this behaviour. Please check if timestamps are actually enabled, so we could remove one (im)possible case. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-26 20:51 ` Evgeniy Polyakov 2008-08-27 12:09 ` Denys Fedoryshchenko @ 2008-08-27 12:54 ` Andi Kleen 2008-08-27 16:07 ` Rick Jones ` (2 more replies) 1 sibling, 3 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-27 12:54 UTC (permalink / raw) To: Evgeniy Polyakov; +Cc: Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes: > > Yup, this innocent toys can end up with this such behaviour on modern > highly loaded machines. I and also other people had some patches to move the time stamp measuring into the socket. This way the time stamping didn't need to be enabled on all packets, only on those that actually end up at a socket that requires the time stamp. Unfortunately DaveM didn't like it because some bank wanted different semantics, see the discussion in http://thread.gmane.org/gmane.linux.network/91679 Perhaps you can find out which bank it was and send them a bill for your CPU time ;-) -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 12:54 ` Andi Kleen @ 2008-08-27 16:07 ` Rick Jones 2008-08-27 16:27 ` Andi Kleen 2008-08-27 16:17 ` Stephen Hemminger 2008-08-27 21:34 ` David Miller 2 siblings, 1 reply; 69+ messages in thread From: Rick Jones @ 2008-08-27 16:07 UTC (permalink / raw) To: Andi Kleen Cc: Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel Andi Kleen wrote: > Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes: > >>Yup, this innocent toys can end up with this such behaviour on modern >>highly loaded machines. > > > I and also other people had some patches to move the time stamp > measuring into the socket. This way the time stamping didn't need to > be enabled on all packets, only on those that actually end up at a > socket that requires the time stamp. > > Unfortunately DaveM didn't like it because some bank wanted > different semantics, see the discussion in > http://thread.gmane.org/gmane.linux.network/91679 > > Perhaps you can find out which bank it was and send them a bill for > your CPU time ;-) Those banks really want to crank down on latency - to the point they start disabling interrupt coalescing. I bet they'd toss anything out they could to shave another microsecond. rick jones ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:07 ` Rick Jones @ 2008-08-27 16:27 ` Andi Kleen 2008-08-27 16:49 ` Rick Jones 2008-08-27 22:18 ` loaded router, excessive getnstimeofday in oprofile David Miller 0 siblings, 2 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-27 16:27 UTC (permalink / raw) To: Rick Jones Cc: Andi Kleen, Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel > Those banks really want to crank down on latency - to the point they > start disabling interrupt coalescing. I bet they'd toss anything out > they could to shave another microsecond. This change would actually likely lower their latency. -Andi ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:27 ` Andi Kleen @ 2008-08-27 16:49 ` Rick Jones 2008-08-27 16:56 ` Andi Kleen 2008-08-27 22:18 ` loaded router, excessive getnstimeofday in oprofile David Miller 1 sibling, 1 reply; 69+ messages in thread From: Rick Jones @ 2008-08-27 16:49 UTC (permalink / raw) To: Andi Kleen Cc: Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel Andi Kleen wrote: >>Those banks really want to crank down on latency - to the point they >>start disabling interrupt coalescing. I bet they'd toss anything out >>they could to shave another microsecond. > > > This change would actually likely lower their latency. I'm guessing you mean increase their latency? I agree, it could - depends entirely on the PPS in production I suspect. rick jones ftp://ftp.cup.hp.com/dist/networking/briefs/nic_latency_vs_tput.txt I should probably refresh/update that one of these days ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:49 ` Rick Jones @ 2008-08-27 16:56 ` Andi Kleen 2008-08-27 16:57 ` Rick Jones 2008-08-27 17:27 ` Eric Dumazet 0 siblings, 2 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-27 16:56 UTC (permalink / raw) To: Rick Jones Cc: Andi Kleen, Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel On Wed, Aug 27, 2008 at 09:49:10AM -0700, Rick Jones wrote: > Andi Kleen wrote: > >>Those banks really want to crank down on latency - to the point they > >>start disabling interrupt coalescing. I bet they'd toss anything out > >>they could to shave another microsecond. > > > > > >This change would actually likely lower their latency. > > I'm guessing you mean increase their latency? I agree, it could - > depends entirely on the PPS in production I suspect. No, moving the time stamps into the socket decreases latency for all packets that don't need time stamps. And they likely have some packets which don't need time stamps too. As a secondary effect if they use a RT kernel it might be also beneficial to do the (depending on the platform) costly time stamp in the lower priority socket context than in the high priority interrupt thread. -Andi ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:56 ` Andi Kleen @ 2008-08-27 16:57 ` Rick Jones 2008-08-27 17:27 ` Eric Dumazet 1 sibling, 0 replies; 69+ messages in thread From: Rick Jones @ 2008-08-27 16:57 UTC (permalink / raw) To: Andi Kleen Cc: Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel Andi Kleen wrote: > On Wed, Aug 27, 2008 at 09:49:10AM -0700, Rick Jones wrote: > >>Andi Kleen wrote: >> >>>>Those banks really want to crank down on latency - to the point they >>>>start disabling interrupt coalescing. I bet they'd toss anything out >>>>they could to shave another microsecond. >>> >>> >>>This change would actually likely lower their latency. >> >>I'm guessing you mean increase their latency? I agree, it could - >>depends entirely on the PPS in production I suspect. > > > No, moving the time stamps into the socket decreases latency > for all packets that don't need time stamps. And they likely > have some packets which don't need time stamps too. Ah, since that part of the discussion wasn't in the quoted text I assumed you were talking about the disabling of interrupt coalescing. rick jones > > As a secondary effect if they use a RT kernel it might > be also beneficial to do the (depending on the platform) > costly time stamp in the lower priority socket context > than in the high priority interrupt thread. > > -Andi ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:56 ` Andi Kleen 2008-08-27 16:57 ` Rick Jones @ 2008-08-27 17:27 ` Eric Dumazet 2008-08-27 18:32 ` loaded router, excessive getnstimeofday in oprofile\ Andi Kleen 1 sibling, 1 reply; 69+ messages in thread From: Eric Dumazet @ 2008-08-27 17:27 UTC (permalink / raw) To: Andi Kleen Cc: Rick Jones, Evgeniy Polyakov, Denys Fedoryshchenko, netdev, linux-kernel Andi Kleen a écrit : > On Wed, Aug 27, 2008 at 09:49:10AM -0700, Rick Jones wrote: >> Andi Kleen wrote: >>>> Those banks really want to crank down on latency - to the point they >>>> start disabling interrupt coalescing. I bet they'd toss anything out >>>> they could to shave another microsecond. >>> >>> This change would actually likely lower their latency. >> I'm guessing you mean increase their latency? I agree, it could - >> depends entirely on the PPS in production I suspect. > > No, moving the time stamps into the socket decreases latency > for all packets that don't need time stamps. And they likely > have some packets which don't need time stamps too. > > As a secondary effect if they use a RT kernel it might > be also beneficial to do the (depending on the platform) > costly time stamp in the lower priority socket context > than in the high priority interrupt thread. > Doing the expensive timestamping in a possibly delayed thread (ie some milliseconds after hardware notification) is wrong/useless. Better use plain xtime instead of getnstimeofday() in this case. We could provide a sysctl setting so that admin can chose between precise timestamps (current behavior) or fast but low resolution timestamping (xtime based) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile\ 2008-08-27 17:27 ` Eric Dumazet @ 2008-08-27 18:32 ` Andi Kleen 2008-08-27 22:23 ` David Miller 0 siblings, 1 reply; 69+ messages in thread From: Andi Kleen @ 2008-08-27 18:32 UTC (permalink / raw) To: Eric Dumazet Cc: Andi Kleen, Rick Jones, Evgeniy Polyakov, Denys Fedoryshchenko, netdev, linux-kernel > Doing the expensive timestamping in a possibly delayed thread (ie some > milliseconds > after hardware notification) is wrong/useless. We had this discussion earlier, please review the thread I linked to. Note that interrupts can be arbitarily delayed too (both by cli and by interrupt mitigation), even on a non RT kernel. If you want exact notification (packet arriving at your NIC's buffers) you need NIC hardware support (and more and more NICs have it[1]). If you do it in software then even the interrupt is at the end of a long queue with a pretty much arbitary delay. Doing it in socket context is just one queue more. It's pretty much all arbitary. The argument for doing it as late as possible is the prohibitive cost on some systems as people notice all the time. -Andi [1] Unfortunately not necessarily synchronized with system time. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile\ 2008-08-27 18:32 ` loaded router, excessive getnstimeofday in oprofile\ Andi Kleen @ 2008-08-27 22:23 ` David Miller 2008-08-27 22:38 ` Andi Kleen 0 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-27 22:23 UTC (permalink / raw) To: andi; +Cc: dada1, rick.jones2, johnpol, denys, netdev, linux-kernel From: Andi Kleen <andi@firstfloor.org> Date: Wed, 27 Aug 2008 20:32:16 +0200 > > Doing the expensive timestamping in a possibly delayed thread (ie some > > milliseconds > > after hardware notification) is wrong/useless. > > We had this discussion earlier, please review the thread I linked to. > > Note that interrupts can be arbitarily delayed too (both by cli > and by interrupt mitigation), even on a non RT kernel. This is a much different kind of delay compared to sleeping for seconds or longer on the socket lock while a GFP_KERNEL allocation is being satisfied by swapping tons of crap out to disk. Your socket solution is not a workable scheme. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile\ 2008-08-27 22:23 ` David Miller @ 2008-08-27 22:38 ` Andi Kleen 0 siblings, 0 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-27 22:38 UTC (permalink / raw) To: David Miller Cc: andi, dada1, rick.jones2, johnpol, denys, netdev, linux-kernel > This is a much different kind of delay compared to sleeping for seconds > or longer on the socket lock while a GFP_KERNEL allocation is being > satisfied by swapping tons of crap out to disk. When this happens then new incoming packets will be lost anyways because there will be no new packets fed back into the RX ring because their allocation will either stall or fail too. I don't think time stamps of dropped packets are very useful ;-) -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:27 ` Andi Kleen 2008-08-27 16:49 ` Rick Jones @ 2008-08-27 22:18 ` David Miller 2008-08-27 22:39 ` Andi Kleen 1 sibling, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-27 22:18 UTC (permalink / raw) To: andi; +Cc: rick.jones2, johnpol, dada1, denys, netdev, linux-kernel From: Andi Kleen <andi@firstfloor.org> Date: Wed, 27 Aug 2008 18:27:35 +0200 > > Those banks really want to crank down on latency - to the point they > > start disabling interrupt coalescing. I bet they'd toss anything out > > they could to shave another microsecond. > > This change would actually likely lower their latency. They want the timestamps, but they want it to match when the packet arrived at their system as closely as is reasonably possible. Socket based solutions don't do that, because we can be sleeping on GFP_KERNEL memory or similar with the socket locked, and thus not be able to set the timestamp until the task wakes up and processes the backlog. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 22:18 ` loaded router, excessive getnstimeofday in oprofile David Miller @ 2008-08-27 22:39 ` Andi Kleen 2008-08-28 0:45 ` Nick Piggin 0 siblings, 1 reply; 69+ messages in thread From: Andi Kleen @ 2008-08-27 22:39 UTC (permalink / raw) To: David Miller Cc: andi, rick.jones2, johnpol, dada1, denys, netdev, linux-kernel On Wed, Aug 27, 2008 at 03:18:24PM -0700, David Miller wrote: > From: Andi Kleen <andi@firstfloor.org> > Date: Wed, 27 Aug 2008 18:27:35 +0200 > > > > Those banks really want to crank down on latency - to the point they > > > start disabling interrupt coalescing. I bet they'd toss anything out > > > they could to shave another microsecond. > > > > This change would actually likely lower their latency. > > They want the timestamps, but they want it to match when the packet > arrived at their system as closely as is reasonably possible. Then they should use hardware time stamps which are increasingly available (e.g. current Intel e1000 design has them and I expect others too). -Andi ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 22:39 ` Andi Kleen @ 2008-08-28 0:45 ` Nick Piggin 2008-08-28 0:48 ` David Miller 0 siblings, 1 reply; 69+ messages in thread From: Nick Piggin @ 2008-08-28 0:45 UTC (permalink / raw) To: Andi Kleen Cc: David Miller, rick.jones2, johnpol, dada1, denys, netdev, linux-kernel On Thursday 28 August 2008 08:39, Andi Kleen wrote: > On Wed, Aug 27, 2008 at 03:18:24PM -0700, David Miller wrote: > > From: Andi Kleen <andi@firstfloor.org> > > Date: Wed, 27 Aug 2008 18:27:35 +0200 > > > > > > Those banks really want to crank down on latency - to the point they > > > > start disabling interrupt coalescing. I bet they'd toss anything out > > > > they could to shave another microsecond. > > > > > > This change would actually likely lower their latency. > > > > They want the timestamps, but they want it to match when the packet > > arrived at their system as closely as is reasonably possible. > > Then they should use hardware time stamps which are increasingly > available (e.g. current Intel e1000 design has them and I expect > others too). Would it make sense to make a new option for these socket timestamps and encourage some apps move over to it? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 0:45 ` Nick Piggin @ 2008-08-28 0:48 ` David Miller 2008-08-28 1:07 ` Nick Piggin 0 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-28 0:48 UTC (permalink / raw) To: nickpiggin; +Cc: andi, rick.jones2, johnpol, dada1, denys, netdev, linux-kernel From: Nick Piggin <nickpiggin@yahoo.com.au> Date: Thu, 28 Aug 2008 10:45:03 +1000 > On Thursday 28 August 2008 08:39, Andi Kleen wrote: > > On Wed, Aug 27, 2008 at 03:18:24PM -0700, David Miller wrote: > > > From: Andi Kleen <andi@firstfloor.org> > > > Date: Wed, 27 Aug 2008 18:27:35 +0200 > > > > > > > > Those banks really want to crank down on latency - to the point they > > > > > start disabling interrupt coalescing. I bet they'd toss anything out > > > > > they could to shave another microsecond. > > > > > > > > This change would actually likely lower their latency. > > > > > > They want the timestamps, but they want it to match when the packet > > > arrived at their system as closely as is reasonably possible. > > > > Then they should use hardware time stamps which are increasingly > > available (e.g. current Intel e1000 design has them and I expect > > others too). > > Would it make sense to make a new option for these socket timestamps > and encourage some apps move over to it? We don't have support to using these specific hardware provided timestamps sources yet, so it's kind of premature to recommend the facility to applications. :) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 0:48 ` David Miller @ 2008-08-28 1:07 ` Nick Piggin 0 siblings, 0 replies; 69+ messages in thread From: Nick Piggin @ 2008-08-28 1:07 UTC (permalink / raw) To: David Miller Cc: andi, rick.jones2, johnpol, dada1, denys, netdev, linux-kernel On Thursday 28 August 2008 10:48, David Miller wrote: > From: Nick Piggin <nickpiggin@yahoo.com.au> > Date: Thu, 28 Aug 2008 10:45:03 +1000 > > > On Thursday 28 August 2008 08:39, Andi Kleen wrote: > > > On Wed, Aug 27, 2008 at 03:18:24PM -0700, David Miller wrote: > > > > From: Andi Kleen <andi@firstfloor.org> > > > > Date: Wed, 27 Aug 2008 18:27:35 +0200 > > > > > > > > > > Those banks really want to crank down on latency - to the point > > > > > > they start disabling interrupt coalescing. I bet they'd toss > > > > > > anything out they could to shave another microsecond. > > > > > > > > > > This change would actually likely lower their latency. > > > > > > > > They want the timestamps, but they want it to match when the packet > > > > arrived at their system as closely as is reasonably possible. > > > > > > Then they should use hardware time stamps which are increasingly > > > available (e.g. current Intel e1000 design has them and I expect > > > others too). > > > > Would it make sense to make a new option for these socket timestamps > > and encourage some apps move over to it? > > We don't have support to using these specific hardware provided timestamps > sources yet, so it's kind of premature to recommend the facility to > applications. :) Dang, that was a really badly quoted. I was reading the thread and got to the end and just fired off my reply from there... Sorry -- what I meant to ask was, would it make sense to have a new option to enable time stamp measuring in the socket receive layer as in the patchset that Andi referenced, but without removing existing support for early timestamping? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 12:54 ` Andi Kleen 2008-08-27 16:07 ` Rick Jones @ 2008-08-27 16:17 ` Stephen Hemminger 2008-08-27 17:14 ` Jarek Poplawski 2008-08-27 21:34 ` David Miller 2 siblings, 1 reply; 69+ messages in thread From: Stephen Hemminger @ 2008-08-27 16:17 UTC (permalink / raw) To: Andi Kleen Cc: Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel On Wed, 27 Aug 2008 14:54:12 +0200 Andi Kleen <andi@firstfloor.org> wrote: > Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes: > > > > Yup, this innocent toys can end up with this such behaviour on modern > > highly loaded machines. > > I and also other people had some patches to move the time stamp > measuring into the socket. This way the time stamping didn't need to > be enabled on all packets, only on those that actually end up at a > socket that requires the time stamp. > > Unfortunately DaveM didn't like it because some bank wanted > different semantics, see the discussion in > http://thread.gmane.org/gmane.linux.network/91679 > > Perhaps you can find out which bank it was and send them a bill for > your CPU time ;-) > > -Andi > Look at /proc/net/ptype to see if any AF_PACKET sockets are open. There are several causes of this: * Applications like DHCP use AF_PACKET when they could use something else * AF_PACKET API was poorly designed and always has timestamps * The choice was made to get more accurate timestamps by stamping early in receive code. A better alternative would be to do it in protocol handler after the socket filter. Sorry, Andi socket layer is too late. * No driver is using hardware mechanisms to get accurate/free timestamps. I was working on sky2, but never was stable/complete. Easist advice now is to fix userspace. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 16:17 ` Stephen Hemminger @ 2008-08-27 17:14 ` Jarek Poplawski 0 siblings, 0 replies; 69+ messages in thread From: Jarek Poplawski @ 2008-08-27 17:14 UTC (permalink / raw) To: Stephen Hemminger Cc: Andi Kleen, Evgeniy Polyakov, Eric Dumazet, Denys Fedoryshchenko, netdev, linux-kernel Stephen Hemminger wrote, On 08/27/2008 06:17 PM: > On Wed, 27 Aug 2008 14:54:12 +0200 > Andi Kleen <andi@firstfloor.org> wrote: > >> Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes: >>> Yup, this innocent toys can end up with this such behaviour on modern >>> highly loaded machines. >> I and also other people had some patches to move the time stamp >> measuring into the socket. This way the time stamping didn't need to >> be enabled on all packets, only on those that actually end up at a >> socket that requires the time stamp. >> >> Unfortunately DaveM didn't like it because some bank wanted >> different semantics, see the discussion in >> http://thread.gmane.org/gmane.linux.network/91679 >> >> Perhaps you can find out which bank it was and send them a bill for >> your CPU time ;-) >> >> -Andi >> > > Look at /proc/net/ptype to see if any AF_PACKET sockets are open. > There are several causes of this: > * Applications like DHCP use AF_PACKET when they could use something else > * AF_PACKET API was poorly designed and always has timestamps > * The choice was made to get more accurate timestamps by stamping early in > receive code. A better alternative would be to do it in protocol handler > after the socket filter. Sorry, Andi socket layer is too late. > * No driver is using hardware mechanisms to get accurate/free timestamps. > I was working on sky2, but never was stable/complete. > > Easist advice now is to fix userspace. And what is working advice? Why exactly admin can't chose between 2 alternatives here? Jarek P. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 12:54 ` Andi Kleen 2008-08-27 16:07 ` Rick Jones 2008-08-27 16:17 ` Stephen Hemminger @ 2008-08-27 21:34 ` David Miller 2008-08-28 2:39 ` Jason Uhlenkott 2 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-27 21:34 UTC (permalink / raw) To: andi; +Cc: johnpol, dada1, denys, netdev, linux-kernel From: Andi Kleen <andi@firstfloor.org> Date: Wed, 27 Aug 2008 14:54:12 +0200 > Evgeniy Polyakov <johnpol@2ka.mipt.ru> writes: > > > > Yup, this innocent toys can end up with this such behaviour on modern > > highly loaded machines. > > I and also other people had some patches to move the time stamp > measuring into the socket. This way the time stamping didn't need to > be enabled on all packets, only on those that actually end up at a > socket that requires the time stamp. By the time you get to the socket, it might be eons (relatively speaking) later, decreasing the usefulness of the timestamp. As just an odd example if the TCP socket is user locked at the moment, because the user is blocked on a GFP_KERNEL allocation, it could be a very long time before we actually process the packet and timestamp it. UDP now does similar socket locking so could potentially hit the same kind of problem. That was my argument against such a change. I find it amusing that nobody it talking about fixing the tools that are creating the timestamp requests when they have no real reason for having them in the first place. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-27 21:34 ` David Miller @ 2008-08-28 2:39 ` Jason Uhlenkott 2008-08-28 3:10 ` David Miller 0 siblings, 1 reply; 69+ messages in thread From: Jason Uhlenkott @ 2008-08-28 2:39 UTC (permalink / raw) To: David Miller; +Cc: andi, johnpol, dada1, denys, netdev, linux-kernel On Wed, Aug 27, 2008 at 14:34:01 -0700, David Miller wrote: > By the time you get to the socket, it might be eons (relatively > speaking) later, decreasing the usefulness of the timestamp. It's a *socket* option. It's named SO_TIMESTAMP. Users of it ought to *expect* that it records the time the packet hits the socket, not the time the frame hits the device. If banks want to know when frames are hitting their devices, that's fine, but setsockopt() is the wrong layer for controlling that sort of thing. An interface flag would make a lot more sense. > I find it amusing that nobody it talking about fixing the tools > that are creating the timestamp requests when they have no real > reason for having them in the first place. I don't agree that the tools are broken. Some of them may have frivolous reasons for wanting timestamps, but they're asking for something at the socket layer, with the scope of a single socket, and it's hardly their fault that we respond to that by doing something expensive and global at a much lower level. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 2:39 ` Jason Uhlenkott @ 2008-08-28 3:10 ` David Miller 2008-08-28 6:28 ` Joe Malicki 0 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-28 3:10 UTC (permalink / raw) To: juhlenko; +Cc: andi, johnpol, dada1, denys, netdev, linux-kernel From: Jason Uhlenkott <juhlenko@akamai.com> Date: Wed, 27 Aug 2008 19:39:58 -0700 > It's a *socket* option. It's named SO_TIMESTAMP. Users of it ought > to *expect* that it records the time the packet hits the socket, not > the time the frame hits the device. When expectations equal reality, and then we change reality, that's called breaking things. What might (and I do mean "might") save us is how other systems implement this. A quick check of BSD shows that at least OpenBSD fetches the timestamp inside of the RAW and UDP usrreq handler, which is basically socket receive. Our man pages simply say "reception" as when the timestamp is from, which may also give us some more leeway. From: Jason Uhlenkott <juhlenko@akamai.com> Date: Wed, 27 Aug 2008 19:39:58 -0700 > > I find it amusing that nobody it talking about fixing the tools > > that are creating the timestamp requests when they have no real > > reason for having them in the first place. > > I don't agree that the tools are broken. Some of them may have > frivolous reasons for wanting timestamps, but they're asking for > something at the socket layer, with the scope of a single socket, and > it's hardly their fault that we respond to that by doing something > expensive and global at a much lower level. Every application using AF_PACKET sockets gets timestamps by default. And we do know of several specific cases where the timestamps are unnecessary. Even for other cases, why in the world does a DHCP client need accurate timestamps? Give me a break. :) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 3:10 ` David Miller @ 2008-08-28 6:28 ` Joe Malicki 2008-08-28 7:22 ` Andi Kleen 2008-08-28 18:00 ` Rick Jones 0 siblings, 2 replies; 69+ messages in thread From: Joe Malicki @ 2008-08-28 6:28 UTC (permalink / raw) To: David Miller Cc: andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy ----- "David Miller" <davem@davemloft.net> wrote: > > Every application using AF_PACKET sockets gets timestamps by > default. And we do know of several specific cases where the > timestamps are unnecessary. > > Even for other cases, why in the world does a DHCP client need > accurate timestamps? Give me a break. :) > I've worked with systems where SO_TIMESTAMP has been used for H.323 videoconferencing systems to synchronize audio and video where remote systems' timestamps on the protocol streams proved to be inaccurate (based off of different, unsynchronized clocks). I can't see any other realistic use of this, but trying to get timestamps for quasi-realtime protocols may be an important use case - and in that case, you want the time when it hits the interface, NOT when it hits the socket. What utility does the time of hitting the socket get you? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 6:28 ` Joe Malicki @ 2008-08-28 7:22 ` Andi Kleen 2008-08-28 15:02 ` Denys Fedoryshchenko ` (2 more replies) 2008-08-28 18:00 ` Rick Jones 1 sibling, 3 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-28 7:22 UTC (permalink / raw) To: Joe Malicki Cc: David Miller, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy > I've worked with systems where SO_TIMESTAMP has been used for > H.323 videoconferencing systems to synchronize audio and video But didn't you really want a "end2end" time stamp in this case, as in really at the end of all kernel/hardware queues on your side. A packet roughly travels this way on a normal NIC before it hits recvmsg() wire -> NIC on die buffers -> NIC RX ring -> interrupt handler -> NAPI or per CPU queue -> softirq socket lookup -> socket queue -> recvmsg These all do their own queuing and all queues can add delays depending on the load. Right now SO_TIMESTAMP is in the interrupt handler, but it's just an arbitary position in a multitude of queues. For video conferencing (or e.g. in general if you implement a retransmit timeout in user space) scheduling delays on the local box surely need to be taken into account too because they all add to the final timing of the packets on the wire. The queues inside the system are really part of the network too. In Linux for example the algorithms who size the TCP buffer space know that and especially take account for it and reserve a local queue buffer. > where remote systems' timestamps on the protocol streams proved > to be inaccurate (based off of different, unsynchronized clocks). Yes, but why ignore local scheduling delays? > > I can't see any other realistic use of this, but trying to get > timestamps for quasi-realtime protocols may be an important use > case - and in that case, you want the time when it hits the > interface, NOT when it hits the socket. I think it's the other way round. Why would the real time protocol care when it hits some arbitary queue in the network stack instead of the time when the application can really read the data? > What utility does the time of hitting the socket get you? SO_TIMESTAMP was originally invented for passive network monitoring as in tcpdump (for which PACKET sockets were designed originally, DHCP is really just abusing them imho). There it makes some sense to do the time stamp as near on the wire as possible but really a hardware time stamp would be better because it is even nearer. But for anything that does end2end it's the wrong semantics anyways because ignoring local queueing delays would be just a bug, and SO_TIMESTAMP ignores them currently. -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 7:22 ` Andi Kleen @ 2008-08-28 15:02 ` Denys Fedoryshchenko 2008-08-28 19:01 ` Ilpo Järvinen 2008-08-28 19:31 ` David Miller 2008-08-28 16:48 ` Denys Fedoryshchenko 2008-08-29 15:21 ` Joe Malicki 2 siblings, 2 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 15:02 UTC (permalink / raw) To: Andi Kleen Cc: Joe Malicki, David Miller, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy On Thursday 28 August 2008, Andi Kleen wrote: I hit one more bug, while deleting root class for htb on ifb0 i got tc stuck (and all operations related to tc), but there was some fixes for this things in net-2.6, so i tried to update git tree. It seems i cannot test current net-2.6, because it is broken for me on USB part (fixed by workaround in init scripts), HPET totally broken in net-2.6, but works for latest main git from torvalds tree. I have to wait when net-2.6 rebased to current torvalds tree, then i will try to test. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 15:02 ` Denys Fedoryshchenko @ 2008-08-28 19:01 ` Ilpo Järvinen 2008-08-28 19:31 ` David Miller 1 sibling, 0 replies; 69+ messages in thread From: Ilpo Järvinen @ 2008-08-28 19:01 UTC (permalink / raw) To: Denys Fedoryshchenko Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, dada1, Netdev, LKML, juhlenko, sammy On Thu, 28 Aug 2008, Denys Fedoryshchenko wrote: > On Thursday 28 August 2008, Andi Kleen wrote: > I hit one more bug, while deleting root class for htb on ifb0 i got tc stuck > (and all operations related to tc), but there was some fixes for this things > in net-2.6, so i tried to update git tree. > > It seems i cannot test current net-2.6, because it is broken for me > on USB part (fixed by workaround in init scripts), HPET totally broken in > net-2.6, but works for latest main git from torvalds tree. > I have to wait when net-2.6 rebased to current torvalds tree, then i will try > to test. You could always pull net-2.6 to Linus' tree by yourself. ...And about the workflow, net-2.6 isn't rebased, instead Linus just pulls it in to his tree. -- i. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 15:02 ` Denys Fedoryshchenko 2008-08-28 19:01 ` Ilpo Järvinen @ 2008-08-28 19:31 ` David Miller 1 sibling, 0 replies; 69+ messages in thread From: David Miller @ 2008-08-28 19:31 UTC (permalink / raw) To: denys; +Cc: andi, jmalicki, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy From: Denys Fedoryshchenko <denys@visp.net.lb> Date: Thu, 28 Aug 2008 18:02:17 +0300 > It seems i cannot test current net-2.6, because it is broken for me > on USB part (fixed by workaround in init scripts), HPET totally broken in > net-2.6, but works for latest main git from torvalds tree. > I have to wait when net-2.6 rebased to current torvalds tree, then i will try > to test. Make a clone of Linus's tree, then pull in the net-2.6 tree. This is always how you should test things especially if you want to make sure you have whatever non-networking bug fixes your machine might require. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 7:22 ` Andi Kleen 2008-08-28 15:02 ` Denys Fedoryshchenko @ 2008-08-28 16:48 ` Denys Fedoryshchenko 2008-08-28 16:56 ` Andi Kleen ` (2 more replies) 2008-08-29 15:21 ` Joe Malicki 2 siblings, 3 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 16:48 UTC (permalink / raw) To: Andi Kleen Cc: Joe Malicki, David Miller, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy My small IMHO regarding SO_TIMESTAMP. 1)Right now i have 400-500 Mbps passing router. If i will run 5 "pings" ,simultaneous ,under _USER_ privileges(i know ping is suid), instead of free 20% CPU time, i will have 1-2% free CPU time. Sure i know ping is suid program, but it is has been "like this" since long time. By security psychos it will be caled DoS. 2)Usefullness of this option. What is a difference if on almost idle machine timestamp retrieved on higher level or lower level? And why we need on highly loaded server so high precision timestamp (with expensive timer), if in my case enabling any socket with SO_TIMESTAMP creating delays more than 10ms(up to 100ms)? 3)Who is most users of SO_TIMESTAMP? iputils which is installed on almost _ANY_ linux machine? busybox which is using same option? Many others userspace multiplatform applications? Or banks? I dont take much in account dhcpd, who is maybe abusing this option. So there is few good solutions available (IMHO): 1)Introduce some SO_REALTIMESTAMP (anyway even SO_TIMESTAMP not defined in any standard) for banks and ntp folks, who need them. And even give them timespec instead timeval, so they will be even more happy with resolution. 2)Provide sysctl,kernel boot, or even "build time" option for "banks" to have high resolution(and expensive) SO_TIMESTAMP. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 16:48 ` Denys Fedoryshchenko @ 2008-08-28 16:56 ` Andi Kleen 2008-08-28 18:57 ` Eric Dumazet 2008-08-28 19:36 ` David Miller 2 siblings, 0 replies; 69+ messages in thread From: Andi Kleen @ 2008-08-28 16:56 UTC (permalink / raw) To: Denys Fedoryshchenko Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy On Thu, Aug 28, 2008 at 07:48:52PM +0300, Denys Fedoryshchenko wrote: > 1)Right now i have 400-500 Mbps passing router. If i will run > 5 "pings" ,simultaneous ,under _USER_ privileges(i know ping is suid), > instead of free 20% CPU time, i will have 1-2% free CPU time. Sure i know > ping is suid program, but it is has been "like this" since long time. By > security psychos it will be caled DoS. The skb timestamp overhead does not add up, it's either on or off. If multiple pings make the router slower it must be something else. -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 16:48 ` Denys Fedoryshchenko 2008-08-28 16:56 ` Andi Kleen @ 2008-08-28 18:57 ` Eric Dumazet 2008-08-28 19:25 ` Denys Fedoryshchenko 2008-08-28 19:36 ` David Miller 2 siblings, 1 reply; 69+ messages in thread From: Eric Dumazet @ 2008-08-28 18:57 UTC (permalink / raw) To: Denys Fedoryshchenko Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, netdev, linux-kernel, juhlenko, sammy Denys Fedoryshchenko a écrit : > My small IMHO regarding SO_TIMESTAMP. > > 1)Right now i have 400-500 Mbps passing router. If i will run > 5 "pings" ,simultaneous ,under _USER_ privileges(i know ping is suid), > instead of free 20% CPU time, i will have 1-2% free CPU time. Sure i know > ping is suid program, but it is has been "like this" since long time. By > security psychos it will be caled DoS. > > So... if using ping on your machine has direct an noticeable effect on cpu load, problem is elsewhere (if no ping is running, you dont have skb timestamping, but still getnstimeofday() is the top function in oprofile) 1) Do you have any netfilter rule using xt_time ? (This module also calls __net_timestamp(skb)) 2) You maybe have a bad program that do something expensive relative to kernel time services. bad_program() { while (1) { struct timeval t0,t1; gettimeofday(&tv0, NULL); // or whatever function that calls getnstimeofday() do_small_work(); gettimeofday(&tv1, NULL); // or whatever function that calls getnstimeofday() add_stat_event(&tv1, &tv0); } > 2)Usefullness of this option. What is a difference if on almost idle machine > timestamp retrieved on higher level or lower level? > And why we need on highly loaded server so high precision timestamp (with > expensive timer), if in my case enabling any socket with SO_TIMESTAMP > creating delays more than 10ms(up to 100ms)? Your setup is probably not common. You want a PersonnalComputer class machine acts as a SuperCiscoDevice(TM), while most PC machines dont use more than 10% of CPU power in average... Many existing programs depend on current SO_TIMESTAMP. We wont break them to solve a particular problem (yet to be demonstrated) > > 3)Who is most users of SO_TIMESTAMP? iputils which is installed on almost > _ANY_ linux machine? busybox which is using same option? Many others > userspace multiplatform applications? Or banks? I dont take much in account > dhcpd, who is maybe abusing this option. > > So there is few good solutions available (IMHO): > 1)Introduce some SO_REALTIMESTAMP (anyway even SO_TIMESTAMP not defined in any > standard) for banks and ntp folks, who need them. And even give them timespec > instead timeval, so they will be even more happy with resolution. kernel already provides nanosecond resolution :) Check SO_TIMESTAMPNS and SCM_TIMESTAMPNS ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 18:57 ` Eric Dumazet @ 2008-08-28 19:25 ` Denys Fedoryshchenko 2008-08-28 19:37 ` Eric Dumazet 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 19:25 UTC (permalink / raw) To: Eric Dumazet Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, netdev, linux-kernel, juhlenko, sammy On Thursday 28 August 2008, Eric Dumazet wrote: > So... if using ping on your machine has direct an noticeable effect on cpu > load, problem is elsewhere (if no ping is running, you dont have skb > timestamping, but still getnstimeofday() is the top function in oprofile) > > 1) Do you have any netfilter rule using xt_time ? > (This module also calls __net_timestamp(skb)) No > > 2) You maybe have a bad program that do something expensive relative to > kernel time services. No, process list is very short, it is custom semi-embedded linux distro i made, so i know each process running there. Here is process list (kernel processes/threads and running shell(busybox ash) removed) 1 root /bin/sh /init 1119 root init 2451 root /sbin/syslogd -R 80.83.17.2 2453 root /sbin/klogd 3168 squid /usr/sbin/zebra -d 3175 squid /usr/sbin/ripd -d 3195 root /usr/sbin/snmpd -c /config/snmpd.conf 3208 root udhcpd /config/udhcp.office.conf -S 3550 root /usr/sbin/sshd -b /etc/banner 3566 root /sbin/getty 38400 tty1 3567 root /sbin/getty 38400 tty2 3570 root /sbin/getty 38400 tty3 4055 root /usr/sbin/sshd -b /etc/banner > Your setup is probably not common. > You want a PersonnalComputer class machine acts as a SuperCiscoDevice(TM), > while most PC machines dont use more than 10% of CPU power in average... I dont think i am alone, and almost sure there is many guys trying to run linux as high-performance router. But most of them dont know about netdev@ :-) Well, thats called "Increasing resources use efficiency and system productivity". It is never a shame to utilize resources more efficiently. Plus i am not using PC class machine. For example this one with HPET, is Sun Fire X4100, which costs us that time a lot of bucks, and mostly because it is reliable hardware (very good IPMI/remote kvm/... onboard, good cooling, 4 e1000, dual power supply). I can use also PC class, but i will face some issues, like building proper cooling system and maybe even it will not work well, cause some chips not designed for "heavy duty", and on load they will not be able to dissipate heat inside the chip and will be broked soon. But sometimes it is even worth to try. And most important, many routers is already "soft"-routers. What is Cisco 7206+NPE G1/G2? It is MIPS CPU with relatively large L2 cache. There is seems no ASIC for routing offloading. Means Linux can do same or better job. And means Vyatta can beat Cisco on this market, and be far away forward from Cisco soon. As result more jobs for opensource guys. Linux must enter "heavy duty" and critical jobs too, not only SOHO-class routers. > > Many existing programs depend on current SO_TIMESTAMP. > We wont break them to solve a particular problem (yet to be demonstrated) I think it wouldn't break. But sure we must be very careful and on my side i can test all possible scenarios i can implement. Maybe even good idea to not change (for now) current default behaviour, but to provide option for "high performance" systems then. > kernel already provides nanosecond resolution :) > Check SO_TIMESTAMPNS and SCM_TIMESTAMPNS Maybe this function really must be "heavy" then. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 19:25 ` Denys Fedoryshchenko @ 2008-08-28 19:37 ` Eric Dumazet 2008-08-28 19:55 ` Denys Fedoryshchenko 0 siblings, 1 reply; 69+ messages in thread From: Eric Dumazet @ 2008-08-28 19:37 UTC (permalink / raw) To: Denys Fedoryshchenko Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, netdev, linux-kernel, juhlenko, sammy Denys Fedoryshchenko a écrit : > On Thursday 28 August 2008, Eric Dumazet wrote: >> 2) You maybe have a bad program that do something expensive relative to >> kernel time services. > No, process list is very short, it is custom semi-embedded linux distro i > made, so i know each process running there. Here is process list (kernel > processes/threads and running shell(busybox ash) removed) > > 1 root /bin/sh /init > 1119 root init > 2451 root /sbin/syslogd -R 80.83.17.2 > 2453 root /sbin/klogd > 3168 squid /usr/sbin/zebra -d > 3175 squid /usr/sbin/ripd -d > 3195 root /usr/sbin/snmpd -c /config/snmpd.conf > 3208 root udhcpd /config/udhcp.office.conf -S > 3550 root /usr/sbin/sshd -b /etc/banner > 3566 root /sbin/getty 38400 tty1 > 3567 root /sbin/getty 38400 tty2 > 3570 root /sbin/getty 38400 tty3 > 4055 root /usr/sbin/sshd -b /etc/banner > OK, please try oprofile with call graph analysis. > >> kernel already provides nanosecond resolution :) >> Check SO_TIMESTAMPNS and SCM_TIMESTAMPNS > Maybe this function really must be "heavy" then. Nope... the contrary :) Kernel timestamping has nanosec resolution. SO_TIMESTAMP needs a divide (by 1000), while SO_TIMESTAMPNS is native. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 19:37 ` Eric Dumazet @ 2008-08-28 19:55 ` Denys Fedoryshchenko 2008-08-29 15:43 ` Stephen Hemminger 0 siblings, 1 reply; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 19:55 UTC (permalink / raw) To: Eric Dumazet Cc: Andi Kleen, Joe Malicki, David Miller, johnpol, netdev, linux-kernel, juhlenko, sammy On Thursday 28 August 2008, Eric Dumazet wrote: > OK, please try oprofile with call graph analysis. I did already. Even because most of programs (except ripd/zebra) can be killed, and i kill them, it doesn't change almost anything. it seems heavy things causing instability: 1)HTB (resolution can be lowered to improve performance, i will try Jarek patch soon) 2)ocassionally ping/tcpdump other SO_TIMESTAMP users 3)Probably softlockup detection. Disabled already, i will come back to it soon, if it is required. One of other issues i notice - "CACHE MISS" cause maybe almost 5-10% in oprofile in u32, but i am not sure it is interesting subject to discuss. I have to optimize all my iproute2 rules first. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 19:55 ` Denys Fedoryshchenko @ 2008-08-29 15:43 ` Stephen Hemminger 0 siblings, 0 replies; 69+ messages in thread From: Stephen Hemminger @ 2008-08-29 15:43 UTC (permalink / raw) To: Denys Fedoryshchenko Cc: Eric Dumazet, Andi Kleen, Joe Malicki, David Miller, johnpol, netdev, linux-kernel, juhlenko, sammy On Thu, 28 Aug 2008 22:55:29 +0300 Denys Fedoryshchenko <denys@visp.net.lb> wrote: > On Thursday 28 August 2008, Eric Dumazet wrote: > > OK, please try oprofile with call graph analysis. > I did already. Even because most of programs (except ripd/zebra) can be > killed, and i kill them, it doesn't change almost anything. > > it seems heavy things causing instability: > > 1)HTB (resolution can be lowered to improve performance, i will try Jarek > patch soon) If you are doing HTB it also calls clock to get timing information. Each packet dequeue in htb calls psched_get_time() and that becomes another call nano-second real time clock. If your embedded processor has really expensive clock, you probably just want to provide an alternative cheaper time source with less resolution. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 16:48 ` Denys Fedoryshchenko 2008-08-28 16:56 ` Andi Kleen 2008-08-28 18:57 ` Eric Dumazet @ 2008-08-28 19:36 ` David Miller 2008-08-28 19:59 ` Denys Fedoryshchenko 2 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-28 19:36 UTC (permalink / raw) To: denys; +Cc: andi, jmalicki, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy From: Denys Fedoryshchenko <denys@visp.net.lb> Date: Thu, 28 Aug 2008 19:48:52 +0300 > So there is few good solutions available (IMHO): > 1)Introduce some SO_REALTIMESTAMP (anyway even SO_TIMESTAMP not defined in any > standard) for banks and ntp folks, who need them. And even give them timespec > instead timeval, so they will be even more happy with resolution. > 2)Provide sysctl,kernel boot, or even "build time" option for "banks" to have > high resolution(and expensive) SO_TIMESTAMP. The performance hit hurts, but changing the default to lower resolution after it having been high resolution for 10+ years is a regression and something we really can't do. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 19:36 ` David Miller @ 2008-08-28 19:59 ` Denys Fedoryshchenko 0 siblings, 0 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 19:59 UTC (permalink / raw) To: David Miller Cc: andi, jmalicki, johnpol, dada1, netdev, linux-kernel, juhlenko, sammy On Thursday 28 August 2008, David Miller wrote: > From: Denys Fedoryshchenko <denys@visp.net.lb> > Date: Thu, 28 Aug 2008 19:48:52 +0300 > > > So there is few good solutions available (IMHO): > > 1)Introduce some SO_REALTIMESTAMP (anyway even SO_TIMESTAMP not defined > > in any standard) for banks and ntp folks, who need them. And even give > > them timespec instead timeval, so they will be even more happy with > > resolution. 2)Provide sysctl,kernel boot, or even "build time" option for > > "banks" to have high resolution(and expensive) SO_TIMESTAMP. > > The performance hit hurts, but changing the default to lower > resolution after it having been high resolution for 10+ years > is a regression and something we really can't do. Agree. Then maybe to add way to choose, because choice is high resolution vs performance. For example Intel dynamically throttling interrupts on e1000*, and it saves me in this case. They leave also option for users who wants low latency/high troughput. So maybe there must be a way for specific functions who uses get(ns)timeofday to use specific timers (cheap and less precise), by option. Or to limit amount of calls to timer by them. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 7:22 ` Andi Kleen 2008-08-28 15:02 ` Denys Fedoryshchenko 2008-08-28 16:48 ` Denys Fedoryshchenko @ 2008-08-29 15:21 ` Joe Malicki 2008-08-29 15:30 ` Andi Kleen 2008-08-29 20:43 ` Evgeniy Polyakov 2 siblings, 2 replies; 69+ messages in thread From: Joe Malicki @ 2008-08-29 15:21 UTC (permalink / raw) To: Andi Kleen Cc: David Miller, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy ----- "Andi Kleen" <andi@firstfloor.org> wrote: > > I've worked with systems where SO_TIMESTAMP has been used for > > H.323 videoconferencing systems to synchronize audio and video > > But didn't you really want a "end2end" time stamp in this case, > as in really at the end of all kernel/hardware queues on your side. No. That adds variance, and packets aren't comparable because they may suffer different kernel/hardware delays. The goal is to approximate original sendtime when the application-level timestamps are unreliable. The more queueing delays that can be taken out of the timestamp, the better. > A packet roughly travels this way on a normal NIC before it hits > recvmsg() > > wire -> NIC on die buffers -> NIC RX ring -> interrupt handler -> > NAPI or per CPU queue -> softirq socket lookup -> socket queue -> > recvmsg > > These all do their own queuing and all queues can add delays > depending > on the load. Right now SO_TIMESTAMP is in the interrupt handler, > but it's just an arbitary position in a multitude of queues. > If it could be even earlier, it would be better. > For video conferencing (or e.g. in general if you implement a > retransmit > timeout in user space) scheduling delays on the local box > surely need to be taken into account too because they all add > to the final timing of the packets on the wire. For retransmit timeouts, that might be interesting, and might be one case where it is interesting. But then what value does SO_TIMESTAMP have, since you could call gettimeofday() immediately after receipt, and also include application scheduling delays? For videoconferencing, one wants to know when to display a packet as compared to other packets. > The queues inside the system are really part of the network > too. In Linux for example the algorithms who size the TCP > buffer space know that and especially take account for it > and reserve a local queue buffer. > > > where remote systems' timestamps on the protocol streams proved > > to be inaccurate (based off of different, unsynchronized clocks). > > Yes, but why ignore local scheduling delays? Because one would want to ignore even network scheduling delays if possible... unfortunately in some instances it's not. > > > > I can't see any other realistic use of this, but trying to get > > timestamps for quasi-realtime protocols may be an important use > > case - and in that case, you want the time when it hits the > > interface, NOT when it hits the socket. > > I think it's the other way round. Why would the real time > protocol care when it hits some arbitary queue in the network > stack instead of the time when the application can really > read the data? > > > What utility does the time of hitting the socket get you? > > SO_TIMESTAMP was originally invented for passive network > monitoring as in tcpdump (for which PACKET sockets were designed > originally, DHCP is really just abusing them imho). There it makes > some sense to do the time stamp as near on the wire as possible > but really a hardware time stamp would be better because > it is even nearer. But for anything that does end2end it's > the wrong semantics anyways because ignoring local queueing > delays would be just a bug, and SO_TIMESTAMP ignores them currently. > > -Andi > > -- > ak@linux.intel.com Why would you want to do end-to-end with SO_TIMESTAMP, vs. gettimeofday after recv? ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-29 15:21 ` Joe Malicki @ 2008-08-29 15:30 ` Andi Kleen 2008-08-29 15:43 ` Joe Malicki 2008-08-29 20:43 ` Evgeniy Polyakov 1 sibling, 1 reply; 69+ messages in thread From: Andi Kleen @ 2008-08-29 15:30 UTC (permalink / raw) To: Joe Malicki Cc: Andi Kleen, David Miller, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy > That adds variance, and packets aren't comparable because they may > suffer different kernel/hardware delays. And there are no "different kernel/hardware delays" in the network? If your RTT measurement method cannot handle some variance (using standard sampling and data smoothing techniques similar to TCP) then it just needs to be fixed. Besides measuring in the interrupt handler doesn't protect you against local variances anyways because the interrupt timing has variability (e.g due to irq off regions or due to interrupt mitigation or higher priority interrupts) too > > Yes, but why ignore local scheduling delays? > > Because one would want to ignore even network scheduling delays > if possible... unfortunately in some instances it's not. The local delays add to the user experience too. It's unclear why you want to ignore those. -Andi -- ak@linux.intel.com ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-29 15:30 ` Andi Kleen @ 2008-08-29 15:43 ` Joe Malicki 0 siblings, 0 replies; 69+ messages in thread From: Joe Malicki @ 2008-08-29 15:43 UTC (permalink / raw) To: Andi Kleen Cc: Andi Kleen, David Miller, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy Joe Malicki Software Engineer MetaCarta, Inc. ----- "Andi Kleen" <andi@firstfloor.org> wrote: > > That adds variance, and packets aren't comparable because they may > > suffer different kernel/hardware delays. > > And there are no "different kernel/hardware delays" in the network? > > If your RTT measurement method cannot handle some variance (using > standard sampling and data smoothing techniques similar to TCP) then > it > just needs to be fixed. Noone's measuring RTT... what ever made you think that? I should explain the application of SO_TIMESTAMP better. Video camera -> Video jack -> Digitization -> Compression -> Packetization -> NIC -> Ethernet -> NIC -> Interrupt Handler -> Queue -> Application Microphone -> MIC jack -> Digitization -> Compression -> Packetization -> NIC -> Ethernet -> NIC -> Interrupt Handler -> Queue -> Application One wants to know the original time sound and light waves hit the camera and microphone, because one wants to know when they should hit the soundcard and video on the other end (i.e. any delays should be synchronized) but one only has control over the receiving system. There are timestamps at the application level for this... unfortunately, many implementations in the real world have independent clocks that skew relative to each other, with little correction on the sending system. Yeah, that's broken, but one has to be liberal in what one accepts from popular products. One way to mitigate the skew between the clocks is to take measurements on the receiving host, which you do control, and compare the average skew between the two streams and correct for it. Interrupt handler time has variance, but it's less than application-level time, so it's a better, more reliable estimator. > Besides measuring in the interrupt handler doesn't protect you > against local variances anyways because the interrupt timing has > variability > (e.g due to irq off regions or due to interrupt mitigation or > higher priority interrupts) too > True, but occasionally it's the best approximation to original send time. > > > Yes, but why ignore local scheduling delays? > > > > Because one would want to ignore even network scheduling delays > > if possible... unfortunately in some instances it's not. > > The local delays add to the user experience too. > It's unclear why you want to ignore those. > > -Andi You don't want to ignore them, you want to compensate for them by getting an earlier timestamp. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-29 15:21 ` Joe Malicki 2008-08-29 15:30 ` Andi Kleen @ 2008-08-29 20:43 ` Evgeniy Polyakov 1 sibling, 0 replies; 69+ messages in thread From: Evgeniy Polyakov @ 2008-08-29 20:43 UTC (permalink / raw) To: Joe Malicki Cc: Andi Kleen, David Miller, dada1, denys, netdev, linux-kernel, juhlenko, sammy On Fri, Aug 29, 2008 at 11:21:26AM -0400, Joe Malicki (jmalicki@metacarta.com) wrote: > > But didn't you really want a "end2end" time stamp in this case, > > as in really at the end of all kernel/hardware queues on your side. > > No. > > That adds variance, and packets aren't comparable because they may > suffer different kernel/hardware delays. > > The goal is to approximate original sendtime when the application-level > timestamps are unreliable. The more queueing delays that can be > taken out of the timestamp, the better. Just a note from that one who really developed real-time audio and video processing engines: _no_one_ really relies to the timestamps attached to the received packet. By no one I really mean NO ONE. It is ust wrong, broken and stupid. There are so many queues in the data path, that it just can not be reliable by definition. Instead sending path incapsulates packet sequence number into appropriate packet header (like, and the most cases the only, RTP header), and receiving path just multiplies this sequence number by the compression rate and size of the packet. This numbers differ from design to design, but overall approach is the same: no one really depends on the hardware timestamp attached on the receiver, only sender's data is reliable. If someone depends on it, it is broken and just waits for the appropriate attack vector to inect broken data into the dataflow (such users do not use tcp, since it "introduces unneded delays" or similar marketing and compeltely untested things). So this overall discussion of the timestamp option is meaningless: we just bloody can not change it as is, since so many applications really depend on it (even if they should not). We can force lower resolution in terms of xtime or similar counter, which will be default timestamp in case of some syscall (turned off by default), but since so far no one sent a patch, this looks very subtle. -- Evgeniy Polyakov ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 6:28 ` Joe Malicki 2008-08-28 7:22 ` Andi Kleen @ 2008-08-28 18:00 ` Rick Jones 2008-08-28 19:42 ` David Miller 2008-09-01 2:39 ` Valdis.Kletnieks 1 sibling, 2 replies; 69+ messages in thread From: Rick Jones @ 2008-08-28 18:00 UTC (permalink / raw) To: Joe Malicki Cc: David Miller, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy > What utility does the time of hitting the socket get you? The earliest time the application could have been expected to start processing the request. Until it hits the socket, it might as well be somewhere in the cloud. By that reasoning of course, one could argue that a gettimeofday() call immediately following recv() would suffice. Earlier in the thread mention was made of financial services types. If someone has knowledge of the (probably) arcane rules under which they must operate it would be great to hear more. Does some entity like the SEC (Securities and Exchange Commission in the United States) mandate some sort of timestamp for when the trading request "arrives at the trading system" and do they define that "arriving at the trading system" means? rick jones ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 18:00 ` Rick Jones @ 2008-08-28 19:42 ` David Miller 2008-08-28 20:29 ` Rick Jones 2008-09-01 2:39 ` Valdis.Kletnieks 1 sibling, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-28 19:42 UTC (permalink / raw) To: rick.jones2 Cc: jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy From: Rick Jones <rick.jones2@hp.com> Date: Thu, 28 Aug 2008 11:00:48 -0700 > Earlier in the thread mention was made of financial services types. > If someone has knowledge of the (probably) arcane rules under which > they must operate it would be great to hear more. Does some entity > like the SEC (Securities and Exchange Commission in the United > States) mandate some sort of timestamp for when the trading request > "arrives at the trading system" and do they define that "arriving at > the trading system" means? The issue is the ordering of processing the requests. So if request A arrived on interface 1 before request B arrived on interface 2, the trade described in A should be performed before the one in B. This is not "arcance" as you seem to suppose it might be, but rather pretty clear fair handling or requests sent between trading desks. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 19:42 ` David Miller @ 2008-08-28 20:29 ` Rick Jones 2008-08-28 20:32 ` David Miller 0 siblings, 1 reply; 69+ messages in thread From: Rick Jones @ 2008-08-28 20:29 UTC (permalink / raw) To: David Miller Cc: jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy > > The issue is the ordering of processing the requests. > > So if request A arrived on interface 1 before request B arrived on > interface 2, the trade described in A should be performed before the > one in B. > > This is not "arcance" as you seem to suppose it might be, but rather > pretty clear fair handling or requests sent between trading desks. Has the request "hit the trading system" when it hits the NIC, or when it hits the application executing the trade? If the SEC calls for when it hits the NIC, then none of what is done today is really accurate/correct and one would need to start using NIC HW timestamps, synchronized with the host and the other NICs in the system no? The way things are today, there really isn't much guarantee that hitting NIC 1 before NIC 2 will result in a driver-generated timestamp for the NIC 1 packet which is before the driver-generated timestamp for the NIC 2 packet. It will be luck of the interrupt coalescing interaction with other traffic on the NIC and/or polling out of NAPI right? rick jones ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 20:29 ` Rick Jones @ 2008-08-28 20:32 ` David Miller 2008-08-28 20:45 ` Rick Jones 0 siblings, 1 reply; 69+ messages in thread From: David Miller @ 2008-08-28 20:32 UTC (permalink / raw) To: rick.jones2 Cc: jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy From: Rick Jones <rick.jones2@hp.com> Date: Thu, 28 Aug 2008 13:29:30 -0700 > Has the request "hit the trading system" when it hits the NIC, or > when it hits the application executing the trade? If the SEC calls > for when it hits the NIC, then none of what is done today is really > accurate/correct and one would need to start using NIC HW > timestamps, synchronized with the host and the other NICs in the > system no? The SEC isn't mandating anything here, stop framing it that way :-) People simply won't trade with a firm if they find out that trades there are executed out of order. They are simply trying to make things as fair as possible. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 20:32 ` David Miller @ 2008-08-28 20:45 ` Rick Jones 2008-08-28 20:47 ` David Miller 0 siblings, 1 reply; 69+ messages in thread From: Rick Jones @ 2008-08-28 20:45 UTC (permalink / raw) To: David Miller Cc: jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy David Miller wrote: > From: Rick Jones <rick.jones2@hp.com> > Date: Thu, 28 Aug 2008 13:29:30 -0700 > > >>Has the request "hit the trading system" when it hits the NIC, or >>when it hits the application executing the trade? If the SEC calls >>for when it hits the NIC, then none of what is done today is really >>accurate/correct and one would need to start using NIC HW >>timestamps, synchronized with the host and the other NICs in the >>system no? > > > The SEC isn't mandating anything here, stop framing it that way :-) Must be my DC upbringing. I figured that if the logic wasn't 100% concrete a US Federal Bureaucracy must be involved :) > People simply won't trade with a firm if they find out that trades > there are executed out of order. > > They are simply trying to make things as fair as possible. But that is the very crux of the question - exactly where is "in order" to be determined? Is it supposed to be arrival time at the NIC HW, initial notice by the driver, or initial notice by the trading application? Given that there are no guarantees that a packet arriving on NIC 1 and timestamped either by the NIC HW or the driver will actually hit the application before a packet arriving on NIC2, just how long are these financial services applications going to wait around before executing the trade carried in the packet arriving on NIC1? rick jones ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 20:45 ` Rick Jones @ 2008-08-28 20:47 ` David Miller 0 siblings, 0 replies; 69+ messages in thread From: David Miller @ 2008-08-28 20:47 UTC (permalink / raw) To: rick.jones2 Cc: jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy From: Rick Jones <rick.jones2@hp.com> Date: Thu, 28 Aug 2008 13:45:40 -0700 > Given that there are no guarantees that a packet arriving on NIC 1 > and timestamped either by the NIC HW or the driver will actually hit > the application before a packet arriving on NIC2, just how long are > these financial services applications going to wait around before > executing the trade carried in the packet arriving on NIC1? I have no idea. They also care about trade processing latency btw. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 18:00 ` Rick Jones 2008-08-28 19:42 ` David Miller @ 2008-09-01 2:39 ` Valdis.Kletnieks 2008-09-01 3:51 ` David Miller 1 sibling, 1 reply; 69+ messages in thread From: Valdis.Kletnieks @ 2008-09-01 2:39 UTC (permalink / raw) To: Rick Jones Cc: Joe Malicki, David Miller, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy [-- Attachment #1: Type: text/plain, Size: 860 bytes --] On Thu, 28 Aug 2008 11:00:48 PDT, Rick Jones said: > Earlier in the thread mention was made of financial services types. If > someone has knowledge of the (probably) arcane rules under which they > must operate it would be great to hear more. Does some entity like the > SEC (Securities and Exchange Commission in the United States) mandate > some sort of timestamp for when the trading request "arrives at the > trading system" and do they define that "arriving at the trading system" > means? As a totally pragmatic point - if the market is in such free-fall that it matters that your order got in a 10 thousandth of a second after somebody else's, instead of before, you probably lost at least 5 to 10 times as much during the time it took somebody to type the damn order in and hit enter. At that point, you have *bigger* things to worry about. [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-09-01 2:39 ` Valdis.Kletnieks @ 2008-09-01 3:51 ` David Miller 2008-09-01 4:08 ` Valdis.Kletnieks 2008-09-02 17:04 ` Rick Jones 0 siblings, 2 replies; 69+ messages in thread From: David Miller @ 2008-09-01 3:51 UTC (permalink / raw) To: Valdis.Kletnieks Cc: rick.jones2, jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy From: Valdis.Kletnieks@vt.edu Date: Sun, 31 Aug 2008 22:39:16 -0400 > As a totally pragmatic point - if the market is in such free-fall that it > matters that your order got in a 10 thousandth of a second after somebody > else's, instead of before, you probably lost at least 5 to 10 times as much > during the time it took somebody to type the damn order in and hit enter. Many trades are made programaticcally using formulas and computer algorithms in response to market activity and other trades of the same security. There is no typing involved :) I don't think anyone in this thread can even pretend to understand how any of this stuff works, that's why I'm starting to consider this thread completely pointless. If the financial folks say they need this stuff, then unless we're prepared to become experts in financial markets and how the IT stuff for them are designed and run, we might as well just trust them on this one. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-09-01 3:51 ` David Miller @ 2008-09-01 4:08 ` Valdis.Kletnieks 2008-09-01 4:10 ` David Miller 2008-09-02 17:04 ` Rick Jones 1 sibling, 1 reply; 69+ messages in thread From: Valdis.Kletnieks @ 2008-09-01 4:08 UTC (permalink / raw) To: David Miller Cc: rick.jones2, jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy [-- Attachment #1: Type: text/plain, Size: 1219 bytes --] On Sun, 31 Aug 2008 20:51:45 PDT, David Miller said: > Many trades are made programaticcally using formulas and computer > algorithms in response to market activity and other trades of the > same security. > > There is no typing involved :) Still the same issue - the time delay in getting the ticker tape values back, making the decision, and launching the transaction are *way* bigger than the packet queueing order. Think - trading a few billion shares a day, if that ticker is even 5 seconds behind, *that* is a much bigger issue than what order the transactions happen in... > If the financial folks say they need this stuff, then unless we're > prepared to become experts in financial markets and how the IT stuff > for them are designed and run, we might as well just trust them on > this one. The toughest part of systems analysis is getting the user to shut up about what they say they need long enough for you to find out what it is they are actually trying to do. Quite frankly, unless somebody *IS* planning to become an expert on how the IT stuff for them are designed and run, we *should not* be doing any code changes that we don't understand, just because they say so and we should trust them... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-09-01 4:08 ` Valdis.Kletnieks @ 2008-09-01 4:10 ` David Miller 0 siblings, 0 replies; 69+ messages in thread From: David Miller @ 2008-09-01 4:10 UTC (permalink / raw) To: Valdis.Kletnieks Cc: rick.jones2, jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy From: Valdis.Kletnieks@vt.edu Date: Mon, 01 Sep 2008 00:08:14 -0400 > Quite frankly, unless somebody *IS* planning to become an expert on how the > IT stuff for them are designed and run, we *should not* be doing any code > changes that we don't understand, just because they say so and we should > trust them... Exactly! We should not change where and how packet timestamps are taken! Thanks! It is what I have been advocating this whole thread :-) ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-09-01 3:51 ` David Miller 2008-09-01 4:08 ` Valdis.Kletnieks @ 2008-09-02 17:04 ` Rick Jones 1 sibling, 0 replies; 69+ messages in thread From: Rick Jones @ 2008-09-02 17:04 UTC (permalink / raw) To: David Miller Cc: Valdis.Kletnieks, jmalicki, andi, johnpol, dada1, denys, netdev, linux-kernel, juhlenko, sammy > I don't think anyone in this thread can even pretend to understand how > any of this stuff works So so last week I started asking FSI contacts I've built-up while answering their "how to improve latency" questions. The intent is to get some direct input from them on just what their and their regulator's expecations are wrt timestamping of packets and/or trades. rick jones ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-22 1:57 loaded router, excessive getnstimeofday in oprofile Denys Fedoryshchenko ` (2 preceding siblings ...) 2008-08-26 20:14 ` Evgeniy Polyakov @ 2008-08-28 3:35 ` Stephen Hemminger 2008-08-28 8:49 ` Denys Fedoryshchenko 3 siblings, 1 reply; 69+ messages in thread From: Stephen Hemminger @ 2008-08-28 3:35 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: netdev, linux-kernel On Fri, 22 Aug 2008 04:57:40 +0300 Denys Fedoryshchenko <denys@visp.net.lb> wrote: > I have loaded router (~650 Mbps In+Out), based on 2xAMD Opteron 248, Sun Fire > X4100. HPET timer available (TSC seems not available on this platform). > Network interfaces is onboard, connected over PCI-X. > > Right now i am using only one processor, cause using only one interface and > interrupts stick to it. Other is almost not used. > At peak time i notice in mpstat, that this processor is almost "dead", and if > i run minor application consuming resources - ping over this router will be > terrible. For me it is clear - system overloaded. I did oprofile, and here is > result (at low load time, but at peak time it is very similar). > > CPU: AMD64 processors, speed 2193.74 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit > mask of 0x00 (No unit mask) count 100000 > CPU_CLK_UNHALT...| > samples| %| > ------------------ > 2679376 71.9851 vmlinux > 287212 7.7163 e1000 > 278674 7.4870 ip_tables > 259923 6.9832 nf_conntrack > 29699 0.7979 iptable_nat > 26752 0.7187 nf_nat > 26093 0.7010 nf_conntrack_ipv4 > 16525 0.4440 iptable_mangle > 14988 0.4027 oprofiled > > > CPU: AMD64 processors, speed 2193.74 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit > mask of 0x00 (No unit mask) count 100000 > samples % symbol name > 1031727 37.1736 getnstimeofday > 230457 8.3035 __napi_schedule > 122154 4.4013 __do_softirq > 110036 3.9647 dev_queue_xmit > 88800 3.1995 net_rx_action > 71163 2.5640 ip_route_input > 52232 1.8819 local_bh_enable > 43804 1.5783 get_next_timer_interrupt > 43387 1.5633 ip_forward > 35501 1.2791 nf_iterate > 35212 1.2687 __slab_alloc > 34652 1.2485 default_idle > 32375 1.1665 kfree > 28127 1.0134 kmem_cache_alloc > > What is bothering me, why getnstimeofday called so much? Even i remove HTB > shaper, it still takes 30-40% of whole vmlinux time. From other > applications - only zebra is running. > Any ideas? What kernel version is this? There was a fix to AF_PACKET about a year ago to reduce this. ^ permalink raw reply [flat|nested] 69+ messages in thread
* Re: loaded router, excessive getnstimeofday in oprofile 2008-08-28 3:35 ` Stephen Hemminger @ 2008-08-28 8:49 ` Denys Fedoryshchenko 0 siblings, 0 replies; 69+ messages in thread From: Denys Fedoryshchenko @ 2008-08-28 8:49 UTC (permalink / raw) To: Stephen Hemminger; +Cc: netdev, linux-kernel On Thursday 28 August 2008, Stephen Hemminger wrote: > > What kernel version is this? There was a fix to AF_PACKET about a year ago > to reduce this. > -- git net-2.6 based on 2.6.27-rc3. Means very fresh. ^ permalink raw reply [flat|nested] 69+ messages in thread
end of thread, other threads:[~2008-09-02 17:04 UTC | newest] Thread overview: 69+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-08-22 1:57 loaded router, excessive getnstimeofday in oprofile Denys Fedoryshchenko 2008-08-22 2:23 ` Denys Fedoryshchenko 2008-08-26 9:51 ` Jarek Poplawski 2008-08-26 10:29 ` Denys Fedoryshchenko 2008-08-26 10:47 ` Jarek Poplawski 2008-08-26 10:49 ` Denys Fedoryshchenko 2008-08-26 11:07 ` Jarek Poplawski 2008-08-26 11:15 ` Jarek Poplawski 2008-08-26 11:16 ` Denys Fedoryshchenko 2008-08-26 11:32 ` Jarek Poplawski 2008-08-26 11:32 ` Denys Fedoryshchenko 2008-08-26 20:14 ` Evgeniy Polyakov 2008-08-26 20:44 ` Eric Dumazet 2008-08-26 20:51 ` Evgeniy Polyakov 2008-08-27 12:09 ` Denys Fedoryshchenko 2008-08-27 12:36 ` Evgeniy Polyakov 2008-08-27 14:00 ` Denys Fedoryshchenko 2008-08-27 14:23 ` Evgeniy Polyakov 2008-08-27 12:54 ` Andi Kleen 2008-08-27 16:07 ` Rick Jones 2008-08-27 16:27 ` Andi Kleen 2008-08-27 16:49 ` Rick Jones 2008-08-27 16:56 ` Andi Kleen 2008-08-27 16:57 ` Rick Jones 2008-08-27 17:27 ` Eric Dumazet 2008-08-27 18:32 ` loaded router, excessive getnstimeofday in oprofile\ Andi Kleen 2008-08-27 22:23 ` David Miller 2008-08-27 22:38 ` Andi Kleen 2008-08-27 22:18 ` loaded router, excessive getnstimeofday in oprofile David Miller 2008-08-27 22:39 ` Andi Kleen 2008-08-28 0:45 ` Nick Piggin 2008-08-28 0:48 ` David Miller 2008-08-28 1:07 ` Nick Piggin 2008-08-27 16:17 ` Stephen Hemminger 2008-08-27 17:14 ` Jarek Poplawski 2008-08-27 21:34 ` David Miller 2008-08-28 2:39 ` Jason Uhlenkott 2008-08-28 3:10 ` David Miller 2008-08-28 6:28 ` Joe Malicki 2008-08-28 7:22 ` Andi Kleen 2008-08-28 15:02 ` Denys Fedoryshchenko 2008-08-28 19:01 ` Ilpo Järvinen 2008-08-28 19:31 ` David Miller 2008-08-28 16:48 ` Denys Fedoryshchenko 2008-08-28 16:56 ` Andi Kleen 2008-08-28 18:57 ` Eric Dumazet 2008-08-28 19:25 ` Denys Fedoryshchenko 2008-08-28 19:37 ` Eric Dumazet 2008-08-28 19:55 ` Denys Fedoryshchenko 2008-08-29 15:43 ` Stephen Hemminger 2008-08-28 19:36 ` David Miller 2008-08-28 19:59 ` Denys Fedoryshchenko 2008-08-29 15:21 ` Joe Malicki 2008-08-29 15:30 ` Andi Kleen 2008-08-29 15:43 ` Joe Malicki 2008-08-29 20:43 ` Evgeniy Polyakov 2008-08-28 18:00 ` Rick Jones 2008-08-28 19:42 ` David Miller 2008-08-28 20:29 ` Rick Jones 2008-08-28 20:32 ` David Miller 2008-08-28 20:45 ` Rick Jones 2008-08-28 20:47 ` David Miller 2008-09-01 2:39 ` Valdis.Kletnieks 2008-09-01 3:51 ` David Miller 2008-09-01 4:08 ` Valdis.Kletnieks 2008-09-01 4:10 ` David Miller 2008-09-02 17:04 ` Rick Jones 2008-08-28 3:35 ` Stephen Hemminger 2008-08-28 8:49 ` Denys Fedoryshchenko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).