* Re: UDP packet loss when running lsof [not found] ` <20070521113503.6bb70ae4.dada1@cosmosbay.com> @ 2007-05-21 22:12 ` John Miller 2007-05-22 6:10 ` Eric Dumazet 0 siblings, 1 reply; 4+ messages in thread From: John Miller @ 2007-05-21 22:12 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev Hi Eric, > I CCed netdev since this stuff is about network and not > lkml. Ok, dropped the CC... > What kind of machine do you have ? SMP or not ? It's a HP system with two dual core CPUs at 3GHz, the storage system is connected through QLogic FC-HBA. It should really be fast enough to handle a data stream of 50 MB/s... > If you have many sockets on this machine, lsof can be > very slow reading /proc/net/tcp and/or /proc/net/udp, > locking some tables long enough to drop packets. First I tried with one UDP socket and during tests I switched to 16 sockets with no effect. As I removed nearly all daemons there aren't many open sockets. /proc/net/tcp seems to be one cause of the problem: a simple "cat /proc/net/tcp" leads nearly allways to immediate UDP packet loss. So it seems that reading TCP statistics blocks UDP packet processing. As it isn't my goal to collect statistics all the time, I could live with disabling access to /proc/net/tcp, but I wouldn't call this a good solution... > If you have a low count of tcp sockets, you might want to > boot with thash_entries=2048 or so, to reduce tcp hash > table size. This did help a lot, I tried thash_entries=10 and now only a while loop around the "cat ...tcp" triggers packet loss. Tests are now running and I can say more tomorrow. Getting information about thash_entries is really hard. Even finding out the default value: For a system with 2GB RAM it could be around 100000. > no RcvbufErrors error as well ? The kernel is a bit too old (2.6.18). Looking at the patch from 2.16.18 to 1.6.19 I found that RcvbufErrors is only increased when InErrors is increased. So my answer would be yes. > > - Network card is handled by bnx2 kernel module > I dont know this NIC, does it support ethtool ? It is a "Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (rev 12)", and it seems ethtool is supported. The output below was captured after packet loss (I don't see any hints, but maybe you): > ethtool -S eth0 NIC statistics: rx_bytes: 155481467364 rx_error_bytes: 0 tx_bytes: 5492161 tx_error_bytes: 0 rx_ucast_packets: 18341 rx_mcast_packets: 137321933 rx_bcast_packets: 2380 tx_ucast_packets: 14416 tx_mcast_packets: 190 tx_bcast_packets: 8 tx_mac_errors: 0 tx_carrier_errors: 0 rx_crc_errors: 0 rx_align_errors: 0 tx_single_collisions: 0 tx_multi_collisions: 0 tx_deferred: 0 tx_excess_collisions: 0 tx_late_collisions: 0 tx_total_collisions: 0 rx_fragments: 0 rx_jabbers: 0 rx_undersize_packets: 0 rx_oversize_packets: 0 rx_64_byte_packets: 244575 rx_65_to_127_byte_packets: 6828 rx_128_to_255_byte_packets: 167 rx_256_to_511_byte_packets: 94 rx_512_to_1023_byte_packets: 393 rx_1024_to_1522_byte_packets: 137090597 rx_1523_to_9022_byte_packets: 0 tx_64_byte_packets: 52 tx_65_to_127_byte_packets: 7547 tx_128_to_255_byte_packets: 3304 tx_256_to_511_byte_packets: 399 tx_512_to_1023_byte_packets: 897 tx_1024_to_1522_byte_packets: 2415 tx_1523_to_9022_byte_packets: 0 rx_xon_frames: 0 rx_xoff_frames: 0 tx_xon_frames: 0 tx_xoff_frames: 0 rx_mac_ctrl_frames: 0 rx_filtered_packets: 158816 rx_discards: 0 rx_fw_discards: 0 > ethtool -c eth0 Coalesce parameters for eth1: Adaptive RX: off TX: off stats-block-usecs: 999936 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 18 rx-frames: 6 rx-usecs-irq: 18 rx-frames-irq: 6 tx-usecs: 80 tx-frames: 20 tx-usecs-irq: 80 tx-frames-irq: 20 rx-usecs-low: 0 rx-frame-low: 0 tx-usecs-low: 0 tx-frame-low: 0 rx-usecs-high: 0 rx-frame-high: 0 tx-usecs-high: 0 tx-frame-high: 0 > ethtool -g eth0 Ring parameters for eth1: Pre-set maximums: RX: 1020 RX Mini: 0 RX Jumbo: 0 TX: 255 Current hardware settings: RX: 100 RX Mini: 0 RX Jumbo: 0 TX: 255 > Just to make sure, does your application setup a huge > enough SO_RCVBUF val? Yes, my first try with one socket was 5MB, but I also tested with 10 and even 25MB. With 16 sockets I also set it to 5MB. When pausing the application netstat shows the filled buffers. > What values do you have in /proc/sys/net/ipv4/tcp_rmem ? I kept the default values there: 4096 43689 87378 > cat /proc/meminfo MemTotal: 2060664 kB MemFree: 146536 kB Buffers: 10984 kB Cached: 1667740 kB SwapCached: 0 kB Active: 255228 kB Inactive: 1536352 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 2060664 kB LowFree: 146536 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 820740 kB Writeback: 112 kB Mapped: 127612 kB Slab: 104184 kB CommitLimit: 1030332 kB Committed_AS: 774944 kB PageTables: 1928 kB VmallocTotal: 34359738367 kB VmallocUsed: 6924 kB VmallocChunk: 34359731259 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB Thanks for your help! Regards, John ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof 2007-05-21 22:12 ` UDP packet loss when running lsof John Miller @ 2007-05-22 6:10 ` Eric Dumazet 2007-05-22 6:47 ` Eric Dumazet 2007-05-22 22:42 ` John Miller 0 siblings, 2 replies; 4+ messages in thread From: Eric Dumazet @ 2007-05-22 6:10 UTC (permalink / raw) To: John Miller; +Cc: netdev John Miller a écrit : > Hi Eric, > >> I CCed netdev since this stuff is about network and not >> lkml. > > Ok, dropped the CC... > >> What kind of machine do you have ? SMP or not ? > > It's a HP system with two dual core CPUs at 3GHz, the > storage system is connected through QLogic FC-HBA. It should > really be fast enough to handle a data stream of 50 MB/s... Then you might try to bind network IRQ to one CPU (echo 1 >/proc/irq/XX/smp_affinity) XX being your NIC interrupt (cat /proc/interrupts to catch it) and bind your user program to another cpu(s) You might hit a cond_resched_softirq() bug that Ingo and others are sorting out right now. Using separate CPU for softirq handling and your programs should help a lot here. > >> If you have many sockets on this machine, lsof can be >> very slow reading /proc/net/tcp and/or /proc/net/udp, >> locking some tables long enough to drop packets. > > First I tried with one UDP socket and during tests I switched > to 16 sockets with no effect. As I removed nearly all daemons > there aren't many open sockets. > > /proc/net/tcp seems to be one cause of the problem: a simple > "cat /proc/net/tcp" leads nearly allways to immediate UDP packet > loss. So it seems that reading TCP statistics blocks UDP > packet processing. > > As it isn't my goal to collect statistics all the time, I could > live with disabling access to /proc/net/tcp, but I wouldn't call > this a good solution... > >> If you have a low count of tcp sockets, you might want to >> boot with thash_entries=2048 or so, to reduce tcp hash >> table size. > > This did help a lot, I tried thash_entries=10 and now only a > while loop around the "cat ...tcp" triggers packet loss. Tests > are now running and I can say more tomorrow. I dont understand here : using a small thash_entries makes the bug always appear ? > > Getting information about thash_entries is really hard. Even > finding out the default value: For a system with 2GB RAM > it could be around 100000. > >> no RcvbufErrors error as well ? > > The kernel is a bit too old (2.6.18). Looking at the patch > from 2.16.18 to 1.6.19 I found that RcvbufErrors is only > increased when InErrors is increased. So my answer would be > yes. > >> > - Network card is handled by bnx2 kernel module > >> I dont know this NIC, does it support ethtool ? > > It is a "Broadcom Corporation NetXtreme II BCM5708S > Gigabit Ethernet (rev 12)", and it seems ethtool is supported. > > The output below was captured after packet loss (I don't see > any hints, but maybe you): > >> ethtool -S eth0 > > NIC statistics: > rx_bytes: 155481467364 > rx_error_bytes: 0 > tx_bytes: 5492161 > tx_error_bytes: 0 > rx_ucast_packets: 18341 > rx_mcast_packets: 137321933 > rx_bcast_packets: 2380 > tx_ucast_packets: 14416 > tx_mcast_packets: 190 > tx_bcast_packets: 8 > tx_mac_errors: 0 > tx_carrier_errors: 0 > rx_crc_errors: 0 > rx_align_errors: 0 > tx_single_collisions: 0 > tx_multi_collisions: 0 > tx_deferred: 0 > tx_excess_collisions: 0 > tx_late_collisions: 0 > tx_total_collisions: 0 > rx_fragments: 0 > rx_jabbers: 0 > rx_undersize_packets: 0 > rx_oversize_packets: 0 > rx_64_byte_packets: 244575 > rx_65_to_127_byte_packets: 6828 > rx_128_to_255_byte_packets: 167 > rx_256_to_511_byte_packets: 94 > rx_512_to_1023_byte_packets: 393 > rx_1024_to_1522_byte_packets: 137090597 > rx_1523_to_9022_byte_packets: 0 > tx_64_byte_packets: 52 > tx_65_to_127_byte_packets: 7547 > tx_128_to_255_byte_packets: 3304 > tx_256_to_511_byte_packets: 399 > tx_512_to_1023_byte_packets: 897 > tx_1024_to_1522_byte_packets: 2415 > tx_1523_to_9022_byte_packets: 0 > rx_xon_frames: 0 > rx_xoff_frames: 0 > tx_xon_frames: 0 > tx_xoff_frames: 0 > rx_mac_ctrl_frames: 0 > rx_filtered_packets: 158816 > rx_discards: 0 > rx_fw_discards: 0 > >> ethtool -c eth0 > > Coalesce parameters for eth1: > Adaptive RX: off TX: off > stats-block-usecs: 999936 > sample-interval: 0 > pkt-rate-low: 0 > pkt-rate-high: 0 > > rx-usecs: 18 > rx-frames: 6 > rx-usecs-irq: 18 > rx-frames-irq: 6 > > tx-usecs: 80 > tx-frames: 20 > tx-usecs-irq: 80 > tx-frames-irq: 20 > > rx-usecs-low: 0 > rx-frame-low: 0 > tx-usecs-low: 0 > tx-frame-low: 0 > > rx-usecs-high: 0 > rx-frame-high: 0 > tx-usecs-high: 0 > tx-frame-high: 0 > >> ethtool -g eth0 > > Ring parameters for eth1: > Pre-set maximums: > RX: 1020 > RX Mini: 0 > RX Jumbo: 0 > TX: 255 > Current hardware settings: > RX: 100 > RX Mini: 0 > RX Jumbo: 0 > TX: 255 > >> Just to make sure, does your application setup a huge >> enough SO_RCVBUF val? > > Yes, my first try with one socket was 5MB, but I also tested > with 10 and even 25MB. With 16 sockets I also set it to 5MB. > When pausing the application netstat shows the filled buffers. > >> What values do you have in /proc/sys/net/ipv4/tcp_rmem ? > > I kept the default values there: > 4096 43689 87378 > >> cat /proc/meminfo > > MemTotal: 2060664 kB > MemFree: 146536 kB > Buffers: 10984 kB > Cached: 1667740 kB > SwapCached: 0 kB > Active: 255228 kB > Inactive: 1536352 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 2060664 kB > LowFree: 146536 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 820740 kB > Writeback: 112 kB > Mapped: 127612 kB > Slab: 104184 kB > CommitLimit: 1030332 kB > Committed_AS: 774944 kB > PageTables: 1928 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 6924 kB > VmallocChunk: 34359731259 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > Hugepagesize: 2048 kB > > Thanks for your help! > Regards, > John > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof 2007-05-22 6:10 ` Eric Dumazet @ 2007-05-22 6:47 ` Eric Dumazet 2007-05-22 22:42 ` John Miller 1 sibling, 0 replies; 4+ messages in thread From: Eric Dumazet @ 2007-05-22 6:47 UTC (permalink / raw) To: Eric Dumazet; +Cc: John Miller, netdev Eric Dumazet a écrit : > John Miller a écrit : >> Hi Eric, >> >>> I CCed netdev since this stuff is about network and not >>> lkml. >> >> Ok, dropped the CC... >> >>> What kind of machine do you have ? SMP or not ? >> >> It's a HP system with two dual core CPUs at 3GHz, the >> storage system is connected through QLogic FC-HBA. It should >> really be fast enough to handle a data stream of 50 MB/s... > > Then you might try to bind network IRQ to one CPU > (echo 1 >/proc/irq/XX/smp_affinity) > > XX being your NIC interrupt (cat /proc/interrupts to catch it) > > and bind your user program to another cpu(s) > > You might hit a cond_resched_softirq() bug that Ingo and others are > sorting out right now. Using separate CPU for softirq handling and your > programs should help a lot here. You might try this patch, now that Ingo "Signed-off-by" it. http://marc.info/?l=linux-kernel&m=117981607429875&w=2 I guess that with a correct softirq resched, no need to play with IRQ affinities, unless you really want to push performance. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof 2007-05-22 6:10 ` Eric Dumazet 2007-05-22 6:47 ` Eric Dumazet @ 2007-05-22 22:42 ` John Miller 1 sibling, 0 replies; 4+ messages in thread From: John Miller @ 2007-05-22 22:42 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev Hi Eric, > > It's a HP system with two dual core CPUs at 3GHz, the > Then you might try to bind network IRQ to one CPU > (echo 1 >/proc/irq/XX/smp_affinity) > XX being your NIC interrupt (cat /proc/interrupts to catch it) > and bind your user program to another cpu(s) the NIC was already fixed at CPU0 and the irq_balancer switched the timer interrupt between all CPUs and the storage HBA between CPU1 and CPU4. Stopping the balancer and leaving NIC alone on CPU0 and the other interrupts and my program on CPU2-4 did not improve the situation. At least I could not see an improvement over just adding thash_entries=2048. > You might hit a cond_resched_softirq() bug that Ingo and others > are sorting out right now. Using separate CPU for softirq > handling and your programs should help a lot here. Shouldn't I get some syslog messages if this bug is triggered? Nevertheless I also opened a call on Novell about this issue, as the current cond_resched_softirq() does look completely different than in 2.6.18 > > This did help a lot, I tried thash_entries=10 and now only a > > while loop around the "cat ...tcp" triggers packet loss. Tests > I dont understand here : using a small thash_entries makes > the bug always appear ? No. thash_entries=10 improves the situation. Without the param nearly every look at /proc/net/tcp leads to packet loss, with thash_entries=10 (or 2048, does not matter) I have to start a "while true; do cat /prc/net/tcp ; done" to get packet loss every minute. But even with thash_entries=10 and if I leave my program alone on he system I get packet loss every few hours. Regards, John ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-05-22 22:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <loom.20070521T090409-252@post.gmane.org>
[not found] ` <20070521113503.6bb70ae4.dada1@cosmosbay.com>
2007-05-21 22:12 ` UDP packet loss when running lsof John Miller
2007-05-22 6:10 ` Eric Dumazet
2007-05-22 6:47 ` Eric Dumazet
2007-05-22 22:42 ` John Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).