* Re: UDP packet loss when running lsof
[not found] ` <20070521113503.6bb70ae4.dada1@cosmosbay.com>
@ 2007-05-21 22:12 ` John Miller
2007-05-22 6:10 ` Eric Dumazet
0 siblings, 1 reply; 4+ messages in thread
From: John Miller @ 2007-05-21 22:12 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
Hi Eric,
> I CCed netdev since this stuff is about network and not
> lkml.
Ok, dropped the CC...
> What kind of machine do you have ? SMP or not ?
It's a HP system with two dual core CPUs at 3GHz, the
storage system is connected through QLogic FC-HBA. It should
really be fast enough to handle a data stream of 50 MB/s...
> If you have many sockets on this machine, lsof can be
> very slow reading /proc/net/tcp and/or /proc/net/udp,
> locking some tables long enough to drop packets.
First I tried with one UDP socket and during tests I switched
to 16 sockets with no effect. As I removed nearly all daemons
there aren't many open sockets.
/proc/net/tcp seems to be one cause of the problem: a simple
"cat /proc/net/tcp" leads nearly allways to immediate UDP packet
loss. So it seems that reading TCP statistics blocks UDP
packet processing.
As it isn't my goal to collect statistics all the time, I could
live with disabling access to /proc/net/tcp, but I wouldn't call
this a good solution...
> If you have a low count of tcp sockets, you might want to
> boot with thash_entries=2048 or so, to reduce tcp hash
> table size.
This did help a lot, I tried thash_entries=10 and now only a
while loop around the "cat ...tcp" triggers packet loss. Tests
are now running and I can say more tomorrow.
Getting information about thash_entries is really hard. Even
finding out the default value: For a system with 2GB RAM
it could be around 100000.
> no RcvbufErrors error as well ?
The kernel is a bit too old (2.6.18). Looking at the patch
from 2.16.18 to 1.6.19 I found that RcvbufErrors is only
increased when InErrors is increased. So my answer would be
yes.
> > - Network card is handled by bnx2 kernel module
> I dont know this NIC, does it support ethtool ?
It is a "Broadcom Corporation NetXtreme II BCM5708S
Gigabit Ethernet (rev 12)", and it seems ethtool is supported.
The output below was captured after packet loss (I don't see
any hints, but maybe you):
> ethtool -S eth0
NIC statistics:
rx_bytes: 155481467364
rx_error_bytes: 0
tx_bytes: 5492161
tx_error_bytes: 0
rx_ucast_packets: 18341
rx_mcast_packets: 137321933
rx_bcast_packets: 2380
tx_ucast_packets: 14416
tx_mcast_packets: 190
tx_bcast_packets: 8
tx_mac_errors: 0
tx_carrier_errors: 0
rx_crc_errors: 0
rx_align_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
tx_deferred: 0
tx_excess_collisions: 0
tx_late_collisions: 0
tx_total_collisions: 0
rx_fragments: 0
rx_jabbers: 0
rx_undersize_packets: 0
rx_oversize_packets: 0
rx_64_byte_packets: 244575
rx_65_to_127_byte_packets: 6828
rx_128_to_255_byte_packets: 167
rx_256_to_511_byte_packets: 94
rx_512_to_1023_byte_packets: 393
rx_1024_to_1522_byte_packets: 137090597
rx_1523_to_9022_byte_packets: 0
tx_64_byte_packets: 52
tx_65_to_127_byte_packets: 7547
tx_128_to_255_byte_packets: 3304
tx_256_to_511_byte_packets: 399
tx_512_to_1023_byte_packets: 897
tx_1024_to_1522_byte_packets: 2415
tx_1523_to_9022_byte_packets: 0
rx_xon_frames: 0
rx_xoff_frames: 0
tx_xon_frames: 0
tx_xoff_frames: 0
rx_mac_ctrl_frames: 0
rx_filtered_packets: 158816
rx_discards: 0
rx_fw_discards: 0
> ethtool -c eth0
Coalesce parameters for eth1:
Adaptive RX: off TX: off
stats-block-usecs: 999936
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
rx-usecs: 18
rx-frames: 6
rx-usecs-irq: 18
rx-frames-irq: 6
tx-usecs: 80
tx-frames: 20
tx-usecs-irq: 80
tx-frames-irq: 20
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0
> ethtool -g eth0
Ring parameters for eth1:
Pre-set maximums:
RX: 1020
RX Mini: 0
RX Jumbo: 0
TX: 255
Current hardware settings:
RX: 100
RX Mini: 0
RX Jumbo: 0
TX: 255
> Just to make sure, does your application setup a huge
> enough SO_RCVBUF val?
Yes, my first try with one socket was 5MB, but I also tested
with 10 and even 25MB. With 16 sockets I also set it to 5MB.
When pausing the application netstat shows the filled buffers.
> What values do you have in /proc/sys/net/ipv4/tcp_rmem ?
I kept the default values there:
4096 43689 87378
> cat /proc/meminfo
MemTotal: 2060664 kB
MemFree: 146536 kB
Buffers: 10984 kB
Cached: 1667740 kB
SwapCached: 0 kB
Active: 255228 kB
Inactive: 1536352 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 2060664 kB
LowFree: 146536 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 820740 kB
Writeback: 112 kB
Mapped: 127612 kB
Slab: 104184 kB
CommitLimit: 1030332 kB
Committed_AS: 774944 kB
PageTables: 1928 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 6924 kB
VmallocChunk: 34359731259 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
Thanks for your help!
Regards,
John
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof
2007-05-21 22:12 ` UDP packet loss when running lsof John Miller
@ 2007-05-22 6:10 ` Eric Dumazet
2007-05-22 6:47 ` Eric Dumazet
2007-05-22 22:42 ` John Miller
0 siblings, 2 replies; 4+ messages in thread
From: Eric Dumazet @ 2007-05-22 6:10 UTC (permalink / raw)
To: John Miller; +Cc: netdev
John Miller a écrit :
> Hi Eric,
>
>> I CCed netdev since this stuff is about network and not
>> lkml.
>
> Ok, dropped the CC...
>
>> What kind of machine do you have ? SMP or not ?
>
> It's a HP system with two dual core CPUs at 3GHz, the
> storage system is connected through QLogic FC-HBA. It should
> really be fast enough to handle a data stream of 50 MB/s...
Then you might try to bind network IRQ to one CPU
(echo 1 >/proc/irq/XX/smp_affinity)
XX being your NIC interrupt (cat /proc/interrupts to catch it)
and bind your user program to another cpu(s)
You might hit a cond_resched_softirq() bug that Ingo and others are sorting
out right now. Using separate CPU for softirq handling and your programs
should help a lot here.
>
>> If you have many sockets on this machine, lsof can be
>> very slow reading /proc/net/tcp and/or /proc/net/udp,
>> locking some tables long enough to drop packets.
>
> First I tried with one UDP socket and during tests I switched
> to 16 sockets with no effect. As I removed nearly all daemons
> there aren't many open sockets.
>
> /proc/net/tcp seems to be one cause of the problem: a simple
> "cat /proc/net/tcp" leads nearly allways to immediate UDP packet
> loss. So it seems that reading TCP statistics blocks UDP
> packet processing.
>
> As it isn't my goal to collect statistics all the time, I could
> live with disabling access to /proc/net/tcp, but I wouldn't call
> this a good solution...
>
>> If you have a low count of tcp sockets, you might want to
>> boot with thash_entries=2048 or so, to reduce tcp hash
>> table size.
>
> This did help a lot, I tried thash_entries=10 and now only a
> while loop around the "cat ...tcp" triggers packet loss. Tests
> are now running and I can say more tomorrow.
I dont understand here : using a small thash_entries makes the bug always appear ?
>
> Getting information about thash_entries is really hard. Even
> finding out the default value: For a system with 2GB RAM
> it could be around 100000.
>
>> no RcvbufErrors error as well ?
>
> The kernel is a bit too old (2.6.18). Looking at the patch
> from 2.16.18 to 1.6.19 I found that RcvbufErrors is only
> increased when InErrors is increased. So my answer would be
> yes.
>
>> > - Network card is handled by bnx2 kernel module
>
>> I dont know this NIC, does it support ethtool ?
>
> It is a "Broadcom Corporation NetXtreme II BCM5708S
> Gigabit Ethernet (rev 12)", and it seems ethtool is supported.
>
> The output below was captured after packet loss (I don't see
> any hints, but maybe you):
>
>> ethtool -S eth0
>
> NIC statistics:
> rx_bytes: 155481467364
> rx_error_bytes: 0
> tx_bytes: 5492161
> tx_error_bytes: 0
> rx_ucast_packets: 18341
> rx_mcast_packets: 137321933
> rx_bcast_packets: 2380
> tx_ucast_packets: 14416
> tx_mcast_packets: 190
> tx_bcast_packets: 8
> tx_mac_errors: 0
> tx_carrier_errors: 0
> rx_crc_errors: 0
> rx_align_errors: 0
> tx_single_collisions: 0
> tx_multi_collisions: 0
> tx_deferred: 0
> tx_excess_collisions: 0
> tx_late_collisions: 0
> tx_total_collisions: 0
> rx_fragments: 0
> rx_jabbers: 0
> rx_undersize_packets: 0
> rx_oversize_packets: 0
> rx_64_byte_packets: 244575
> rx_65_to_127_byte_packets: 6828
> rx_128_to_255_byte_packets: 167
> rx_256_to_511_byte_packets: 94
> rx_512_to_1023_byte_packets: 393
> rx_1024_to_1522_byte_packets: 137090597
> rx_1523_to_9022_byte_packets: 0
> tx_64_byte_packets: 52
> tx_65_to_127_byte_packets: 7547
> tx_128_to_255_byte_packets: 3304
> tx_256_to_511_byte_packets: 399
> tx_512_to_1023_byte_packets: 897
> tx_1024_to_1522_byte_packets: 2415
> tx_1523_to_9022_byte_packets: 0
> rx_xon_frames: 0
> rx_xoff_frames: 0
> tx_xon_frames: 0
> tx_xoff_frames: 0
> rx_mac_ctrl_frames: 0
> rx_filtered_packets: 158816
> rx_discards: 0
> rx_fw_discards: 0
>
>> ethtool -c eth0
>
> Coalesce parameters for eth1:
> Adaptive RX: off TX: off
> stats-block-usecs: 999936
> sample-interval: 0
> pkt-rate-low: 0
> pkt-rate-high: 0
>
> rx-usecs: 18
> rx-frames: 6
> rx-usecs-irq: 18
> rx-frames-irq: 6
>
> tx-usecs: 80
> tx-frames: 20
> tx-usecs-irq: 80
> tx-frames-irq: 20
>
> rx-usecs-low: 0
> rx-frame-low: 0
> tx-usecs-low: 0
> tx-frame-low: 0
>
> rx-usecs-high: 0
> rx-frame-high: 0
> tx-usecs-high: 0
> tx-frame-high: 0
>
>> ethtool -g eth0
>
> Ring parameters for eth1:
> Pre-set maximums:
> RX: 1020
> RX Mini: 0
> RX Jumbo: 0
> TX: 255
> Current hardware settings:
> RX: 100
> RX Mini: 0
> RX Jumbo: 0
> TX: 255
>
>> Just to make sure, does your application setup a huge
>> enough SO_RCVBUF val?
>
> Yes, my first try with one socket was 5MB, but I also tested
> with 10 and even 25MB. With 16 sockets I also set it to 5MB.
> When pausing the application netstat shows the filled buffers.
>
>> What values do you have in /proc/sys/net/ipv4/tcp_rmem ?
>
> I kept the default values there:
> 4096 43689 87378
>
>> cat /proc/meminfo
>
> MemTotal: 2060664 kB
> MemFree: 146536 kB
> Buffers: 10984 kB
> Cached: 1667740 kB
> SwapCached: 0 kB
> Active: 255228 kB
> Inactive: 1536352 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 2060664 kB
> LowFree: 146536 kB
> SwapTotal: 0 kB
> SwapFree: 0 kB
> Dirty: 820740 kB
> Writeback: 112 kB
> Mapped: 127612 kB
> Slab: 104184 kB
> CommitLimit: 1030332 kB
> Committed_AS: 774944 kB
> PageTables: 1928 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 6924 kB
> VmallocChunk: 34359731259 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> Hugepagesize: 2048 kB
>
> Thanks for your help!
> Regards,
> John
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof
2007-05-22 6:10 ` Eric Dumazet
@ 2007-05-22 6:47 ` Eric Dumazet
2007-05-22 22:42 ` John Miller
1 sibling, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2007-05-22 6:47 UTC (permalink / raw)
To: Eric Dumazet; +Cc: John Miller, netdev
Eric Dumazet a écrit :
> John Miller a écrit :
>> Hi Eric,
>>
>>> I CCed netdev since this stuff is about network and not
>>> lkml.
>>
>> Ok, dropped the CC...
>>
>>> What kind of machine do you have ? SMP or not ?
>>
>> It's a HP system with two dual core CPUs at 3GHz, the
>> storage system is connected through QLogic FC-HBA. It should
>> really be fast enough to handle a data stream of 50 MB/s...
>
> Then you might try to bind network IRQ to one CPU
> (echo 1 >/proc/irq/XX/smp_affinity)
>
> XX being your NIC interrupt (cat /proc/interrupts to catch it)
>
> and bind your user program to another cpu(s)
>
> You might hit a cond_resched_softirq() bug that Ingo and others are
> sorting out right now. Using separate CPU for softirq handling and your
> programs should help a lot here.
You might try this patch, now that Ingo "Signed-off-by" it.
http://marc.info/?l=linux-kernel&m=117981607429875&w=2
I guess that with a correct softirq resched, no need to play with IRQ
affinities, unless you really want to push performance.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: UDP packet loss when running lsof
2007-05-22 6:10 ` Eric Dumazet
2007-05-22 6:47 ` Eric Dumazet
@ 2007-05-22 22:42 ` John Miller
1 sibling, 0 replies; 4+ messages in thread
From: John Miller @ 2007-05-22 22:42 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
Hi Eric,
> > It's a HP system with two dual core CPUs at 3GHz, the
> Then you might try to bind network IRQ to one CPU
> (echo 1 >/proc/irq/XX/smp_affinity)
> XX being your NIC interrupt (cat /proc/interrupts to catch it)
> and bind your user program to another cpu(s)
the NIC was already fixed at CPU0 and the irq_balancer switched
the timer interrupt between all CPUs and the storage HBA between
CPU1 and CPU4. Stopping the balancer and leaving NIC alone on CPU0
and the other interrupts and my program on CPU2-4 did not improve
the situation.
At least I could not see an improvement over just adding
thash_entries=2048.
> You might hit a cond_resched_softirq() bug that Ingo and others
> are sorting out right now. Using separate CPU for softirq
> handling and your programs should help a lot here.
Shouldn't I get some syslog messages if this bug is triggered?
Nevertheless I also opened a call on Novell about this issue,
as the current cond_resched_softirq() does look completely
different than in 2.6.18
> > This did help a lot, I tried thash_entries=10 and now only a
> > while loop around the "cat ...tcp" triggers packet loss. Tests
> I dont understand here : using a small thash_entries makes
> the bug always appear ?
No. thash_entries=10 improves the situation. Without the param
nearly every look at /proc/net/tcp leads to packet loss, with
thash_entries=10 (or 2048, does not matter) I have to start a
"while true; do cat /prc/net/tcp ; done" to get packet loss
every minute.
But even with thash_entries=10 and if I leave my program alone
on he system I get packet loss every few hours.
Regards,
John
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-05-22 22:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <loom.20070521T090409-252@post.gmane.org>
[not found] ` <20070521113503.6bb70ae4.dada1@cosmosbay.com>
2007-05-21 22:12 ` UDP packet loss when running lsof John Miller
2007-05-22 6:10 ` Eric Dumazet
2007-05-22 6:47 ` Eric Dumazet
2007-05-22 22:42 ` John Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).