* Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP
@ 2014-02-27 2:00 Sharat Masetty
2014-02-27 16:42 ` Rick Jones
2014-02-27 17:54 ` Eric Dumazet
0 siblings, 2 replies; 7+ messages in thread
From: Sharat Masetty @ 2014-02-27 2:00 UTC (permalink / raw)
To: netdev
Hi,
We are trying to achieve category 4 data rates on an ARM device. We
see that with an incoming TCP stream(IP packets coming in and acks
going out) lots of packets are getting dropped when the backlog queue
is full. This is impacting overall data TCP throughput. I am trying to
understand the full context of why this queue is getting full so
often.
>From my brief look at the code, it looks to me like the user space
process is slow and busy in pulling the data from the socket buffer,
therefore the TCP stack is using this backlog queue in the mean time.
This queue is also charged against the main socket buffer allocation.
Can you please explain this backlog queue, and possibly confirm if my
understanding this matter is accurate?
Also can you suggest any ideas on how to mitigate these drops?
Thanks,
Sharat
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 2:00 Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP Sharat Masetty @ 2014-02-27 16:42 ` Rick Jones 2014-02-27 20:50 ` Sharat Masetty 2014-02-27 17:54 ` Eric Dumazet 1 sibling, 1 reply; 7+ messages in thread From: Rick Jones @ 2014-02-27 16:42 UTC (permalink / raw) To: Sharat Masetty, netdev On 02/26/2014 06:00 PM, Sharat Masetty wrote: > Hi, > > We are trying to achieve category 4 data rates on an ARM device. Please forgive my ignorance, but what are "category 4 data rates?" > We see that with an incoming TCP stream(IP packets coming in and > acks going out) lots of packets are getting dropped when the backlog > queue is full. This is impacting overall data TCP throughput. I am > trying to understand the full context of why this queue is getting > full so often. > > From my brief look at the code, it looks to me like the user space > process is slow and busy in pulling the data from the socket buffer, > therefore the TCP stack is using this backlog queue in the mean time. > This queue is also charged against the main socket buffer allocation. > > Can you please explain this backlog queue, and possibly confirm if my > understanding this matter is accurate? > Also can you suggest any ideas on how to mitigate these drops? Well, there is always the question of why the user process is slow pulling the data out of the socket. If it is unable to handle this "category 4 data rate" on a sustained basis, then something has got to give. If it is only *sometimes* unable to keep-up but otherwise is able to go as fast and faster (to be able to clear-out a backlog) then you could consider tweaking the size of the queue. But it would be better still to find the cause of the occasional slowness and address it. If you run something which does no processing on the data (eg netperf) are you able to achieve the data rates you seek? At what level of CPU utilization? From a system you know can generate the desired data rate, something like: netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <what your application sends each time> If the ARM system is multi-core, I might go with netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <sendsize> -o throughput,remote_cpu_util,remote_cpu_peak_util,remote_cpu_peak_id,remote_sd so netperf will tell you the ID and utilization of the most utilized CPU on the receiver in addition to the overall CPU utilization. There might be other netperf options to use depending on just what the sender is doing - to know which would require knowing more about this stream of traffic. happy benchmarking, rick jones ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 16:42 ` Rick Jones @ 2014-02-27 20:50 ` Sharat Masetty 2014-02-28 0:34 ` Rick Jones 0 siblings, 1 reply; 7+ messages in thread From: Sharat Masetty @ 2014-02-27 20:50 UTC (permalink / raw) To: Rick Jones; +Cc: netdev, harout Hi Rick, Category 4 is 150Mbps Downlink and 50 Mbps Uplink. We are using iperf in our test and it seems that its the most widely used tool out there. We have not used netperf before and we will definitely give it a shot. Would you know how different is iperf from netperf? Should we expect similar results? The TCP throughput is consistently slower and these drops are not helping either. We see huge dips consistently which I am positive is due to these drops. Do you know how to tune the length of this socket backlog queue? I am trying to correlate the iperf behavior(potential slowness) with this backlog drops. Any help in understanding this would be really helpful in looking for more clues. Thanks Sharat On Thu, Feb 27, 2014 at 9:42 AM, Rick Jones <rick.jones2@hp.com> wrote: > On 02/26/2014 06:00 PM, Sharat Masetty wrote: >> >> Hi, >> >> We are trying to achieve category 4 data rates on an ARM device. > > > Please forgive my ignorance, but what are "category 4 data rates?" > > >> We see that with an incoming TCP stream(IP packets coming in and >> acks going out) lots of packets are getting dropped when the backlog >> queue is full. This is impacting overall data TCP throughput. I am >> trying to understand the full context of why this queue is getting >> full so often. >> >> From my brief look at the code, it looks to me like the user space >> process is slow and busy in pulling the data from the socket buffer, >> therefore the TCP stack is using this backlog queue in the mean time. >> This queue is also charged against the main socket buffer allocation. >> >> Can you please explain this backlog queue, and possibly confirm if my >> understanding this matter is accurate? >> Also can you suggest any ideas on how to mitigate these drops? > > > Well, there is always the question of why the user process is slow pulling > the data out of the socket. If it is unable to handle this "category 4 data > rate" on a sustained basis, then something has got to give. If it is only > *sometimes* unable to keep-up but otherwise is able to go as fast and faster > (to be able to clear-out a backlog) then you could consider tweaking the > size of the queue. But it would be better still to find the cause of the > occasional slowness and address it. > > If you run something which does no processing on the data (eg netperf) are > you able to achieve the data rates you seek? At what level of CPU > utilization? From a system you know can generate the desired data rate, > something like: > > netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <what your application > sends each time> > > If the ARM system is multi-core, I might go with > > netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <sendsize> -o > throughput,remote_cpu_util,remote_cpu_peak_util,remote_cpu_peak_id,remote_sd > > so netperf will tell you the ID and utilization of the most utilized CPU on > the receiver in addition to the overall CPU utilization. > > There might be other netperf options to use depending on just what the > sender is doing - to know which would require knowing more about this stream > of traffic. > > happy benchmarking, > > rick jones ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 20:50 ` Sharat Masetty @ 2014-02-28 0:34 ` Rick Jones 0 siblings, 0 replies; 7+ messages in thread From: Rick Jones @ 2014-02-28 0:34 UTC (permalink / raw) To: Sharat Masetty; +Cc: netdev, harout On 02/27/2014 12:50 PM, Sharat Masetty wrote: > Hi Rick, > > Category 4 is 150Mbps Downlink and 50 Mbps Uplink. We are using iperf > in our test and it seems that its the most widely used tool out there. > We have not used netperf before and we will definitely give it a shot. > Would you know how different is iperf from netperf? Should we expect > similar results? I would expect them to show similar results for similar test types. I am not familiar with iperf's CPU utilization reporting. If it doesn't have any, you can always run top at the same time - be certain to have it show all CPUs. One can have a saturated CPU on a multiple CPU system and still have a low overall CPU utilization. That is what all that -o <list> stuff was about showing - both overall and the ID and utilization of the most utilized CPU. (Of course if your system is a single core all that becomes moot...) > The TCP throughput is consistently slower and these drops are not > helping either. We see huge dips consistently which I am positive is > due to these drops. Iperf may have a rate limiting option you could use to control how fast it sends and then walk that up until you see the peak. All while looking at CPU util I should thing. The netperf benchmark has rate limiting, but as an optional configuration parameter (configure --enable-intervals ...) and then a rebuild of the netperf binary. If you want to take that path, contact me offline and I can go through the steps. > Do you know how to tune the length of this socket backlog queue? I think Eric mentioned that how much of that queue is used will (also) relate to the size of the socket buffer, so you might try creating a much larger SO_RCVBUF on the receive side. I assume that iperf has options for that. If you were using netperf it would be: netperf -H <arm> -- -S 1M though for that to "work" you will probably have to tweak net.core.rmem_max because for an explicit socket buffer size setting that sysctl will act as a bound. If I've got the right backlog queue, there is also a sysctl for it - net.core.netdev_max_backlog . happy benchmarking, rick jones > I am trying to correlate the iperf behavior(potential slowness) with > this backlog drops. Any help in understanding this would be really > helpful in looking for more clues. > > Thanks > Sharat > > > > On Thu, Feb 27, 2014 at 9:42 AM, Rick Jones <rick.jones2@hp.com> wrote: >> On 02/26/2014 06:00 PM, Sharat Masetty wrote: >>> >>> Hi, >>> >>> We are trying to achieve category 4 data rates on an ARM device. >> >> >> Please forgive my ignorance, but what are "category 4 data rates?" >> >> >>> We see that with an incoming TCP stream(IP packets coming in and >>> acks going out) lots of packets are getting dropped when the backlog >>> queue is full. This is impacting overall data TCP throughput. I am >>> trying to understand the full context of why this queue is getting >>> full so often. >>> >>> From my brief look at the code, it looks to me like the user space >>> process is slow and busy in pulling the data from the socket buffer, >>> therefore the TCP stack is using this backlog queue in the mean time. >>> This queue is also charged against the main socket buffer allocation. >>> >>> Can you please explain this backlog queue, and possibly confirm if my >>> understanding this matter is accurate? >>> Also can you suggest any ideas on how to mitigate these drops? >> >> >> Well, there is always the question of why the user process is slow pulling >> the data out of the socket. If it is unable to handle this "category 4 data >> rate" on a sustained basis, then something has got to give. If it is only >> *sometimes* unable to keep-up but otherwise is able to go as fast and faster >> (to be able to clear-out a backlog) then you could consider tweaking the >> size of the queue. But it would be better still to find the cause of the >> occasional slowness and address it. >> >> If you run something which does no processing on the data (eg netperf) are >> you able to achieve the data rates you seek? At what level of CPU >> utilization? From a system you know can generate the desired data rate, >> something like: >> >> netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <what your application >> sends each time> >> >> If the ARM system is multi-core, I might go with >> >> netperf -H <yourARMsystem> -t TCP_STREAM -C -- -m <sendsize> -o >> throughput,remote_cpu_util,remote_cpu_peak_util,remote_cpu_peak_id,remote_sd >> >> so netperf will tell you the ID and utilization of the most utilized CPU on >> the receiver in addition to the overall CPU utilization. >> >> There might be other netperf options to use depending on just what the >> sender is doing - to know which would require knowing more about this stream >> of traffic. >> >> happy benchmarking, >> >> rick jones ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 2:00 Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP Sharat Masetty 2014-02-27 16:42 ` Rick Jones @ 2014-02-27 17:54 ` Eric Dumazet 2014-02-27 20:40 ` Sharat Masetty 1 sibling, 1 reply; 7+ messages in thread From: Eric Dumazet @ 2014-02-27 17:54 UTC (permalink / raw) To: Sharat Masetty; +Cc: netdev On Wed, 2014-02-26 at 19:00 -0700, Sharat Masetty wrote: > Hi, > > We are trying to achieve category 4 data rates on an ARM device. We > see that with an incoming TCP stream(IP packets coming in and acks > going out) lots of packets are getting dropped when the backlog queue > is full. This is impacting overall data TCP throughput. I am trying to > understand the full context of why this queue is getting full so > often. > > From my brief look at the code, it looks to me like the user space > process is slow and busy in pulling the data from the socket buffer, > therefore the TCP stack is using this backlog queue in the mean time. > This queue is also charged against the main socket buffer allocation. > > Can you please explain this backlog queue, and possibly confirm if my > understanding this matter is accurate? > Also can you suggest any ideas on how to mitigate these drops? You forgot to tell us things like : 1) Kernel version 2) Network driver in use. Some drivers allocate huge buffers to store incoming frames. Unfortunately we have to avoid OOM issues in the kernel, so such drivers are likely to have some skb dropped by the backlog. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 17:54 ` Eric Dumazet @ 2014-02-27 20:40 ` Sharat Masetty 2014-02-27 20:49 ` Eric Dumazet 0 siblings, 1 reply; 7+ messages in thread From: Sharat Masetty @ 2014-02-27 20:40 UTC (permalink / raw) To: Eric Dumazet; +Cc: netdev, harout Hi Eric, We are using kernel version 3.10.0 and network driver is our own driver designed to work on hsic-usb interconnect. One thing is for sure is that we do not pre allocate lots buffers to hold incoming traffic, we only allocate buffers of required size when there is data to read off the bus. I was under the impression that this backlog queue is charged against the socket buffer space(sndbuf and rcvbuf), so I am trying to understand in more detail how the driver implementation can be linked to this drops in the backlog queue. Can you explain to me the significance of this backlog queue? mainly from the perspective of the userspace application(in this case iperf) socket API calls and the kernel/network stack enqueuing packets up the stack.. Thanks Sharat On Thu, Feb 27, 2014 at 10:54 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Wed, 2014-02-26 at 19:00 -0700, Sharat Masetty wrote: >> Hi, >> >> We are trying to achieve category 4 data rates on an ARM device. We >> see that with an incoming TCP stream(IP packets coming in and acks >> going out) lots of packets are getting dropped when the backlog queue >> is full. This is impacting overall data TCP throughput. I am trying to >> understand the full context of why this queue is getting full so >> often. >> >> From my brief look at the code, it looks to me like the user space >> process is slow and busy in pulling the data from the socket buffer, >> therefore the TCP stack is using this backlog queue in the mean time. >> This queue is also charged against the main socket buffer allocation. >> >> Can you please explain this backlog queue, and possibly confirm if my >> understanding this matter is accurate? >> Also can you suggest any ideas on how to mitigate these drops? > > You forgot to tell us things like : > > 1) Kernel version > 2) Network driver in use. > > Some drivers allocate huge buffers to store incoming frames. > > Unfortunately we have to avoid OOM issues in the kernel, so such drivers > are likely to have some skb dropped by the backlog. > > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP 2014-02-27 20:40 ` Sharat Masetty @ 2014-02-27 20:49 ` Eric Dumazet 0 siblings, 0 replies; 7+ messages in thread From: Eric Dumazet @ 2014-02-27 20:49 UTC (permalink / raw) To: Sharat Masetty; +Cc: netdev, harout On Thu, 2014-02-27 at 13:40 -0700, Sharat Masetty wrote: > Hi Eric, > > We are using kernel version 3.10.0 and network driver is our own > driver designed to work on hsic-usb interconnect. One thing is for > sure is that we do not pre allocate lots buffers to hold incoming > traffic, we only allocate buffers of required size when there is data > to read off the bus. > > I was under the impression that this backlog queue is charged against > the socket buffer space(sndbuf and rcvbuf), so I am trying to > understand in more detail how the driver implementation can be linked > to this drops in the backlog queue. > > Can you explain to me the significance of this backlog queue? mainly > from the perspective of the userspace application(in this case iperf) > socket API calls and the kernel/network stack enqueuing packets up the > stack.. backlog queue is used to hold packets when softirq handler found the socket being used by a process (user application doing a sendmsg() system call for example). Since a softirq cannot sleep, it has to queue the packet into 'backlog' so that the process can process it as soon as it is going to exit TCP stack. We have to put a limit on the backlog, otherwise a DOS attack would fill the queue and the host would crash, because all memory would be consumed. We start dropping packet if length of backlog exceeds ( sk->sk_rcvbuf + sk->sk_sndbuf ) socket limit, since linux-3.5 So maybe you set too small limits ? ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-02-28 0:34 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-02-27 2:00 Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP Sharat Masetty 2014-02-27 16:42 ` Rick Jones 2014-02-27 20:50 ` Sharat Masetty 2014-02-28 0:34 ` Rick Jones 2014-02-27 17:54 ` Eric Dumazet 2014-02-27 20:40 ` Sharat Masetty 2014-02-27 20:49 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).