* Send-Q on UDP socket growing steadily - why? @ 2008-03-30 5:43 Deomid Ryabkov 2008-03-30 22:01 ` Denys Vlasenko 0 siblings, 1 reply; 3+ messages in thread From: Deomid Ryabkov @ 2008-03-30 5:43 UTC (permalink / raw) To: linux-kernel This has started recently and i'm at a loss as to why. Send-Q on a moderately active UDP socket keeps growing steadily until it reaches ~128K (wmem_max?) at which point socket writes start failing. The application in question is standard ntpd from Fedora 7, kernel is the latest available for the distro, that is 2.6.23.15-80.fc7 #1 SMP Sun Feb 10 16:52:18 EST 2008 x86_64 BIND, running on the same machine, does not exhibit this problem, but that may be because it does not get nearly as much load as ntpd, which is part of the pool.ntp.org. That said, load is really not very high, on the order of 10 QPS, and machine is 99+% idle. ntpd seems to be doing its usual select-recvmsg-sendto routine, nothing out of the ordinary. And yet, Send-Q keeps growing at _exactly_ 360 bytes every 10 seconds, here's a sample of output shortly after ntpd restart: # while sleep 1; do netstat -na | grep 177:123; done udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* udp 0 17280 89.111.168.177:123 0.0.0.0:* -------> +360 bytes udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* udp 0 17640 89.111.168.177:123 0.0.0.0:* -------> +360 bytes, 10 seconds later udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* udp 0 18000 89.111.168.177:123 0.0.0.0:* -------> +360 bytes, 10 seconds later udp 0 18360 89.111.168.177:123 0.0.0.0:* [...] etc, etc. My understanding is that non-empty send queue for UDP sockets should be very rare occurence, maybe under extreme loads. And then there's this steady creep... What's going on? It almost looks like something is leaking somewhere. -- Deomid Ryabkov aka Rojer myself@rojer.pp.ru rojer@sysadmins.ru ICQ: 8025844 ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Send-Q on UDP socket growing steadily - why? 2008-03-30 5:43 Send-Q on UDP socket growing steadily - why? Deomid Ryabkov @ 2008-03-30 22:01 ` Denys Vlasenko 2008-05-13 18:56 ` Deomid Ryabkov 0 siblings, 1 reply; 3+ messages in thread From: Denys Vlasenko @ 2008-03-30 22:01 UTC (permalink / raw) To: Deomid Ryabkov; +Cc: linux-kernel On Sunday 30 March 2008 07:43, Deomid Ryabkov wrote: > This has started recently and i'm at a loss as to why. > Send-Q on a moderately active UDP socket keeps growing steadily until it > reaches ~128K (wmem_max?) at which point socket writes start failing. > The application in question is standard ntpd from Fedora 7, kernel is > the latest available for the distro, that is > 2.6.23.15-80.fc7 #1 SMP Sun Feb 10 16:52:18 EST 2008 x86_64 > > BIND, running on the same machine, does not exhibit this problem, but > that may be because it does not get nearly as much load as ntpd, > which is part of the pool.ntp.org. That said, load is really not very > high, on the order of 10 QPS, and machine is 99+% idle. > ntpd seems to be doing its usual select-recvmsg-sendto routine, nothing > out of the ordinary. Wher does it (tries to) send these packets? I managed to reproduced something like this if I try to send UDPs to nonexistent host on local subnet. Kernel tries to find it, it emits ARP probes but no reply is coming. As long as kernel doesn't know how to send queued UDP packet, I see nonempty queue. However, in my simple case kernel decides that it is a lost case in a few seconds, and drops packets (queue len 0). I imagine whit routing table tricks and/or iptables/arptables you may end up with situation where kernel is stuck in "I don't know how to send these packets" mode forever. You can strace ntpd, get a list of IPs it is trying to send packets to, and then do "echo TEST | nc -u <ip> 123" for each of these. will nc's queue become nonempty (at least for some IP)? -- vda ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Send-Q on UDP socket growing steadily - why? 2008-03-30 22:01 ` Denys Vlasenko @ 2008-05-13 18:56 ` Deomid Ryabkov 0 siblings, 0 replies; 3+ messages in thread From: Deomid Ryabkov @ 2008-05-13 18:56 UTC (permalink / raw) To: Denys Vlasenko; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 4839 bytes --] Denys Vlasenko wrote: > On Sunday 30 March 2008 07:43, Deomid Ryabkov wrote: > >> This has started recently and i'm at a loss as to why. >> Send-Q on a moderately active UDP socket keeps growing steadily until it >> reaches ~128K (wmem_max?) at which point socket writes start failing. >> The application in question is standard ntpd from Fedora 7, kernel is >> the latest available for the distro, that is >> 2.6.23.15-80.fc7 #1 SMP Sun Feb 10 16:52:18 EST 2008 x86_64 >> >> BIND, running on the same machine, does not exhibit this problem, but >> that may be because it does not get nearly as much load as ntpd, >> which is part of the pool.ntp.org. That said, load is really not very >> high, on the order of 10 QPS, and machine is 99+% idle. >> ntpd seems to be doing its usual select-recvmsg-sendto routine, nothing >> out of the ordinary. >> > > Wher does it (tries to) send these packets? > all over the world :) > I managed to reproduced something like this if I try to send > UDPs to nonexistent host on local subnet. Kernel tries to find it, > it emits ARP probes but no reply is coming. As long as kernel > doesn't know how to send queued UDP packet, I see nonempty > queue. > > However, in my simple case kernel decides that it is a lost case > in a few seconds, and drops packets (queue len 0). > ok, it happened again. no, it's not arp - there are no <incomplete> entries in the arp table. > I imagine whit routing table tricks and/or iptables/arptables > you may end up with situation where kernel is stuck in > "I don't know how to send these packets" mode forever. > nothing fancy on this box - there are no firewall rules, except for on nat rule that does not apply to these packets. > You can strace ntpd, get a list of IPs it is trying to send packets > to, and then do "echo TEST | nc -u <ip> 123" for each of these. > will nc's queue become nonempty (at least for some IP)? > as far as i can tell, apart from this one socket networking on the box works normally. this is what i see in netstat: udp 0 125280 89.111.168.177:123 0.0.0.0:* this is how strace looks like (nothing suspicious): select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 382485}) = 1 (in [22], left {0, 125000}) select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 0}) = 1 (in [22], left {0, 0}) recvmsg(22, {msg_name(16)={sa_family=AF_INET, sin_port=htons(101), sin_addr=inet_addr("80.250.211.2")}, msg_iov(1)=[{"#\3\n\356\0\0\17v\0\0\25 \302!\277E\313\324`\257\317\3K\16\0\0\0\0\0\0\0\0"..., 1092}], msg_controllen=32, {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=0}, 0) = 48 recvmsg(22, 0x7fffb14354a0, 0) = -1 EAGAIN (Resource temporarily unavailable) sendto(22, "$\4\n\354\0\0\r\345\0\0!\31\n\0\0\2\313\324aR!\334\3242\313\324a\250\t\32\206\224"..., 48, 0, {sa_family=AF_INET, sin_port=htons(101), sin_addr=inet_addr("80.250.211.2")}, 16) = -1 EAGAIN (Resource temporarily unavailable) select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 123523}) = 1 (in [22], left {0, 73000}) select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 0}) = 1 (in [22], left {0, 0}) recvmsg(22, {msg_name(16)={sa_family=AF_INET, sin_port=htons(123), sin_addr=inet_addr("217.77.53.12")}, msg_iov(1)=[{"\31\3\4\372\0\0\16\200\0\7\334F\301}\217\214\313\324R\4+g\\/\0\0\0\0\0\0\0\0"..., 1092}], msg_controllen=32, {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=0}, 0) = 48 recvmsg(22, 0x7fffb14354a0, 0) = -1 EAGAIN (Resource temporarily unavailable) sendto(22, "\31\4\4\354\0\0\r\345\0\0!\31\n\0\0\2\313\324aR!\334\3242\313\324a\236$\366\274\212"..., 48, 0, {sa_family=AF_INET, sin_port=htons(123), sin_addr=inet_addr("217.77.53.12")}, 16) = -1 EAGAIN (Resource temporarily unavailable) select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 71771}) = 1 (in [22], left {0, 39000}) select(26, [16 17 18 19 20 21 22 23 24 25], NULL, NULL, {0, 0}) = 1 (in [22], left {0, 0}) recvmsg(22, {msg_name(16)={sa_family=AF_INET, sin_port=htons(29080), sin_addr=inet_addr("213.33.220.118")}, msg_iov(1)=[{"\331\3\4\372\0\0\7\v\0\2\6\262Yl|\4\313\324S\272\322\235e\326\313\324#\260\247\367\352\r"..., 1092}], msg_controllen=32, {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=0}, 0) = 48 recvmsg(22, 0x7fffb14354a0, 0) = -1 EAGAIN (Resource temporarily unavailable) sendto(22, "\31\4\4\354\0\0\r\345\0\0!\31\n\0\0\2\313\324aR!\334\3242\313\324a\247;\244\206\\"..., 48, 0, {sa_family=AF_INET, sin_port=htons(29080), sin_addr=inet_addr("213.33.220.118")}, 16) = -1 EAGAIN (Resource temporarily unavailable) etc, etc, etc > > -- > vda > -- Deomid Ryabkov aka Rojer myself@rojer.pp.ru rojer@sysadmins.ru ICQ: 8025844 [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/x-pkcs7-signature, Size: 3295 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-05-13 19:03 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-30 5:43 Send-Q on UDP socket growing steadily - why? Deomid Ryabkov 2008-03-30 22:01 ` Denys Vlasenko 2008-05-13 18:56 ` Deomid Ryabkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox