* Data sitting and remaining in Send-Q @ 2001-12-24 17:01 Jan-Benedict Glaw 2001-12-24 18:10 ` José Luis Domingo López 0 siblings, 1 reply; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 17:01 UTC (permalink / raw) To: linux-kernel Hi! I've got some problem with a freshly installed Debian sid system. It's running with 2.4.16, 2.4.17-rc2 and 2.4.17 (the problem appears on all these kernels) and something seems to break ssh. When ssh'ing to this box (only this box, regardless which client) the connection breaks if I request more than some dozends of bytes at a time (so it will break at 'ls -l' with more than 10 files, 'cat /etc/passwd' will break, calling 'vi' will also break, because it re-displays all the screen. When strace'ing ssh client and server, I can see that both of them are in a select() loop. On the broken server, netstat shows some (kilo)bytes of data remaining in the Send-Q. However, this data is actually *never* send over the wire letting the connection die. Can anybody give me some hint on how to solve this? Marry Chrismas, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 17:01 Data sitting and remaining in Send-Q Jan-Benedict Glaw @ 2001-12-24 18:10 ` José Luis Domingo López 2001-12-24 19:00 ` Jan-Benedict Glaw 2001-12-24 19:38 ` Jan-Benedict Glaw 0 siblings, 2 replies; 11+ messages in thread From: José Luis Domingo López @ 2001-12-24 18:10 UTC (permalink / raw) To: linux-kernel On Monday, 24 December 2001, at 18:01:42 +0100, Jan-Benedict Glaw wrote: > I've got some problem with a freshly installed Debian sid system. > It's running with 2.4.16, 2.4.17-rc2 and 2.4.17 (the problem > appears on all these kernels) and something seems to break ssh. > I don't know if this has something to do with your problem, but bugs.debian.org has a _long_ list of reported bugs for ssh, many of them with respect to ssh's X-forwarding. My own experience with Debian's ssh is that, sooner or later, X-forwarding fails, with Send-Q (or Recv-Q) in the server side completely full. The server side was Debian Sid, and client side was Debian Woody, and it happened with both a simple xclock and gkrellm (ssh remoteserver xclock, ssh remoteserver gkrellm). However, interactive shells didn't seem to show this problem. -- José Luis Domingo López Linux Registered User #189436 Debian Linux Woody (P166 64 MB RAM) jdomingo AT internautas DOT org => Spam at your own risk ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 18:10 ` José Luis Domingo López @ 2001-12-24 19:00 ` Jan-Benedict Glaw 2001-12-24 19:38 ` Jan-Benedict Glaw 1 sibling, 0 replies; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 19:00 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-24 19:10:32 +0100, José Luis Domingo López <jdomingo@internautas.org> wrote in message <20011224181031.GA7934@localhost>: > On Monday, 24 December 2001, at 18:01:42 +0100, > Jan-Benedict Glaw wrote: > > > I've got some problem with a freshly installed Debian sid system. > > It's running with 2.4.16, 2.4.17-rc2 and 2.4.17 (the problem > > appears on all these kernels) and something seems to break ssh. > > > I don't know if this has something to do with your problem, but > bugs.debian.org has a _long_ list of reported bugs for ssh, many of them > with respect to ssh's X-forwarding. Yes, I know, and it's not only connected to X forwording, but also (this is the majority of filed bugs) with ssh's exit behaviour when any processes where started in background. However -- I've got this problem with the running, interactive session. If I make the server to send more than maybe 200 byte or so, the session will hang, with both sides sitting in select, and data on the server's Send-Q... > My own experience with Debian's ssh is that, sooner or later, > X-forwarding fails, with Send-Q (or Recv-Q) in the server side > completely full. The server side was Debian Sid, and client side was > Debian Woody, and it happened with both a simple xclock and gkrellm (ssh > remoteserver xclock, ssh remoteserver gkrellm). Well, my understanding is that, if there's data in any of the queues, these bytes should be delivered. In this case, data is *not* sent over the wire. Is this a kernel bug? ...or is data only transmitted if we're in position to also set the PUSH bit? > However, interactive shells didn't seem to show this problem. Mine does:-( And this is quite annoying, because I'm to present some software on the box in question in some days. But, with no ssh on a (so far) headless box, I'll face some trouble... MfG, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 18:10 ` José Luis Domingo López 2001-12-24 19:00 ` Jan-Benedict Glaw @ 2001-12-24 19:38 ` Jan-Benedict Glaw 2001-12-24 20:09 ` Mr. James W. Laferriere 1 sibling, 1 reply; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 19:38 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-24 19:10:32 +0100, José Luis Domingo López <jdomingo@internautas.org> wrote in message <20011224181031.GA7934@localhost>: > On Monday, 24 December 2001, at 18:01:42 +0100, > Jan-Benedict Glaw wrote: > > I've got some problem with a freshly installed Debian sid system. > > It's running with 2.4.16, 2.4.17-rc2 and 2.4.17 (the problem > > appears on all these kernels) and something seems to break ssh. > > My own experience with Debian's ssh is that, sooner or later, > X-forwarding fails, with Send-Q (or Recv-Q) in the server side > completely full. The server side was Debian Sid, and client side was > Debian Woody, and it happened with both a simple xclock and gkrellm (ssh > remoteserver xclock, ssh remoteserver gkrellm). Seems to bo a more general problem. I just installed ftpd and telnetd. *Both* of them show exactly the same behaviour: 'ls -l' via telnet blocks also. I could get a 635 byte file via ftp, but fetching a 69294 bytes long file stalled. (This time, strace shows that ftpd is sitting in write(5, ...data..., 56262), and there are 13032 bytes in Send-Q for ftpd...) So what is this? Seems that there's a general TCP I/O problem with the software current software versions in Debian unstable. libc problem? Could a lousy network card cause this? Are there any debugging hints for me? MfG, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 19:38 ` Jan-Benedict Glaw @ 2001-12-24 20:09 ` Mr. James W. Laferriere 2001-12-24 20:17 ` Jan-Benedict Glaw 0 siblings, 1 reply; 11+ messages in thread From: Mr. James W. Laferriere @ 2001-12-24 20:09 UTC (permalink / raw) To: Jan-Benedict Glaw; +Cc: linux-kernel [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN, Size: 2391 bytes --] Hello Jan , Is this possibly related to a ECN enabled host & somewhere in between a Non-ECN enabled (or a cisco router) ? Just a thought , JimL On Mon, 24 Dec 2001, Jan-Benedict Glaw wrote: > On Mon, 2001-12-24 19:10:32 +0100, José Luis Domingo López <jdomingo@internautas.org> > wrote in message <20011224181031.GA7934@localhost>: > > On Monday, 24 December 2001, at 18:01:42 +0100, > > Jan-Benedict Glaw wrote: > > > I've got some problem with a freshly installed Debian sid system. > > > It's running with 2.4.16, 2.4.17-rc2 and 2.4.17 (the problem > > > appears on all these kernels) and something seems to break ssh. > > > > My own experience with Debian's ssh is that, sooner or later, > > X-forwarding fails, with Send-Q (or Recv-Q) in the server side > > completely full. The server side was Debian Sid, and client side was > > Debian Woody, and it happened with both a simple xclock and gkrellm (ssh > > remoteserver xclock, ssh remoteserver gkrellm). > > Seems to bo a more general problem. I just installed ftpd and telnetd. > *Both* of them show exactly the same behaviour: 'ls -l' via telnet > blocks also. I could get a 635 byte file via ftp, but fetching a > 69294 bytes long file stalled. (This time, strace shows that ftpd is > sitting in write(5, ...data..., 56262), and there are > 13032 bytes in Send-Q for ftpd...) > > So what is this? Seems that there's a general TCP I/O problem with > the software current software versions in Debian unstable. libc > problem? Could a lousy network card cause this? Are there any > debugging hints for me? > > MfG, JBG > > -- > Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 > -- New APT-Proxy written in shell script -- > http://lug-owl.de/~jbglaw/software/ap2/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > +------------------------------------------------------------------+ | James W. Laferriere | System Techniques | Give me VMS | | Network Engineer | P.O. Box 854 | Give me Linux | | babydr@baby-dragons.com | Coudersport PA 16915 | only on AXP | +------------------------------------------------------------------+ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 20:09 ` Mr. James W. Laferriere @ 2001-12-24 20:17 ` Jan-Benedict Glaw 2001-12-24 20:44 ` Alex Bligh - linux-kernel 0 siblings, 1 reply; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 20:17 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-24 15:09:07 -0500, Mr. James W. Laferriere <babydr@baby-dragons.com> wrote in message <Pine.LNX.4.43.0112241507550.31883-100000@filesrv1.baby-dragons.com>: > > Hello Jan , Is this possibly related to a ECN enabled host & > somewhere in between a Non-ECN enabled (or a cisco router) ? That would give a different result: "functional TCP connections" or "non-functional TCP connections". Mine are between that. If data gets sent in small chunks, everything is fine, but if it's a larger transfer (more than one ethernet frame may transport???), write() stalls (or non-blocking write returns), but data is kept in Send-Q rather than being sent down to the client. Well, my setup is a LAN, everything here is fully functional wrt. ECN. I've never switched ECN off, and 2.4.x is running since ages on the boxes around. So it's definitely *not* ECN in this case:-( MfG, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 20:17 ` Jan-Benedict Glaw @ 2001-12-24 20:44 ` Alex Bligh - linux-kernel 2001-12-24 21:34 ` Thorsten Kranzkowski 2001-12-24 21:56 ` Jan-Benedict Glaw 0 siblings, 2 replies; 11+ messages in thread From: Alex Bligh - linux-kernel @ 2001-12-24 20:44 UTC (permalink / raw) To: Jan-Benedict Glaw, linux-kernel; +Cc: Alex Bligh - linux-kernel > That would give a different result: "functional TCP connections" or > "non-functional TCP connections". Mine are between that. If data gets > sent in small chunks, everything is fine, but if it's a larger > transfer (more than one ethernet frame may transport???), write() > stalls (or non-blocking write returns), but data is kept in > Send-Q rather than being sent down to the client. Just to check the completely obvious: Difficult / impossible to tell without a tcpdump, but last time I saw something like this, one end was silently dropping packets exactly equal to the MTU size (or up to 3 bytes smaller), but transmitting all other packets (in this instance it was a bizarre 802.11 problem). What happens is that small files get through, as do files sufficiently small the TCP window hasn't grown properly, as do interactive sessions (frequently) but large ftp's appear to die; in fact if you leave them long enough they recover after a long stall. This is far easier to diagnose if both devices are on the same segment (remember it can be an L1/L2 device in the way that does the drop though). If you have an L3 device (router etc.) in the middle, you can get a similar effect if the device does not fragment data correctly (for instance the Cisco into ip tunnels bug - now fixed I think), or, if you are using PMTU discovery (probably), if some evil device, or the end nodes, are filtering out ICMP (or doing something else which breaks PMTU discovery, such as some types of address filtering if there is NAT in the way). If you run tcpdump on both boxes, then for each packet transmitted you should either see a received packet the other end, or an ICMP reply (immediately); if you see a long pause, and get a reassembly failed message come back, or a retransmit, you will know it's this. I'd recommend testing tcpspray or something simple before looking at the gory internals of ssh buffer handling (openssh seems cleaner). I'd also recommend, if you are in an environment that can stand it, putting the two machines on a common L2 network, close together, and removing all filters (iptables etc.) and checking that works. -- Alex Bligh ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 20:44 ` Alex Bligh - linux-kernel @ 2001-12-24 21:34 ` Thorsten Kranzkowski 2001-12-24 21:58 ` Jan-Benedict Glaw 2001-12-24 21:56 ` Jan-Benedict Glaw 1 sibling, 1 reply; 11+ messages in thread From: Thorsten Kranzkowski @ 2001-12-24 21:34 UTC (permalink / raw) To: jbglaw; +Cc: linux-kernel On Mon, Dec 24, 2001 at 08:44:37PM -0000, Alex Bligh - linux-kernel wrote: > > That would give a different result: "functional TCP connections" or > > "non-functional TCP connections". Mine are between that. If data gets > > sent in small chunks, everything is fine, but if it's a larger > > transfer (more than one ethernet frame may transport???), write() > > stalls (or non-blocking write returns), but data is kept in > > Send-Q rather than being sent down to the client. > > Just to check the completely obvious: > [...] > > If you have an L3 device (router etc.) in the middle, you can get > a similar effect if the device does not fragment data correctly > (for instance the Cisco into ip tunnels bug - now fixed I think), > or, if you are using PMTU discovery (probably), if some evil device, Jan, do you have some DSL Modem in between? Thorsten -- | Thorsten Kranzkowski Internet: dl8bcu@dl8bcu.de | | Mobile: ++49 170 1876134 Snail: Niemannsweg 30, 49201 Dissen, Germany | | Ampr: dl8bcu@db0lj.#rpl.deu.eu, dl8bcu@marvin.dl8bcu.ampr.org [44.130.8.19] | ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 21:34 ` Thorsten Kranzkowski @ 2001-12-24 21:58 ` Jan-Benedict Glaw 0 siblings, 0 replies; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 21:58 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-24 21:34:52 +0000, Thorsten Kranzkowski <dl8bcu@dl8bcu.de> wrote in message <20011224213452.A7761@Marvin.DL8BCU.ampr.org>: > On Mon, Dec 24, 2001 at 08:44:37PM -0000, Alex Bligh - linux-kernel wrote: > > If you have an L3 device (router etc.) in the middle, you can get > > a similar effect if the device does not fragment data correctly > > (for instance the Cisco into ip tunnels bug - now fixed I think), > > or, if you are using PMTU discovery (probably), if some evil device, > > Jan, > do you have some DSL Modem in between? Hi Thorsten! No, it's not the famous MTU-too-large-and-a-lot-of-fragmentation-needed problem. It was a broken NIC, unwilling to send frames > ~960 bytes... MfG, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 20:44 ` Alex Bligh - linux-kernel 2001-12-24 21:34 ` Thorsten Kranzkowski @ 2001-12-24 21:56 ` Jan-Benedict Glaw 2001-12-25 0:43 ` Alex Bligh - linux-kernel 1 sibling, 1 reply; 11+ messages in thread From: Jan-Benedict Glaw @ 2001-12-24 21:56 UTC (permalink / raw) To: linux-kernel On Mon, 2001-12-24 20:44:37 -0000, Alex Bligh - linux-kernel <linux-kernel@alex.org.uk> wrote in message <1062462662.1009226676@[195.224.237.69]>: > >That would give a different result: "functional TCP connections" or > >"non-functional TCP connections". Mine are between that. If data gets > >sent in small chunks, everything is fine, but if it's a larger > >transfer (more than one ethernet frame may transport???), write() > >stalls (or non-blocking write returns), but data is kept in > >Send-Q rather than being sent down to the client. Well, some testing done. I've written a small microserver bound to port 1111/tcp via inetd: -------------------------------------- #!/bin/sh LEN="`cat /root/size`" dd bs=$LEN if=/dev/zero count=1 2>/dev/null sleep 1 exit 0 ------------------------------------ I can control it's output by a file. It seems that I can always transmit up to ~920 bytes at a time, but never more than 940. All values in between these borders are more-or-less functional, depending their size (smaller packets == high chance to reach client, larger packets == small chance to reach destination). > Just to check the completely obvious: > > Difficult / impossible to tell without a tcpdump, but last time I > saw something like this, one end was silently dropping packets > exactly equal to the MTU size (or up to 3 bytes smaller), but > transmitting all other packets (in this instance it was a bizarre > 802.11 problem). It's quite a problem to do tcpdumping on a host from which you never can get more than ~920 bytes at a time, neither by ftp, nor by ssh or telnet or whatever:-) Well, I've tcpdumped now, and it seemy my old WaveSwitch is to blame. The "bad" server actually transmits everything (and also tries retransmits etc.), but that never leaves the switch again... I've changed the switch port as well as the cable. It seems the switch and that network card don't like each other... I've now replaced the network card, everything is fine now. I've never seen a NIC failing partially, I've learned a lot this evening... Thank you very much (to all who send me notes) and have a nice X-Mas... MfG, JBG -- Jan-Benedict Glaw . jbglaw@lug-owl.de . +49-172-7608481 -- New APT-Proxy written in shell script -- http://lug-owl.de/~jbglaw/software/ap2/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Data sitting and remaining in Send-Q 2001-12-24 21:56 ` Jan-Benedict Glaw @ 2001-12-25 0:43 ` Alex Bligh - linux-kernel 0 siblings, 0 replies; 11+ messages in thread From: Alex Bligh - linux-kernel @ 2001-12-25 0:43 UTC (permalink / raw) To: Jan-Benedict Glaw, linux-kernel; +Cc: Alex Bligh - linux-kernel > Well, I've tcpdumped now, and it seemy my old WaveSwitch is > to blame. The "bad" server actually transmits everything > (and also tries retransmits etc.), but that never leaves the > switch again... I've changed the switch port as well as the > cable. It seems the switch and that network card don't > like each other... > > I've now replaced the network card, everything is fine now. > > I've never seen a NIC failing partially, I've learned a lot > this evening... Well I dunno if a WaveSwitch is 802.11 (sounds like it might be), so if it is, I had an identical problem - look under wireless ethernet at http://www.alex.org.uk/T23 Various firmware upgrades fixed it, and crucially a settings changed, fixed it. Your symptoms sound identical to mine (and if so it's the basestation you have to fix). Short answer is change to rfc1042 encapsulation from 802.1h, which (seemingly illegally) works at 1500 byte MTUs only between some hardware pairs. -- Alex Bligh ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2001-12-25 0:43 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-24 17:01 Data sitting and remaining in Send-Q Jan-Benedict Glaw 2001-12-24 18:10 ` José Luis Domingo López 2001-12-24 19:00 ` Jan-Benedict Glaw 2001-12-24 19:38 ` Jan-Benedict Glaw 2001-12-24 20:09 ` Mr. James W. Laferriere 2001-12-24 20:17 ` Jan-Benedict Glaw 2001-12-24 20:44 ` Alex Bligh - linux-kernel 2001-12-24 21:34 ` Thorsten Kranzkowski 2001-12-24 21:58 ` Jan-Benedict Glaw 2001-12-24 21:56 ` Jan-Benedict Glaw 2001-12-25 0:43 ` Alex Bligh - linux-kernel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox