netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
To: Didier Raboud <didier@raboud.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Netdev <netdev@vger.kernel.org>,
	bugme-daemon@bugzilla.kernel.org
Subject: Re: [Bugme-new] [Bug 10903] New: ssh connections hang with 2.6.26-rc5
Date: Mon, 16 Jun 2008 16:21:25 +0300 (EEST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0806161542290.16829@wrl-59.cs.helsinki.fi> (raw)
In-Reply-To: <200806151537.11986.didier@raboud.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6081 bytes --]

On Sun, 15 Jun 2008, Didier Raboud wrote:

> Le samedi 14 juin 2008 22:45:41 Ilpo Järvinen, vous avez écrit :
> > On Fri, 13 Jun 2008, Andrew Morton wrote:
> > > (switched to email.  Please respond via emailed reply-to-all, not via the
> > > bugzilla web interface).
> 
> OK.
> 
> > > On Fri, 13 Jun 2008 02:39:17 -0700 (PDT) bugme-daemon@bugzilla.kernel.org 
> wrote:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=10903
> > > >
> > > >            Summary: ssh connections hang with 2.6.26-rc5
> > > >            Product: Networking
> > > >            Version: 2.5
> > > >      KernelVersion: 2.6.26-rc5
> > > >           Platform: All
> > > >         OS/Version: Linux
> > > >               Tree: Mainline
> > > >             Status: NEW
> > > >           Severity: normal
> > > >           Priority: P1
> > > >          Component: Other
> > > >         AssignedTo: acme@ghostprotocols.net
> > > >         ReportedBy: didier@raboud.com
> > > >
> > > >
> > > > Latest working kernel version: 2.6.25-2
> > > > Earliest failing kernel version: 2.6.26-rc5
> > > > Distribution: Debian (Lenny + Sid)
> > > > Hardware Environment: amd64 (Dell Latitude D630)
> > > > Software Environment: KDE
> > > > Problem Description:
> > > >
> > > > With kernel version 2.6.26-rc5, the ssh connections to remote servers
> > > > randomly
> > > > hang (no error message). No amelioration despite the activation of
> > > > "ServerAliveInterval" on both sides.
> >
> > Thanks for reporting. Could you please clarify couple of things:
> 
> Hi.
> 
> I will try to, with my time and knowledge.
> 
> > Does this only happen with a particular server/servers?
> 
> I have only tried with two of my home servers. One runs 2.6.22-4-686 and the 
> other 2.6.18-6-vserver-686.
> 
> > Any middleboxes in between (NAT, firewall, etc.)?
> 
> There is a ADSL router which "provides" internet to the servers by NAT. 
> I have  tried from "inside" the house (so in the same subnet) and from 
> outside: it hangs in both cases.

...Ok. Those have some timeouts for idle sessions, so if one has active or 
keepalived sessions that shouldn't be a problem.

> The common point is my use of "iwl3945" : I have always tried the ssh 
> connections through WiFi.
>
> > Do all ssh connections hang simultaneously?
> 
> Well... It is hard to say. As far as I have seen, no. When I get one hang, I 
> can successfully connect to the same server.

It's quite likely that the hangs are independent, but it was worth of 
confirming still.

> > How long have you waited until concluding that TCP is "hung"?
> 
> Well. The "ServerAliveInterval" option of openssh now leads to "Received 
> disconnect from $IP: 2: Timeout, your session not responding." after the 
> hang. So the openssh server notices that my session is not responding and so 
> cuts the connection.
>
> > Is TSO enabled (ethtool -k)? Have you tried without it?
> 
> Doesn't seem:
> 
> ----
> # ethtool -k wlan0
> Offload parameters for wlan0:
> Cannot get device rx csum settings: Operation not supported
> Cannot get device tx csum settings: Operation not supported
> Cannot get device scatter-gather settings: Operation not supported
> Cannot get device tcp segmentation offload settings: Operation not supported
> Cannot get device udp large send offload settings: Operation not supported
> Cannot get device generic segmentation offload settings: Operation not 
> supported
> no offload info available
> ----
>
> > It wouldn't hurt to include info about eth hw too (e.g., lspci), though
> > it might turn unneeded at some point of time but it might save an email
> > round-trip.
> 
> lspci attached.

...Thanks for all the details. I especially appreciated the kernel 
versions of the servers since TCP has two end hosts (and I forgot to 
ask)... :-)

> > TCP can appear to hang due to vast number of reasons. Only recent changes
> > that are suspectable is the DEFERRED_ACCEPT thing which is already
> > reverted in the very latest Linus' tree (even -rc6 is too old for that)
> > and few FRTO fixes (you can exclude FRTO by turning
> > /proc/sys/net/ipv4/tcp_frto sysctl to 0 but it seems quite unlikely to
> > change anything); your problem might well come from something else and TCP
> > hang is just a symptom of other problem downstream.
> 
> I can't understand everything, but what I can say is that with the exact 
> same software, I get no hangs  with 2.6.25-2 but I get some with 
> 2.6.26-rc5.

Yes, I understand that... I was just trying to bring up above what has 
changed between those kernels :-). ...Quite few TCP related changes 
actually (there were also some TSO related changes but they're not 
significant in your case).

> > So please gather this information (at least for the relevant connections):
> >
> > $ netstat -pn
> > $ cat /proc/net/tcp
> 
> Attached.

I probably wasn't specific enough. ...I meant that you would get this once 
one of the ssh sessions gets stuck (do it right after you notice that the 
session is stuck, that should get the info before the connection is cut 
down). This info should be collected from both ends (on client and 
server).

> > ...Also a tcpdump might be handy (though I don't know yet).
> 
> Well. It seems that there is another bug here: everytime I tried a

Ah, lets try to figure that one out as well...

> # tcpdump -w /tmp/tcpdump.wlan0 -i wlan0
>
> I got a CPU lockup (or similar, can't know exactly, but keyboard blocked and 
> nothing doable).

You probably run it under X, no? Please switch beforehand to some other vt 
(a textual one) then (Ctrl-Alt-Fn, where n < 6) and then log in and 
running that command there and see if you get some output into screen 
there. If you see something (e.g., a sudden OOPS message or some other 
warning printed) when it locks up, the easiest things is to take a shot 
with a digicam (or write it down somewhere else) and send that shot (or 
those details) to us please.

...Once you have a tcpdump, I can probably figure at least something out 
(though it might still just point to the right direction rather than 
exposing the actual cause).

-- 
 i.

  reply	other threads:[~2008-06-16 13:21 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-10903-10286@http.bugzilla.kernel.org/>
2008-06-13  9:58 ` [Bugme-new] [Bug 10903] New: ssh connections hang with 2.6.26-rc5 Andrew Morton
2008-06-14 20:45   ` Ilpo Järvinen
2008-06-15 13:37     ` Didier Raboud
2008-06-16 13:21       ` Ilpo Järvinen [this message]
     [not found]         ` <200806172359.33768.didier@raboud.com>
2008-06-17 23:04           ` Ilpo Järvinen
     [not found]             ` <Pine.LNX.4.64.0806180152410.32420-x/A8LOkYjdVsRR2hCrRKtT03IgOmwywn@public.gmane.org>
2008-06-18  7:24               ` Johannes Berg
2008-06-18  8:05                 ` David Miller
     [not found]                   ` <20080618.010528.05757230.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-06-18  8:26                     ` David Miller
     [not found]                 ` <1213773887.3803.158.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>
2008-06-18 11:34                   ` Didier Raboud
     [not found]                     ` <200806181334.10654.didier-efQUOpnRmEvQT0dZR+AlfA@public.gmane.org>
2008-06-18 11:39                       ` Michael Buesch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0806161542290.16829@wrl-59.cs.helsinki.fi \
    --to=ilpo.jarvinen@helsinki.fi \
    --cc=akpm@linux-foundation.org \
    --cc=bugme-daemon@bugzilla.kernel.org \
    --cc=didier@raboud.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).