* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-27 22:36 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
@ 2003-01-28 2:57 ` kuznet
2003-01-28 3:22 ` Christopher Faylor
0 siblings, 1 reply; 23+ messages in thread
From: kuznet @ 2003-01-28 2:57 UTC (permalink / raw)
To: David S. Miller; +Cc: andersg, lkernel2003, linux-kernel, tobi
Hello!
> Alexey, this piece of code was buggy first time it was coded, and it
> may still have some holes. :-)))
To my shame, I cannot say "no". It was written sort of too fast. :-)
Did the reporters see packets with wrong checksum on wire or wrong tcp
headers or something like that?
Alexey
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-28 2:57 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
@ 2003-01-28 3:22 ` Christopher Faylor
0 siblings, 0 replies; 23+ messages in thread
From: Christopher Faylor @ 2003-01-28 3:22 UTC (permalink / raw)
To: kuznet; +Cc: David S. Miller, linux-kernel
On Tue, Jan 28, 2003 at 05:57:55AM +0300, kuznet@ms2.inr.ac.ru wrote:
>>Alexey, this piece of code was buggy first time it was coded, and it
>>may still have some holes. :-)))
>
>To my shame, I cannot say "no". It was written sort of too fast. :-)
>
>Did the reporters see packets with wrong checksum on wire or wrong tcp
>headers or something like that?
My knowledge of TCP/IP is extremely minimal but the sequence number looked
weird when the stall occurred. It looked like the sequence numbers you get
with the -S option to tcpdump. All of the other packets had small sequence
numbers and what I assume was the bad packet had a large one.
I'm sorry if this is gibberish and makes no sense. I don't know how to tell
if the checksum was wrong or not.
cgf
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-28 20:34 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
@ 2003-01-28 23:56 ` kuznet
2003-01-29 0:08 ` David S. Miller
0 siblings, 1 reply; 23+ messages in thread
From: kuznet @ 2003-01-28 23:56 UTC (permalink / raw)
To: David S. Miller
Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
Hello!
> Alexey, most solid report is that 2.5.43-bk1 makes bug appear.
> This is good because it sort of narrows things down.
Now I do not think so. It looks like some old beast just got manifested.
It happens when 2 short consecutive segments are lost.
Funny thing happen when retransmitting.
First, I do not see collapsing, which must be succesfull in this case.
So, the first segment is retransmitted alone, but the second is never
retransmitted, tcp even prefers to retransmit the third one. Something
is already bad, queue is broken in an interesting way, the impression is
that... that... that tcp did collapsing, but "forgot" to modify skb length.
Hey! Interesting thing has just happened, it is the first time when I found
the bug formulating a senstence while writing e-mail not while peering
to code. :-)
Shheit, look into tcp_retrans_try_collapse():
if (skb->ip_summed != CHECKSUM_HW) {
memcpy(skb_put(skb, next_skb_size), next_skb->data, nex$ skb->csum = csum_block_add(skb->csum, next_skb->csum, s$ }
WHERE IS skb_put and copy when skb->ip_summed==CHECKSUM_HW??!!
So, the fix is move of memcpy() line out of if clause.
Alexey
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-28 23:21 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
@ 2003-01-29 0:02 ` kuznet
2003-01-29 0:09 ` kuznet
1 sibling, 0 replies; 23+ messages in thread
From: kuznet @ 2003-01-29 0:02 UTC (permalink / raw)
To: David S. Miller
Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
Hello!
> BTW, how come tcp_trim_head() can just set skb->ip_summed
> blindly to CHECKSUM_HW and not setup skb->csum?
When skb->ip_summed is CHECKSUM_HW skb->csum is ignored and
initialized at the moment when segment is transmitted in
tcp_v*_send_check().
Alexey
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-28 23:56 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
@ 2003-01-29 0:08 ` David S. Miller
2003-01-29 3:14 ` kuznet
2003-01-29 14:12 ` David C Niemi
0 siblings, 2 replies; 23+ messages in thread
From: David S. Miller @ 2003-01-29 0:08 UTC (permalink / raw)
To: kuznet; +Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
From: kuznet@ms2.inr.ac.ru
Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
Hey! Interesting thing has just happened, it is the first time when I found
the bug formulating a senstence while writing e-mail not while peering
to code. :-)
Congratulations :-)
Shheit, look into tcp_retrans_try_collapse():
if (skb->ip_summed != CHECKSUM_HW) {
memcpy(skb_put(skb, next_skb_size), next_skb->data, nex$ skb->csum = csum_block_add(skb->csum, next_skb->csum, s$ }
WHERE IS skb_put and copy when skb->ip_summed==CHECKSUM_HW??!!
So, the fix is move of memcpy() line out of if clause.
Indeed, this bug exists in 2.4 as well of course.
This bug is 2.4.3 vintage :-) It got added as part of initial
zerocopy merge in fact.
Here is 2.4.x version of fix, 2.5.x is identicaly sans some line
number differences. I will push this all to Linus/Marcelo.
BTW, Alexey, please please explain to me how that trick made
by tcp_trim_head() works. :-) I am talking about how it is
setting ip_summed to CHECKSUM_HARDWARE blindly and not even
bothering to set skb->csum correctly.
--- net/ipv4/tcp_output.c.~1~ Tue Jan 28 16:12:39 2003
+++ net/ipv4/tcp_output.c Tue Jan 28 16:14:18 2003
@@ -721,10 +721,9 @@
if (next_skb->ip_summed == CHECKSUM_HW)
skb->ip_summed = CHECKSUM_HW;
- if (skb->ip_summed != CHECKSUM_HW) {
- memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size);
+ memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size);
+ if (skb->ip_summed != CHECKSUM_HW)
skb->csum = csum_block_add(skb->csum, next_skb->csum, skb_size);
- }
/* Update sequence range on original skb. */
TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(next_skb)->end_seq;
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-28 23:21 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
2003-01-29 0:02 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
@ 2003-01-29 0:09 ` kuznet
2003-01-29 0:46 ` Sebastian Benoit
2003-01-29 6:52 ` David S. Miller
1 sibling, 2 replies; 23+ messages in thread
From: kuznet @ 2003-01-29 0:09 UTC (permalink / raw)
To: David S. Miller
Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
Hello!
The proposed fix is enclosed. Please, check.
Alexey
===== net/ipv4/tcp_output.c 1.19 vs edited =====
--- 1.19/net/ipv4/tcp_output.c Fri Oct 25 15:46:21 2002
+++ edited/net/ipv4/tcp_output.c Wed Jan 29 03:07:26 2003
@@ -786,13 +786,13 @@
/* Ok. We will be able to collapse the packet. */
__skb_unlink(next_skb, next_skb->list);
+ memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size);
+
if (next_skb->ip_summed == CHECKSUM_HW)
skb->ip_summed = CHECKSUM_HW;
- if (skb->ip_summed != CHECKSUM_HW) {
- memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size);
+ if (skb->ip_summed != CHECKSUM_HW)
skb->csum = csum_block_add(skb->csum, next_skb->csum, skb_size);
- }
/* Update sequence range on original skb. */
TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(next_skb)->end_seq;
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 0:09 ` kuznet
@ 2003-01-29 0:46 ` Sebastian Benoit
2003-01-29 4:12 ` Christopher Faylor
2003-01-29 6:52 ` David S. Miller
1 sibling, 1 reply; 23+ messages in thread
From: Sebastian Benoit @ 2003-01-29 0:46 UTC (permalink / raw)
To: kuznet
Cc: David S. Miller, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
[-- Attachment #1: Type: text/plain, Size: 486 bytes --]
kuznet@ms2.inr.ac.ru(kuznet@ms2.inr.ac.ru)@2003.01.29 03:09:21 +0000:
> Hello!
>
> The proposed fix is enclosed. Please, check.
okay, this seems to be a solution.
i can't get the ssh session to lock up with this patch.
thanks,
B.
--
Sebastian Benoit <benoit-lists@fb12.de>
My mail is GnuPG signed -- Unsigned ones are bogus -- http://www.gnupg.org/
GnuPG 0x5BA22F00 2001-07-31 2999 9839 6C9E E4BF B540 C44B 4EC4 E1BE 5BA2 2F00
I'm not as think as you stoned I am.
[-- Attachment #2: Type: application/pgp-signature, Size: 240 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 0:08 ` David S. Miller
@ 2003-01-29 3:14 ` kuznet
2003-01-29 7:32 ` David S. Miller
2003-01-29 14:12 ` David C Niemi
1 sibling, 1 reply; 23+ messages in thread
From: kuznet @ 2003-01-29 3:14 UTC (permalink / raw)
To: David S. Miller
Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
Hello!
> BTW, Alexey, please please explain to me how that trick made
> by tcp_trim_head() works. :-) I am talking about how it is
> setting ip_summed to CHECKSUM_HARDWARE blindly and not even
> bothering to set skb->csum correctly.
skb->csum is not used inside TCP when skb->ip_summed==CHECKSUM_HW:
void tcp_v4_send_check(struct sock *sk, struct tcphdr *th, int len,
struct sk_buff *skb)
{
struct inet_opt *inet = inet_sk(sk);
if (skb->ip_summed == CHECKSUM_HW) {
th->check = ~tcp_v4_check(th, len, inet->saddr, inet->daddr, 0);
skb->csum = offsetof(struct tcphdr, check);
And when pushing segment down to IP, it is initialized to offset of th->check.
So, it is safe to make skb->ip_summed := CHECKSUM_HW any moment when
we are lazy to recalculate checksum. Frankly speaking, it is not very good,
I was confused _a_ _lot_ when seeing wrong checksums on those bogus
zero-length packets in tcpdumps made by Christopher. But saves some
source lines.
Alexey
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 0:46 ` Sebastian Benoit
@ 2003-01-29 4:12 ` Christopher Faylor
0 siblings, 0 replies; 23+ messages in thread
From: Christopher Faylor @ 2003-01-29 4:12 UTC (permalink / raw)
To: linux-kernel
On Wed, Jan 29, 2003 at 01:46:42AM +0100, Sebastian Benoit wrote:
>kuznet@ms2.inr.ac.ru(kuznet@ms2.inr.ac.ru)@2003.01.29 03:09:21 +0000:
>>The proposed fix is enclosed. Please, check.
>
>okay, this seems to be a solution. i can't get the ssh session to lock
>up with this patch.
Ditto for me.
Thank you!
cgf
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 0:09 ` kuznet
2003-01-29 0:46 ` Sebastian Benoit
@ 2003-01-29 6:52 ` David S. Miller
1 sibling, 0 replies; 23+ messages in thread
From: David S. Miller @ 2003-01-29 6:52 UTC (permalink / raw)
To: kuznet; +Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
From: kuznet@ms2.inr.ac.ru
Date: Wed, 29 Jan 2003 03:09:21 +0300 (MSK)
The proposed fix is enclosed. Please, check.
Installed locally and I will propagate everywhere as soon
as possible.
Thanks a lot Alexey.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 3:14 ` kuznet
@ 2003-01-29 7:32 ` David S. Miller
0 siblings, 0 replies; 23+ messages in thread
From: David S. Miller @ 2003-01-29 7:32 UTC (permalink / raw)
To: kuznet; +Cc: benoit-lists, dada1, cgf, andersg, lkernel2003, linux-kernel,
tobi
From: kuznet@ms2.inr.ac.ru
Date: Wed, 29 Jan 2003 06:14:55 +0300 (MSK)
skb->csum is not used inside TCP when skb->ip_summed==CHECKSUM_HW:
...
So, it is safe to make skb->ip_summed := CHECKSUM_HW any moment when
we are lazy to recalculate checksum.
I see, clever trick as I had suspected.
Thanks for the explanation.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 0:08 ` David S. Miller
2003-01-29 3:14 ` kuznet
@ 2003-01-29 14:12 ` David C Niemi
2003-01-29 14:24 ` kuznet
2003-02-02 15:40 ` Bill Davidsen
1 sibling, 2 replies; 23+ messages in thread
From: David C Niemi @ 2003-01-29 14:12 UTC (permalink / raw)
To: David S. Miller
Cc: kuznet, benoit-lists, dada1, cgf, andersg, lkernel2003,
linux-kernel, tobi
On Tue, 28 Jan 2003, David S. Miller wrote:
> From: kuznet@ms2.inr.ac.ru
> Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
>
> Hey! Interesting thing has just happened, it is the first time when I
> found the bug formulating a senstence while writing e-mail not while
> peering to code. :-)
>
> Congratulations :-)
Just to confirm, this fix works for me as well.
...
> Indeed, this bug exists in 2.4 as well of course.
>
> This bug is 2.4.3 vintage :-) It got added as part of initial
> zerocopy merge in fact.
Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
even once, despite heavily using it for several days under the same
circumstances. Is there any reason 2.4.x would be better able to recover?
2.5.59 with the fix seems to feel a bit less balky than 2.4.18 without the
fix, so it seemed to me that 2.4.18 had some way of recovering at the cost
of a several second pause in the session.
DCN
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 14:12 ` David C Niemi
@ 2003-01-29 14:24 ` kuznet
2003-01-29 15:11 ` dada1
2003-02-02 15:40 ` Bill Davidsen
1 sibling, 1 reply; 23+ messages in thread
From: kuznet @ 2003-01-29 14:24 UTC (permalink / raw)
To: David C Niemi
Cc: davem, benoit-lists, dada1, cgf, andersg, lkernel2003,
linux-kernel, tobi
Hello!
> Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
The bug is there, but it cannot be triggered with ssh.
In 2.4 it can happen only on sockets which use sendfile().
Alexey
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 14:24 ` kuznet
@ 2003-01-29 15:11 ` dada1
0 siblings, 0 replies; 23+ messages in thread
From: dada1 @ 2003-01-29 15:11 UTC (permalink / raw)
To: kuznet, David C Niemi
Cc: davem, benoit-lists, cgf, andersg, lkernel2003, linux-kernel,
tobi
> Hello!
>
> > Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
>
> The bug is there, but it cannot be triggered with ssh.
> In 2.4 it can happen only on sockets which use sendfile().
>
> Alexey
>
Thanks VERY much Alexey for your fast fix.
Back to linux 2.5.59, is the TOS 0x10 mandatory to have such hangs, or are
all TCP sessions potentially candidates ?
Eric
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-01-29 14:12 ` David C Niemi
2003-01-29 14:24 ` kuznet
@ 2003-02-02 15:40 ` Bill Davidsen
1 sibling, 0 replies; 23+ messages in thread
From: Bill Davidsen @ 2003-02-02 15:40 UTC (permalink / raw)
To: David C Niemi; +Cc: David S. Miller, Linux Kernel Mailing List
On Wed, 29 Jan 2003, David C Niemi wrote:
>
> On Tue, 28 Jan 2003, David S. Miller wrote:
> > From: kuznet@ms2.inr.ac.ru
> > Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
> >
> > Hey! Interesting thing has just happened, it is the first time when I
> > found the bug formulating a senstence while writing e-mail not while
> > peering to code. :-)
> >
> > Congratulations :-)
>
> Just to confirm, this fix works for me as well.
>
> ...
> > Indeed, this bug exists in 2.4 as well of course.
> >
> > This bug is 2.4.3 vintage :-) It got added as part of initial
> > zerocopy merge in fact.
>
> Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
> even once, despite heavily using it for several days under the same
> circumstances. Is there any reason 2.4.x would be better able to recover?
> 2.5.59 with the fix seems to feel a bit less balky than 2.4.18 without the
> fix, so it seemed to me that 2.4.18 had some way of recovering at the cost
> of a several second pause in the session.
The problem which I have been seeing with some regularity is not the hang
you describe (I see that infrequently) but rather a hang after I exit an
ssh connection. I open several dozen windows at a time to a cluster when I
do admin, and when I close almost always at least one doesn't drop without
"~." to help. So far in a hour I haven't seen that.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
@ 2003-02-03 14:23 Franz Sirl
2003-02-03 17:11 ` J Sloan
2003-02-03 21:04 ` Bill Davidsen
0 siblings, 2 replies; 23+ messages in thread
From: Franz Sirl @ 2003-02-03 14:23 UTC (permalink / raw)
To: bill davidsen; +Cc: linux-kernel
On 2003-02-02 15:40:33 Bill Davidsen wrote:
>On Wed, 29 Jan 2003, David C Niemi wrote:
>
> >
> > On Tue, 28 Jan 2003, David S. Miller wrote:
> > > From: kuznet@ms2.inr.ac.ru
> > > Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
> > >
> > > Hey! Interesting thing has just happened, it is the first time when I
> > > found the bug formulating a senstence while writing e-mail not while
> > > peering to code. :-)
> > >
> > > Congratulations :-)
> >
> > Just to confirm, this fix works for me as well.
> >
> > ...
> > > Indeed, this bug exists in 2.4 as well of course.
> > >
> > > This bug is 2.4.3 vintage :-) It got added as part of initial
> > > zerocopy merge in fact.
> >
> > Odd, then, that it I was unable to reproduce the SSH hangs under 2.4.18
> > even once, despite heavily using it for several days under the same
> > circumstances. Is there any reason 2.4.x would be better able to
> recover?
> > 2.5.59 with the fix seems to feel a bit less balky than 2.4.18 without the
> > fix, so it seemed to me that 2.4.18 had some way of recovering at the cost
> > of a several second pause in the session.
>
>The problem which I have been seeing with some regularity is not the hang
>you describe (I see that infrequently) but rather a hang after I exit an
>ssh connection. I open several dozen windows at a time to a cluster when I
>do admin, and when I close almost always at least one doesn't drop without
>"~." to help. So far in a hour I haven't seen that.
That's some internal problem in OpenSSH, can be seen on Solaris as well.
Can be easily reproduced in a ssh session:
nohup sleep 60 &
logout
The ssh session will terminate only after the sleep exited.
Franz.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-02-03 14:23 Franz Sirl
@ 2003-02-03 17:11 ` J Sloan
2003-02-03 18:22 ` Jeff Garzik
2003-02-03 21:04 ` Bill Davidsen
1 sibling, 1 reply; 23+ messages in thread
From: J Sloan @ 2003-02-03 17:11 UTC (permalink / raw)
To: Franz Sirl; +Cc: linux-kernel
This is a royal pain - there's been a "linux
ssh hang patch" floating around for ages -
None of the linux vendors seem to want to
fix it, and the openssh maintainers don't
seem to want to fix it - grrr.
BTW we some demoronized ssh 3.4p1 rpms
available for suse and redhat - if it will ease
somebody's pain, feel free to grab them -
Joe
Franz Sirl wrote:
> On 2003-02-02 15:40:33 Bill Davidsen wrote:
>
>> On Wed, 29 Jan 2003, David C Niemi wrote:
>>
>> >
>> > On Tue, 28 Jan 2003, David S. Miller wrote:
>> > > From: kuznet@ms2.inr.ac.ru
>> > > Date: Wed, 29 Jan 2003 02:56:41 +0300 (MSK)
>> > >
>> > > Hey! Interesting thing has just happened, it is the first time
>> when I
>> > > found the bug formulating a senstence while writing e-mail not
>> while
>> > > peering to code. :-)
>> > >
>> > > Congratulations :-)
>> >
>> > Just to confirm, this fix works for me as well.
>> >
>> > ...
>> > > Indeed, this bug exists in 2.4 as well of course.
>> > >
>> > > This bug is 2.4.3 vintage :-) It got added as part of initial
>> > > zerocopy merge in fact.
>> >
>> > Odd, then, that it I was unable to reproduce the SSH hangs under
>> 2.4.18
>> > even once, despite heavily using it for several days under the same
>> > circumstances. Is there any reason 2.4.x would be better able to
>> recover?
>> > 2.5.59 with the fix seems to feel a bit less balky than 2.4.18
>> without the
>> > fix, so it seemed to me that 2.4.18 had some way of recovering at
>> the cost
>> > of a several second pause in the session.
>>
>> The problem which I have been seeing with some regularity is not the
>> hang
>> you describe (I see that infrequently) but rather a hang after I exit an
>> ssh connection. I open several dozen windows at a time to a cluster
>> when I
>> do admin, and when I close almost always at least one doesn't drop
>> without
>> "~." to help. So far in a hour I haven't seen that.
>
>
> That's some internal problem in OpenSSH, can be seen on Solaris as
> well. Can be easily reproduced in a ssh session:
>
> nohup sleep 60 &
> logout
>
> The ssh session will terminate only after the sleep exited.
>
> Franz.
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-02-03 17:11 ` J Sloan
@ 2003-02-03 18:22 ` Jeff Garzik
0 siblings, 0 replies; 23+ messages in thread
From: Jeff Garzik @ 2003-02-03 18:22 UTC (permalink / raw)
To: J Sloan; +Cc: Franz Sirl, linux-kernel
On Mon, Feb 03, 2003 at 09:11:56AM -0800, J Sloan wrote:
> None of the linux vendors seem to want to
> fix it, and the openssh maintainers don't
> seem to want to fix it - grrr.
Seems to me like a Linux vendor just fixed it.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-02-03 14:23 Franz Sirl
2003-02-03 17:11 ` J Sloan
@ 2003-02-03 21:04 ` Bill Davidsen
1 sibling, 0 replies; 23+ messages in thread
From: Bill Davidsen @ 2003-02-03 21:04 UTC (permalink / raw)
To: Franz Sirl; +Cc: linux-kernel
On Mon, 3 Feb 2003, Franz Sirl wrote:
> On 2003-02-02 15:40:33 Bill Davidsen wrote:
> >The problem which I have been seeing with some regularity is not the hang
> >you describe (I see that infrequently) but rather a hang after I exit an
> >ssh connection. I open several dozen windows at a time to a cluster when I
> >do admin, and when I close almost always at least one doesn't drop without
> >"~." to help. So far in a hour I haven't seen that.
>
> That's some internal problem in OpenSSH, can be seen on Solaris as well.
> Can be easily reproduced in a ssh session:
>
> nohup sleep 60 &
> logout
>
> The ssh session will terminate only after the sleep exited.
That is a problem with processes left running. I do not forward
connections, I do not forward X, I do not (in normal practice) leave
anything running. A typical thing to do is to go to each machine in a
cluster and look for a user activity:
grep "user" log/stats.readers
exit
nothing more. And every once in a while that hangs after executing the
logout sequence. With the patch it hasn't to date.
That doesn't mean it's a fix, I don't see it every day, I just haven't
seen it in a few days since I put in the patch.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
[not found] <Pine.LNX.3.96.1030203155651.28323A-100000@dstl.gov.uk>
@ 2003-02-04 9:58 ` Tony Gale
2003-02-04 14:12 ` David Ford
2003-02-04 14:40 ` Bill Davidsen
0 siblings, 2 replies; 23+ messages in thread
From: Tony Gale @ 2003-02-04 9:58 UTC (permalink / raw)
To: Bill Davidsen; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 892 bytes --]
On Mon, 2003-02-03 at 21:04, Bill Davidsen wrote:
>
> That is a problem with processes left running. I do not forward
> connections, I do not forward X, I do not (in normal practice) leave
> anything running. A typical thing to do is to go to each machine in a
> cluster and look for a user activity:
> grep "user" log/stats.readers
> exit
> nothing more. And every once in a while that hangs after executing the
> logout sequence. With the patch it hasn't to date.
>
> That doesn't mean it's a fix, I don't see it every day, I just haven't
> seen it in a few days since I put in the patch.
The ssh hang on exit "problem" is a policy of the ssh coders. It'll
happen when you have a background job still running when you exit, which
is still connected to the terminal.
As I said, it's an ssh policy issue (which many people disagree with)
and not a bug.
-tony
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 307 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-02-04 9:58 ` Tony Gale
@ 2003-02-04 14:12 ` David Ford
2003-02-04 14:40 ` Bill Davidsen
1 sibling, 0 replies; 23+ messages in thread
From: David Ford @ 2003-02-04 14:12 UTC (permalink / raw)
To: Tony Gale; +Cc: Bill Davidsen, linux-kernel
I've run into this often with a background job however .. it also hangs
when there isn't any background job. I suspect there definitely is a bug.
David
Tony Gale wrote:
>On Mon, 2003-02-03 at 21:04, Bill Davidsen wrote:
>
>
>>That is a problem with processes left running. I do not forward
>>connections, I do not forward X, I do not (in normal practice) leave
>>anything running. A typical thing to do is to go to each machine in a
>>cluster and look for a user activity:
>> grep "user" log/stats.readers
>> exit
>>nothing more. And every once in a while that hangs after executing the
>>logout sequence. With the patch it hasn't to date.
>>
>>That doesn't mean it's a fix, I don't see it every day, I just haven't
>>seen it in a few days since I put in the patch.
>>
>>
>
>The ssh hang on exit "problem" is a policy of the ssh coders. It'll
>happen when you have a background job still running when you exit, which
>is still connected to the terminal.
>
>As I said, it's an ssh policy issue (which many people disagree with)
>and not a bug.
>
>-tony
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
2003-02-04 9:58 ` Tony Gale
2003-02-04 14:12 ` David Ford
@ 2003-02-04 14:40 ` Bill Davidsen
1 sibling, 0 replies; 23+ messages in thread
From: Bill Davidsen @ 2003-02-04 14:40 UTC (permalink / raw)
To: Tony Gale; +Cc: linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1321 bytes --]
On 4 Feb 2003, Tony Gale wrote:
> On Mon, 2003-02-03 at 21:04, Bill Davidsen wrote:
> >
> > That is a problem with processes left running. I do not forward
> > connections, I do not forward X, I do not (in normal practice) leave
> > anything running. A typical thing to do is to go to each machine in a
> > cluster and look for a user activity:
> > grep "user" log/stats.readers
> > exit
> > nothing more. And every once in a while that hangs after executing the
> > logout sequence. With the patch it hasn't to date.
> >
> > That doesn't mean it's a fix, I don't see it every day, I just haven't
> > seen it in a few days since I put in the patch.
>
> The ssh hang on exit "problem" is a policy of the ssh coders. It'll
> happen when you have a background job still running when you exit, which
> is still connected to the terminal.
Please go back and reread either of my comments on the topic, I think I've
made it clear that I have no background jobs, no forwarded ports, and no
forwarded X. The existance of a problem in one area doesn't mean that
nothing else is allow to cause bad behaviour.
> As I said, it's an ssh policy issue (which many people disagree with)
> and not a bug.
--
bill davidsen <davidsen@tmr.com>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: APPLICATION/PGP-SIGNATURE, Size: 307 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x,
@ 2003-02-04 19:54 jjs
0 siblings, 0 replies; 23+ messages in thread
From: jjs @ 2003-02-04 19:54 UTC (permalink / raw)
To: linux kernel
Tony Gale wrote:
>
> The ssh hang on exit "problem" is a policy of the ssh coders. It'll
> happen when you have a background job still running when you exit, which
> is still connected to the terminal.
>
> As I said, it's an ssh policy issue (which many people disagree with)
> and not a bug.
>
So, admin logs in and restarts a process -
a very very common task. oops, can't log out.
Sure sounds like a thinko to me, if not a bug.
Demoronized openssh packages for
suse and redhat are available by
popular request from:
ftp.mainphrame.com/pub/openssh
Joe
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2003-02-04 19:44 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-04 19:54 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, jjs
[not found] <Pine.LNX.3.96.1030203155651.28323A-100000@dstl.gov.uk>
2003-02-04 9:58 ` Tony Gale
2003-02-04 14:12 ` David Ford
2003-02-04 14:40 ` Bill Davidsen
-- strict thread matches above, loose matches on Subject: below --
2003-02-03 14:23 Franz Sirl
2003-02-03 17:11 ` J Sloan
2003-02-03 18:22 ` Jeff Garzik
2003-02-03 21:04 ` Bill Davidsen
2003-01-28 23:21 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
2003-01-29 0:02 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
2003-01-29 0:09 ` kuznet
2003-01-29 0:46 ` Sebastian Benoit
2003-01-29 4:12 ` Christopher Faylor
2003-01-29 6:52 ` David S. Miller
2003-01-28 20:34 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
2003-01-28 23:56 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
2003-01-29 0:08 ` David S. Miller
2003-01-29 3:14 ` kuznet
2003-01-29 7:32 ` David S. Miller
2003-01-29 14:12 ` David C Niemi
2003-01-29 14:24 ` kuznet
2003-01-29 15:11 ` dada1
2003-02-02 15:40 ` Bill Davidsen
2003-01-27 22:36 [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, through Cisco PIX David S. Miller
2003-01-28 2:57 ` [TEST FIX] Re: SSH Hangs in 2.5.59 and 2.5.55 but not 2.4.x, kuznet
2003-01-28 3:22 ` Christopher Faylor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox