Re: Linux 2.6.0-test9

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: Linux 2.6.0-test9
@ 2003-10-26 19:40 Andries.Brouwer
  2003-10-27  0:01 ` Andrew Morton
  0 siblings, 1 reply; 14+ messages in thread
From: Andries.Brouwer @ 2003-10-26 19:40 UTC (permalink / raw)
  To: Andries.Brouwer, torvalds; +Cc: linux-kernel, netdev

    From: Linus Torvalds <torvalds@osdl.org>

    > I have run 2.6.0-test6 without any problems. Switched
    > to 2.6.0-test9 today. Something involving job control
    > or so is broken. Several of my remote xterms hang.

    Btw, this one sounds like a known bug in bash.

No - it is a recent kernel bug.
Mikael Pettersson noticed precisely the same thing, and says
 "Reverting 2.6.0-test8-bk3's net/ipv4/tcp.c patch fixes
  these problems."
And so it does.

Andries

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-26 19:40 Linux 2.6.0-test9 Andries.Brouwer
@ 2003-10-27  0:01 ` Andrew Morton
  2003-10-27  0:21   ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2003-10-27  0:01 UTC (permalink / raw)
  To: Andries.Brouwer; +Cc: torvalds, linux-kernel, netdev

Andries.Brouwer@cwi.nl wrote:
>
>     From: Linus Torvalds <torvalds@osdl.org>
> 
>     > I have run 2.6.0-test6 without any problems. Switched
>     > to 2.6.0-test9 today. Something involving job control
>     > or so is broken. Several of my remote xterms hang.
> 
>     Btw, this one sounds like a known bug in bash.
> 
> No - it is a recent kernel bug.
> Mikael Pettersson noticed precisely the same thing, and says
>  "Reverting 2.6.0-test8-bk3's net/ipv4/tcp.c patch fixes
>   these problems."
> And so it does.
> 

Can someone show us the diff for this?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  0:01 ` Andrew Morton
@ 2003-10-27  0:21   ` Linus Torvalds
  2003-10-27  0:28     ` Linus Torvalds
  2003-10-27 19:36     ` kuznet
  0 siblings, 2 replies; 14+ messages in thread
From: Linus Torvalds @ 2003-10-27  0:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andries.Brouwer, Kernel Mailing List, netdev, David S. Miller,
	kuznet

On Sun, 26 Oct 2003, Andrew Morton wrote:
> 
> Can someone show us the diff for this?

There's only one change to tcp.c after -test8: it's 

	kuznet@ms2.inr.ac.ru:
	 TCP: do not return -EINTR, when data are available for read()

and I think it should just be reverted: the changeset even removes the 
comment that clearly says:

		/* We need to check signals first, to get correct SIGURG
		 * handling. FIXME: Need to check this doesn't impact 1003.1g
		 * and move it down to the bottom of the loop
		 */

And Alexey apparently tried to do the "FIXME" part, but without thinking 
about the SIGURG part.

We _need_ to stop at urgent data and we _should_ return -EINTR, and let
the SIGURG handler do the URG read. Otherwise we'll lose urgent data (or
we'll just read it inline without realizing that it was urgent data).

I don't know anybody sane who uses urgent data (telnet and rlogin do, I
don't know if they count as sane any more), but it does look like that
patch totally broke it.

I'd revert it myself, but since the networking code is fairly well 
maintained, I'll just wait for David and Alexey to weigh in, and tell me 
that I'm a total moron and I overlooked something. But I don't think I 
missed anything.

Andries, what was the situation that led to a TCP lockup? I don't see 
anything but URG being broken by that patch, so it would be good to verify 
that your breakage really was URG, to see that it's totally understood..

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  0:21   ` Linus Torvalds
@ 2003-10-27  0:28     ` Linus Torvalds
  2003-10-27  6:43       ` David S. Miller
  2003-10-27 19:36     ` kuznet
  1 sibling, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2003-10-27  0:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Andries.Brouwer, Kernel Mailing List, netdev, David S. Miller,
	kuznet

On Sun, 26 Oct 2003, Linus Torvalds wrote:
> 
> Andries, what was the situation that led to a TCP lockup? I don't see 
> anything but URG being broken by that patch, so it would be good to verify 
> that your breakage really was URG, to see that it's totally understood..

Btw, final comment: if it really is URG-only breakage, then instead of 
reverting the patch (if it had some other reasons going for it), we could 
change the URG test at the top of the loop from

	if (copied && tp->urg_data && tp->urg_seq == *seq)
		break

to

	if (tp->urg_data && tp->urg_seq == *seq) {
		if (copied)
			break;
		if (signal_pending(current)) {
			copied = timeo ? sock_intr_errno(timeo) : -EAGAIN;
			break;
		}
	}

which gives the right break semantics for URG data.

After that, the only other place where we should check for signal pending 
is probably at the tcp_data_wait() call. All the other signal pending 
checks seem bogus (ie right now a pending signal will mean that we avoid 
doing even TCP-level cleanups, which looks just wrong).

But reverting the change is clearly the "safer" thing to do, I just worry 
that Alexey might have had a real reason for tryign to avoid the EINTR in 
the first place (for non-URG data).

		Linus

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
@ 2003-10-27  1:48 Andries.Brouwer
  2003-10-27  2:10 ` Linus Torvalds
  0 siblings, 1 reply; 14+ messages in thread
From: Andries.Brouwer @ 2003-10-27  1:48 UTC (permalink / raw)
  To: akpm, torvalds; +Cc: Andries.Brouwer, davem, kuznet, linux-kernel, netdev

> Andries, what was the situation that led to a TCP lockup?

rlogin followed by "emacs -nw".

rlogin uses SIGURG for communication.

It is not the TCP protocol that is locked up.
Keystrokes are transmitted and the results are sent back.
But the reader half of rlogin hangs.

Andries

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  1:48 Andries.Brouwer
@ 2003-10-27  2:10 ` Linus Torvalds
  2003-10-27  9:40   ` David S. Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Linus Torvalds @ 2003-10-27  2:10 UTC (permalink / raw)
  To: Andries.Brouwer; +Cc: akpm, davem, kuznet, linux-kernel, netdev

On Mon, 27 Oct 2003 Andries.Brouwer@cwi.nl wrote:
> 
> rlogin followed by "emacs -nw".

Ok. I bet I've never seen it partly because I only use ssh (I don't even
allow rlogin to any of my machines). But you're right, rlogin certainly
not only uses OOB data, but uses SIGURG itself. I would actually expect 
that if we delay the SIGURG until after we've read the URG data, the child 
process that wants to actually read the URG data will trivially hang, 
waiting for it.

If this is easily repeatable for you, can you test just applying this
patch on top of plain -test9? It's not the patch I'd actually do in real 
life, but it's the minimal patch to verify that it's really SIGURG and 
urgent data that is the thing you see. Sounds very likely, but it would be 
good to really verify.

I think that this is, btw, the _right_ place for checking that SIGURG
anyway.

The case of being at urgent data really is a special case, and I think it
was a mistake to try to have the signal_pending() check in a common code
sequence - it's really two totally differenct cases when we check for
"should we stop here due to SIGURG handling", or "should we return because
we would have to wait for more data and we have a signal pending".

Yes, both cases test for "signal_pending(current)", but the SIGURG case
really could test for just "do we have SIGURG pending", not just "any
signal".

		Linus

--- 1.49/net/ipv4/tcp.c	Mon Oct 20 22:27:42 2003
+++ edited/net/ipv4/tcp.c	Sun Oct 26 17:59:14 2003
@@ -1536,9 +1536,15 @@
 		struct sk_buff *skb;
 		u32 offset;

-		/* Are we at urgent data? Stop if we have read anything. */
-		if (copied && tp->urg_data && tp->urg_seq == *seq)
-			break;
+		/* Are we at urgent data? Stop if we have read anything or have SIGURG pending. */
+		if (tp->urg_data && tp->urg_seq == *seq) {
+			if (copied)
+				break;
+			if (signal_pending(current)) {
+				copied = timeo ? sock_intr_errno(timeo) : -EAGAIN;
+				break;
+			}
+		}

 		/* Next get a buffer. */

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  0:28     ` Linus Torvalds
@ 2003-10-27  6:43       ` David S. Miller
  2003-10-27 19:54         ` kuznet
  0 siblings, 1 reply; 14+ messages in thread
From: David S. Miller @ 2003-10-27  6:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: akpm, Andries.Brouwer, linux-kernel, netdev, kuznet

On Sun, 26 Oct 2003 16:28:11 -0800 (PST)
Linus Torvalds <torvalds@osdl.org> wrote:

> But reverting the change is clearly the "safer" thing to do, I just worry 
> that Alexey might have had a real reason for tryign to avoid the EINTR in 
> the first place (for non-URG data).

I'd like to hear something from Alexey first.

The problem we were trying to deal with was that when data
is available to read a lot of people were complaining that
we return -EINTR and no other system does this.

This is heavily inconsistent with how we handle every other
type of socket error.  In all other cases, a read() when data
is available will succeed until the very last byte is sucked
out of the socket, then any subsequent read() call after the
queue is emptied will return the error.

But I am starting to see that URG is different.  It is not like
other socket errors that halt the socket and make no new data
arrive after it happens.  Rather, URG can happen just about anywhere
and more data can continue to flow into the socket buffers.

In fact, this means that our change can result in an application
can never see the error if data continues to arrive faster than
the application can pull it out, see?

Alexey, I think we did not understand this case fully when making this
change.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  2:10 ` Linus Torvalds
@ 2003-10-27  9:40   ` David S. Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David S. Miller @ 2003-10-27  9:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andries.Brouwer, akpm, kuznet, linux-kernel, netdev

I'm going to just revert the changeset in the networking
fixes I send to Linus today.

If we resolve this some other way, that's fine and the
original change is in the revision history for people
to refer to.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
@ 2003-10-27  9:47 Mikael Pettersson
  0 siblings, 0 replies; 14+ messages in thread
From: Mikael Pettersson @ 2003-10-27  9:47 UTC (permalink / raw)
  To: Andries.Brouwer, torvalds; +Cc: akpm, davem, kuznet, linux-kernel, netdev

On Sun, 26 Oct 2003 18:10:03 -0800 (PST), Linus Torvalds wrote:
>On Mon, 27 Oct 2003 Andries.Brouwer@cwi.nl wrote:
>> 
>> rlogin followed by "emacs -nw".
>
>Ok. I bet I've never seen it partly because I only use ssh (I don't even
>allow rlogin to any of my machines). But you're right, rlogin certainly
>not only uses OOB data, but uses SIGURG itself. I would actually expect 
>that if we delay the SIGURG until after we've read the URG data, the child 
>process that wants to actually read the URG data will trivially hang, 
>waiting for it.
>
>If this is easily repeatable for you, can you test just applying this
>patch on top of plain -test9? It's not the patch I'd actually do in real 
>life, but it's the minimal patch to verify that it's really SIGURG and 
>urgent data that is the thing you see. Sounds very likely, but it would be 
>good to really verify.

This patch does fix the rlogin + emacs -nw problems.

/Mikael

>--- 1.49/net/ipv4/tcp.c	Mon Oct 20 22:27:42 2003
>+++ edited/net/ipv4/tcp.c	Sun Oct 26 17:59:14 2003
>@@ -1536,9 +1536,15 @@
> 		struct sk_buff *skb;
> 		u32 offset;
> 
>-		/* Are we at urgent data? Stop if we have read anything. */
>-		if (copied && tp->urg_data && tp->urg_seq == *seq)
>-			break;
>+		/* Are we at urgent data? Stop if we have read anything or have SIGURG pending. */
>+		if (tp->urg_data && tp->urg_seq == *seq) {
>+			if (copied)
>+				break;
>+			if (signal_pending(current)) {
>+				copied = timeo ? sock_intr_errno(timeo) : -EAGAIN;
>+				break;
>+			}
>+		}
> 
> 		/* Next get a buffer. */
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
@ 2003-10-27 10:58 Andries.Brouwer
  0 siblings, 0 replies; 14+ messages in thread
From: Andries.Brouwer @ 2003-10-27 10:58 UTC (permalink / raw)
  To: Andries.Brouwer, torvalds; +Cc: akpm, davem, kuznet, linux-kernel, netdev

> If this is easily repeatable for you, can you test just applying this
> patch on top of plain -test9?

Yes, that works.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  0:21   ` Linus Torvalds
  2003-10-27  0:28     ` Linus Torvalds
@ 2003-10-27 19:36     ` kuznet
  2003-10-28  0:42       ` Tommy Christensen
  1 sibling, 1 reply; 14+ messages in thread
From: kuznet @ 2003-10-27 19:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: akpm, Andries.Brouwer, Kernel Mailing List, netdev,
	David S. Miller, kuznet

Hello!

> And Alexey apparently tried to do the "FIXME" part, but without thinking 
> about the SIGURG part.

Actually, it was thought a lot for several linux-2.x. :-)


> We _need_ to stop at urgent data and we _should_ return -EINTR, and let
> the SIGURG handler do the URG read. Otherwise we'll lose urgent data (or
> we'll just read it inline without realizing that it was urgent data).

The patch was expected not to break this property. Alas, something
is overlooked yet. I still do not understand what exactly is broken,
I feel I have to find some rlogin to experiment in vivo.

Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27  6:43       ` David S. Miller
@ 2003-10-27 19:54         ` kuznet
  0 siblings, 0 replies; 14+ messages in thread
From: kuznet @ 2003-10-27 19:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: Linus Torvalds, akpm, Andries.Brouwer, linux-kernel, netdev,
	kuznet

Hello!


> But I am starting to see that URG is different.  It is not like
> other socket errors that halt the socket and make no new data
> arrive after it happens.  Rather, URG can happen just about anywhere
> and more data can continue to flow into the socket buffers.
> 
> In fact, this means that our change can result in an application
> can never see the error if data continues to arrive faster than
> the application can pull it out, see?
> 
> Alexey, I think we did not understand this case fully when making this
> change.

No doubts taking into account that the things are so badly broken. :-)

But I was aware about this problem and even "proved" to you that the patch
will not change anything. :-) Remember this?


> I found only one more or less essential difference: when we have one oob
> octet in the stream with data following it, read() can return -EINTR/RESTART
> due to pending SIGURG and normal data will be read at the second read(),
> or, alternatively, it can return data following oob octet immediately.
>
> In practice there is no real difference: SIGURG is processed before read()
> returns in any case and even more, with restartable signals it is just
> exactly no difference.


Actually, I considered the same thing:

>--- 1.49/net/ipv4/tcp.c	Mon Oct 20 22:27:42 2003
>+++ edited/net/ipv4/tcp.c	Sun Oct 26 17:59:14 2003
>@@ -1536,9 +1536,15 @@
> 		struct sk_buff *skb;
> 		u32 offset;
> 
>-		/* Are we at urgent data? Stop if we have read anything. */
>-		if (copied && tp->urg_data && tp->urg_seq == *seq)
>-			break;
>+		/* Are we at urgent data? Stop if we have read anything or have SIGURG pending. */
>+		if (tp->urg_data && tp->urg_seq == *seq) {
>+			if (copied)
>+				break;
>+			if (signal_pending(current)) {
>+				copied = timeo ? sock_intr_errno(timeo) : -EAGAIN;
>+				break;
>+			}
>+		}
> 
> 		/* Next get a buffer. */
> 

as a reliable workaround for SIGURG situation, but then "figured out"
that this is not necessary (which was mistake). Though at the moment I still
do not understand what exactly is so wrong with that my argument, the signal
should be processed before read() returns, should not it? Maybe, rlogin
longjmp()s or something like this...

Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-27 19:36     ` kuznet
@ 2003-10-28  0:42       ` Tommy Christensen
  2003-10-28 18:25         ` kuznet
  0 siblings, 1 reply; 14+ messages in thread
From: Tommy Christensen @ 2003-10-28  0:42 UTC (permalink / raw)
  To: kuznet
  Cc: Linus Torvalds, akpm, Andries.Brouwer, Kernel Mailing List,
	netdev, David S. Miller

kuznet@ms2.inr.ac.ru wrote:
> Hello!
> 
> 
>>And Alexey apparently tried to do the "FIXME" part, but without thinking 
>>about the SIGURG part.
> 
> 
> Actually, it was thought a lot for several linux-2.x. :-)
> 
> 
> 
>>We _need_ to stop at urgent data and we _should_ return -EINTR, and let
>>the SIGURG handler do the URG read. Otherwise we'll lose urgent data (or
>>we'll just read it inline without realizing that it was urgent data).
> 
> 
> The patch was expected not to break this property. Alas, something
> is overlooked yet. I still do not understand what exactly is broken,
> I feel I have to find some rlogin to experiment in vivo.

Hi Alexey

I think the patch breaks things because it consumes (or rather skips)
the urgent data ( in the code after the label found_ok_skb: ).

Since this happens before the SIGURG handler is run, it won't find
any urgent data.

What do you think?

The patch by Linus seems to be fine though.

-Tommy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Linux 2.6.0-test9
  2003-10-28  0:42       ` Tommy Christensen
@ 2003-10-28 18:25         ` kuznet
  0 siblings, 0 replies; 14+ messages in thread
From: kuznet @ 2003-10-28 18:25 UTC (permalink / raw)
  To: Tommy Christensen
  Cc: Linus Torvalds, akpm, Andries.Brouwer, Kernel Mailing List,
	netdev, David S. Miller

Hello!

> I think the patch breaks things because it consumes (or rather skips)
> the urgent data ( in the code after the label found_ok_skb: ).
> 
> Since this happens before the SIGURG handler is run, it won't find
> any urgent data.
> 
> What do you think?

Yes, you are absolutely right. I missed exactly this thing.


> The patch by Linus seems to be fine though.

I think the patch suggested by Linus is 100% correct and
in fact it is the only solution.

Alexey

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2003-10-28 18:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-26 19:40 Linux 2.6.0-test9 Andries.Brouwer
2003-10-27  0:01 ` Andrew Morton
2003-10-27  0:21   ` Linus Torvalds
2003-10-27  0:28     ` Linus Torvalds
2003-10-27  6:43       ` David S. Miller
2003-10-27 19:54         ` kuznet
2003-10-27 19:36     ` kuznet
2003-10-28  0:42       ` Tommy Christensen
2003-10-28 18:25         ` kuznet
  -- strict thread matches above, loose matches on Subject: below --
2003-10-27  1:48 Andries.Brouwer
2003-10-27  2:10 ` Linus Torvalds
2003-10-27  9:40   ` David S. Miller
2003-10-27  9:47 Mikael Pettersson
2003-10-27 10:58 Andries.Brouwer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).