netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] TCP FIN gets dropped prematurely, results in ack storm
@ 2007-05-01 15:13 Benjamin LaHaise
  2007-05-01 16:20 ` Evgeniy Polyakov
  2007-05-01 18:20 ` Evgeniy Polyakov
  0 siblings, 2 replies; 15+ messages in thread
From: Benjamin LaHaise @ 2007-05-01 15:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Hello,

While testing a failover scenario, I managed to trigger an ack storm 
between a Linux box and another system.  Although the cause of this particular 
ACK storm was due to the other box forgetting that it sent out a FIN (the 
second node was unaware of the FIN the first sent in its dying gasp, which 
is what I'm trying to fix, but it's a tricky race), the resulting Linux 
behaviour wasn't very robust.  Is there any particularly good reason that 
FIN flag gets cleared on a connection which is being shut down?  The trace 
that motivates this can be seen at 
http://www.kvack.org/~bcrl/ack-storm.log .  As near as I can tell, a 
similar effect can occur between two Linux boxes if the right packets get 
reordered/dropped during connection teardown.

		-ben

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 0faacf9..1e54291 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -635,9 +635,9 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss
 	TCP_SKB_CB(buff)->end_seq = TCP_SKB_CB(skb)->end_seq;
 	TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(buff)->seq;
 
-	/* PSH and FIN should only be set in the second packet. */
+	/* PSH should only be set in the second packet. */
 	flags = TCP_SKB_CB(skb)->flags;
-	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH);
+	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_PSH);
 	TCP_SKB_CB(buff)->flags = flags;
 	TCP_SKB_CB(buff)->sacked = TCP_SKB_CB(skb)->sacked;
 	TCP_SKB_CB(skb)->sacked &= ~TCPCB_AT_TAIL;
@@ -1124,9 +1124,9 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len,
 	TCP_SKB_CB(buff)->end_seq = TCP_SKB_CB(skb)->end_seq;
 	TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(buff)->seq;
 
-	/* PSH and FIN should only be set in the second packet. */
+	/* PSH should only be set in the second packet. */
 	flags = TCP_SKB_CB(skb)->flags;
-	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH);
+	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_PSH);
 	TCP_SKB_CB(buff)->flags = flags;
 
 	/* This packet was never sent out yet, so no SACK bits. */
@@ -1308,7 +1308,7 @@ static int tcp_mtu_probe(struct sock *sk)
 			sk_stream_free_skb(sk, skb);
 		} else {
 			TCP_SKB_CB(nskb)->flags |= TCP_SKB_CB(skb)->flags &
-						   ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH);
+						   ~(TCPCB_FLAG_PSH);
 			if (!skb_shinfo(skb)->nr_frags) {
 				skb_pull(skb, copy);
 				if (skb->ip_summed != CHECKSUM_PARTIAL)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 15:13 [PATCH] TCP FIN gets dropped prematurely, results in ack storm Benjamin LaHaise
@ 2007-05-01 16:20 ` Evgeniy Polyakov
  2007-05-01 16:49   ` Benjamin LaHaise
  2007-05-01 18:20 ` Evgeniy Polyakov
  1 sibling, 1 reply; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 16:20 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 11:13:54AM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> Hello,

Hi Benjamin.

> While testing a failover scenario, I managed to trigger an ack storm 
> between a Linux box and another system.  Although the cause of this particular 
> ACK storm was due to the other box forgetting that it sent out a FIN (the 
> second node was unaware of the FIN the first sent in its dying gasp, which 
> is what I'm trying to fix, but it's a tricky race), the resulting Linux 
> behaviour wasn't very robust.  Is there any particularly good reason that 
> FIN flag gets cleared on a connection which is being shut down?  The trace 
> that motivates this can be seen at 
> http://www.kvack.org/~bcrl/ack-storm.log .  As near as I can tell, a 
> similar effect can occur between two Linux boxes if the right packets get 
> reordered/dropped during connection teardown.

Could you archive 24Mb file or cut more precise bits out of it?

> +++ b/net/ipv4/tcp_output.c
> -	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH);
> +	TCP_SKB_CB(skb)->flags = flags & ~(TCPCB_FLAG_PSH);

Will it break 793 RFC:

while the FIN is considered
to occur after the last actual data octet in a segment in which it occurs.

...

In this case, a FIN segment can be constructed and placed on the
outgoing segment queue.  No further SENDs from the user will be
accepted by the TCP, and it enters the FIN-WAIT-1 state.
RECEIVEs are allowed in this state.  All segments preceding and
including FIN will be retransmitted until acknowledged.  When the
other TCP has both acknowledged the FIN and sent a FIN of its
own, the first TCP can ACK this FIN.  Note that a TCP receiving
a FIN will ACK but not send its own FIN until its user has
CLOSED the connection also.

According to your patch, several packets with fin bit might be sent,
including one with data. If another host does not receive fin
retransmit, then that logic is broken, and it can not be fixed by
duplicating fins, I would even say, that remote box should drop second
packet with fin, while it can carry data, which will break higher
connection logic.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 16:20 ` Evgeniy Polyakov
@ 2007-05-01 16:49   ` Benjamin LaHaise
  2007-05-01 17:41     ` Evgeniy Polyakov
  2007-05-01 17:54     ` John Heffner
  0 siblings, 2 replies; 15+ messages in thread
From: Benjamin LaHaise @ 2007-05-01 16:49 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 08:20:50PM +0400, Evgeniy Polyakov wrote:
> > http://www.kvack.org/~bcrl/ack-storm.log .  As near as I can tell, a 
> > similar effect can occur between two Linux boxes if the right packets get 
> > reordered/dropped during connection teardown.
> 
> Could you archive 24Mb file or cut more precise bits out of it?

The interesting bits are the first 10 lines.

> According to your patch, several packets with fin bit might be sent,
> including one with data. If another host does not receive fin
> retransmit, then that logic is broken, and it can not be fixed by
> duplicating fins, I would even say, that remote box should drop second
> packet with fin, while it can carry data, which will break higher
> connection logic.

The FIN hasn't been ack'd by the other side, though and yet Linux is no 
longer transmitting packets with it sent.  Read the beginning of the trace.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 16:49   ` Benjamin LaHaise
@ 2007-05-01 17:41     ` Evgeniy Polyakov
  2007-05-01 17:53       ` Benjamin LaHaise
  2007-05-01 17:57       ` Evgeniy Polyakov
  2007-05-01 17:54     ` John Heffner
  1 sibling, 2 replies; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 17:41 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 12:49:35PM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> On Tue, May 01, 2007 at 08:20:50PM +0400, Evgeniy Polyakov wrote:
> > > http://www.kvack.org/~bcrl/ack-storm.log .  As near as I can tell, a 
> > > similar effect can occur between two Linux boxes if the right packets get 
> > > reordered/dropped during connection teardown.
> > 
> > Could you archive 24Mb file or cut more precise bits out of it?
> 
> The interesting bits are the first 10 lines.
> 
> > According to your patch, several packets with fin bit might be sent,
> > including one with data. If another host does not receive fin
> > retransmit, then that logic is broken, and it can not be fixed by
> > duplicating fins, I would even say, that remote box should drop second
> > packet with fin, while it can carry data, which will break higher
> > connection logic.
> 
> The FIN hasn't been ack'd by the other side, though and yet Linux is no 
> longer transmitting packets with it sent.  Read the beginning of the trace.

Hmm, 2.2 machine in your test seems to behave incorrectly:

22>11: 2624175182:2624175182(0) ack 1562038077 ack
22>11: 2624175182:2624175182(0) ack 1562038077 fin

11>22: 1562038077:1562038077(0) ack 2624175183 fin
11>22: 1562038077:1562038077(0) ack 2624175183 fin // retransmit after
0.3 seconds, since there was no ack, it was either dropped, or first fin
was dropped in the wire
In former case 22 is in closing, in latter case - fin-wait1

22>11: 2624175182:2624175182(0) ack 1562038077 ack //what is this ack
for? It should have sequence number +1, since fin was sent.
11>22: 1562038078:1562038078(0) ack 2624175183 ack
11 answers that this ack is bogus and it wants 2624175183

22>11: 2624175182:2624175182(0) ack 1562038077 ack
11>22: 1562038078:1562038078(0) ack 2624175183 ack

and so on...

I think if you will storm any system with acks lower than expected
unacknowledged number, result will be the same - ack, that it was bogus
message, if sending system sends wrong ack again, it will again receive
that it was bogus...

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 17:41     ` Evgeniy Polyakov
@ 2007-05-01 17:53       ` Benjamin LaHaise
  2007-05-01 18:03         ` John Heffner
  2007-05-01 17:57       ` Evgeniy Polyakov
  1 sibling, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2007-05-01 17:53 UTC (permalink / raw)
  To: Evgeniy Polyakov; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 09:41:28PM +0400, Evgeniy Polyakov wrote:
> Hmm, 2.2 machine in your test seems to behave incorrectly:

I am aware of that.  However, I think that the loss of certain packets and 
reordering can result in the same behaviour.  What's more, is that this 
behaviour can occur in real deployed systems.  "Be strict in what you send 
and liberal in what you accept."  Both systems should be fixed, which is 
what I'm trying to do.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 16:49   ` Benjamin LaHaise
  2007-05-01 17:41     ` Evgeniy Polyakov
@ 2007-05-01 17:54     ` John Heffner
  2007-05-01 18:04       ` Benjamin LaHaise
  1 sibling, 1 reply; 15+ messages in thread
From: John Heffner @ 2007-05-01 17:54 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Evgeniy Polyakov, David Miller, netdev

Benjamin LaHaise wrote:
>> According to your patch, several packets with fin bit might be sent,
>> including one with data. If another host does not receive fin
>> retransmit, then that logic is broken, and it can not be fixed by
>> duplicating fins, I would even say, that remote box should drop second
>> packet with fin, while it can carry data, which will break higher
>> connection logic.
> 
> The FIN hasn't been ack'd by the other side, though and yet Linux is no 
> longer transmitting packets with it sent.  Read the beginning of the trace.

I agree completely with Evgeniy.  The patch you sent would cause bad 
breakage by sending the FIN bit on segments with different sequence numbers.

Looking at your trace, it seems like the behavior of the test system 
192.168.2.2 is broken in two ways.  First, like you said it has broken 
state in that it has forgotten that it sent the FIN.  Once you do that, 
the connection state is corrupt and all bets are off.  It's sending an 
out-of-window segment that's getting tossed by Linux, and Linux 
generates an ack in response.  This is in direct RFC compliance.  The 
second problem is that the other system is generating these broken acks 
in response to the legitimate acks Linux is sending, causing the ack 
war.  I can't really guess why it's doing that...

You might be able to change Linux to prevent this ack war, but doing so 
would break RFC compliance, and given the buggy nature of the other end, 
it sounds to me like a bad idea.

   -John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 17:41     ` Evgeniy Polyakov
  2007-05-01 17:53       ` Benjamin LaHaise
@ 2007-05-01 17:57       ` Evgeniy Polyakov
  2007-05-01 18:02         ` Evgeniy Polyakov
  1 sibling, 1 reply; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 17:57 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 09:41:28PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> On Tue, May 01, 2007 at 12:49:35PM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> > On Tue, May 01, 2007 at 08:20:50PM +0400, Evgeniy Polyakov wrote:
> > > > http://www.kvack.org/~bcrl/ack-storm.log .  As near as I can tell, a 
> > > > similar effect can occur between two Linux boxes if the right packets get 
> > > > reordered/dropped during connection teardown.
> > > 
> > > Could you archive 24Mb file or cut more precise bits out of it?
> > 
> > The interesting bits are the first 10 lines.
> > 
> > > According to your patch, several packets with fin bit might be sent,
> > > including one with data. If another host does not receive fin
> > > retransmit, then that logic is broken, and it can not be fixed by
> > > duplicating fins, I would even say, that remote box should drop second
> > > packet with fin, while it can carry data, which will break higher
> > > connection logic.
> > 
> > The FIN hasn't been ack'd by the other side, though and yet Linux is no 
> > longer transmitting packets with it sent.  Read the beginning of the trace.
> 
> Hmm, 2.2 machine in your test seems to behave incorrectly:
> 
> 22>11: 2624175182:2624175182(0) ack 1562038077 ack
> 22>11: 2624175182:2624175182(0) ack 1562038077 fin
> 
> 11>22: 1562038077:1562038077(0) ack 2624175183 fin
> 11>22: 1562038077:1562038077(0) ack 2624175183 fin // retransmit after
> 0.3 seconds, since there was no ack, it was either dropped, or first fin
> was dropped in the wire
> In former case 22 is in closing, in latter case - fin-wait1
> 
> 22>11: 2624175182:2624175182(0) ack 1562038077 ack //what is this ack
> for? It should have sequence number +1, since fin was sent.
> 11>22: 1562038078:1562038078(0) ack 2624175183 ack
> 11 answers that this ack is bogus and it wants 2624175183
> 
> 22>11: 2624175182:2624175182(0) ack 1562038077 ack
> 11>22: 1562038078:1562038078(0) ack 2624175183 ack
> 
> and so on...
> 
> I think if you will storm any system with acks lower than expected
> unacknowledged number, result will be the same - ack, that it was bogus
> message, if sending system sends wrong ack again, it will again receive
> that it was bogus...

Wrong syn number of course.
I described not exactly correct case - likely broken side does not know 
that its messages were lost (or specially crafts such packets), so it 
retransmit the same bogus packet with wrong syn, which already was acked.

As far as I can see, this is it:

/* step 1: check sequence number */
if (!tcp_sequence(tp, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq)) {
	if (!th->rst)
		tcp_send_dupack(sk, skb);
	goto discard;
}

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 17:57       ` Evgeniy Polyakov
@ 2007-05-01 18:02         ` Evgeniy Polyakov
  0 siblings, 0 replies; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 18:02 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 09:57:18PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> > I think if you will storm any system with acks lower than expected
> > unacknowledged number, result will be the same - ack, that it was bogus
> > message, if sending system sends wrong ack again, it will again receive
> > that it was bogus...
> 
> Wrong syn number of course.
> I described not exactly correct case - likely broken side does not know 
> that its messages were lost (or specially crafts such packets), so it 
> retransmit the same bogus packet with wrong syn, which already was acked.
> 
> As far as I can see, this is it:
> 
> /* step 1: check sequence number */
> if (!tcp_sequence(tp, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq)) {
> 	if (!th->rst)
> 		tcp_send_dupack(sk, skb);
> 	goto discard;
> }

And this is part of the RFC Linux follows:

        There are four cases for the acceptability test for an incoming
        segment:

        Segment Receive  Test
        Length  Window
        ------- -------  -------------------------------------------

           0       0     SEG.SEQ = RCV.NXT

           0      >0     RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND

          >0       0     not acceptable

          >0      >0     RCV.NXT =< SEG.SEQ < RCV.NXT+RCV.WND
                      or RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND

        If the RCV.WND is zero, no segments will be acceptable, but
        special allowance should be made to accept valid ACKs, URGs and
        RSTs.

        If an incoming segment is not acceptable, an acknowledgment
        should be sent in reply (unless the RST bit is set, if so drop
        the segment and return):

          <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>

        After sending the acknowledgment, drop the unacceptable segment
        and return.


-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 17:53       ` Benjamin LaHaise
@ 2007-05-01 18:03         ` John Heffner
  2007-05-01 19:19           ` Benjamin LaHaise
  0 siblings, 1 reply; 15+ messages in thread
From: John Heffner @ 2007-05-01 18:03 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Evgeniy Polyakov, David Miller, netdev

Benjamin LaHaise wrote:
> On Tue, May 01, 2007 at 09:41:28PM +0400, Evgeniy Polyakov wrote:
>> Hmm, 2.2 machine in your test seems to behave incorrectly:
> 
> I am aware of that.  However, I think that the loss of certain packets and 
> reordering can result in the same behaviour.  What's more, is that this 
> behaviour can occur in real deployed systems.  "Be strict in what you send 
> and liberal in what you accept."  Both systems should be fixed, which is 
> what I'm trying to do.

Actually, you cannot get in this situation by loss or reordering of 
packets, only be corruption of state on one side.  It sends the FIN, 
which effectively increases the sequence number by one.  However, all 
later segments it sends have an old lower sequence number, which are now 
out of window.

Being liberal in what you accept is good to a point, but sometimes you 
have to draw the line.

   -John

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 17:54     ` John Heffner
@ 2007-05-01 18:04       ` Benjamin LaHaise
  2007-05-01 18:07         ` Evgeniy Polyakov
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2007-05-01 18:04 UTC (permalink / raw)
  To: John Heffner; +Cc: Evgeniy Polyakov, David Miller, netdev

On Tue, May 01, 2007 at 01:54:03PM -0400, John Heffner wrote:
> Looking at your trace, it seems like the behavior of the test system 
> 192.168.2.2 is broken in two ways.  First, like you said it has broken 
> state in that it has forgotten that it sent the FIN.  Once you do that, 
> the connection state is corrupt and all bets are off.  It's sending an 
> out-of-window segment that's getting tossed by Linux, and Linux 
> generates an ack in response.  This is in direct RFC compliance.  The 
> second problem is that the other system is generating these broken acks 
> in response to the legitimate acks Linux is sending, causing the ack 
> war.  I can't really guess why it's doing that...

I know it's a bug, and I'm trying to fix it, but that doesn't change the 
fact that A) the system is already deployed and B) Linux is not retransmitting 
the FIN, which (from Linux's point of view) remains unacknowledged by the 
other side.  The patch might be wrong, but the goal of fixing the behaviour 
isn't.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 18:04       ` Benjamin LaHaise
@ 2007-05-01 18:07         ` Evgeniy Polyakov
  0 siblings, 0 replies; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 18:07 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: John Heffner, David Miller, netdev

On Tue, May 01, 2007 at 02:04:05PM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> I know it's a bug, and I'm trying to fix it, but that doesn't change the 
> fact that A) the system is already deployed and B) Linux is not retransmitting 
> the FIN, which (from Linux's point of view) remains unacknowledged by the 
> other side.  The patch might be wrong, but the goal of fixing the behaviour 
> isn't.

It does retransmit fin:

10:23:32.298477 IP (tos 0x0, ttl 255, id 39972, offset 0, flags [DF],
proto: TCP (6), length: 52) 192.168.1.1.59192 > 192.168.2.2.bgp: F,
cksum 0x5558 (correct), 1562038077:1562038077(0) ack 2624175183 win 183
<nop,nop,timestamp 1549356119 781231>
10:23:32.599809 IP (tos 0x0, ttl 255, id 39973, offset 0, flags [DF],
proto: TCP (6), length: 52) 192.168.1.1.59192 > 192.168.2.2.bgp: F,
cksum 0x542b (correct), 1562038077:1562038077(0) ack 2624175183 win 183
<nop,nop,timestamp 1549356420 781231>
10:23:33.201722 IP (tos 0x0, ttl 255, id 40802, offset 0, flags [DF],
proto: TCP (6), length: 52) 192.168.1.1.59192 > 192.168.2.2.bgp: F,
cksum 0x51d1 (correct), 1562038077:1562038077(0) ack 2624175183 win 183
<nop,nop,timestamp 1549357022 781231>
10:23:34.405537 IP (tos 0x0, ttl 255, id 40914, offset 0, flags [DF],
proto: TCP (6), length: 52) 192.168.1.1.59192 > 192.168.2.2.bgp: F,
cksum 0x4d1d (correct), 1562038077:1562038077(0) ack 2624175183 win 183
<nop,nop,timestamp 1549358226 781231>


-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 15:13 [PATCH] TCP FIN gets dropped prematurely, results in ack storm Benjamin LaHaise
  2007-05-01 16:20 ` Evgeniy Polyakov
@ 2007-05-01 18:20 ` Evgeniy Polyakov
  2007-05-01 18:25   ` Evgeniy Polyakov
  1 sibling, 1 reply; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 18:20 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 11:13:54AM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> Hello,
> 
> While testing a failover scenario, I managed to trigger an ack storm 
> between a Linux box and another system.  Although the cause of this particular 
> ACK storm was due to the other box forgetting that it sent out a FIN (the 
> second node was unaware of the FIN the first sent in its dying gasp, which 
> is what I'm trying to fix, but it's a tricky race), the resulting Linux 
> behaviour wasn't very robust.  Is there any particularly good reason that 

One of the packets sent by broken 1.1 host has incorrect checksum, so it
will be dropped by 2.2 system in theory, could that packet somehow break
2.2 stack's state machine?

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 18:20 ` Evgeniy Polyakov
@ 2007-05-01 18:25   ` Evgeniy Polyakov
  0 siblings, 0 replies; 15+ messages in thread
From: Evgeniy Polyakov @ 2007-05-01 18:25 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David Miller, netdev

On Tue, May 01, 2007 at 10:20:07PM +0400, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> On Tue, May 01, 2007 at 11:13:54AM -0400, Benjamin LaHaise (bcrl@kvack.org) wrote:
> > Hello,
> > 
> > While testing a failover scenario, I managed to trigger an ack storm 
> > between a Linux box and another system.  Although the cause of this particular 
> > ACK storm was due to the other box forgetting that it sent out a FIN (the 
> > second node was unaware of the FIN the first sent in its dying gasp, which 
> > is what I'm trying to fix, but it's a tricky race), the resulting Linux 
> > behaviour wasn't very robust.  Is there any particularly good reason that 
> 
> One of the packets sent by broken 1.1 host has incorrect checksum, so it
> will be dropped by 2.2 system in theory, could that packet somehow break
> 2.2 stack's state machine?

It seems so, 2.2 stack expects i-1 sequence number, so when you add fin
into both (i-1)'th and i'th packets, 2.2 system correctly completes session
thinking that i-1 is the real last message, which is not.

-- 
	Evgeniy Polyakov

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 18:03         ` John Heffner
@ 2007-05-01 19:19           ` Benjamin LaHaise
  2007-05-01 20:24             ` David Miller
  0 siblings, 1 reply; 15+ messages in thread
From: Benjamin LaHaise @ 2007-05-01 19:19 UTC (permalink / raw)
  To: John Heffner; +Cc: Evgeniy Polyakov, David Miller, netdev

On Tue, May 01, 2007 at 02:03:04PM -0400, John Heffner wrote:
> Actually, you cannot get in this situation by loss or reordering of 
> packets, only be corruption of state on one side.  It sends the FIN, 
> which effectively increases the sequence number by one.  However, all 
> later segments it sends have an old lower sequence number, which are now 
> out of window.

Okay, I missed the other packets with a FIN later on in the storm.  What is 
different about them is that they get sent with different timestamps than 
the acks being thrown about.  Perhaps narrowly looking at the lack of FIN 
is wrong -- I'll try instrumenting what the PAWS code is doing on both 
sides as that is probably what short circuits an ACK into being sent.

> Being liberal in what you accept is good to a point, but sometimes you 
> have to draw the line.

True.  Still, both sides are doing completely the wrong thing in this case, 
and I'd like to get an idea of the best way to prevent the ACK storm from 
happenning.

		-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] TCP FIN gets dropped prematurely, results in ack storm
  2007-05-01 19:19           ` Benjamin LaHaise
@ 2007-05-01 20:24             ` David Miller
  0 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2007-05-01 20:24 UTC (permalink / raw)
  To: bcrl; +Cc: jheffner, johnpol, netdev

From: Benjamin LaHaise <bcrl@kvack.org>
Date: Tue, 1 May 2007 15:19:31 -0400

> On Tue, May 01, 2007 at 02:03:04PM -0400, John Heffner wrote:
> > Being liberal in what you accept is good to a point, but sometimes you 
> > have to draw the line.
> 
> True.  Still, both sides are doing completely the wrong thing in this case, 
> and I'd like to get an idea of the best way to prevent the ACK storm from 
> happenning.

You do it by fixing the broken 2.2.x system, you don't do it
by breaking 2.6.x to spit out FINs with different sequence
numbers which is absolutely wrong and will cause more problems
than it will solve.

Please drop even the thought of this patch, kthx.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-05-01 20:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-01 15:13 [PATCH] TCP FIN gets dropped prematurely, results in ack storm Benjamin LaHaise
2007-05-01 16:20 ` Evgeniy Polyakov
2007-05-01 16:49   ` Benjamin LaHaise
2007-05-01 17:41     ` Evgeniy Polyakov
2007-05-01 17:53       ` Benjamin LaHaise
2007-05-01 18:03         ` John Heffner
2007-05-01 19:19           ` Benjamin LaHaise
2007-05-01 20:24             ` David Miller
2007-05-01 17:57       ` Evgeniy Polyakov
2007-05-01 18:02         ` Evgeniy Polyakov
2007-05-01 17:54     ` John Heffner
2007-05-01 18:04       ` Benjamin LaHaise
2007-05-01 18:07         ` Evgeniy Polyakov
2007-05-01 18:20 ` Evgeniy Polyakov
2007-05-01 18:25   ` Evgeniy Polyakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).