netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: ilpo.jarvinen@helsinki.fi
Cc: ycheng@google.com, netdev@vger.kernel.org
Subject: Re: [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost
Date: Mon, 27 Sep 2010 11:11:14 -0700 (PDT)	[thread overview]
Message-ID: <20100927.111114.124003889.davem@davemloft.net> (raw)
In-Reply-To: <alpine.DEB.2.00.1009271520560.26447@wel-95.cs.helsinki.fi>

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>
Date: Mon, 27 Sep 2010 15:22:09 +0300 (EEST)

> On Fri, 24 Sep 2010, Yuchung Cheng wrote:
> 
>> When TCP uses FACK algorithm to mark lost packets in
>> tcp_mark_head_lost(), if the number of packets in the (TSO) skb is
>> greater than the number of packets that should be marked lost, TCP
>> incorrectly exits the loop and marks no packets lost in the skb. This
>> underestimates tp->lost_out and affects the recovery/retransmission.
>> This patch fargments the skb and marks the correct amount of packets
>> lost.
>> 
>> Signed-off-by: Yuchung Cheng <ycheng@google.com>
>> ---
>>  net/ipv4/tcp_input.c |    3 ++-
>>  1 files changed, 2 insertions(+), 1 deletions(-)
>> 
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index 1bc87a0..e4f472e 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -2532,7 +2532,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets)
>>  			cnt += tcp_skb_pcount(skb);
>>  
>>  		if (cnt > packets) {
>> -			if (tcp_is_sack(tp) || (oldcnt >= packets))
>> +			if ((tcp_is_sack(tp) && !tcp_is_fack(tp)) ||
>> +			    (oldcnt >= packets))
>>  				break;
>>  
>>  			mss = skb_shinfo(skb)->gso_size;
>> 
> 
> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>

BTW, the history is that this code was added to fix a bug because
we didn't handle GSO packets at all at one point.

But, conservatively, we didn't do splits here for SACK, and it was
stated in the commit that we would look into it "at some point in the
future" :-)

--------------------
commit c137f3dda04b0aee1bc6889cdc69185f53df8a82
Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Date:   Mon Apr 7 22:32:38 2008 -0700

    [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb
    
    Fixes a long-standing bug which makes NewReno recovery crippled.
    With GSO the whole head skb was marked as LOST which is in
    violation of NewReno procedure that only wants to mark one packet
    and ended up breaking our TCP code by causing counter overflow
    because our code was built on top of assumption about valid
    NewReno procedure. This manifested as triggering a WARN_ON for
    the overflow in a number of places.
    
    It seems relatively safe alternative to just do nothing if
    tcp_fragment fails due to oom because another duplicate ACK is
    likely to be received soon and the fragmentation will be retried.
    
    Special thanks goes to Soeren Sonnenburg <kernel@nn7.de> who was
    lucky enough to be able to reproduce this so that the warning
    for the overflow was hit. It's not as easy task as it seems even
    if this bug happens quite often because the amount of outstanding
    data is pretty significant for the mismarkings to lead to an
    overflow.
    
    Because it's very late in 2.6.25-rc cycle (if this even makes in
    time), I didn't want to touch anything with SACK enabled here.
    Fragmenting might be useful for it as well but it's more or less
    a policy decision rather than mandatory fix. Thus there's no need
    to rush and we can postpone considering tcp_fragment with SACK
    for 2.6.26.
    
    In 2.6.24 and earlier, this very same bug existed but the effect
    is slightly different because of a small changes in the if
    conditions that fit to the patch's context. With them nothing
    got lost marker and thus no retransmissions happened.
    
    Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
    Signed-off-by: David S. Miller <davem@davemloft.net>
--------------------

To be honest, we should probably just remove the whole tcp_is_sack()
test, rather than special case FACK.

The cost isn't what it was when this code was added.  Back then we
didn't have Ilpo's restransmit queue coalescing code, so it would make
retransmit queue packet freeing more expensive.  But since the
coalescing code is there now, splitting all the time to record
accurate loss information should be fine.

Ilpo what do you say?

  reply	other threads:[~2010-09-27 18:10 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-24 23:22 [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost Yuchung Cheng
2010-09-27 12:22 ` Ilpo Järvinen
2010-09-27 18:11   ` David Miller [this message]
2010-09-27 19:20     ` Ilpo Järvinen
2010-09-27 21:56       ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100927.111114.124003889.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=ilpo.jarvinen@helsinki.fi \
    --cc=netdev@vger.kernel.org \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).