* [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost @ 2010-09-24 23:22 Yuchung Cheng 2010-09-27 12:22 ` Ilpo Järvinen 0 siblings, 1 reply; 5+ messages in thread From: Yuchung Cheng @ 2010-09-24 23:22 UTC (permalink / raw) To: ilpo.jarvinen, davem; +Cc: netdev, Yuchung Cheng When TCP uses FACK algorithm to mark lost packets in tcp_mark_head_lost(), if the number of packets in the (TSO) skb is greater than the number of packets that should be marked lost, TCP incorrectly exits the loop and marks no packets lost in the skb. This underestimates tp->lost_out and affects the recovery/retransmission. This patch fargments the skb and marks the correct amount of packets lost. Signed-off-by: Yuchung Cheng <ycheng@google.com> --- net/ipv4/tcp_input.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1bc87a0..e4f472e 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2532,7 +2532,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets) cnt += tcp_skb_pcount(skb); if (cnt > packets) { - if (tcp_is_sack(tp) || (oldcnt >= packets)) + if ((tcp_is_sack(tp) && !tcp_is_fack(tp)) || + (oldcnt >= packets)) break; mss = skb_shinfo(skb)->gso_size; -- 1.7.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost 2010-09-24 23:22 [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost Yuchung Cheng @ 2010-09-27 12:22 ` Ilpo Järvinen 2010-09-27 18:11 ` David Miller 0 siblings, 1 reply; 5+ messages in thread From: Ilpo Järvinen @ 2010-09-27 12:22 UTC (permalink / raw) To: Yuchung Cheng; +Cc: David Miller, Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 1159 bytes --] On Fri, 24 Sep 2010, Yuchung Cheng wrote: > When TCP uses FACK algorithm to mark lost packets in > tcp_mark_head_lost(), if the number of packets in the (TSO) skb is > greater than the number of packets that should be marked lost, TCP > incorrectly exits the loop and marks no packets lost in the skb. This > underestimates tp->lost_out and affects the recovery/retransmission. > This patch fargments the skb and marks the correct amount of packets > lost. > > Signed-off-by: Yuchung Cheng <ycheng@google.com> > --- > net/ipv4/tcp_input.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 1bc87a0..e4f472e 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2532,7 +2532,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets) > cnt += tcp_skb_pcount(skb); > > if (cnt > packets) { > - if (tcp_is_sack(tp) || (oldcnt >= packets)) > + if ((tcp_is_sack(tp) && !tcp_is_fack(tp)) || > + (oldcnt >= packets)) > break; > > mss = skb_shinfo(skb)->gso_size; > Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> -- i. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost 2010-09-27 12:22 ` Ilpo Järvinen @ 2010-09-27 18:11 ` David Miller 2010-09-27 19:20 ` Ilpo Järvinen 0 siblings, 1 reply; 5+ messages in thread From: David Miller @ 2010-09-27 18:11 UTC (permalink / raw) To: ilpo.jarvinen; +Cc: ycheng, netdev From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Date: Mon, 27 Sep 2010 15:22:09 +0300 (EEST) > On Fri, 24 Sep 2010, Yuchung Cheng wrote: > >> When TCP uses FACK algorithm to mark lost packets in >> tcp_mark_head_lost(), if the number of packets in the (TSO) skb is >> greater than the number of packets that should be marked lost, TCP >> incorrectly exits the loop and marks no packets lost in the skb. This >> underestimates tp->lost_out and affects the recovery/retransmission. >> This patch fargments the skb and marks the correct amount of packets >> lost. >> >> Signed-off-by: Yuchung Cheng <ycheng@google.com> >> --- >> net/ipv4/tcp_input.c | 3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c >> index 1bc87a0..e4f472e 100644 >> --- a/net/ipv4/tcp_input.c >> +++ b/net/ipv4/tcp_input.c >> @@ -2532,7 +2532,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets) >> cnt += tcp_skb_pcount(skb); >> >> if (cnt > packets) { >> - if (tcp_is_sack(tp) || (oldcnt >= packets)) >> + if ((tcp_is_sack(tp) && !tcp_is_fack(tp)) || >> + (oldcnt >= packets)) >> break; >> >> mss = skb_shinfo(skb)->gso_size; >> > > Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> BTW, the history is that this code was added to fix a bug because we didn't handle GSO packets at all at one point. But, conservatively, we didn't do splits here for SACK, and it was stated in the commit that we would look into it "at some point in the future" :-) -------------------- commit c137f3dda04b0aee1bc6889cdc69185f53df8a82 Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Date: Mon Apr 7 22:32:38 2008 -0700 [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb Fixes a long-standing bug which makes NewReno recovery crippled. With GSO the whole head skb was marked as LOST which is in violation of NewReno procedure that only wants to mark one packet and ended up breaking our TCP code by causing counter overflow because our code was built on top of assumption about valid NewReno procedure. This manifested as triggering a WARN_ON for the overflow in a number of places. It seems relatively safe alternative to just do nothing if tcp_fragment fails due to oom because another duplicate ACK is likely to be received soon and the fragmentation will be retried. Special thanks goes to Soeren Sonnenburg <kernel@nn7.de> who was lucky enough to be able to reproduce this so that the warning for the overflow was hit. It's not as easy task as it seems even if this bug happens quite often because the amount of outstanding data is pretty significant for the mismarkings to lead to an overflow. Because it's very late in 2.6.25-rc cycle (if this even makes in time), I didn't want to touch anything with SACK enabled here. Fragmenting might be useful for it as well but it's more or less a policy decision rather than mandatory fix. Thus there's no need to rush and we can postpone considering tcp_fragment with SACK for 2.6.26. In 2.6.24 and earlier, this very same bug existed but the effect is slightly different because of a small changes in the if conditions that fit to the patch's context. With them nothing got lost marker and thus no retransmissions happened. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net> -------------------- To be honest, we should probably just remove the whole tcp_is_sack() test, rather than special case FACK. The cost isn't what it was when this code was added. Back then we didn't have Ilpo's restransmit queue coalescing code, so it would make retransmit queue packet freeing more expensive. But since the coalescing code is there now, splitting all the time to record accurate loss information should be fine. Ilpo what do you say? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost 2010-09-27 18:11 ` David Miller @ 2010-09-27 19:20 ` Ilpo Järvinen 2010-09-27 21:56 ` David Miller 0 siblings, 1 reply; 5+ messages in thread From: Ilpo Järvinen @ 2010-09-27 19:20 UTC (permalink / raw) To: David Miller; +Cc: ycheng, Netdev [-- Attachment #1: Type: TEXT/PLAIN, Size: 4893 bytes --] On Mon, 27 Sep 2010, David Miller wrote: > From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> > Date: Mon, 27 Sep 2010 15:22:09 +0300 (EEST) > > > On Fri, 24 Sep 2010, Yuchung Cheng wrote: > > > >> When TCP uses FACK algorithm to mark lost packets in > >> tcp_mark_head_lost(), if the number of packets in the (TSO) skb is > >> greater than the number of packets that should be marked lost, TCP > >> incorrectly exits the loop and marks no packets lost in the skb. This > >> underestimates tp->lost_out and affects the recovery/retransmission. > >> This patch fargments the skb and marks the correct amount of packets > >> lost. > >> > >> Signed-off-by: Yuchung Cheng <ycheng@google.com> > >> --- > >> net/ipv4/tcp_input.c | 3 ++- > >> 1 files changed, 2 insertions(+), 1 deletions(-) > >> > >> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > >> index 1bc87a0..e4f472e 100644 > >> --- a/net/ipv4/tcp_input.c > >> +++ b/net/ipv4/tcp_input.c > >> @@ -2532,7 +2532,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets) > >> cnt += tcp_skb_pcount(skb); > >> > >> if (cnt > packets) { > >> - if (tcp_is_sack(tp) || (oldcnt >= packets)) > >> + if ((tcp_is_sack(tp) && !tcp_is_fack(tp)) || > >> + (oldcnt >= packets)) > >> break; > >> > >> mss = skb_shinfo(skb)->gso_size; > >> > > > > Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > > BTW, the history is that this code was added to fix a bug because > we didn't handle GSO packets at all at one point. > > But, conservatively, we didn't do splits here for SACK, and it was > stated in the commit that we would look into it "at some point in the > future" :-) > > -------------------- > commit c137f3dda04b0aee1bc6889cdc69185f53df8a82 > Author: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > Date: Mon Apr 7 22:32:38 2008 -0700 > > [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb > > Fixes a long-standing bug which makes NewReno recovery crippled. > With GSO the whole head skb was marked as LOST which is in > violation of NewReno procedure that only wants to mark one packet > and ended up breaking our TCP code by causing counter overflow > because our code was built on top of assumption about valid > NewReno procedure. This manifested as triggering a WARN_ON for > the overflow in a number of places. > > It seems relatively safe alternative to just do nothing if > tcp_fragment fails due to oom because another duplicate ACK is > likely to be received soon and the fragmentation will be retried. > > Special thanks goes to Soeren Sonnenburg <kernel@nn7.de> who was > lucky enough to be able to reproduce this so that the warning > for the overflow was hit. It's not as easy task as it seems even > if this bug happens quite often because the amount of outstanding > data is pretty significant for the mismarkings to lead to an > overflow. > > Because it's very late in 2.6.25-rc cycle (if this even makes in > time), I didn't want to touch anything with SACK enabled here. > Fragmenting might be useful for it as well but it's more or less > a policy decision rather than mandatory fix. Thus there's no need > to rush and we can postpone considering tcp_fragment with SACK > for 2.6.26. > > In 2.6.24 and earlier, this very same bug existed but the effect > is slightly different because of a small changes in the if > conditions that fit to the patch's context. With them nothing > got lost marker and thus no retransmissions happened. > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> > Signed-off-by: David S. Miller <davem@davemloft.net> > -------------------- > > To be honest, we should probably just remove the whole tcp_is_sack() > test, rather than special case FACK. > > The cost isn't what it was when this code was added. Back then we > didn't have Ilpo's restransmit queue coalescing code, so it would make > retransmit queue packet freeing more expensive. But since the > coalescing code is there now, splitting all the time to record > accurate loss information should be fine. > > Ilpo what do you say? Removing it would add a bug for non-FACK SACK as lost_cnt_hint is calculated a bit differently with it (only SACKed skbs count so lost_cnt_hint is not changing over the whole skb with pcount > 1 while it does change per each seg for FACK, basically non-FACK is all or nothing as only passing by an already SACKed segment increases lost_cnt_hint). ...I also fell to this trap once while reviewing this change and wondered in my mind why not to d it for all variants :-). So Yuchung's patch is fine as is afaict. I disagree now with myself btw, it's not just "a policy decision" as we do not retransmit some segments at all before RTO in some scenarios. -- i. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost 2010-09-27 19:20 ` Ilpo Järvinen @ 2010-09-27 21:56 ` David Miller 0 siblings, 0 replies; 5+ messages in thread From: David Miller @ 2010-09-27 21:56 UTC (permalink / raw) To: ilpo.jarvinen; +Cc: ycheng, netdev From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Date: Mon, 27 Sep 2010 22:20:57 +0300 (EEST) > I disagree now with myself btw, it's not just "a policy decision" as we > do not retransmit some segments at all before RTO in some scenarios. Ok, thanks for explaining all of this. I've applied Yuchung's patch to net-2.6, thanks everyone! ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-09-27 21:56 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-09-24 23:22 [PATCH] fix TSO FACK loss marking in tcp_mark_head_lost Yuchung Cheng 2010-09-27 12:22 ` Ilpo Järvinen 2010-09-27 18:11 ` David Miller 2010-09-27 19:20 ` Ilpo Järvinen 2010-09-27 21:56 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).