* [PATCH] Fix sorting of SACK blocks
@ 2007-01-25 18:29 Baruch Even
2007-01-25 18:36 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Baruch Even @ 2007-01-25 18:29 UTC (permalink / raw)
To: netdev; +Cc: David S. Miller, Stephen Hemminger
The sorting of SACK blocks actually munges them rather than sort, causing the
TCP stack to ignore some SACK information and breaking the assumption of
ordered SACK blocks after sorting.
The sort takes the data from a second buffer which isn't moved causing
subsequent data moves to occur from the wrong location. The fix is to
use a temporary buffer as a normal sort does.
Signed-Off-By: Baruch Even <baruch@ev-en.org>
diff -X 2.6-rc6/Documentation/dontdiff -ur 2.6-rc6/net/ipv4/tcp_input.c 2.6-mod/net/ipv4/tcp_input.c
--- 2.6-rc6/net/ipv4/tcp_input.c 2007-01-25 19:04:20.000000000 +0200
+++ 2.6-mod/net/ipv4/tcp_input.c 2007-01-25 19:52:04.000000000 +0200
@@ -1011,10 +1011,11 @@
for (j = 0; j < i; j++){
if (after(ntohl(sp[j].start_seq),
ntohl(sp[j+1].start_seq))){
- sp[j].start_seq = htonl(tp->recv_sack_cache[j+1].start_seq);
- sp[j].end_seq = htonl(tp->recv_sack_cache[j+1].end_seq);
- sp[j+1].start_seq = htonl(tp->recv_sack_cache[j].start_seq);
- sp[j+1].end_seq = htonl(tp->recv_sack_cache[j].end_seq);
+ struct tcp_sack_block_wire tmp;
+
+ tmp = sp[j];
+ sp[j] = sp[j+1];
+ sp[j+1] = tmp;
}
}
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] Fix sorting of SACK blocks
2007-01-25 18:29 [PATCH] Fix sorting of SACK blocks Baruch Even
@ 2007-01-25 18:36 ` Stephen Hemminger
2007-01-25 19:08 ` Baruch Even
2007-01-25 21:34 ` David Miller
2007-01-25 23:55 ` David Miller
2 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2007-01-25 18:36 UTC (permalink / raw)
To: Baruch Even; +Cc: netdev, David S. Miller, Stephen Hemminger
On Thu, 25 Jan 2007 20:29:03 +0200
Baruch Even <baruch@ev-en.org> wrote:
> The sorting of SACK blocks actually munges them rather than sort, causing the
> TCP stack to ignore some SACK information and breaking the assumption of
> ordered SACK blocks after sorting.
>
> The sort takes the data from a second buffer which isn't moved causing
> subsequent data moves to occur from the wrong location. The fix is to
> use a temporary buffer as a normal sort does.
>
> Signed-Off-By: Baruch Even <baruch@ev-en.org>
>
> diff -X 2.6-rc6/Documentation/dontdiff -ur 2.6-rc6/net/ipv4/tcp_input.c 2.6-mod/net/ipv4/tcp_input.c
> --- 2.6-rc6/net/ipv4/tcp_input.c 2007-01-25 19:04:20.000000000 +0200
> +++ 2.6-mod/net/ipv4/tcp_input.c 2007-01-25 19:52:04.000000000 +0200
> @@ -1011,10 +1011,11 @@
> for (j = 0; j < i; j++){
> if (after(ntohl(sp[j].start_seq),
> ntohl(sp[j+1].start_seq))){
> - sp[j].start_seq = htonl(tp->recv_sack_cache[j+1].start_seq);
> - sp[j].end_seq = htonl(tp->recv_sack_cache[j+1].end_seq);
> - sp[j+1].start_seq = htonl(tp->recv_sack_cache[j].start_seq);
> - sp[j+1].end_seq = htonl(tp->recv_sack_cache[j].end_seq);
> + struct tcp_sack_block_wire tmp;
> +
> + tmp = sp[j];
> + sp[j] = sp[j+1];
> + sp[j+1] = tmp;
> }
>
> }
This looks okay, but is there a test case that can be run?
--
Stephen Hemminger <shemminger@linux-foundation.org>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] Fix sorting of SACK blocks
2007-01-25 18:36 ` Stephen Hemminger
@ 2007-01-25 19:08 ` Baruch Even
0 siblings, 0 replies; 7+ messages in thread
From: Baruch Even @ 2007-01-25 19:08 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, David S. Miller, Stephen Hemminger
* Stephen Hemminger <shemminger@linux-foundation.org> [070125 20:47]:
> On Thu, 25 Jan 2007 20:29:03 +0200
> Baruch Even <baruch@ev-en.org> wrote:
>
> > The sorting of SACK blocks actually munges them rather than sort, causing the
> > TCP stack to ignore some SACK information and breaking the assumption of
> > ordered SACK blocks after sorting.
> >
> > The sort takes the data from a second buffer which isn't moved causing
> > subsequent data moves to occur from the wrong location. The fix is to
> > use a temporary buffer as a normal sort does.
> >
> > Signed-Off-By: Baruch Even <baruch@ev-en.org>
> >
> > diff -X 2.6-rc6/Documentation/dontdiff -ur 2.6-rc6/net/ipv4/tcp_input.c 2.6-mod/net/ipv4/tcp_input.c
> > --- 2.6-rc6/net/ipv4/tcp_input.c 2007-01-25 19:04:20.000000000 +0200
> > +++ 2.6-mod/net/ipv4/tcp_input.c 2007-01-25 19:52:04.000000000 +0200
> > @@ -1011,10 +1011,11 @@
> > for (j = 0; j < i; j++){
> > if (after(ntohl(sp[j].start_seq),
> > ntohl(sp[j+1].start_seq))){
> > - sp[j].start_seq = htonl(tp->recv_sack_cache[j+1].start_seq);
> > - sp[j].end_seq = htonl(tp->recv_sack_cache[j+1].end_seq);
> > - sp[j+1].start_seq = htonl(tp->recv_sack_cache[j].start_seq);
> > - sp[j+1].end_seq = htonl(tp->recv_sack_cache[j].end_seq);
> > + struct tcp_sack_block_wire tmp;
> > +
> > + tmp = sp[j];
> > + sp[j] = sp[j+1];
> > + sp[j+1] = tmp;
> > }
> >
> > }
>
> This looks okay, but is there a test case that can be run?
There is nothing visible that shows the problem, the only option is to
add some code to print the SACK blocks after sorting and run it over a
large BDP connection that can be saturated. You'll obviously need to
have several holes, I believe that the bug will be visible when you have
ACK packets with three SACK blocks where the first block is the highest
which should be the normal case.
Cheers,
Baruch
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix sorting of SACK blocks
2007-01-25 18:29 [PATCH] Fix sorting of SACK blocks Baruch Even
2007-01-25 18:36 ` Stephen Hemminger
@ 2007-01-25 21:34 ` David Miller
2007-01-25 23:55 ` David Miller
2 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2007-01-25 21:34 UTC (permalink / raw)
To: baruch; +Cc: netdev, shemminger
From: Baruch Even <baruch@ev-en.org>
Date: Thu, 25 Jan 2007 20:29:03 +0200
> The sorting of SACK blocks actually munges them rather than sort, causing the
> TCP stack to ignore some SACK information and breaking the assumption of
> ordered SACK blocks after sorting.
>
> The sort takes the data from a second buffer which isn't moved causing
> subsequent data moves to occur from the wrong location. The fix is to
> use a temporary buffer as a normal sort does.
>
> Signed-Off-By: Baruch Even <baruch@ev-en.org>
Thanks for finding this bug Baruch.
It probably explains some weird TCP traces I've seen over
the years :-)
I'll review this and apply it later today, thanks again.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix sorting of SACK blocks
2007-01-25 18:29 [PATCH] Fix sorting of SACK blocks Baruch Even
2007-01-25 18:36 ` Stephen Hemminger
2007-01-25 21:34 ` David Miller
@ 2007-01-25 23:55 ` David Miller
2007-01-26 6:40 ` Baruch Even
2 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2007-01-25 23:55 UTC (permalink / raw)
To: baruch; +Cc: netdev, shemminger
From: Baruch Even <baruch@ev-en.org>
Date: Thu, 25 Jan 2007 20:29:03 +0200
> The sorting of SACK blocks actually munges them rather than sort, causing the
> TCP stack to ignore some SACK information and breaking the assumption of
> ordered SACK blocks after sorting.
>
> The sort takes the data from a second buffer which isn't moved causing
> subsequent data moves to occur from the wrong location. The fix is to
> use a temporary buffer as a normal sort does.
>
> Signed-Off-By: Baruch Even <baruch@ev-en.org>
BTW, in reviewing this I note that there is now only one remaining
use of tp->recv_sack_cache[] and that is the code earlier in this
function which is trying to detect if all we are doing is extending
the leading edge of a SACK block.
It would be nice to be able to clear out that usage as well, and
remove recv_sack_cache[] and thus make tcp_sock smaller.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix sorting of SACK blocks
2007-01-25 23:55 ` David Miller
@ 2007-01-26 6:40 ` Baruch Even
2007-01-26 8:42 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: Baruch Even @ 2007-01-26 6:40 UTC (permalink / raw)
To: David Miller; +Cc: netdev, shemminger
* David Miller <davem@davemloft.net> [070126 01:55]:
> From: Baruch Even <baruch@ev-en.org>
> Date: Thu, 25 Jan 2007 20:29:03 +0200
>
> > The sorting of SACK blocks actually munges them rather than sort, causing the
> > TCP stack to ignore some SACK information and breaking the assumption of
> > ordered SACK blocks after sorting.
> >
> > The sort takes the data from a second buffer which isn't moved causing
> > subsequent data moves to occur from the wrong location. The fix is to
> > use a temporary buffer as a normal sort does.
> >
> > Signed-Off-By: Baruch Even <baruch@ev-en.org>
>
> BTW, in reviewing this I note that there is now only one remaining
> use of tp->recv_sack_cache[] and that is the code earlier in this
> function which is trying to detect if all we are doing is extending
> the leading edge of a SACK block.
>
> It would be nice to be able to clear out that usage as well, and
> remove recv_sack_cache[] and thus make tcp_sock smaller.
You actually need recv_sack_cache to detect if you can use the fast
path. Another alternative is to somehow hash the values of the sack
blocks but then you rely on probabilty that you will properly detect the
ability to use the fast path. Hashing will save some space but you can't
get rid of it completely unless you go back to the old and slow method
of SACK processing.
There were thoughts thrown a while back about using a different data
structure, I think you said you started working on something like that.
If that comes to fruition the cache might go.
FWIW, my other mail about possible bugs actually says that you might
need to add another value to check, the number of sack blocks in the
cache.
Baruch
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Fix sorting of SACK blocks
2007-01-26 6:40 ` Baruch Even
@ 2007-01-26 8:42 ` David Miller
0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2007-01-26 8:42 UTC (permalink / raw)
To: baruch; +Cc: netdev, shemminger
From: Baruch Even <baruch@ev-en.org>
Date: Fri, 26 Jan 2007 08:40:09 +0200
> You actually need recv_sack_cache to detect if you can use the fast
> path. Another alternative is to somehow hash the values of the sack
> blocks but then you rely on probabilty that you will properly detect the
> ability to use the fast path. Hashing will save some space but you can't
> get rid of it completely unless you go back to the old and slow method
> of SACK processing.
>
> There were thoughts thrown a while back about using a different data
> structure, I think you said you started working on something like that.
> If that comes to fruition the cache might go.
I don't know if those ideas will go anywhere, it's difficult
to make it work properly and it has to be done right since we'll
likely have to change drivers to handle an aggregated send queue
cleanly and we only want to do that once.
> FWIW, my other mail about possible bugs actually says that you might
> need to add another value to check, the number of sack blocks in the
> cache.
Yes I saw that, I was about to go over that and maybe come up with
some patches.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-01-26 8:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-25 18:29 [PATCH] Fix sorting of SACK blocks Baruch Even
2007-01-25 18:36 ` Stephen Hemminger
2007-01-25 19:08 ` Baruch Even
2007-01-25 21:34 ` David Miller
2007-01-25 23:55 ` David Miller
2007-01-26 6:40 ` Baruch Even
2007-01-26 8:42 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).