* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
@ 2009-07-28 15:13 ` Vlad Yasevich
2009-07-28 15:49 ` Doug Graham
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Vlad Yasevich @ 2009-07-28 15:13 UTC (permalink / raw)
To: linux-sctp
Doug Graham wrote:
> Hello,
>
> I have a little test that simply has a client send 32 bytes of data to
> server, which then replies back with the same data. What doesn't look
> right to me is that I never see piggybacked ACKs. I see the client
> send 32 bytes, then the server reply in a packet containing a single
> DATA chunk of 32 bytes, and then 200ms later both SACKs are sent.
Piggybacked ACKs will not be sent every packet. Looks like you are hitting
the typical single packet request-response senario.
I'd recommend you read RFC 4960 section 6.2.
Now, it's possible to do some piggy-backing when bulk data is flowing in
both directions, but that would different then you proposal below.
-vlad
>
> Looking at the code in net/sctp/output.c I see this:
>
> /* If sending DATA and haven't aleady bundled a SACK, try to
> * bundle one in to the packet.
> */
> if (sctp_chunk_is_data(chunk) && !pkt->has_sack &&
> !pkt->has_cookie_echo) {
> if (asoc->a_rwnd > asoc->rwnd)
> <append the SACK chunk to the DATA>
>
> The test that is failing is the "a_rwnd > rwnd" test, and I don't
> understand that test at all. As far as I know, a_rwnd is the last value
> of rwnd that was advertised to the peer, and rwnd is the current receive
> window. What's happening is that after the association has settled,
> a_rwnd and rwnd are equal. The client then sends 32 bytes to the peer
> SCTP on the server. The server SCTP receives these 32 bytes and decreases
> its rwnd. Then the userspace server process reads the 32 bytes from the
> socket, thus freeing up the buffer space, causing SCTP to increase rwnd back
> to where it was initially. When the server process echoes the data,
> SCTP constructs a DATA chunk and then runs the above code to see whether
> it needs to append a SACK. But since rwnd is no longer less than a_rwnd,
> no SACK is sent.
>
> I think this logic is wrong. I don't think the decision as to whether
> or not to piggyback a SACK on a DATA packet should have anything to do
> with the receive window. If there is unacknowledged data, then the
> SACK should be sent, regardless of the state of the receive window.
> Am I right about this?
>
> Thanks,
> Doug.
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
2009-07-28 15:13 ` Vlad Yasevich
@ 2009-07-28 15:49 ` Doug Graham
2009-07-28 16:09 ` Vlad Yasevich
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-28 15:49 UTC (permalink / raw)
To: linux-sctp
On Tue, Jul 28, 2009 at 11:13:05AM -0400, Vlad Yasevich wrote:
> Doug Graham wrote:
> > Hello,
> >
> > I have a little test that simply has a client send 32 bytes of data to
> > server, which then replies back with the same data. What doesn't look
> > right to me is that I never see piggybacked ACKs. I see the client
> > send 32 bytes, then the server reply in a packet containing a single
> > DATA chunk of 32 bytes, and then 200ms later both SACKs are sent.
>
> Piggybacked ACKs will not be sent every packet. Looks like you are hitting
> the typical single packet request-response senario.
>
> I'd recommend you read RFC 4960 section 6.2.
>
> Now, it's possible to do some piggy-backing when bulk data is flowing in
> both directions, but that would different then you proposal below.
I did a fairly quick scan of section 6.2 in RFC 4960, and I'm not sure
what I'm supposed to see in there that would clear this up. I still
think that section 6.1 applies:
Before an endpoint transmits a DATA chunk, if any received DATA
chunks have not been acknowledged (e.g., due to delayed ack), the
sender should create a SACK and bundle it with the outbound DATA
chunk, as long as the size of the final SCTP packet does not exceed
the current MTU. See Section 6.2.
I still don't understand what "a_rwnd > rwnd" has to do with whether or
not a SACK should be appended. Could you tell me specifically which part
of section 6.2 you think should prevent SACKs from being piggybacked in
a single packet request-response scenario? Note that I've also run my
simple test program on the only other implementation of SCTP that I have
access to, that on FreeBSD 7.2, and it does piggyback SACKS in the way
I expect. To me, it makes no sense to ever skip the piggybacking of a
SACK if you've got one to send. The whole point of delayed SACKs is to
make this possible, and thus avoid sending a SACK in a separate packet.
Here's one case where this lack of SACK piggybacking can have a big
performance impact. Suppose C(lient) and S(erver) are sending small
requests and replies between each other as quickly as possible. ie:
S sends a reply as soon as it gets a request from C, and C sends another
request as soon as it gets the previous reply from S. If Nagle is not
disabled, what happens is this:
C sends request DATA to S
S sends reply DATA to C
C and S send SACKs to each other after 200ms
After the 200ms delay introduced above, A finally has the SACK to
its first request, and only now, by the rules of Nagle, can it send
another request.
So there's a delay of 200ms in request/reply cycle. If S had piggybacked
a SACK on the DATA it sent to C, then C would have its SACK when it got
S's reply and could send another request immediately after receiving the
reply to the first. And if C piggybacks a SACK onto its second request,
then S doesn't need to wait for a SACK timeout before it can send the
second reply.
There's also bandwidth wasted by sending separate SACK packets.
Thanks for your reply,
Doug.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
2009-07-28 15:13 ` Vlad Yasevich
2009-07-28 15:49 ` Doug Graham
@ 2009-07-28 16:09 ` Vlad Yasevich
2009-07-28 20:45 ` Doug Graham
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Vlad Yasevich @ 2009-07-28 16:09 UTC (permalink / raw)
To: linux-sctp
Doug Graham wrote:
> On Tue, Jul 28, 2009 at 11:13:05AM -0400, Vlad Yasevich wrote:
>> Doug Graham wrote:
>>> Hello,
>>>
>>> I have a little test that simply has a client send 32 bytes of data to
>>> server, which then replies back with the same data. What doesn't look
>>> right to me is that I never see piggybacked ACKs. I see the client
>>> send 32 bytes, then the server reply in a packet containing a single
>>> DATA chunk of 32 bytes, and then 200ms later both SACKs are sent.
>> Piggybacked ACKs will not be sent every packet. Looks like you are hitting
>> the typical single packet request-response senario.
>>
>> I'd recommend you read RFC 4960 section 6.2.
>>
>> Now, it's possible to do some piggy-backing when bulk data is flowing in
>> both directions, but that would different then you proposal below.
>
> I did a fairly quick scan of section 6.2 in RFC 4960, and I'm not sure
> what I'm supposed to see in there that would clear this up. I still
> think that section 6.1 applies:
>
> Before an endpoint transmits a DATA chunk, if any received DATA
> chunks have not been acknowledged (e.g., due to delayed ack), the
> sender should create a SACK and bundle it with the outbound DATA
> chunk, as long as the size of the final SCTP packet does not exceed
> the current MTU. See Section 6.2.
>
Section 6.2 are specific guidelines of when one should generate SACKs.
Specifically, SACKs are generate every other packet or within 200 ms of
unacknowledged DATA, which-ever is faster.
What you are proposing is that SACKs are generated for every packet and
well short of the 200 ms delay limit, which is definitely more aggressive.
This is explicitly a MUST NOT.
> I still don't understand what "a_rwnd > rwnd" has to do with whether or
> not a SACK should be appended. Could you tell me specifically which part
> of section 6.2 you think should prevent SACKs from being piggybacked in
> a single packet request-response scenario? Note that I've also run my
> simple test program on the only other implementation of SCTP that I have
> access to, that on FreeBSD 7.2, and it does piggyback SACKS in the way
> I expect. To me, it makes no sense to ever skip the piggybacking of a
> SACK if you've got one to send. The whole point of delayed SACKs is to
> make this possible, and thus avoid sending a SACK in a separate packet.
But in you example of single message request-response, there is no SACK to
send. Not sure if BSD 7.2 implements immediate SACK draft (it might), but
that would allow one to do what you ask.
>
> Here's one case where this lack of SACK piggybacking can have a big
> performance impact. Suppose C(lient) and S(erver) are sending small
> requests and replies between each other as quickly as possible. ie:
> S sends a reply as soon as it gets a request from C, and C sends another
> request as soon as it gets the previous reply from S. If Nagle is not
> disabled, what happens is this:
>
> C sends request DATA to S
> S sends reply DATA to C
> C and S send SACKs to each other after 200ms
>
> After the 200ms delay introduced above, A finally has the SACK to
> its first request, and only now, by the rules of Nagle, can it send
> another request.
>
> So there's a delay of 200ms in request/reply cycle. If S had piggybacked
> a SACK on the DATA it sent to C, then C would have its SACK when it got
> S's reply and could send another request immediately after receiving the
> reply to the first. And if C piggybacks a SACK onto its second request,
> then S doesn't need to wait for a SACK timeout before it can send the
> second reply.
This is a well known problem with SCTP and 'immidiate sack' draft tries to solve
it. However, changing the code the way you suggesting will not fully correct
the problem. LKSCTP is notorious for not bundling SACKS.
What we need to do is deffer SACKS for very short amount of type when they need
to be sent, so that they might be bundled with other chunks. However, if the
user application happened to stall, the SACKs should still be sent. This timer
should be on the order of microseconds. This way we would more likely bundle
SACKs.
-vlad
>
> There's also bandwidth wasted by sending separate SACK packets.
>
> Thanks for your reply,
> Doug.
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (2 preceding siblings ...)
2009-07-28 16:09 ` Vlad Yasevich
@ 2009-07-28 20:45 ` Doug Graham
2009-07-28 21:18 ` Michael Tüxen
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-28 20:45 UTC (permalink / raw)
To: linux-sctp
Also copying Michael, since he seems interested, except he's interested on
the wrong mailing list :-). Some of what I'm replying to here is from the
other mailing list, but since you prefer to use this one, I'll reply here.
On Tue, Jul 28, 2009 at 12:09:28PM -0400, Vlad Yasevich wrote:
> Doug Graham wrote:
> > On Tue, Jul 28, 2009 at 11:13:05AM -0400, Vlad Yasevich wrote:
> >> Doug Graham wrote:
> >>>
> >>> I have a little test that simply has a client send 32 bytes of data to
> >>> server, which then replies back with the same data. What doesn't look
> >>> right to me is that I never see piggybacked ACKs. I see the client
> >>> send 32 bytes, then the server reply in a packet containing a single
> >>> DATA chunk of 32 bytes, and then 200ms later both SACKs are sent.
> >>>
> >> Piggybacked ACKs will not be sent every packet. Looks like you are hitting
> >> the typical single packet request-response senario.
> >>
> >> I'd recommend you read RFC 4960 section 6.2.
> >>
> >> Now, it's possible to do some piggy-backing when bulk data is flowing in
> >> both directions, but that would different then you proposal below.
> >
> > I did a fairly quick scan of section 6.2 in RFC 4960, and I'm not sure
> > what I'm supposed to see in there that would clear this up. I still
> > think that section 6.1 applies:
> >
> > Before an endpoint transmits a DATA chunk, if any received DATA
> > chunks have not been acknowledged (e.g., due to delayed ack), the
> > sender should create a SACK and bundle it with the outbound DATA
> > chunk, as long as the size of the final SCTP packet does not exceed
> > the current MTU. See Section 6.2.
> >
>
> Section 6.2 are specific guidelines of when one should generate SACKs.
> Specifically, SACKs are generate every other packet or within 200 ms of
> unacknowledged DATA, which-ever is faster.
I think this is a misinterpretation of the RFC. Otherwise, what could
that paragraph from section 6.1 possibly mean? I read section 6.2 to
mean that a SACK should be generated for *at least* every second packet,
and that a SACK should be sent within 200ms. The RFC only prohibits
more than one SACK from being sent for every incoming DATA packet;
it doesn't say that a SACK can *only* be be sent for every second packet.
> What you are proposing is that SACKs are generated for every packet and
> well short of the 200 ms delay limit, which is definitely more aggressive.
> This is explicitly a MUST NOT.
I don't think that's what the RFC says, but I guess only the author(s) of
the RFC could tell us what they really meant. Your interpretation doesn't
make any sense to.
> But in you example of single message request-response, there is no SACK to
> send.
I don't know what you mean by this. A SACK needs to be sent for both
the request and the reply. Or perhaps you meant that since there's only
one DATA packet in each direction, the conjectured "SACK only every two
packets" rule kicks in. But as I mention, I very much doubt that that's
what the RFC actually means to say.
> Not sure if BSD 7.2 implements immediate SACK draft (it might), but
> that would allow one to do what you ask.
That's not what I saw BSD do. As I read
http://tools.ietf.org/html/draft-tuexen-tsvwg-sctp-sack-immediately-02
it specifies a way for a data sender to request that the receiver send a
SACK immediately upon receipt of the DATA. ie: it disables delayed-ACK
for a DATA chunk with the proposed I bit set. This would result in a
SACK being sent by the receiver immediately after it receives the DATA,
and would prevent piggybacking in my scenario. The reason it would
prevent piggybacking is that at the time the receiving SCTP gets the
DATA packet and is required to send a SACK, it has no DATA to send yet,
and so nothing to piggyback the SACK on. In BSD's case, I *did* see it
piggyback the SACK.
I'll send pcap traces from BSD and Linux later today.
> > Here's one case where this lack of SACK piggybacking can have a big
> > performance impact. Suppose C(lient) and S(erver) are sending small
> > requests and replies between each other as quickly as possible. ie:
> > S sends a reply as soon as it gets a request from C, and C sends another
> > request as soon as it gets the previous reply from S. If Nagle is not
> > disabled, what happens is this:
> >
> > C sends request DATA to S
> > S sends reply DATA to C
> > C and S send SACKs to each other after 200ms
> >
> > After the 200ms delay introduced above, A finally has the SACK to
> > its first request, and only now, by the rules of Nagle, can it send
> > another request.
> >
> > So there's a delay of 200ms in request/reply cycle. If S had piggybacked
> > a SACK on the DATA it sent to C, then C would have its SACK when it got
> > S's reply and could send another request immediately after receiving the
> > reply to the first. And if C piggybacks a SACK onto its second request,
> > then S doesn't need to wait for a SACK timeout before it can send the
> > second reply.
>
> This is a well known problem with SCTP and 'immidiate sack' draft tries to solve
> it. However, changing the code the way you suggesting will not fully correct
> the problem. LKSCTP is notorious for not bundling SACKS.
It's not clear to me why my proposed fix would not solve the problem, unless you
mean that it would violate a clause in the RFC (which I still don't agree with).
The 'immediate SACK' draft would also solve my problem, except it would do so
at the expense of an extra SACK packet. ie: it prevents piggybacking when the
receiver has no DATA to send immediately, so a separate SACK would need to be
sent. With my fix, the receiver would delay the SACK as usual, but then a short
time after, it would be given DATA to send back, and would piggyback the SACK
on that.
From your other email:
>> I think this logic is wrong. I don't think the decision as to whether
>> or not to piggyback a SACK on a DATA packet should have anything to do
>> with the receive window. If there is unacknowledged data, then the
>> SACK should be sent, regardless of the state of the receive window.
>
>This test is there to catch potential window update SACK that can be bundled
>with any outgoing data. It is not there for immediate SACKs.
That's not what the comment above the code says. The comment says
/* If sending DATA and haven't aleady bundled a SACK, try to
* bundle one in to the packet.
*/
If this code is just there to generate window updates, where is the
code to implement the clause from section 6.1 that I included above,
and provide again here?:
Before an endpoint transmits a DATA chunk, if any received DATA
chunks have not been acknowledged (e.g., due to delayed ack), the
sender should create a SACK and bundle it with the outbound DATA
chunk, as long as the size of the final SCTP packet does not exceed
the current MTU. See Section 6.2.
--Doug.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (3 preceding siblings ...)
2009-07-28 20:45 ` Doug Graham
@ 2009-07-28 21:18 ` Michael Tüxen
2009-07-28 22:31 ` Doug Graham
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Michael Tüxen @ 2009-07-28 21:18 UTC (permalink / raw)
To: linux-sctp
On Jul 28, 2009, at 10:45 PM, Doug Graham wrote:
> Also copying Michael, since he seems interested, except he's
> interested on
> the wrong mailing list :-). Some of what I'm replying to here is
> from the
> other mailing list, but since you prefer to use this one, I'll reply
> here.
>
> On Tue, Jul 28, 2009 at 12:09:28PM -0400, Vlad Yasevich wrote:
>> Doug Graham wrote:
>>> On Tue, Jul 28, 2009 at 11:13:05AM -0400, Vlad Yasevich wrote:
>>>> Doug Graham wrote:
>>>>>
>>>>> I have a little test that simply has a client send 32 bytes of
>>>>> data to
>>>>> server, which then replies back with the same data. What
>>>>> doesn't look
>>>>> right to me is that I never see piggybacked ACKs. I see the
>>>>> client
>>>>> send 32 bytes, then the server reply in a packet containing a
>>>>> single
>>>>> DATA chunk of 32 bytes, and then 200ms later both SACKs are sent.
>>>>>
>>>> Piggybacked ACKs will not be sent every packet. Looks like you
>>>> are hitting
>>>> the typical single packet request-response senario.
>>>>
>>>> I'd recommend you read RFC 4960 section 6.2.
>>>>
>>>> Now, it's possible to do some piggy-backing when bulk data is
>>>> flowing in
>>>> both directions, but that would different then you proposal below.
>>>
>>> I did a fairly quick scan of section 6.2 in RFC 4960, and I'm not
>>> sure
>>> what I'm supposed to see in there that would clear this up. I still
>>> think that section 6.1 applies:
>>>
>>> Before an endpoint transmits a DATA chunk, if any received DATA
>>> chunks have not been acknowledged (e.g., due to delayed ack), the
>>> sender should create a SACK and bundle it with the outbound DATA
>>> chunk, as long as the size of the final SCTP packet does not
>>> exceed
>>> the current MTU. See Section 6.2.
>>>
>>
>> Section 6.2 are specific guidelines of when one should generate
>> SACKs.
>> Specifically, SACKs are generate every other packet or within 200
>> ms of
>> unacknowledged DATA, which-ever is faster.
This is the way it should work...
>
> I think this is a misinterpretation of the RFC. Otherwise, what could
> that paragraph from section 6.1 possibly mean? I read section 6.2 to
> mean that a SACK should be generated for *at least* every second
> packet,
> and that a SACK should be sent within 200ms. The RFC only prohibits
> more than one SACK from being sent for every incoming DATA packet;
> it doesn't say that a SACK can *only* be be sent for every second
> packet.
>
>> What you are proposing is that SACKs are generated for every packet
>> and
>> well short of the 200 ms delay limit, which is definitely more
>> aggressive.
>> This is explicitly a MUST NOT.
>
> I don't think that's what the RFC says, but I guess only the
> author(s) of
> the RFC could tell us what they really meant. Your interpretation
> doesn't
> make any sense to.
Let us see the tracefile and I can tell you if that behaviour is the
one the authors of the RFC intended...
>
>> But in you example of single message request-response, there is no
>> SACK to
>> send.
>
> I don't know what you mean by this. A SACK needs to be sent for both
> the request and the reply. Or perhaps you meant that since there's
> only
> one DATA packet in each direction, the conjectured "SACK only every
> two
> packets" rule kicks in. But as I mention, I very much doubt that
> that's
> what the RFC actually means to say.
>
>> Not sure if BSD 7.2 implements immediate SACK draft (it might), but
>> that would allow one to do what you ask.
I looked in the svn repository and the code made in into 7.2...
>
> That's not what I saw BSD do. As I read
> http://tools.ietf.org/html/draft-tuexen-tsvwg-sctp-sack-immediately-02
> it specifies a way for a data sender to request that the receiver
> send a
> SACK immediately upon receipt of the DATA. ie: it disables delayed-
> ACK
> for a DATA chunk with the proposed I bit set. This would result in a
> SACK being sent by the receiver immediately after it receives the
> DATA,
> and would prevent piggybacking in my scenario. The reason it would
> prevent piggybacking is that at the time the receiving SCTP gets the
> DATA packet and is required to send a SACK, it has no DATA to send
> yet,
> and so nothing to piggyback the SACK on. In BSD's case, I *did* see
> it
> piggyback the SACK.
I guess you did not specify SCTP_DATA_SACK_IMMEDIATELY in the send()
call...
>
> I'll send pcap traces from BSD and Linux later today.
>
>
>>> Here's one case where this lack of SACK piggybacking can have a big
>>> performance impact. Suppose C(lient) and S(erver) are sending small
>>> requests and replies between each other as quickly as possible. ie:
>>> S sends a reply as soon as it gets a request from C, and C sends
>>> another
>>> request as soon as it gets the previous reply from S. If Nagle is
>>> not
>>> disabled, what happens is this:
>>>
>>> C sends request DATA to S
>>> S sends reply DATA to C
>>> C and S send SACKs to each other after 200ms
>>>
>>> After the 200ms delay introduced above, A finally has the SACK to
>>> its first request, and only now, by the rules of Nagle, can it send
>>> another request.
>>>
>>> So there's a delay of 200ms in request/reply cycle. If S had
>>> piggybacked
>>> a SACK on the DATA it sent to C, then C would have its SACK when
>>> it got
>>> S's reply and could send another request immediately after
>>> receiving the
>>> reply to the first. And if C piggybacks a SACK onto its second
>>> request,
>>> then S doesn't need to wait for a SACK timeout before it can send
>>> the
>>> second reply.
>>
>> This is a well known problem with SCTP and 'immidiate sack' draft
>> tries to solve
>> it. However, changing the code the way you suggesting will not
>> fully correct
>> the problem. LKSCTP is notorious for not bundling SACKS.
>
> It's not clear to me why my proposed fix would not solve the
> problem, unless you
> mean that it would violate a clause in the RFC (which I still don't
> agree with).
> The 'immediate SACK' draft would also solve my problem, except it
> would do so
> at the expense of an extra SACK packet. ie: it prevents
> piggybacking when the
> receiver has no DATA to send immediately, so a separate SACK would
> need to be
> sent. With my fix, the receiver would delay the SACK as usual, but
> then a short
> time after, it would be given DATA to send back, and would piggyback
> the SACK
> on that.
>
> From your other email:
>
>>> I think this logic is wrong. I don't think the decision as to
>>> whether
>>> or not to piggyback a SACK on a DATA packet should have anything
>>> to do
>>> with the receive window. If there is unacknowledged data, then the
>>> SACK should be sent, regardless of the state of the receive window.
>>
>> This test is there to catch potential window update SACK that can
>> be bundled
>> with any outgoing data. It is not there for immediate SACKs.
>
> That's not what the comment above the code says. The comment says
>
> /* If sending DATA and haven't aleady bundled a SACK, try to
> * bundle one in to the packet.
> */
>
> If this code is just there to generate window updates, where is the
> code to implement the clause from section 6.1 that I included above,
> and provide again here?:
>
> Before an endpoint transmits a DATA chunk, if any received DATA
> chunks have not been acknowledged (e.g., due to delayed ack), the
> sender should create a SACK and bundle it with the outbound DATA
> chunk, as long as the size of the final SCTP packet does not exceed
> the current MTU. See Section 6.2.
>
> --Doug.
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (4 preceding siblings ...)
2009-07-28 21:18 ` Michael Tüxen
@ 2009-07-28 22:31 ` Doug Graham
2009-07-28 22:49 ` Doug Graham
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-28 22:31 UTC (permalink / raw)
To: linux-sctp
[-- Attachment #1: Type: text/plain, Size: 2286 bytes --]
On Tue, Jul 28, 2009 at 11:18:05PM +0200, Michael T?xen wrote:
> >I don't think that's what the RFC says, but I guess only the
> >author(s) of
> >the RFC could tell us what they really meant. Your interpretation
> >doesn't
> >make any sense to.
>
> Let us see the tracefile and I can tell you if that behaviour is the
> one the authors of the RFC intended...
Heh. I'm guessing that you're one of the authors then? I see you given
credit in RFC 4960, but the only author listed at the top is R. Stewart.
I've attached linux and BSD capture files, and the client and server
test programs. The client just sends a request to the server, and then
waits for a reply. The client loops four times and sleeps 2 seconds
between iterations. I ran the client on a Fedora 10 laptop in all cases
(kernel version 2.6.27.25) and did the wireshark capture on the same
laptop. sctp_bsd72_server.cap is the capture when running the server
on a FreeBSD 7.2 machine. sctp_linux_server.cap is the capture when
running the server on a Fedora 10 desktop machine. A single iteration
with the BSD server looks like:
7 2.000205 10.0.0.15 10.0.0.11 SCTP DATA
8 2.000501 10.0.0.11 10.0.0.15 SCTP SACK DATA
9 2.200484 10.0.0.15 10.0.0.11 SCTP SACK
So one DATA packet from client to server, then the reply data packet
from server to client with a bundled SACK, then the SACK from client
to server to acknowledge the reply. The last SACK does have to wait
for the SACK timer to expire; this is to be expected, since no more
data is sent until the next iteration in a couple seconds.
A single iteration with the Linux server looks like:
7 2.000161 10.0.0.15 10.0.0.12 SCTP DATA
8 2.000495 10.0.0.12 10.0.0.15 SCTP DATA
9 2.199995 10.0.0.15 10.0.0.12 SCTP SACK
10 2.200170 10.0.0.12 10.0.0.15 SCTP SACK
> >In BSD's case, I *did* see it piggyback the SACK.
>
> I guess you did not specify SCTP_DATA_SACK_IMMEDIATELY in the send()
> call...
No, I didn't. If I had, I agree that no piggybacking would have been
possible. That was what I was trying to say: from the trace, it did
not look as though BSD was using the immediate SACK feature.
--Doug.
[-- Attachment #2: client.c --]
[-- Type: text/plain, Size: 1106 bytes --]
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#define PORT 12342
int main(int argc, char **argv)
{
int s, cc, i;
struct sockaddr_in addr;
in_addr_t destaddr = inet_addr("127.0.0.1");
if (argv[1])
destaddr = inet_addr(argv[1]);
if ((s = socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP)) < 0) {
perror("socket");
exit(1);
}
memset(&addr, '\0', sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = destaddr;
addr.sin_port = htons(PORT);
char buf[] = "yo momma wears army boots";
struct iovec iov[1] = {{.iov_base = buf, .iov_len = sizeof(buf)}};
struct msghdr msg = {
.msg_name = &addr,
.msg_namelen = sizeof(addr),
.msg_iov = iov,
.msg_iovlen = sizeof(iov)/sizeof(iov[0]),
.msg_control = NULL,
.msg_controllen = 0,
.msg_flags = 0
};
for (i = 0; i < 4; i++) {
printf("Sending message to %s\n", inet_ntoa(addr.sin_addr));
if ((cc = sendmsg(s, &msg, 0)) < 0) {
perror("sendmsg");
exit(1);
}
sleep(2);
}
return 0;
}
[-- Attachment #3: server.c --]
[-- Type: text/plain, Size: 1033 bytes --]
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define PORT 12342
int main(void)
{
int s, cc;
struct sockaddr_in addr;
if ((s = socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP)) < 0) {
perror("socket");
exit(1);
}
memset(&addr, '\0', sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_port = htons(PORT);
if (bind(s, (struct sockaddr *) &addr, sizeof(addr)) < 0) {
perror("bind");
exit(1);
}
if (listen(s, 5) < 0) {
perror("listen");
exit(1);
}
for (;;) {
char buf[128];
struct iovec iov[1] = {{.iov_base = buf, .iov_len = sizeof(buf)}};
struct msghdr msg = {
.msg_name = &addr,
.msg_namelen = sizeof(addr),
.msg_iov = iov,
.msg_iovlen = sizeof(iov)/sizeof(iov[0]),
.msg_control = NULL,
.msg_controllen = 0,
.msg_flags = 0
};
printf("Waiting for message\n");
cc = recvmsg(s, &msg, 0);
printf("Got msg, len = %d\n", cc);
sendmsg(s, &msg, 0);
}
}
[-- Attachment #4: sctp_bsd72_server.cap --]
[-- Type: application/octet-stream, Size: 2738 bytes --]
[-- Attachment #5: sctp_linux_server.cap --]
[-- Type: application/octet-stream, Size: 3080 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (5 preceding siblings ...)
2009-07-28 22:31 ` Doug Graham
@ 2009-07-28 22:49 ` Doug Graham
2009-07-29 0:06 ` Michael Tüxen
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-28 22:49 UTC (permalink / raw)
To: linux-sctp
On Tue, Jul 28, 2009 at 06:31:08PM -0400, Doug Graham wrote:
> The client just sends a request to the server, and then
> waits for a reply.
Ummm, I lied about this. In this iteration of my test program,
the client doesn't actually wait for or read the reply. But that
doesn't affect the analysis, since it's what the server does that
matters here. Even if the client does wait for and read the reply,
the packets exchanged are exactly the same.
--Doug
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (6 preceding siblings ...)
2009-07-28 22:49 ` Doug Graham
@ 2009-07-29 0:06 ` Michael Tüxen
2009-07-29 15:19 ` Vlad Yasevich
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Michael Tüxen @ 2009-07-29 0:06 UTC (permalink / raw)
To: linux-sctp
On Jul 29, 2009, at 12:31 AM, Doug Graham wrote:
> On Tue, Jul 28, 2009 at 11:18:05PM +0200, Michael T?xen wrote:
>>> I don't think that's what the RFC says, but I guess only the
>>> author(s) of
>>> the RFC could tell us what they really meant. Your interpretation
>>> doesn't
>>> make any sense to.
>>
>> Let us see the tracefile and I can tell you if that behaviour is the
>> one the authors of the RFC intended...
>
> Heh. I'm guessing that you're one of the authors then? I see you
> given
> credit in RFC 4960, but the only author listed at the top is R.
> Stewart.
Randy is the editor of the document...
>
> I've attached linux and BSD capture files, and the client and server
> test programs. The client just sends a request to the server, and
> then
> waits for a reply. The client loops four times and sleeps 2 seconds
> between iterations. I ran the client on a Fedora 10 laptop in all
> cases
> (kernel version 2.6.27.25) and did the wireshark capture on the same
> laptop. sctp_bsd72_server.cap is the capture when running the server
> on a FreeBSD 7.2 machine. sctp_linux_server.cap is the capture when
> running the server on a Fedora 10 desktop machine. A single iteration
> with the BSD server looks like:
>
> 7 2.000205 10.0.0.15 10.0.0.11 SCTP DATA
> 8 2.000501 10.0.0.11 10.0.0.15 SCTP SACK DATA
> 9 2.200484 10.0.0.15 10.0.0.11 SCTP SACK
This is what I would expect.
>
> So one DATA packet from client to server, then the reply data packet
> from server to client with a bundled SACK, then the SACK from client
> to server to acknowledge the reply. The last SACK does have to wait
> for the SACK timer to expire; this is to be expected, since no more
> data is sent until the next iteration in a couple seconds.
>
> A single iteration with the Linux server looks like:
>
> 7 2.000161 10.0.0.15 10.0.0.12 SCTP DATA
> 8 2.000495 10.0.0.12 10.0.0.15 SCTP DATA
> 9 2.199995 10.0.0.15 10.0.0.12 SCTP SACK
> 10 2.200170 10.0.0.12 10.0.0.15 SCTP SACK
This is what I would not expect.
Vlad: Any reason not to bundle the SACK with the DATA chunk?
>
>
>>> In BSD's case, I *did* see it piggyback the SACK.
>>
>> I guess you did not specify SCTP_DATA_SACK_IMMEDIATELY in the send()
>> call...
>
> No, I didn't. If I had, I agree that no piggybacking would have been
> possible. That was what I was trying to say: from the trace, it did
> not look as though BSD was using the immediate SACK feature.
The I-bit is currently only set when the user requests it or you are
in SHUTDOWN-PENDING...
>
> --Doug.
> <client.c><server.c><sctp_bsd72_server.cap><sctp_linux_server.cap>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (7 preceding siblings ...)
2009-07-29 0:06 ` Michael Tüxen
@ 2009-07-29 15:19 ` Vlad Yasevich
2009-07-29 16:07 ` Doug Graham
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Vlad Yasevich @ 2009-07-29 15:19 UTC (permalink / raw)
To: linux-sctp
Michael Tüxen wrote:
>
> On Jul 29, 2009, at 12:31 AM, Doug Graham wrote:
>
>> On Tue, Jul 28, 2009 at 11:18:05PM +0200, Michael T?xen wrote:
>>>> I don't think that's what the RFC says, but I guess only the
>>>> author(s) of
>>>> the RFC could tell us what they really meant. Your interpretation
>>>> doesn't
>>>> make any sense to.
>>>
>>> Let us see the tracefile and I can tell you if that behaviour is the
>>> one the authors of the RFC intended...
>>
>> Heh. I'm guessing that you're one of the authors then? I see you given
>> credit in RFC 4960, but the only author listed at the top is R. Stewart.
> Randy is the editor of the document...
>>
>> I've attached linux and BSD capture files, and the client and server
>> test programs. The client just sends a request to the server, and then
>> waits for a reply. The client loops four times and sleeps 2 seconds
>> between iterations. I ran the client on a Fedora 10 laptop in all cases
>> (kernel version 2.6.27.25) and did the wireshark capture on the same
>> laptop. sctp_bsd72_server.cap is the capture when running the server
>> on a FreeBSD 7.2 machine. sctp_linux_server.cap is the capture when
>> running the server on a Fedora 10 desktop machine. A single iteration
>> with the BSD server looks like:
>>
>> 7 2.000205 10.0.0.15 10.0.0.11 SCTP DATA
>> 8 2.000501 10.0.0.11 10.0.0.15 SCTP SACK DATA
>> 9 2.200484 10.0.0.15 10.0.0.11 SCTP SACK
> This is what I would expect.
Hmm... time to re-read 6.1 and 6.2...
>>
>> So one DATA packet from client to server, then the reply data packet
>> from server to client with a bundled SACK, then the SACK from client
>> to server to acknowledge the reply. The last SACK does have to wait
>> for the SACK timer to expire; this is to be expected, since no more
>> data is sent until the next iteration in a couple seconds.
>>
>> A single iteration with the Linux server looks like:
>>
>> 7 2.000161 10.0.0.15 10.0.0.12 SCTP DATA
>> 8 2.000495 10.0.0.12 10.0.0.15 SCTP DATA
>> 9 2.199995 10.0.0.15 10.0.0.12 SCTP SACK
>> 10 2.200170 10.0.0.12 10.0.0.15 SCTP SACK
> This is what I would not expect.
> Vlad: Any reason not to bundle the SACK with the DATA chunk?
Since we received only 1 packet so far, the SACK is delayed. The implementers
have focused on section 6.2, but seemed to have ignored the following text
from 6.1:
Before an endpoint transmits a DATA chunk, if any received DATA
chunks have not been acknowledged (e.g., due to delayed ack), the
sender should create a SACK and bundle it with the outbound DATA
chunk, as long as the size of the final SCTP packet does not exceed
the current MTU. See Section 6.2.
Looks like BSD does this and linux doesn't appear to. Linux has been doing this
since the beginning...
Doug, can you regenerate you patch with proper commit comment and sign-off
(according to Documentations/SubmittingPatches).
Thanks
-vlad
>>
>>
>>>> In BSD's case, I *did* see it piggyback the SACK.
>>>
>>> I guess you did not specify SCTP_DATA_SACK_IMMEDIATELY in the send()
>>> call...
>>
>> No, I didn't. If I had, I agree that no piggybacking would have been
>> possible. That was what I was trying to say: from the trace, it did
>> not look as though BSD was using the immediate SACK feature.
> The I-bit is currently only set when the user requests it or you are
> in SHUTDOWN-PENDING...
>>
>> --Doug.
>> <client.c><server.c><sctp_bsd72_server.cap><sctp_linux_server.cap>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (8 preceding siblings ...)
2009-07-29 15:19 ` Vlad Yasevich
@ 2009-07-29 16:07 ` Doug Graham
2009-07-29 16:21 ` Doug Graham
2009-07-29 18:14 ` Vlad Yasevich
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-29 16:07 UTC (permalink / raw)
To: linux-sctp
On Wed, Jul 29, 2009 at 11:19:08AM -0400, Vlad Yasevich wrote:
> >> 7 2.000205 10.0.0.15 10.0.0.11 SCTP DATA
> >> 8 2.000501 10.0.0.11 10.0.0.15 SCTP SACK DATA
> >> 9 2.200484 10.0.0.15 10.0.0.11 SCTP SACK
> > This is what I would expect.
>
> Hmm... time to re-read 6.1 and 6.2...
> [...]
> Since we received only 1 packet so far, the SACK is delayed. The implementers
> have focused on section 6.2, but seemed to have ignored the following text
> from 6.1:
>
> Before an endpoint transmits a DATA chunk, if any received DATA
> chunks have not been acknowledged (e.g., due to delayed ack), the
> sender should create a SACK and bundle it with the outbound DATA
> chunk, as long as the size of the final SCTP packet does not exceed
> the current MTU. See Section 6.2.
>
> Looks like BSD does this and linux doesn't appear to. Linux has been doing this
> since the beginning...
>
> Doug, can you regenerate you patch with proper commit comment and sign-off
> (according to Documentations/SubmittingPatches).
I just sent a patch, but please keep in mind that I am not by any means
an expert on this code. I've tested my patch only under conditions
where no packet loss was occurring, so I can not vouch for its behaviour
when losses do occur, nor do I know how it will interact with features
that I haven't used during testing. However, to the best of knowledge,
this patch does correctly implement the intended sematics of RFC 4960.
--Doug.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (9 preceding siblings ...)
2009-07-29 16:07 ` Doug Graham
@ 2009-07-29 16:21 ` Doug Graham
2009-07-29 18:14 ` Vlad Yasevich
11 siblings, 0 replies; 13+ messages in thread
From: Doug Graham @ 2009-07-29 16:21 UTC (permalink / raw)
To: linux-sctp
On Wed, Jul 29, 2009 at 12:07:11PM -0400, Doug Graham wrote:
> On Wed, Jul 29, 2009 at 11:19:08AM -0400, Vlad Yasevich wrote:
> > >> 7 2.000205 10.0.0.15 10.0.0.11 SCTP DATA
> > >> 8 2.000501 10.0.0.11 10.0.0.15 SCTP SACK DATA
> > >> 9 2.200484 10.0.0.15 10.0.0.11 SCTP SACK
> > > This is what I would expect.
> >
> > Hmm... time to re-read 6.1 and 6.2...
> > [...]
> > Since we received only 1 packet so far, the SACK is delayed. The implementers
> > have focused on section 6.2, but seemed to have ignored the following text
> > from 6.1:
> >
> > Before an endpoint transmits a DATA chunk, if any received DATA
> > chunks have not been acknowledged (e.g., due to delayed ack), the
> > sender should create a SACK and bundle it with the outbound DATA
> > chunk, as long as the size of the final SCTP packet does not exceed
> > the current MTU. See Section 6.2.
> >
> > Looks like BSD does this and linux doesn't appear to. Linux has been doing this
> > since the beginning...
> >
> > Doug, can you regenerate you patch with proper commit comment and sign-off
> > (according to Documentations/SubmittingPatches).
>
> I just sent a patch, but please keep in mind that I am not by any means
> an expert on this code. I've tested my patch only under conditions
> where no packet loss was occurring, so I can not vouch for its behaviour
> when losses do occur, nor do I know how it will interact with features
> that I haven't used during testing. However, to the best of knowledge,
> this patch does correctly implement the intended sematics of RFC 4960.
Oh yeah, I should also mention that I still don't understand what the
original 'asoc->a_rwnd > asoc->rwnd' condition was all about. I replaced
that condition with timer_pending(), but if the original condition really
does have something to with sending window updates as you mentioned,
it's possible that it should be left in as well. ie:
if ((asoc->a_rwnd > asoc->rwnd) || timer_pending(timer))
but then keep in mind that the body of that if block has been rewritten
to assume that the timer is pending if the block is executed. If the
block can be entered as a result of either of these two conditions,
that assumption may no longer be true.
--Doug.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Do piggybacked ACKs work?
2009-07-25 2:09 Do piggybacked ACKs work? Doug Graham
` (10 preceding siblings ...)
2009-07-29 16:21 ` Doug Graham
@ 2009-07-29 18:14 ` Vlad Yasevich
11 siblings, 0 replies; 13+ messages in thread
From: Vlad Yasevich @ 2009-07-29 18:14 UTC (permalink / raw)
To: linux-sctp
Doug Graham wrote:
>
> Oh yeah, I should also mention that I still don't understand what the
> original 'asoc->a_rwnd > asoc->rwnd' condition was all about. I replaced
> that condition with timer_pending(), but if the original condition really
> does have something to with sending window updates as you mentioned,
> it's possible that it should be left in as well. ie:
>
> if ((asoc->a_rwnd > asoc->rwnd) || timer_pending(timer))
>
> but then keep in mind that the body of that if block has been rewritten
> to assume that the timer is pending if the block is executed. If the
> block can be entered as a result of either of these two conditions,
> that assumption may no longer be true.
I just re-read this code and it looks like it's trying to catch a condition when
a new packet was given to the socket, but the socket issued a write instead of
a read.
In this case, the timer solution would do the same thing.
During loss, there is no timer as gaps are reported immediately. Upon loss
recovery, there is timer, but the notification shouldn't be delayed (this is a
bug) and this corrects that as well.
I'll run this through it's paces to see if anything breaks, but I think it
should be good.
-vlad
>
> --Doug.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 13+ messages in thread