From: Vlad Yasevich <vladislav.yasevich@hp.com>
To: linux-sctp@vger.kernel.org
Subject: Re: [PATCH] Fix piggybacked ACKs
Date: Fri, 31 Jul 2009 14:18:11 +0000 [thread overview]
Message-ID: <4A72FD23.9060303@hp.com> (raw)
In-Reply-To: <20090729160557.GC29475@nortel.com>
Michael Tüxen wrote:
> On Jul 31, 2009, at 3:17 AM, Doug Graham wrote:
>
>> On Thu, Jul 30, 2009 at 07:40:47PM -0400, Doug Graham wrote:
>>> On Thu, Jul 30, 2009 at 05:24:09PM -0400, Vlad Yasevich wrote:
>>>> If you still have BSD setup, can you try increasing you message size
>>>> to say 1442 and see what happens.
>>>>
>>>> I'd expect bundles SACKs at 1440 bytes, but then probably a separate
>>>> SACK and DATA.
>>>
>>> The largest amount of data I can send and still have the BSD server
>>> bundle
>>> a SACK with the response is 1436 bytes. The total ethernet frame size
>>> at that point is 1514 bytes, so this seems correct. I've attached
>>> wireshark captures with data sizes of 1436 bytes and 1438 bytes.
>>> It's interesting to note that if BSD decides not to bundle a SACK,
>>> it instead sends a separate SACK packet immediately; it does not wait
>>> for the SACK timer to timeout. It first sends the SACK, then the DATA
>>> immediately follows. I don't think Wei's patch would do this; I think
>>> that if his patch determined that bundling a SACK would cause the packet
>>> to exceed the MTU, then the behaviour will revert to what it was before
>>> my patch is applied: ie the SACK will not be sent for 200ms.
>>
>> I think it's about time that I sat down and carefully read the RFC all
>> the
>> way through before trying to do much more analysis of what's happening on
>> the wire, but I did just notice something surprising while try slightly
>> larger packets. For one, I could've sworn that I saw a ethernet frame
>> of 1516 bytes at one point, but I didn't save the capture and don't
>> know whether it was Linux or BSD that sent the oversized frame, or just
>> my imagination. But here's one that I did capture when sending and
>> receiving 1454 bytes of data. 1452 bytes is the most data that will fit
>> in a single 1514 byte ethernet frame, so 1454 bytes must be fragmented.
>> The capture is attached, but here's one iteration:
>>
>> 13 2.002632 10.0.0.15 10.0.0.11 DATA (1452 bytes data)
>> 14 2.203092 10.0.0.11 10.0.0.15 SACK
>> 15 2.203153 10.0.0.15 10.0.0.11 DATA (2 bytes data)
>> 16 2.203427 10.0.0.11 10.0.0.15 SACK
>> 17 2.203808 10.0.0.11 10.0.0.15 DATA (1452 bytes data)
>> 18 2.403524 10.0.0.15 10.0.0.11 SACK
>> 19 2.403686 10.0.0.11 10.0.0.15 DATA (2 bytes data)
>> 20 2.603285 10.0.0.15 10.0.0.11 SACK
>>
>> What bothers me about this is that Nagle seems to be introducing a delay
> This is the common bad interaction between Nagle and delayed SACKs.
>> here. The first DATA packets in both directions are MTU-sized packets,
>> yet both the Linux client and the BSD server wait 200ms until they get
>> the SACK to the first fragment before sending the second fragment.
>> The server can't send its reply until it gets both fragments, and the
>> client can't reassemble the reply until it gets both fragments, so from
>> the application's point of view, the reply doesn't arrive until 400ms
>> after the request is sent. This could probably be fixed by disabling
>> Nagle with SCTP_NODELAY, but that shouldn't be required. Nagle is only
>> supposed to prevent multiple outstanding *small* packets.
> Yes, but Nagle operates at the level of chunks...
> This problem is one of the reasons why we have
Michael
That doesn't make sense. Nagle was meant to prevent sending a bunch small
packets. That doesn't apply if the user sends a large enough message that it
ends up being fragmenting into a full sized data chunk and a small-sized data
chunk. It doesn't sound like Nagle should apply to the second fragment.
-vlad
> draft-tuexen-tsvwg-sctp-sack-immediately-02
> The kernel can set the I-Bit on the first chunk...
> Currently the only way around this is to disable Nagle at all...
>>
>> If you tell me I'm full of crap, I promise I'll shut up until I read
>> the whole RFC :-)
>>
>> --Doug.
>> <bsd72_server_1454.cap>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2009-07-31 14:18 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-29 16:05 [PATCH] Fix piggybacked ACKs Doug Graham
2009-07-30 6:48 ` Wei Yongjun
2009-07-30 9:51 ` Wei Yongjun
2009-07-30 16:49 ` Doug Graham
2009-07-30 17:05 ` Vlad Yasevich
2009-07-30 21:24 ` Vlad Yasevich
2009-07-30 23:40 ` Doug Graham
2009-07-31 0:53 ` Wei Yongjun
2009-07-31 1:17 ` Doug Graham
2009-07-31 1:43 ` Doug Graham
2009-07-31 4:21 ` Wei Yongjun
2009-07-31 7:30 ` Michael Tüxen
2009-07-31 7:34 ` Michael Tüxen
2009-07-31 12:59 ` Doug Graham
2009-07-31 13:11 ` Doug Graham
2009-07-31 13:39 ` Doug Graham
2009-07-31 14:18 ` Vlad Yasevich [this message]
2009-08-02 2:03 ` Doug Graham
2009-08-03 2:00 ` Wei Yongjun
2009-08-03 2:15 ` Wei Yongjun
2009-08-03 3:32 ` Wei Yongjun
2009-08-04 3:00 ` Doug Graham
2009-08-04 3:03 ` Wei Yongjun
2009-08-04 3:28 ` Doug Graham
2009-08-04 3:44 ` Doug Graham
2009-08-04 3:57 ` Doug Graham
2009-08-04 14:50 ` Vlad Yasevich
2009-08-04 17:05 ` Doug Graham
2009-08-04 17:14 ` Vlad Yasevich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A72FD23.9060303@hp.com \
--to=vladislav.yasevich@hp.com \
--cc=linux-sctp@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).