* [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
@ 2011-12-19 14:11 Thomas Graf
2011-12-19 20:10 ` David Miller
2011-12-19 23:19 ` Thomas Graf
0 siblings, 2 replies; 7+ messages in thread
From: Thomas Graf @ 2011-12-19 14:11 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: linux-sctp, David S. Miller, netdev, Thomas Graf
When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.
The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.
When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.
Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.
The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.
Chunk
Size Unpatched No Overhead
-------------------------------------
4 15.2 Kbit [!] 12.2 Mbit [!]
8 35.8 Kbit [!] 26.0 Mbit [!]
16 95.5 Kbit [!] 54.4 Mbit [!]
32 106.7 Mbit 102.3 Mbit
64 189.2 Mbit 188.3 Mbit
128 331.2 Mbit 334.8 Mbit
256 537.7 Mbit 536.0 Mbit
512 766.9 Mbit 766.6 Mbit
1024 810.1 Mbit 808.6 Mbit
Signed-off-by: Thomas Graf <tgraf@redhat.com>
---
net/sctp/output.c | 8 +-------
net/sctp/outqueue.c | 6 ++----
2 files changed, 3 insertions(+), 11 deletions(-)
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 08b3cea..817174e 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -697,13 +697,7 @@ static void sctp_packet_append_data(struct sctp_packet *packet,
/* Keep track of how many bytes are in flight to the receiver. */
asoc->outqueue.outstanding_bytes += datasize;
- /* Update our view of the receiver's rwnd. Include sk_buff overhead
- * while updating peer.rwnd so that it reduces the chances of a
- * receiver running out of receive buffer space even when receive
- * window is still open. This can happen when a sender is sending
- * sending small messages.
- */
- datasize += sizeof(struct sk_buff);
+ /* Update our view of the receiver's rwnd. */
if (datasize < rwnd)
rwnd -= datasize;
else
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 14c2b06..cfeb1d4 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -411,8 +411,7 @@ void sctp_retransmit_mark(struct sctp_outq *q,
chunk->transport->flight_size -=
sctp_data_size(chunk);
q->outstanding_bytes -= sctp_data_size(chunk);
- q->asoc->peer.rwnd += (sctp_data_size(chunk) +
- sizeof(struct sk_buff));
+ q->asoc->peer.rwnd += sctp_data_size(chunk);
}
continue;
}
@@ -432,8 +431,7 @@ void sctp_retransmit_mark(struct sctp_outq *q,
* (Section 7.2.4)), add the data size of those
* chunks to the rwnd.
*/
- q->asoc->peer.rwnd += (sctp_data_size(chunk) +
- sizeof(struct sk_buff));
+ q->asoc->peer.rwnd += sctp_data_size(chunk);
q->outstanding_bytes -= sctp_data_size(chunk);
if (chunk->transport)
transport->flight_size -= sctp_data_size(chunk);
--
1.7.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
2011-12-19 14:11 Thomas Graf
@ 2011-12-19 20:10 ` David Miller
2011-12-19 23:19 ` Thomas Graf
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2011-12-19 20:10 UTC (permalink / raw)
To: tgraf; +Cc: vladislav.yasevich, linux-sctp, netdev
From: Thomas Graf <tgraf@redhat.com>
Date: Mon, 19 Dec 2011 15:11:40 +0100
> When checking whether a DATA chunk fits into the estimated rwnd a
> full sizeof(struct sk_buff) is added to the needed chunk size. This
> quickly exhausts the available rwnd space and leads to packets being
> sent which are much below the PMTU limit. This can lead to much worse
> performance.
...
Vlad, please review this.
Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
2011-12-19 14:11 Thomas Graf
2011-12-19 20:10 ` David Miller
@ 2011-12-19 23:19 ` Thomas Graf
1 sibling, 0 replies; 7+ messages in thread
From: Thomas Graf @ 2011-12-19 23:19 UTC (permalink / raw)
To: Vladislav Yasevich; +Cc: linux-sctp, David S. Miller, netdev
On Mon, Dec 19, 2011 at 03:11:40PM +0100, Thomas Graf wrote:
> Trying to reproduce this I found that with the sk_buff overhead removed,
> the performance would improve significantly unless socket buffer limits
> are increased.
I believe this is likely to be misunderstood. What I meant is that by
removing the sk_buff overhead and while using default socket buffer limits
the performance increases as shown below. If socket buffers are enlarged
performance differences fade until there is no longer any difference.
Sorry for poor wording.
> The following numbers have been gathered using a patched iperf
> supporting SCTP over a live 1 Gbit ethernet network. The -l option
> was used to limit DATA chunk sizes. The numbers listed are based on
> the average of 3 test runs each. Default values have been used for
> sk_(r|w)mem.
>
> Chunk
> Size Unpatched No Overhead
> -------------------------------------
> 4 15.2 Kbit [!] 12.2 Mbit [!]
> 8 35.8 Kbit [!] 26.0 Mbit [!]
> 16 95.5 Kbit [!] 54.4 Mbit [!]
> 32 106.7 Mbit 102.3 Mbit
> 64 189.2 Mbit 188.3 Mbit
> 128 331.2 Mbit 334.8 Mbit
> 256 537.7 Mbit 536.0 Mbit
> 512 766.9 Mbit 766.6 Mbit
> 1024 810.1 Mbit 808.6 Mbit
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
@ 2011-12-20 5:00 Wei Yongjun
2011-12-20 9:39 ` Thomas Graf
0 siblings, 1 reply; 7+ messages in thread
From: Wei Yongjun @ 2011-12-20 5:00 UTC (permalink / raw)
To: tgraf; +Cc: vladislav.yasevich, linux-sctp, davem, netdev
Hi Thomas,
> On Mon, Dec 19, 2011 at 03:11:40PM +0100, Thomas Graf wrote:
>> Trying to reproduce this I found that with the sk_buff overhead removed,
>> the performance would improve significantly unless socket buffer limits
>> are increased.
> I believe this is likely to be misunderstood. What I meant is that by
> removing the sk_buff overhead and while using default socket buffer limits
> the performance increases as shown below. If socket buffers are enlarged
> performance differences fade until there is no longer any difference.
>
> Sorry for poor wording.
>
>> The following numbers have been gathered using a patched iperf
>> supporting SCTP over a live 1 Gbit ethernet network. The -l option
>> was used to limit DATA chunk sizes. The numbers listed are based on
>> the average of 3 test runs each. Default values have been used for
>> sk_(r|w)mem.
>>
>> Chunk
>> Size Unpatched No Overhead
>> -------------------------------------
>> 4 15.2 Kbit [!] 12.2 Mbit [!]
>> 8 35.8 Kbit [!] 26.0 Mbit [!]
>> 16 95.5 Kbit [!] 54.4 Mbit [!]
>> 32 106.7 Mbit 102.3 Mbit
>> 64 189.2 Mbit 188.3 Mbit
>> 128 331.2 Mbit 334.8 Mbit
>> 256 537.7 Mbit 536.0 Mbit
>> 512 766.9 Mbit 766.6 Mbit
>> 1024 810.1 Mbit 808.6 Mbit
I saw you discussed this with Vlad in old mail:
http://www.spinics.net/lists/linux-sctp/msg01365.html
You said you will update patch to include a per packet overhead,
but it does not include in this patch, what's wrong with in?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
2011-12-20 5:00 [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd Wei Yongjun
@ 2011-12-20 9:39 ` Thomas Graf
2011-12-20 18:01 ` Vlad Yasevich
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Graf @ 2011-12-20 9:39 UTC (permalink / raw)
To: Wei Yongjun; +Cc: vladislav.yasevich, linux-sctp, davem, netdev
On Tue, Dec 20, 2011 at 01:00:48PM +0800, Wei Yongjun wrote:
> I saw you discussed this with Vlad in old mail:
> http://www.spinics.net/lists/linux-sctp/msg01365.html
>
> You said you will update patch to include a per packet overhead,
> but it does not include in this patch, what's wrong with in?
It's not possible because upon retransmission of a chunk we need
to readd the overhead to the rwnd. There is no longer a reference
to a packet so we can't know how much to add. This explanation is
also in the original mail thread.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
2011-12-20 9:39 ` Thomas Graf
@ 2011-12-20 18:01 ` Vlad Yasevich
2011-12-20 18:59 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: Vlad Yasevich @ 2011-12-20 18:01 UTC (permalink / raw)
To: Wei Yongjun, linux-sctp, davem, netdev
On 12/20/2011 04:39 AM, Thomas Graf wrote:
> On Tue, Dec 20, 2011 at 01:00:48PM +0800, Wei Yongjun wrote:
>> I saw you discussed this with Vlad in old mail:
>> http://www.spinics.net/lists/linux-sctp/msg01365.html
>>
>> You said you will update patch to include a per packet overhead,
>> but it does not include in this patch, what's wrong with in?
>
> It's not possible because upon retransmission of a chunk we need
> to readd the overhead to the rwnd. There is no longer a reference
> to a packet so we can't know how much to add. This explanation is
> also in the original mail thread.
>
Right. The original patches were done to work around the problem of
leftover rwnd when socket buffer is exhausted and they didn't really
address the problem sufficiently. It was still possible to reach that
condition. Some subsequent patches added support to address this issue
a different way. As a result, I think this revert is just fine.
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
-vlad
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
2011-12-20 18:01 ` Vlad Yasevich
@ 2011-12-20 18:59 ` David Miller
0 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2011-12-20 18:59 UTC (permalink / raw)
To: vladislav.yasevich; +Cc: weiyj.lk, linux-sctp, netdev
From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Tue, 20 Dec 2011 13:01:13 -0500
>
>
> On 12/20/2011 04:39 AM, Thomas Graf wrote:
>> On Tue, Dec 20, 2011 at 01:00:48PM +0800, Wei Yongjun wrote:
>>> I saw you discussed this with Vlad in old mail:
>>> http://www.spinics.net/lists/linux-sctp/msg01365.html
>>>
>>> You said you will update patch to include a per packet overhead,
>>> but it does not include in this patch, what's wrong with in?
>>
>> It's not possible because upon retransmission of a chunk we need
>> to readd the overhead to the rwnd. There is no longer a reference
>> to a packet so we can't know how much to add. This explanation is
>> also in the original mail thread.
>>
>
>
> Right. The original patches were done to work around the problem of
> leftover rwnd when socket buffer is exhausted and they didn't really
> address the problem sufficiently. It was still possible to reach that
> condition. Some subsequent patches added support to address this
> issue
> a different way. As a result, I think this revert is just fine.
>
> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Applied, thanks everyone.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2011-12-20 18:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-20 5:00 [PATCH] sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd Wei Yongjun
2011-12-20 9:39 ` Thomas Graf
2011-12-20 18:01 ` Vlad Yasevich
2011-12-20 18:59 ` David Miller
-- strict thread matches above, loose matches on Subject: below --
2011-12-19 14:11 Thomas Graf
2011-12-19 20:10 ` David Miller
2011-12-19 23:19 ` Thomas Graf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).