From: Vladislav Yasevich <vladislav.yasevich@hp.com>
To: Sridhar Samudrala <sri@us.ibm.com>,
linux-sctp@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each
Date: Wed, 29 Jun 2011 14:09:18 +0000 [thread overview]
Message-ID: <4E0B320E.4040309@hp.com> (raw)
In-Reply-To: <20110627091136.GA10085@canuck.infradead.org>
On 06/27/2011 05:11 AM, Thomas Graf wrote:
> On Fri, Jun 24, 2011 at 11:21:11AM -0400, Vladislav Yasevich wrote:
>> We, instead of trying to underestimate the window size, try to over-estimate it.
>> Almost every implementation has some kind of overhead and we don't know how
>> that overhead will impact the window. As such we try to temporarily account for this
>> overhead.
>
> I looked into this some more and it turns out that adding per-packet
> overhead is difficult because when we mark chunks for retransmissions
> we have to add its data size to the peer rwnd again but we have no
> idea how many packets were used for the initial transmission. Therefore
> if we add an overhead, we can only do so per chunk.
>
Good point.
>> If we treat the window as strictly available data, then we may end up sending a lot more traffic
>> then the window can take thus causing us to enter 0 window probe and potential retransmission
>> issues that will trigger congestion control.
>> We'd like to avoid that so we put some overhead into our computations. It may not be ideal
>> since we do this on a per-chunk basis. It could probably be done on per-packet basis instead.
>> This way, we'll essentially over-estimate but under-subscribe our current view of the peers
>> window. So in one shot, we are not going to over-fill it and will get an updated view next
>> time the SACK arrives.
>
> What kind of configuration showed this behaviour? Did you observe that
> issue with Linux peers?
Yes, this was observed with linux peers.
> If a peer announces an a_rwnd which it cannot
> handle then that is a implementation bug of the receiver and not of the
> sender.
>
> We won't go into zero window probe mode that easily, remember it's only
> one packet allowed in flight while rwnd is 0. We always take into
> account outstanding bytes when updating rwnd with a_rwnd so our view of
> the peer's rwnd is very accurate.
>
> In fact the RFC clearly states when and how to update the peer rwnd:
>
> B) Any time a DATA chunk is transmitted (or retransmitted) to a peer,
> the endpoint subtracts the data size of the chunk from the rwnd of
> that peer.
>
> I would like to try and reproduce the behaviour you have observed and
> fix it without cutting our ability to produce pmtu maxed packets with
> small data chunks.
>
This was easily reproducible with sctp_darn tool using 1 byte payload.
This was a while ago, and I dont' know if anyone has tried it recently.
-vlad
WARNING: multiple messages have this Message-ID (diff)
From: Vladislav Yasevich <vladislav.yasevich@hp.com>
To: Sridhar Samudrala <sri@us.ibm.com>,
linux-sctp@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive
Date: Wed, 29 Jun 2011 10:09:18 -0400 [thread overview]
Message-ID: <4E0B320E.4040309@hp.com> (raw)
In-Reply-To: <20110627091136.GA10085@canuck.infradead.org>
On 06/27/2011 05:11 AM, Thomas Graf wrote:
> On Fri, Jun 24, 2011 at 11:21:11AM -0400, Vladislav Yasevich wrote:
>> We, instead of trying to underestimate the window size, try to over-estimate it.
>> Almost every implementation has some kind of overhead and we don't know how
>> that overhead will impact the window. As such we try to temporarily account for this
>> overhead.
>
> I looked into this some more and it turns out that adding per-packet
> overhead is difficult because when we mark chunks for retransmissions
> we have to add its data size to the peer rwnd again but we have no
> idea how many packets were used for the initial transmission. Therefore
> if we add an overhead, we can only do so per chunk.
>
Good point.
>> If we treat the window as strictly available data, then we may end up sending a lot more traffic
>> then the window can take thus causing us to enter 0 window probe and potential retransmission
>> issues that will trigger congestion control.
>> We'd like to avoid that so we put some overhead into our computations. It may not be ideal
>> since we do this on a per-chunk basis. It could probably be done on per-packet basis instead.
>> This way, we'll essentially over-estimate but under-subscribe our current view of the peers
>> window. So in one shot, we are not going to over-fill it and will get an updated view next
>> time the SACK arrives.
>
> What kind of configuration showed this behaviour? Did you observe that
> issue with Linux peers?
Yes, this was observed with linux peers.
> If a peer announces an a_rwnd which it cannot
> handle then that is a implementation bug of the receiver and not of the
> sender.
>
> We won't go into zero window probe mode that easily, remember it's only
> one packet allowed in flight while rwnd is 0. We always take into
> account outstanding bytes when updating rwnd with a_rwnd so our view of
> the peer's rwnd is very accurate.
>
> In fact the RFC clearly states when and how to update the peer rwnd:
>
> B) Any time a DATA chunk is transmitted (or retransmitted) to a peer,
> the endpoint subtracts the data size of the chunk from the rwnd of
> that peer.
>
> I would like to try and reproduce the behaviour you have observed and
> fix it without cutting our ability to produce pmtu maxed packets with
> small data chunks.
>
This was easily reproducible with sctp_darn tool using 1 byte payload.
This was a while ago, and I dont' know if anyone has tried it recently.
-vlad
next prev parent reply other threads:[~2011-06-29 14:09 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-24 10:15 [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK Thomas Graf
2011-06-24 10:15 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Thomas Graf
2011-06-24 13:48 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each Vladislav Yasevich
2011-06-24 13:48 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Vladislav Yasevich
2011-06-24 14:42 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each Thomas Graf
2011-06-24 14:42 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Thomas Graf
2011-06-24 15:21 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each Vladislav Yasevich
2011-06-24 15:21 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Vladislav Yasevich
2011-06-24 15:53 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each Thomas Graf
2011-06-24 15:53 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Thomas Graf
2011-06-27 9:11 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each Thomas Graf
2011-06-27 9:11 ` [PATCH] sctp: Reducing rwnd by sizeof(struct sk_buff) for each CHUNK is too aggressive Thomas Graf
2011-06-29 14:09 ` Vladislav Yasevich [this message]
2011-06-29 14:09 ` Vladislav Yasevich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E0B320E.4040309@hp.com \
--to=vladislav.yasevich@hp.com \
--cc=linux-sctp@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=sri@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.