From: Vlad Yasevich <vyasevich@gmail.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v3] sctp: be more restrictive in transport selection on bundled sacks
Date: Fri, 29 Jun 2012 15:15:12 -0400 [thread overview]
Message-ID: <4FEDFEC0.2060405@gmail.com> (raw)
In-Reply-To: <20120629184310.GA24604@hmsreliant.think-freely.org>
On 06/29/2012 02:43 PM, Neil Horman wrote:
> On Fri, Jun 29, 2012 at 02:29:52PM -0400, Vlad Yasevich wrote:
>> On 06/29/2012 12:34 PM, Neil Horman wrote:
>>> It was noticed recently that when we send data on a transport, its possible that
>>> we might bundle a sack that arrived on a different transport. While this isn't
>>> a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
>>> 2960:
>>>
>>> An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
>>> etc.) to the same destination transport address from which it
>>> received the DATA or control chunk to which it is replying. This
>>> rule should also be followed if the endpoint is bundling DATA chunks
>>> together with the reply chunk.
>>>
>>> This patch seeks to correct that. It restricts the bundling of sack operations
>>> to only those transports which have moved the ctsn of the association forward
>>> since the last sack. By doing this we guarantee that we only bundle outbound
>>> saks on a transport that has received a chunk since the last sack. This brings
>>> us into stricter compliance with the RFC.
>>>
>>> Vlad had initially suggested that we strictly allow only sack bundling on the
>>> transport that last moved the ctsn forward. While this makes sense, I was
>>> concerned that doing so prevented us from bundling in the case where we had
>>> received chunks that moved the ctsn on multiple transports. In those cases, the
>>> RFC allows us to select any of the transports having received chunks to bundle
>>> the sack on. so I've modified the approach to allow for that, by adding a state
>>> variable to each transport that tracks weather it has moved the ctsn since the
>>> last sack. This I think keeps our behavior (and performance), close enough to
>>> our current profile that I think we can do this without a sysctl knob to
>>> enable/disable it.
>>>
>>> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
>>> CC: Vlad Yaseivch<vyasevich@gmail.com>
>>> CC: David S. Miller<davem@davemloft.net>
>>> Reported-by: Michele Baldessari<michele@redhat.com>
>>> Reported-by: sorin serban<sserban@redhat.com>
>>>
>>> ---
>>> Change Notes:
>>> V2)
>>> * Removed unused variable as per Dave M. Request
>>> * Delayed rwnd adjustment until we are sure we will sack (Vlad Y.)
>>> V3)
>>> * Switched test to use pkt->transport rather than chunk->transport
>>> * Modified detection of sacka-able transport. Instead of just setting
>>> and clearning a flag, we now mark each transport and association with
>>> a sack generation tag. We increment the associations generation on
>>> every sack, and assign that generation tag to every transport that
>>> updates the ctsn. This prevents us from having to iterate over a for
>>> loop on every sack, which is much more scalable.
>>> ---
>>> include/net/sctp/structs.h | 4 ++++
>>> include/net/sctp/tsnmap.h | 3 ++-
>>> net/sctp/associola.c | 1 +
>>> net/sctp/output.c | 9 +++++++--
>>> net/sctp/sm_make_chunk.c | 10 ++++++++++
>>> net/sctp/sm_sideeffect.c | 2 +-
>>> net/sctp/transport.c | 2 ++
>>> net/sctp/tsnmap.c | 6 +++++-
>>> net/sctp/ulpevent.c | 3 ++-
>>> net/sctp/ulpqueue.c | 2 +-
>>> 10 files changed, 35 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>>> index e4652fe..fecdf31 100644
>>> --- a/include/net/sctp/structs.h
>>> +++ b/include/net/sctp/structs.h
>>> @@ -912,6 +912,9 @@ struct sctp_transport {
>>> /* Is this structure kfree()able? */
>>> malloced:1;
>>>
>>> + /* Has this transport moved the ctsn since we last sacked */
>>> + __u32 sack_generation;
>>> +
>>> struct flowi fl;
>>>
>>> /* This is the peer's IP address and port. */
>>> @@ -1584,6 +1587,7 @@ struct sctp_association {
>>> */
>>> __u8 sack_needed; /* Do we need to sack the peer? */
>>> __u32 sack_cnt;
>>> + __u32 sack_generation;
>>>
>>> /* These are capabilities which our peer advertised. */
>>> __u8 ecn_capable:1, /* Can peer do ECN? */
>>> diff --git a/include/net/sctp/tsnmap.h b/include/net/sctp/tsnmap.h
>>> index e7728bc..2c5d2b4 100644
>>> --- a/include/net/sctp/tsnmap.h
>>> +++ b/include/net/sctp/tsnmap.h
>>> @@ -117,7 +117,8 @@ void sctp_tsnmap_free(struct sctp_tsnmap *map);
>>> int sctp_tsnmap_check(const struct sctp_tsnmap *, __u32 tsn);
>>>
>>> /* Mark this TSN as seen. */
>>> -int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn);
>>> +int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn,
>>> + struct sctp_transport *trans);
>>>
>>> /* Mark this TSN and all lower as seen. */
>>> void sctp_tsnmap_skip(struct sctp_tsnmap *map, __u32 tsn);
>>> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
>>> index 5bc9ab1..6c66adb 100644
>>> --- a/net/sctp/associola.c
>>> +++ b/net/sctp/associola.c
>>> @@ -271,6 +271,7 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a
>>> */
>>> asoc->peer.sack_needed = 1;
>>> asoc->peer.sack_cnt = 0;
>>> + asoc->peer.sack_generation=0;
>>>
>>> /* Assume that the peer will tell us if he recognizes ASCONF
>>> * as part of INIT exchange.
>>> diff --git a/net/sctp/output.c b/net/sctp/output.c
>>> index f1b7d4b..0de6cd5 100644
>>> --- a/net/sctp/output.c
>>> +++ b/net/sctp/output.c
>>> @@ -240,14 +240,19 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct sctp_packet *pkt,
>>> */
>>> if (sctp_chunk_is_data(chunk)&& !pkt->has_sack&&
>>> !pkt->has_cookie_echo) {
>>> - struct sctp_association *asoc;
>>> struct timer_list *timer;
>>> - asoc = pkt->transport->asoc;
>>> + struct sctp_association *asoc = pkt->transport->asoc;
>>> +
>>> timer =&asoc->timers[SCTP_EVENT_TIMEOUT_SACK];
>>>
>>> /* If the SACK timer is running, we have a pending SACK */
>>> if (timer_pending(timer)) {
>>> struct sctp_chunk *sack;
>>> +
>>> + if (pkt->transport->sack_generation !=
>>> + pkt->transport->asoc->peer.sack_generation)
>>> + return retval;
>>> +
>>> asoc->a_rwnd = asoc->rwnd;
>>> sack = sctp_make_sack(asoc);
>>> if (sack) {
>>> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
>>> index a85eeeb..ffa2a8e 100644
>>> --- a/net/sctp/sm_make_chunk.c
>>> +++ b/net/sctp/sm_make_chunk.c
>>> @@ -736,6 +736,7 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>>> __u16 num_gabs, num_dup_tsns;
>>> struct sctp_tsnmap *map = (struct sctp_tsnmap *)&asoc->peer.tsn_map;
>>> struct sctp_gap_ack_block gabs[SCTP_MAX_GABS];
>>> + struct sctp_transport *trans;
>>>
>>> memset(gabs, 0, sizeof(gabs));
>>> ctsn = sctp_tsnmap_get_ctsn(map);
>>> @@ -805,6 +806,15 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>>> sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
>>> sctp_tsnmap_get_dups(map));
>>>
>>> + /*
>>> + * Once we have a sack generated, clear the moved_tsn information
>>> + * from all the transports
>>> + */
>>> + if (!asoc->peer.sack_generation)
>>> + list_for_each_entry(trans,&asoc->peer.transport_addr_list,
>>> + transports)
>>> + trans->sack_generation = UINT_MAX;
>>> + ((struct sctp_association *)asoc)->peer.sack_generation++;
>>
>> Two points here:
>> 1) The commend no longer matches the code
> Crud, missed that, I'll fix it.
>
>> 2) Why special case the peer.sack_generations == 0 and set the
>> transport to UNIT_MAX?
>>
> To avoid wrapping problems leading to erroneous bundling errors. Consider a
> long lived connection with two trasports (A and B).
>
> If all traffic is sent on A for a long time (generating UINT_MAX sacks), and the
> peer chooses that moment to send data on transport B, its possible that we will
> bundle a sack with that data chunk erroneously, because the associations
> sack_generation has wrapped, and now matches with the transports, even though we
> never received data on transport B. The special casing ensures that we never
> hit that problem.
>
But you just move this condition to the UINT_MAX value instead. If we
use the alternate transport at the time that sack_generation ==
UINT_MAX, we may pick the wrong transport.
You may want to consider value 0 reserved as UNUSED and make
peer.sack_generation start at 1 and wrap to 1.
-vlad
next prev parent reply other threads:[~2012-06-29 19:15 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-26 20:31 [PATCH] sctp: be mroe restrictive in transport selection on bundled sacks Neil Horman
2012-06-27 4:05 ` David Miller
2012-06-27 10:24 ` Neil Horman
2012-06-27 13:20 ` Vlad Yasevich
2012-06-27 13:22 ` Neil Horman
2012-06-27 14:23 ` [PATCH v2] sctp: be more " Neil Horman
2012-06-27 15:10 ` Vlad Yasevich
2012-06-27 17:28 ` Neil Horman
2012-06-27 19:44 ` Vlad Yasevich
2012-06-28 15:33 ` Neil Horman
2012-06-28 15:58 ` Vlad Yasevich
2012-06-28 18:07 ` Neil Horman
2012-06-28 18:22 ` Vlad Yasevich
2012-06-28 18:36 ` Neil Horman
2012-06-28 20:14 ` Neil Horman
2012-06-29 16:34 ` [PATCH v3] " Neil Horman
2012-06-29 18:29 ` Vlad Yasevich
2012-06-29 18:43 ` Neil Horman
2012-06-29 19:15 ` Vlad Yasevich [this message]
2012-06-29 19:21 ` Neil Horman
2012-06-29 19:24 ` [PATCH v4] " Neil Horman
2012-06-29 20:15 ` [PATCH v5] " Neil Horman
2012-06-29 20:19 ` Vlad Yasevich
2012-06-29 23:34 ` David Miller
2012-06-30 12:26 ` Neil Horman
2012-07-01 0:38 ` David Miller
2012-06-30 13:04 ` [PATCH v6] " Neil Horman
2012-07-01 0:39 ` David Miller
2012-07-01 3:17 ` Vlad Yasevich
2012-07-01 5:44 ` David Miller
2012-07-01 12:47 ` Neil Horman
2012-07-01 21:43 ` David Miller
2012-07-01 23:44 ` Neil Horman
2012-07-02 12:25 ` Neil Horman
2012-07-03 0:10 ` David Miller
2012-07-03 18:45 ` Jan Ceuleers
2012-07-03 23:42 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FEDFEC0.2060405@gmail.com \
--to=vyasevich@gmail.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
--cc=nhorman@tuxdriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).