All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: Neil Horman <nhorman@tuxdriver.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v3] sctp: be more restrictive in transport selection on bundled sacks
Date: Fri, 29 Jun 2012 15:15:12 -0400	[thread overview]
Message-ID: <4FEDFEC0.2060405@gmail.com> (raw)
In-Reply-To: <20120629184310.GA24604@hmsreliant.think-freely.org>

On 06/29/2012 02:43 PM, Neil Horman wrote:
> On Fri, Jun 29, 2012 at 02:29:52PM -0400, Vlad Yasevich wrote:
>> On 06/29/2012 12:34 PM, Neil Horman wrote:
>>> It was noticed recently that when we send data on a transport, its possible that
>>> we might bundle a sack that arrived on a different transport.  While this isn't
>>> a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
>>> 2960:
>>>
>>>   An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
>>>     etc.) to the same destination transport address from which it
>>>     received the DATA or control chunk to which it is replying.  This
>>>     rule should also be followed if the endpoint is bundling DATA chunks
>>>     together with the reply chunk.
>>>
>>> This patch seeks to correct that.  It restricts the bundling of sack operations
>>> to only those transports which have moved the ctsn of the association forward
>>> since the last sack.  By doing this we guarantee that we only bundle outbound
>>> saks on a transport that has received a chunk since the last sack.  This brings
>>> us into stricter compliance with the RFC.
>>>
>>> Vlad had initially suggested that we strictly allow only sack bundling on the
>>> transport that last moved the ctsn forward.  While this makes sense, I was
>>> concerned that doing so prevented us from bundling in the case where we had
>>> received chunks that moved the ctsn on multiple transports.  In those cases, the
>>> RFC allows us to select any of the transports having received chunks to bundle
>>> the sack on.  so I've modified the approach to allow for that, by adding a state
>>> variable to each transport that tracks weather it has moved the ctsn since the
>>> last sack.  This I think keeps our behavior (and performance), close enough to
>>> our current profile that I think we can do this without a sysctl knob to
>>> enable/disable it.
>>>
>>> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
>>> CC: Vlad Yaseivch<vyasevich@gmail.com>
>>> CC: David S. Miller<davem@davemloft.net>
>>> Reported-by: Michele Baldessari<michele@redhat.com>
>>> Reported-by: sorin serban<sserban@redhat.com>
>>>
>>> ---
>>> Change Notes:
>>> V2)
>>> 	* Removed unused variable as per Dave M. Request
>>> 	* Delayed rwnd adjustment until we are sure we will sack (Vlad Y.)
>>> V3)
>>> 	* Switched test to use pkt->transport rather than chunk->transport
>>> 	* Modified detection of sacka-able transport.  Instead of just setting
>>> 	  and clearning a flag, we now mark each transport and association with
>>> 	  a sack generation tag.  We increment the associations generation on
>>> 	  every sack, and assign that generation tag to every transport that
>>> 	  updates the ctsn.  This prevents us from having to iterate over a for
>>> 	  loop on every sack, which is much more scalable.
>>> ---
>>>   include/net/sctp/structs.h |    4 ++++
>>>   include/net/sctp/tsnmap.h  |    3 ++-
>>>   net/sctp/associola.c       |    1 +
>>>   net/sctp/output.c          |    9 +++++++--
>>>   net/sctp/sm_make_chunk.c   |   10 ++++++++++
>>>   net/sctp/sm_sideeffect.c   |    2 +-
>>>   net/sctp/transport.c       |    2 ++
>>>   net/sctp/tsnmap.c          |    6 +++++-
>>>   net/sctp/ulpevent.c        |    3 ++-
>>>   net/sctp/ulpqueue.c        |    2 +-
>>>   10 files changed, 35 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>>> index e4652fe..fecdf31 100644
>>> --- a/include/net/sctp/structs.h
>>> +++ b/include/net/sctp/structs.h
>>> @@ -912,6 +912,9 @@ struct sctp_transport {
>>>   		/* Is this structure kfree()able? */
>>>   		malloced:1;
>>>
>>> +	/* Has this transport moved the ctsn since we last sacked */
>>> +	__u32 sack_generation;
>>> +
>>>   	struct flowi fl;
>>>
>>>   	/* This is the peer's IP address and port. */
>>> @@ -1584,6 +1587,7 @@ struct sctp_association {
>>>   		 */
>>>   		__u8    sack_needed;     /* Do we need to sack the peer? */
>>>   		__u32	sack_cnt;
>>> +		__u32	sack_generation;
>>>
>>>   		/* These are capabilities which our peer advertised.  */
>>>   		__u8	ecn_capable:1,	    /* Can peer do ECN? */
>>> diff --git a/include/net/sctp/tsnmap.h b/include/net/sctp/tsnmap.h
>>> index e7728bc..2c5d2b4 100644
>>> --- a/include/net/sctp/tsnmap.h
>>> +++ b/include/net/sctp/tsnmap.h
>>> @@ -117,7 +117,8 @@ void sctp_tsnmap_free(struct sctp_tsnmap *map);
>>>   int sctp_tsnmap_check(const struct sctp_tsnmap *, __u32 tsn);
>>>
>>>   /* Mark this TSN as seen.  */
>>> -int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn);
>>> +int sctp_tsnmap_mark(struct sctp_tsnmap *, __u32 tsn,
>>> +		     struct sctp_transport *trans);
>>>
>>>   /* Mark this TSN and all lower as seen. */
>>>   void sctp_tsnmap_skip(struct sctp_tsnmap *map, __u32 tsn);
>>> diff --git a/net/sctp/associola.c b/net/sctp/associola.c
>>> index 5bc9ab1..6c66adb 100644
>>> --- a/net/sctp/associola.c
>>> +++ b/net/sctp/associola.c
>>> @@ -271,6 +271,7 @@ static struct sctp_association *sctp_association_init(struct sctp_association *a
>>>   	 */
>>>   	asoc->peer.sack_needed = 1;
>>>   	asoc->peer.sack_cnt = 0;
>>> +	asoc->peer.sack_generation=0;
>>>
>>>   	/* Assume that the peer will tell us if he recognizes ASCONF
>>>   	 * as part of INIT exchange.
>>> diff --git a/net/sctp/output.c b/net/sctp/output.c
>>> index f1b7d4b..0de6cd5 100644
>>> --- a/net/sctp/output.c
>>> +++ b/net/sctp/output.c
>>> @@ -240,14 +240,19 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct sctp_packet *pkt,
>>>   	 */
>>>   	if (sctp_chunk_is_data(chunk)&&   !pkt->has_sack&&
>>>   	!pkt->has_cookie_echo) {
>>> -		struct sctp_association *asoc;
>>>   		struct timer_list *timer;
>>> -		asoc = pkt->transport->asoc;
>>> +		struct sctp_association *asoc = pkt->transport->asoc;
>>> +
>>>   		timer =&asoc->timers[SCTP_EVENT_TIMEOUT_SACK];
>>>
>>>   		/* If the SACK timer is running, we have a pending SACK */
>>>   		if (timer_pending(timer)) {
>>>   			struct sctp_chunk *sack;
>>> +
>>> +			if (pkt->transport->sack_generation !=
>>> +			    pkt->transport->asoc->peer.sack_generation)
>>> +				return retval;
>>> +
>>>   			asoc->a_rwnd = asoc->rwnd;
>>>   			sack = sctp_make_sack(asoc);
>>>   			if (sack) {
>>> diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
>>> index a85eeeb..ffa2a8e 100644
>>> --- a/net/sctp/sm_make_chunk.c
>>> +++ b/net/sctp/sm_make_chunk.c
>>> @@ -736,6 +736,7 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>>>   	__u16 num_gabs, num_dup_tsns;
>>>   	struct sctp_tsnmap *map = (struct sctp_tsnmap *)&asoc->peer.tsn_map;
>>>   	struct sctp_gap_ack_block gabs[SCTP_MAX_GABS];
>>> +	struct sctp_transport *trans;
>>>
>>>   	memset(gabs, 0, sizeof(gabs));
>>>   	ctsn = sctp_tsnmap_get_ctsn(map);
>>> @@ -805,6 +806,15 @@ struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc)
>>>   		sctp_addto_chunk(retval, sizeof(__u32) * num_dup_tsns,
>>>   				 sctp_tsnmap_get_dups(map));
>>>
>>> +	/*
>>> +	 * Once we have a sack generated, clear the moved_tsn information
>>> +	 * from all the transports
>>> +	 */
>>> +	if (!asoc->peer.sack_generation)
>>> +		list_for_each_entry(trans,&asoc->peer.transport_addr_list,
>>> +				    transports)
>>> +			trans->sack_generation = UINT_MAX;
>>> +	((struct sctp_association *)asoc)->peer.sack_generation++;
>>
>> Two points here:
>> 1) The commend no longer matches the code
> Crud, missed that, I'll fix it.
>
>> 2) Why special case the peer.sack_generations == 0 and set the
>> transport to UNIT_MAX?
>>
> To avoid wrapping problems leading to erroneous bundling errors.  Consider a
> long lived connection with two trasports (A and B).
>
> If all traffic is sent on A for a long time (generating UINT_MAX sacks), and the
> peer chooses that moment to send data on transport B, its possible that we will
> bundle a sack with that data chunk erroneously, because the associations
> sack_generation has wrapped, and now matches with the transports, even though we
> never received data on transport B.  The special casing ensures that we never
> hit that problem.
>

But you just move this condition to the UINT_MAX value instead.  If we 
use the alternate transport at the time that sack_generation == 
UINT_MAX, we may pick the wrong transport.

You may want to consider value 0 reserved as UNUSED and make 
peer.sack_generation start at 1 and wrap to 1.

-vlad

  reply	other threads:[~2012-06-29 19:15 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-26 20:31 [PATCH] sctp: be mroe restrictive in transport selection on bundled sacks Neil Horman
2012-06-26 20:31 ` Neil Horman
2012-06-27  4:05 ` David Miller
2012-06-27  4:05   ` David Miller
2012-06-27 10:24   ` Neil Horman
2012-06-27 10:24     ` Neil Horman
2012-06-27 13:20     ` Vlad Yasevich
2012-06-27 13:20       ` Vlad Yasevich
2012-06-27 13:22       ` Neil Horman
2012-06-27 13:22         ` Neil Horman
2012-06-27 14:23 ` [PATCH v2] sctp: be more " Neil Horman
2012-06-27 15:10   ` Vlad Yasevich
2012-06-27 17:28     ` Neil Horman
2012-06-27 19:44       ` Vlad Yasevich
2012-06-27 19:44         ` Vlad Yasevich
2012-06-28 15:33         ` Neil Horman
2012-06-28 15:33           ` Neil Horman
2012-06-28 15:58           ` Vlad Yasevich
2012-06-28 15:58             ` Vlad Yasevich
2012-06-28 18:07             ` Neil Horman
2012-06-28 18:07               ` Neil Horman
2012-06-28 18:22               ` Vlad Yasevich
2012-06-28 18:22                 ` Vlad Yasevich
2012-06-28 18:36                 ` Neil Horman
2012-06-28 18:36                   ` Neil Horman
2012-06-28 20:14                 ` Neil Horman
2012-06-28 20:14                   ` Neil Horman
2012-06-29 16:34 ` [PATCH v3] " Neil Horman
2012-06-29 18:29   ` Vlad Yasevich
2012-06-29 18:43     ` Neil Horman
2012-06-29 19:15       ` Vlad Yasevich [this message]
2012-06-29 19:21         ` Neil Horman
2012-06-29 19:24 ` [PATCH v4] " Neil Horman
2012-06-29 19:24   ` Neil Horman
2012-06-29 20:15 ` [PATCH v5] " Neil Horman
2012-06-29 20:15   ` Neil Horman
2012-06-29 20:19   ` Vlad Yasevich
2012-06-29 20:19     ` Vlad Yasevich
2012-06-29 23:34   ` David Miller
2012-06-29 23:34     ` David Miller
2012-06-30 12:26     ` Neil Horman
2012-06-30 12:26       ` Neil Horman
2012-07-01  0:38       ` David Miller
2012-07-01  0:38         ` David Miller
2012-06-30 13:04 ` [PATCH v6] " Neil Horman
2012-06-30 13:04   ` Neil Horman
2012-07-01  0:39   ` David Miller
2012-07-01  0:39     ` David Miller
2012-07-01  3:17     ` Vlad Yasevich
2012-07-01  3:17       ` Vlad Yasevich
2012-07-01  5:44       ` David Miller
2012-07-01  5:44         ` David Miller
2012-07-01 12:47     ` Neil Horman
2012-07-01 12:47       ` Neil Horman
2012-07-01 21:43       ` David Miller
2012-07-01 21:43         ` David Miller
2012-07-01 23:44         ` Neil Horman
2012-07-01 23:44           ` Neil Horman
2012-07-02 12:25           ` Neil Horman
2012-07-02 12:25             ` Neil Horman
2012-07-03  0:10             ` David Miller
2012-07-03  0:10               ` David Miller
2012-07-03 18:45             ` Jan Ceuleers
2012-07-03 18:45               ` Jan Ceuleers
2012-07-03 23:42               ` Neil Horman
2012-07-03 23:42                 ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FEDFEC0.2060405@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.