netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: Alexander Sverdlin <alexander.sverdlin@nsn.com>,
	ext Dongsheng Song <dongsheng.song@gmail.com>,
	Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Daniel Borkmann <dborkman@redhat.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	"linux-sctp@vger.kernel.org" <linux-sctp@vger.kernel.org>
Subject: Re: [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
Date: Wed, 16 Apr 2014 14:50:37 -0400	[thread overview]
Message-ID: <534ED0FD.4040709@gmail.com> (raw)
In-Reply-To: <534E473E.20303@nsn.com>

On 04/16/2014 05:02 AM, Alexander Sverdlin wrote:
> Hi Dongsheng!
>
> On 16/04/14 10:39, ext Dongsheng Song wrote:
>> >From my testing, netperf throughput from 600 Mbit/s drop to 6 Mbit/s,
>> the penalty is 99 %.
>
> The question was, do you see this as a problem of the new rwnd algorithm?
> If yes, how exactly?

The algorithm isn't wrong, but the implementation appears to have
a bug with window update SACKs.  The problem is that
sk->sk_rmem_alloc is updated by the skb destructor when
skb is freed.  This happens after we call sctp_assoc_rwnd_update()
which tries to send the update SACK.  As a result, in default
config with per-socket accounting, the test
    if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0)
uses the wrong values for rx_count and results in advertisement
of decreased rwnd instead of what is really available.

Can you try this patch without the revert applied.

Thanks
-vlad

diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 8d198ae..cc2d440 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -1011,7 +1011,6 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 {
 	struct sk_buff *skb, *frag;
 	unsigned int	len;
-	struct sctp_association *asoc;

 	/* Current stack structures assume that the rcv buffer is
 	 * per socket.   For UDP style sockets this is not true as
@@ -1036,11 +1035,7 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 	}

 done:
-	asoc = event->asoc;
-	sctp_association_hold(asoc);
 	sctp_ulpevent_release_owner(event);
-	sctp_assoc_rwnd_update(asoc, true);
-	sctp_association_put(asoc);
 }

 static void sctp_ulpevent_release_frag_data(struct sctp_ulpevent *event)
@@ -1071,12 +1066,21 @@ done:
  */
 void sctp_ulpevent_free(struct sctp_ulpevent *event)
 {
+	struct sctp_association *assoc = event->asoc;
+
 	if (sctp_ulpevent_is_notification(event))
 		sctp_ulpevent_release_owner(event);
 	else
 		sctp_ulpevent_release_data(event);

 	kfree_skb(sctp_event2skb(event));
+	/* The socket is locked and the assocaiton can't go anywhere
+	 * since we are walking the uplqueue.  No need to hold
+	 * another ref on the association.  Now that the skb has been
+	 * freed and accounted for everywhere, see if we need to send
+	 * a window update SACK.
+	 */
+	sctp_assoc_rwnd_update(asoc, true);
 }

 /* Purge the skb lists holding ulpevents. */


> The algorithm actually has no preference to any amount of data.
> It was fine-tuned before to serve as congestion control algorithm, but
this should
> be located elsewhere. Perhaps, indeed, a re-use of congestion control
modules from
> TCP would be possible...
>
>> http://www.spinics.net/lists/linux-sctp/msg03308.html
>>
>>
>> On Wed, Apr 16, 2014 at 2:57 PM, Matija Glavinic Pecotic
>> <matija.glavinic-pecotic.ext@nsn.com> wrote:
>>>
>>> Hello Vlad,
>>>
>>> On 04/14/2014 09:57 PM, ext Vlad Yasevich wrote:
>>>> The base approach is sound.  The idea is to calculate rwnd based
>>>> on receiver buffer available.  The algorithm chosen however, is
>>>> gives a much higher preference to small data and penalizes large
>>>> data transfers.  We need to figure our something else here..
>>>
>>> I don't follow you here. Could you please explain what do you see as
penalty?
>>>
>>> Thanks,
>>>
>>> Matija
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>

  parent reply	other threads:[~2014-04-16 18:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-14 19:45 [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Daniel Borkmann
2014-04-14 19:57 ` Vlad Yasevich
2014-04-16  6:57   ` Matija Glavinic Pecotic
2014-04-16  8:39     ` Dongsheng Song
2014-04-16  9:02       ` Alexander Sverdlin
2014-04-16 11:55         ` Matija Glavinic Pecotic
2014-04-16 13:32           ` Vlad Yasevich
2014-04-16 18:50         ` Vlad Yasevich [this message]
2014-04-16 19:05           ` Daniel Borkmann
2014-04-16 19:24             ` Matija Glavinic Pecotic
2014-04-16 19:47               ` Vlad Yasevich
2014-04-21 19:12                 ` Matija Glavinic Pecotic
2014-04-14 20:48 ` David Miller
2014-04-15  8:46   ` Alexander Sverdlin
2014-04-15  8:57     ` Daniel Borkmann
2014-04-15  6:43 ` Alexander Sverdlin
2014-04-15  7:08   ` Daniel Borkmann
2014-04-15 14:27   ` Butler, Peter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534ED0FD.4040709@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=alexander.sverdlin@nsn.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=dongsheng.song@gmail.com \
    --cc=linux-sctp@vger.kernel.org \
    --cc=matija.glavinic-pecotic.ext@nsn.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).