All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlad Yasevich <vyasevich@gmail.com>
To: Alexander Sverdlin <alexander.sverdlin@nsn.com>,
	ext Dongsheng Song <dongsheng.song@gmail.com>,
	Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Daniel Borkmann <dborkman@redhat.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	"linux-sctp@vger.kernel.org" <linux-sctp@vger.kernel.org>
Subject: Re: [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver'
Date: Wed, 16 Apr 2014 18:50:37 +0000	[thread overview]
Message-ID: <534ED0FD.4040709@gmail.com> (raw)
In-Reply-To: <534E473E.20303@nsn.com>

On 04/16/2014 05:02 AM, Alexander Sverdlin wrote:
> Hi Dongsheng!
>
> On 16/04/14 10:39, ext Dongsheng Song wrote:
>> >From my testing, netperf throughput from 600 Mbit/s drop to 6 Mbit/s,
>> the penalty is 99 %.
>
> The question was, do you see this as a problem of the new rwnd algorithm?
> If yes, how exactly?

The algorithm isn't wrong, but the implementation appears to have
a bug with window update SACKs.  The problem is that
sk->sk_rmem_alloc is updated by the skb destructor when
skb is freed.  This happens after we call sctp_assoc_rwnd_update()
which tries to send the update SACK.  As a result, in default
config with per-socket accounting, the test
    if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0)
uses the wrong values for rx_count and results in advertisement
of decreased rwnd instead of what is really available.

Can you try this patch without the revert applied.

Thanks
-vlad

diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 8d198ae..cc2d440 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -1011,7 +1011,6 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 {
 	struct sk_buff *skb, *frag;
 	unsigned int	len;
-	struct sctp_association *asoc;

 	/* Current stack structures assume that the rcv buffer is
 	 * per socket.   For UDP style sockets this is not true as
@@ -1036,11 +1035,7 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 	}

 done:
-	asoc = event->asoc;
-	sctp_association_hold(asoc);
 	sctp_ulpevent_release_owner(event);
-	sctp_assoc_rwnd_update(asoc, true);
-	sctp_association_put(asoc);
 }

 static void sctp_ulpevent_release_frag_data(struct sctp_ulpevent *event)
@@ -1071,12 +1066,21 @@ done:
  */
 void sctp_ulpevent_free(struct sctp_ulpevent *event)
 {
+	struct sctp_association *assoc = event->asoc;
+
 	if (sctp_ulpevent_is_notification(event))
 		sctp_ulpevent_release_owner(event);
 	else
 		sctp_ulpevent_release_data(event);

 	kfree_skb(sctp_event2skb(event));
+	/* The socket is locked and the assocaiton can't go anywhere
+	 * since we are walking the uplqueue.  No need to hold
+	 * another ref on the association.  Now that the skb has been
+	 * freed and accounted for everywhere, see if we need to send
+	 * a window update SACK.
+	 */
+	sctp_assoc_rwnd_update(asoc, true);
 }

 /* Purge the skb lists holding ulpevents. */


> The algorithm actually has no preference to any amount of data.
> It was fine-tuned before to serve as congestion control algorithm, but
this should
> be located elsewhere. Perhaps, indeed, a re-use of congestion control
modules from
> TCP would be possible...
>
>> http://www.spinics.net/lists/linux-sctp/msg03308.html
>>
>>
>> On Wed, Apr 16, 2014 at 2:57 PM, Matija Glavinic Pecotic
>> <matija.glavinic-pecotic.ext@nsn.com> wrote:
>>>
>>> Hello Vlad,
>>>
>>> On 04/14/2014 09:57 PM, ext Vlad Yasevich wrote:
>>>> The base approach is sound.  The idea is to calculate rwnd based
>>>> on receiver buffer available.  The algorithm chosen however, is
>>>> gives a much higher preference to small data and penalizes large
>>>> data transfers.  We need to figure our something else here..
>>>
>>> I don't follow you here. Could you please explain what do you see as
penalty?
>>>
>>> Thanks,
>>>
>>> Matija
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>


WARNING: multiple messages have this Message-ID (diff)
From: Vlad Yasevich <vyasevich@gmail.com>
To: Alexander Sverdlin <alexander.sverdlin@nsn.com>,
	ext Dongsheng Song <dongsheng.song@gmail.com>,
	Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Cc: Daniel Borkmann <dborkman@redhat.com>,
	davem@davemloft.net, netdev@vger.kernel.org,
	"linux-sctp@vger.kernel.org" <linux-sctp@vger.kernel.org>
Subject: Re: [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
Date: Wed, 16 Apr 2014 14:50:37 -0400	[thread overview]
Message-ID: <534ED0FD.4040709@gmail.com> (raw)
In-Reply-To: <534E473E.20303@nsn.com>

On 04/16/2014 05:02 AM, Alexander Sverdlin wrote:
> Hi Dongsheng!
>
> On 16/04/14 10:39, ext Dongsheng Song wrote:
>> >From my testing, netperf throughput from 600 Mbit/s drop to 6 Mbit/s,
>> the penalty is 99 %.
>
> The question was, do you see this as a problem of the new rwnd algorithm?
> If yes, how exactly?

The algorithm isn't wrong, but the implementation appears to have
a bug with window update SACKs.  The problem is that
sk->sk_rmem_alloc is updated by the skb destructor when
skb is freed.  This happens after we call sctp_assoc_rwnd_update()
which tries to send the update SACK.  As a result, in default
config with per-socket accounting, the test
    if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0)
uses the wrong values for rx_count and results in advertisement
of decreased rwnd instead of what is really available.

Can you try this patch without the revert applied.

Thanks
-vlad

diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c
index 8d198ae..cc2d440 100644
--- a/net/sctp/ulpevent.c
+++ b/net/sctp/ulpevent.c
@@ -1011,7 +1011,6 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 {
 	struct sk_buff *skb, *frag;
 	unsigned int	len;
-	struct sctp_association *asoc;

 	/* Current stack structures assume that the rcv buffer is
 	 * per socket.   For UDP style sockets this is not true as
@@ -1036,11 +1035,7 @@ static void sctp_ulpevent_release_data(struct
sctp_ulpevent *event)
 	}

 done:
-	asoc = event->asoc;
-	sctp_association_hold(asoc);
 	sctp_ulpevent_release_owner(event);
-	sctp_assoc_rwnd_update(asoc, true);
-	sctp_association_put(asoc);
 }

 static void sctp_ulpevent_release_frag_data(struct sctp_ulpevent *event)
@@ -1071,12 +1066,21 @@ done:
  */
 void sctp_ulpevent_free(struct sctp_ulpevent *event)
 {
+	struct sctp_association *assoc = event->asoc;
+
 	if (sctp_ulpevent_is_notification(event))
 		sctp_ulpevent_release_owner(event);
 	else
 		sctp_ulpevent_release_data(event);

 	kfree_skb(sctp_event2skb(event));
+	/* The socket is locked and the assocaiton can't go anywhere
+	 * since we are walking the uplqueue.  No need to hold
+	 * another ref on the association.  Now that the skb has been
+	 * freed and accounted for everywhere, see if we need to send
+	 * a window update SACK.
+	 */
+	sctp_assoc_rwnd_update(asoc, true);
 }

 /* Purge the skb lists holding ulpevents. */


> The algorithm actually has no preference to any amount of data.
> It was fine-tuned before to serve as congestion control algorithm, but
this should
> be located elsewhere. Perhaps, indeed, a re-use of congestion control
modules from
> TCP would be possible...
>
>> http://www.spinics.net/lists/linux-sctp/msg03308.html
>>
>>
>> On Wed, Apr 16, 2014 at 2:57 PM, Matija Glavinic Pecotic
>> <matija.glavinic-pecotic.ext@nsn.com> wrote:
>>>
>>> Hello Vlad,
>>>
>>> On 04/14/2014 09:57 PM, ext Vlad Yasevich wrote:
>>>> The base approach is sound.  The idea is to calculate rwnd based
>>>> on receiver buffer available.  The algorithm chosen however, is
>>>> gives a much higher preference to small data and penalizes large
>>>> data transfers.  We need to figure our something else here..
>>>
>>> I don't follow you here. Could you please explain what do you see as
penalty?
>>>
>>> Thanks,
>>>
>>> Matija
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>

  parent reply	other threads:[~2014-04-16 18:50 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-14 19:45 [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's bu Daniel Borkmann
2014-04-14 19:45 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Daniel Borkmann
2014-04-14 19:57 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Vlad Yasevich
2014-04-14 19:57   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Vlad Yasevich
2014-04-16  6:57   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Matija Glavinic Pecotic
2014-04-16  6:57     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Matija Glavinic Pecotic
2014-04-16  8:39     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Dongsheng Song
2014-04-16  8:39       ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Dongsheng Song
2014-04-16  9:02       ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Alexander Sverdlin
2014-04-16  9:02         ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Alexander Sverdlin
2014-04-16 11:55         ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Matija Glavinic Pecotic
2014-04-16 11:55           ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Matija Glavinic Pecotic
2014-04-16 13:32           ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Vlad Yasevich
2014-04-16 13:32             ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Vlad Yasevich
2014-04-16 18:50         ` Vlad Yasevich [this message]
2014-04-16 18:50           ` Vlad Yasevich
2014-04-16 19:05           ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Daniel Borkmann
2014-04-16 19:05             ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Daniel Borkmann
2014-04-16 19:24             ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Matija Glavinic Pecotic
2014-04-16 19:24               ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Matija Glavinic Pecotic
2014-04-16 19:47               ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Vlad Yasevich
2014-04-16 19:47                 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Vlad Yasevich
2014-04-21 19:12                 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Matija Glavinic Pecotic
2014-04-21 19:12                   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Matija Glavinic Pecotic
2014-04-14 20:48 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' David Miller
2014-04-14 20:48   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" David Miller
2014-04-15  8:46   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Alexander Sverdlin
2014-04-15  8:46     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Alexander Sverdlin
2014-04-15  8:57     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Daniel Borkmann
2014-04-15  8:57       ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Daniel Borkmann
2014-04-15  6:43 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Alexander Sverdlin
2014-04-15  6:43   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Alexander Sverdlin
2014-04-15  7:08   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Daniel Borkmann
2014-04-15  7:08     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Daniel Borkmann
2014-04-15 14:27   ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Butler, Peter
2014-04-15 14:27     ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer" Butler, Peter
2014-04-16 18:36 ` [PATCH net] Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver' Vlad Yasevich
  -- strict thread matches above, loose matches on Subject: below --
2015-12-23  7:13 Roger Nyberg
2015-12-23 13:18 ` Marcelo Ricardo Leitner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534ED0FD.4040709@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=alexander.sverdlin@nsn.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=dongsheng.song@gmail.com \
    --cc=linux-sctp@vger.kernel.org \
    --cc=matija.glavinic-pecotic.ext@nsn.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.