Re: [PATCH net] net: sctp: wake up all assocs if sndbuf policy is per socket

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Vlad Yasevich <vyasevich@gmail.com>
To: Daniel Borkmann <dborkman@redhat.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
	linux-sctp@vger.kernel.org, Thomas Graf <tgraf@suug.ch>,
	Neil Horman <nhorman@tuxdriver.com>
Subject: Re: [PATCH net] net: sctp: wake up all assocs if sndbuf policy is per socket
Date: Tue, 08 Apr 2014 11:26:41 -0400	[thread overview]
Message-ID: <53441531.8030009@gmail.com> (raw)
In-Reply-To: <53440D33.2000809@redhat.com>

On 04/08/2014 10:52 AM, Daniel Borkmann wrote:
> On 04/08/2014 04:41 PM, Vlad Yasevich wrote:
>> On 04/08/2014 09:33 AM, Daniel Borkmann wrote:
>>> SCTP charges chunks for wmem accounting via skb->truesize in
>>> sctp_set_owner_w(), and sctp_wfree() respectively as the
>>> reverse operation. If a sender runs out of wmem, it needs to
>>> wait via sctp_wait_for_sndbuf(), and gets woken up by a call
>>> to __sctp_write_space() mostly via sctp_wfree().
>>>
>>> __sctp_write_space() is being called per association. Although
>>> we assign sk->sk_write_space() to sctp_write_space(), which
>>> is then being done per socket, it is only used if send space
>>> is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
>>> is set and therefore not invoked in sock_wfree().
>>>
>>> Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
>>> again when transmitting packet") fixed an issue where in case
>>> sctp_packet_transmit() manages to queue up more than sndbuf
>>> bytes, sctp_wait_for_sndbuf() will never be woken up again
>>> unless it is interrupted by a signal. However, a still
>>> remaining issue is that if net.sctp.sndbuf_policy=0, that is
>>> accounting per socket, and one-to-many sockets are in use,
>>> the reclaimed write space from sctp_wfree() is 'unfairly'
>>> handed back on the server to the association that is the lucky
>>> one to be woken up again via __sctp_write_space(), while
>>> the remaining associations are never be woken up again
>>> (unless by a signal).
>>>
>>> The effect disappears with net.sctp.sndbuf_policy=1, that
>>> is wmem accounting per association, as it guarantees a fair
>>> share of wmem among associations.
>>>
>>> Therefore, if we have reclaimed memory in case of per socket
>>> accouting, wake all related associations to a socket in a
>>> fair manner, that is, traverse the socket association list
>>> starting from the current neighbour of the association and
>>> issue a __sctp_write_space() to everyone until we end up
>>> waking ourselves. This guarantees that no association is
>>> preferred over another and even if more associations are
>>> taken into the one-to-many session, all receivers will get
>>> messages from the server and are not stalled forever on
>>> high load. This setting still leaves the advantage of per
>>> socket accounting in touch as an association can still use
>>> up global limits if unused by others.
>>>
>>> Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
>>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>>> Cc: Thomas Graf <tgraf@suug.ch>
>>> Cc: Neil Horman <nhorman@tuxdriver.com>
>>> ---
>>>   [ When net-next opens up again, we need to think how
>>>     we can ideally make a new list interface and simplify
>>>     both open-coded list traversals. ]
>>>
>>>   net/sctp/socket.c | 31 ++++++++++++++++++++++++++++++-
>>>   1 file changed, 30 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
>>> index 981aaf8..a4c8c1f 100644
>>> --- a/net/sctp/socket.c
>>> +++ b/net/sctp/socket.c
>>> @@ -6593,6 +6593,35 @@ static void __sctp_write_space(struct
>>> sctp_association *asoc)
>>>       }
>>>   }
>>>
>>> +static void sctp_wake_up_waiters(struct sock *sk,
>>> +                 struct sctp_association *asoc)
>>> +{
>>> +    struct sctp_association *tmp = asoc;
>>> +
>>> +    /* We do accounting for the sndbuf space per association,
>>> +     * so we only need to wake our own association.
>>> +     */
>>> +    if (asoc->ep->sndbuf_policy)
>>> +        return __sctp_write_space(asoc);
>>> +
>>> +    /* Accounting for the sndbuf space is per socket, so we need
>>> +     * to wake up others, try to be fair and in case of other
>>> +     * associations, let them have a go first instead of just
>>> +     * doing a sctp_write_space() call.
>>> +     */
>>
>> May be a note saying that we are here only when association frees
>> queued up chunks and thus we are under lock and the list is guaranteed
>> not to change.
> 
> Ok, will add that to the comment and respin, thanks Vlad.
> 
>>> +    for (tmp = list_next_entry(tmp, asocs); 1;
>>
>> Why not change the stop condition to tmp == asoc.  It should work
>> since it will not be head pointer.
> 
> If I see this correctly, wouldn't we then exclude to eventually
> call __sctp_write_space(tmp) on ourselves as we also need to make
> sure to wake us up?
> 

Ahh, yes.  You are right.

This is yet another list traversal with skip_head.  I am going to
resurrect that code for net-next.

-vlad

>> -vlad
>>
>>> +         tmp = list_next_entry(tmp, asocs)) {
>>> +        /* Manually skip the head element. */
>>> +        if (&tmp->asocs == &((sctp_sk(sk))->ep->asocs))
>>> +            continue;
>>> +        /* Wake up association. */
>>> +        __sctp_write_space(tmp);
>>> +        /* We've reached the end. */
>>> +        if (tmp == asoc)
>>> +            break;
>>> +    }
>>> +}
>>> +
>>>   /* Do accounting for the sndbuf space.
>>>    * Decrement the used sndbuf space of the corresponding association
>>> by the
>>>    * data size which was just transmitted(freed).
>>> @@ -6620,7 +6649,7 @@ static void sctp_wfree(struct sk_buff *skb)
>>>       sk_mem_uncharge(sk, skb->truesize);
>>>
>>>       sock_wfree(skb);
>>> -    __sctp_write_space(asoc);
>>> +    sctp_wake_up_waiters(sk, asoc);
>>>
>>>       sctp_association_put(asoc);
>>>   }
>>>
>>

next prev parent reply	other threads:[~2014-04-08 15:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-08 13:33 [PATCH net] net: sctp: wake up all assocs if sndbuf policy is per socket Daniel Borkmann
2014-04-08 14:41 ` Vlad Yasevich
2014-04-08 14:52   ` Daniel Borkmann
2014-04-08 15:26     ` Vlad Yasevich [this message]
2014-04-08 16:23       ` Daniel Borkmann
2014-04-08 16:50 ` David Miller
2014-04-08 17:13   ` Vlad Yasevich
2014-04-08 17:19     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53441531.8030009@gmail.com \
    --to=vyasevich@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=linux-sctp@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).