* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 15:26 [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket Daniel Borkmann
@ 2014-04-08 15:28 ` Vlad Yasevich
2014-04-08 15:37 ` Neil Horman
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Vlad Yasevich @ 2014-04-08 15:28 UTC (permalink / raw)
To: Daniel Borkmann, davem; +Cc: netdev, linux-sctp, Thomas Graf, Neil Horman
On 04/08/2014 11:26 AM, Daniel Borkmann wrote:
> SCTP charges chunks for wmem accounting via skb->truesize in
> sctp_set_owner_w(), and sctp_wfree() respectively as the
> reverse operation. If a sender runs out of wmem, it needs to
> wait via sctp_wait_for_sndbuf(), and gets woken up by a call
> to __sctp_write_space() mostly via sctp_wfree().
>
> __sctp_write_space() is being called per association. Although
> we assign sk->sk_write_space() to sctp_write_space(), which
> is then being done per socket, it is only used if send space
> is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
> is set and therefore not invoked in sock_wfree().
>
> Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
> again when transmitting packet") fixed an issue where in case
> sctp_packet_transmit() manages to queue up more than sndbuf
> bytes, sctp_wait_for_sndbuf() will never be woken up again
> unless it is interrupted by a signal. However, a still
> remaining issue is that if net.sctp.sndbuf_policy=0, that is
> accounting per socket, and one-to-many sockets are in use,
> the reclaimed write space from sctp_wfree() is 'unfairly'
> handed back on the server to the association that is the lucky
> one to be woken up again via __sctp_write_space(), while
> the remaining associations are never be woken up again
> (unless by a signal).
>
> The effect disappears with net.sctp.sndbuf_policy=1, that
> is wmem accounting per association, as it guarantees a fair
> share of wmem among associations.
>
> Therefore, if we have reclaimed memory in case of per socket
> accounting, wake all related associations to a socket in a
> fair manner, that is, traverse the socket association list
> starting from the current neighbour of the association and
> issue a __sctp_write_space() to everyone until we end up
> waking ourselves. This guarantees that no association is
> preferred over another and even if more associations are
> taken into the one-to-many session, all receivers will get
> messages from the server and are not stalled forever on
> high load. This setting still leaves the advantage of per
> socket accounting in touch as an association can still use
> up global limits if unused by others.
>
> Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
-vlad
> ---
> [ When net-next opens up again, we need to think how
> we can ideally make a new list interface and simplify
> both open-coded list traversals. ]
>
> v1->v2:
> - Improved comment, included note about locking
>
> net/sctp/socket.c | 36 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 981aaf8..96cfaba 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -6593,6 +6593,40 @@ static void __sctp_write_space(struct sctp_association *asoc)
> }
> }
>
> +static void sctp_wake_up_waiters(struct sock *sk,
> + struct sctp_association *asoc)
> +{
> + struct sctp_association *tmp = asoc;
> +
> + /* We do accounting for the sndbuf space per association,
> + * so we only need to wake our own association.
> + */
> + if (asoc->ep->sndbuf_policy)
> + return __sctp_write_space(asoc);
> +
> + /* Accounting for the sndbuf space is per socket, so we
> + * need to wake up others, try to be fair and in case of
> + * other associations, let them have a go first instead
> + * of just doing a sctp_write_space() call.
> + *
> + * Note that we reach sctp_wake_up_waiters() only when
> + * associations free up queued chunks, thus we are under
> + * lock and the list of associations on a socket is
> + * guaranteed not to change.
> + */
> + for (tmp = list_next_entry(tmp, asocs); 1;
> + tmp = list_next_entry(tmp, asocs)) {
> + /* Manually skip the head element. */
> + if (&tmp->asocs == &((sctp_sk(sk))->ep->asocs))
> + continue;
> + /* Wake up association. */
> + __sctp_write_space(tmp);
> + /* We've reached the end. */
> + if (tmp == asoc)
> + break;
> + }
> +}
> +
> /* Do accounting for the sndbuf space.
> * Decrement the used sndbuf space of the corresponding association by the
> * data size which was just transmitted(freed).
> @@ -6620,7 +6654,7 @@ static void sctp_wfree(struct sk_buff *skb)
> sk_mem_uncharge(sk, skb->truesize);
>
> sock_wfree(skb);
> - __sctp_write_space(asoc);
> + sctp_wake_up_waiters(sk, asoc);
>
> sctp_association_put(asoc);
> }
>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 15:26 [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket Daniel Borkmann
2014-04-08 15:28 ` Vlad Yasevich
@ 2014-04-08 15:37 ` Neil Horman
2014-04-08 16:57 ` Daniel Borkmann
2014-04-08 17:08 ` David Miller
3 siblings, 0 replies; 12+ messages in thread
From: Neil Horman @ 2014-04-08 15:37 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: davem, netdev, linux-sctp, Thomas Graf, Vlad Yasevich
On Tue, Apr 08, 2014 at 05:26:13PM +0200, Daniel Borkmann wrote:
> SCTP charges chunks for wmem accounting via skb->truesize in
> sctp_set_owner_w(), and sctp_wfree() respectively as the
> reverse operation. If a sender runs out of wmem, it needs to
> wait via sctp_wait_for_sndbuf(), and gets woken up by a call
> to __sctp_write_space() mostly via sctp_wfree().
>
> __sctp_write_space() is being called per association. Although
> we assign sk->sk_write_space() to sctp_write_space(), which
> is then being done per socket, it is only used if send space
> is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
> is set and therefore not invoked in sock_wfree().
>
> Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
> again when transmitting packet") fixed an issue where in case
> sctp_packet_transmit() manages to queue up more than sndbuf
> bytes, sctp_wait_for_sndbuf() will never be woken up again
> unless it is interrupted by a signal. However, a still
> remaining issue is that if net.sctp.sndbuf_policy=0, that is
> accounting per socket, and one-to-many sockets are in use,
> the reclaimed write space from sctp_wfree() is 'unfairly'
> handed back on the server to the association that is the lucky
> one to be woken up again via __sctp_write_space(), while
> the remaining associations are never be woken up again
> (unless by a signal).
>
> The effect disappears with net.sctp.sndbuf_policy=1, that
> is wmem accounting per association, as it guarantees a fair
> share of wmem among associations.
>
> Therefore, if we have reclaimed memory in case of per socket
> accounting, wake all related associations to a socket in a
> fair manner, that is, traverse the socket association list
> starting from the current neighbour of the association and
> issue a __sctp_write_space() to everyone until we end up
> waking ourselves. This guarantees that no association is
> preferred over another and even if more associations are
> taken into the one-to-many session, all receivers will get
> messages from the server and are not stalled forever on
> high load. This setting still leaves the advantage of per
> socket accounting in touch as an association can still use
> up global limits if unused by others.
>
> Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Thomas Graf <tgraf@suug.ch>
> Cc: Neil Horman <nhorman@tuxdriver.com>
> Cc: Vlad Yasevich <vyasevic@redhat.com>
> ---
> [ When net-next opens up again, we need to think how
> we can ideally make a new list interface and simplify
> both open-coded list traversals. ]
>
> v1->v2:
> - Improved comment, included note about locking
>
> net/sctp/socket.c | 36 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 981aaf8..96cfaba 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -6593,6 +6593,40 @@ static void __sctp_write_space(struct sctp_association *asoc)
> }
> }
>
> +static void sctp_wake_up_waiters(struct sock *sk,
> + struct sctp_association *asoc)
> +{
> + struct sctp_association *tmp = asoc;
> +
> + /* We do accounting for the sndbuf space per association,
> + * so we only need to wake our own association.
> + */
> + if (asoc->ep->sndbuf_policy)
> + return __sctp_write_space(asoc);
> +
> + /* Accounting for the sndbuf space is per socket, so we
> + * need to wake up others, try to be fair and in case of
> + * other associations, let them have a go first instead
> + * of just doing a sctp_write_space() call.
> + *
> + * Note that we reach sctp_wake_up_waiters() only when
> + * associations free up queued chunks, thus we are under
> + * lock and the list of associations on a socket is
> + * guaranteed not to change.
> + */
> + for (tmp = list_next_entry(tmp, asocs); 1;
> + tmp = list_next_entry(tmp, asocs)) {
> + /* Manually skip the head element. */
> + if (&tmp->asocs == &((sctp_sk(sk))->ep->asocs))
> + continue;
> + /* Wake up association. */
> + __sctp_write_space(tmp);
> + /* We've reached the end. */
> + if (tmp == asoc)
> + break;
> + }
> +}
> +
> /* Do accounting for the sndbuf space.
> * Decrement the used sndbuf space of the corresponding association by the
> * data size which was just transmitted(freed).
> @@ -6620,7 +6654,7 @@ static void sctp_wfree(struct sk_buff *skb)
> sk_mem_uncharge(sk, skb->truesize);
>
> sock_wfree(skb);
> - __sctp_write_space(asoc);
> + sctp_wake_up_waiters(sk, asoc);
>
> sctp_association_put(asoc);
> }
> --
> 1.7.11.7
>
>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 15:26 [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket Daniel Borkmann
2014-04-08 15:28 ` Vlad Yasevich
2014-04-08 15:37 ` Neil Horman
@ 2014-04-08 16:57 ` Daniel Borkmann
2014-04-08 17:18 ` David Miller
2014-04-08 17:08 ` David Miller
3 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2014-04-08 16:57 UTC (permalink / raw)
To: davem; +Cc: netdev, linux-sctp, Thomas Graf, Neil Horman, Vlad Yasevich
Hm, I still found an issue, please hold on with this one, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 16:57 ` Daniel Borkmann
@ 2014-04-08 17:18 ` David Miller
2014-04-08 17:20 ` Daniel Borkmann
0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2014-04-08 17:18 UTC (permalink / raw)
To: dborkman; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
From: Daniel Borkmann <dborkman@redhat.com>
Date: Tue, 08 Apr 2014 18:57:06 +0200
> Hm, I still found an issue, please hold on with this one, thanks.
Ugh too late I pushed it out, you'll need to send a relative fix.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 17:18 ` David Miller
@ 2014-04-08 17:20 ` Daniel Borkmann
2014-04-08 18:19 ` David Miller
0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2014-04-08 17:20 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
On 04/08/2014 07:18 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Tue, 08 Apr 2014 18:57:06 +0200
>
>> Hm, I still found an issue, please hold on with this one, thanks.
>
> Ugh too late I pushed it out, you'll need to send a relative fix.
No problem, will do, thanks. I suggest to take it out from
the stable queue for now. Sorry for any inconvenience.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 17:20 ` Daniel Borkmann
@ 2014-04-08 18:19 ` David Miller
2014-04-08 18:46 ` David Miller
0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2014-04-08 18:19 UTC (permalink / raw)
To: dborkman; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
From: Daniel Borkmann <dborkman@redhat.com>
Date: Tue, 08 Apr 2014 19:20:16 +0200
> On 04/08/2014 07:18 PM, David Miller wrote:
>> From: Daniel Borkmann <dborkman@redhat.com>
>> Date: Tue, 08 Apr 2014 18:57:06 +0200
>>
>>> Hm, I still found an issue, please hold on with this one, thanks.
>>
>> Ugh too late I pushed it out, you'll need to send a relative fix.
>
> No problem, will do, thanks. I suggest to take it out from
> the stable queue for now. Sorry for any inconvenience.
I'm going to keep it in there and add the fixup when you submit it,
I'll make a mental note not to apply it until the fixup is in too.
I always re-read all of the discussion in the patchwork entry for
a patch when I backport it to -stable, so it's very unlikely that
I will accidently apply it without the fixup.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 18:19 ` David Miller
@ 2014-04-08 18:46 ` David Miller
2014-04-08 19:04 ` Daniel Borkmann
0 siblings, 1 reply; 12+ messages in thread
From: David Miller @ 2014-04-08 18:46 UTC (permalink / raw)
To: dborkman; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
Daniel and Vlad, I'm about to send Linus a pull request.
I know that you still need to fixup this SCTP change and it'll be
in there, but I really need to get the changes in my tree staged
so that I can do a set of -stable submissions.
So please don't freak out, I know that this change still needs work
and shouldn't go to -stable just yet :-)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 18:46 ` David Miller
@ 2014-04-08 19:04 ` Daniel Borkmann
2014-04-08 19:37 ` Vlad Yasevich
0 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2014-04-08 19:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
On 04/08/2014 08:46 PM, David Miller wrote:
>
> Daniel and Vlad, I'm about to send Linus a pull request.
>
> I know that you still need to fixup this SCTP change and it'll be
> in there, but I really need to get the changes in my tree staged
> so that I can do a set of -stable submissions.
>
> So please don't freak out, I know that this change still needs work
> and shouldn't go to -stable just yet :-)
Noted, thanks. I think the issue is that in sctp_association_free()
we do a list_del(&asoc->asocs) and then flush sctp_outq_free() which
will then access on sctp_wfree() a poisoned entry. I think this
should be list_del_init() instead.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 19:04 ` Daniel Borkmann
@ 2014-04-08 19:37 ` Vlad Yasevich
2014-04-08 20:50 ` Daniel Borkmann
0 siblings, 1 reply; 12+ messages in thread
From: Vlad Yasevich @ 2014-04-08 19:37 UTC (permalink / raw)
To: Daniel Borkmann, David Miller; +Cc: netdev, linux-sctp, tgraf, nhorman
On 04/08/2014 03:04 PM, Daniel Borkmann wrote:
> On 04/08/2014 08:46 PM, David Miller wrote:
>>
>> Daniel and Vlad, I'm about to send Linus a pull request.
>>
>> I know that you still need to fixup this SCTP change and it'll be
>> in there, but I really need to get the changes in my tree staged
>> so that I can do a set of -stable submissions.
>>
>> So please don't freak out, I know that this change still needs work
>> and shouldn't go to -stable just yet :-)
>
> Noted, thanks. I think the issue is that in sctp_association_free()
> we do a list_del(&asoc->asocs) and then flush sctp_outq_free() which
> will then access on sctp_wfree() a poisoned entry. I think this
> should be list_del_init() instead.
Switching to list_del_init() will solve the crash, but will not address
the issue. You've just removed an association and need to notify others
of available space. You can't do that since you've been unlinked.
We either need a rcu_style unlink, or detect the delete case and loop
from the beginning.
You can do #2 easily enough by looking at asoc->base.dead to decide
where to start looping.
-vlad
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 19:37 ` Vlad Yasevich
@ 2014-04-08 20:50 ` Daniel Borkmann
0 siblings, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2014-04-08 20:50 UTC (permalink / raw)
To: vyasevic; +Cc: David Miller, netdev, linux-sctp, tgraf, nhorman
On 04/08/2014 09:37 PM, Vlad Yasevich wrote:
> On 04/08/2014 03:04 PM, Daniel Borkmann wrote:
>> On 04/08/2014 08:46 PM, David Miller wrote:
>>>
>>> Daniel and Vlad, I'm about to send Linus a pull request.
>>>
>>> I know that you still need to fixup this SCTP change and it'll be
>>> in there, but I really need to get the changes in my tree staged
>>> so that I can do a set of -stable submissions.
>>>
>>> So please don't freak out, I know that this change still needs work
>>> and shouldn't go to -stable just yet :-)
>>
>> Noted, thanks. I think the issue is that in sctp_association_free()
>> we do a list_del(&asoc->asocs) and then flush sctp_outq_free() which
>> will then access on sctp_wfree() a poisoned entry. I think this
>> should be list_del_init() instead.
>
> Switching to list_del_init() will solve the crash, but will not address
> the issue. You've just removed an association and need to notify others
> of available space. You can't do that since you've been unlinked.
>
> We either need a rcu_style unlink, or detect the delete case and loop
> from the beginning.
>
> You can do #2 easily enough by looking at asoc->base.dead to decide
> where to start looping.
Agreed, I think #2 is better, so we can simply call and return with
sctp_write_space() if we see that the assoc is dead; I think SCTP is
doing too much deferring to RCU anyway. ;)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket
2014-04-08 15:26 [PATCH net v2] net: sctp: wake up all assocs if sndbuf policy is per socket Daniel Borkmann
` (2 preceding siblings ...)
2014-04-08 16:57 ` Daniel Borkmann
@ 2014-04-08 17:08 ` David Miller
3 siblings, 0 replies; 12+ messages in thread
From: David Miller @ 2014-04-08 17:08 UTC (permalink / raw)
To: dborkman; +Cc: netdev, linux-sctp, tgraf, nhorman, vyasevic
From: Daniel Borkmann <dborkman@redhat.com>
Date: Tue, 8 Apr 2014 17:26:13 +0200
> SCTP charges chunks for wmem accounting via skb->truesize in
> sctp_set_owner_w(), and sctp_wfree() respectively as the
> reverse operation. If a sender runs out of wmem, it needs to
> wait via sctp_wait_for_sndbuf(), and gets woken up by a call
> to __sctp_write_space() mostly via sctp_wfree().
>
> __sctp_write_space() is being called per association. Although
> we assign sk->sk_write_space() to sctp_write_space(), which
> is then being done per socket, it is only used if send space
> is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
> is set and therefore not invoked in sock_wfree().
>
> Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
> again when transmitting packet") fixed an issue where in case
> sctp_packet_transmit() manages to queue up more than sndbuf
> bytes, sctp_wait_for_sndbuf() will never be woken up again
> unless it is interrupted by a signal. However, a still
> remaining issue is that if net.sctp.sndbuf_policy=0, that is
> accounting per socket, and one-to-many sockets are in use,
> the reclaimed write space from sctp_wfree() is 'unfairly'
> handed back on the server to the association that is the lucky
> one to be woken up again via __sctp_write_space(), while
> the remaining associations are never be woken up again
> (unless by a signal).
>
> The effect disappears with net.sctp.sndbuf_policy=1, that
> is wmem accounting per association, as it guarantees a fair
> share of wmem among associations.
>
> Therefore, if we have reclaimed memory in case of per socket
> accounting, wake all related associations to a socket in a
> fair manner, that is, traverse the socket association list
> starting from the current neighbour of the association and
> issue a __sctp_write_space() to everyone until we end up
> waking ourselves. This guarantees that no association is
> preferred over another and even if more associations are
> taken into the one-to-many session, all receivers will get
> messages from the server and are not stalled forever on
> high load. This setting still leaves the advantage of per
> socket accounting in touch as an association can still use
> up global limits if unused by others.
>
> Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Applied and queued up for -stable, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread