stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Daniel Borkmann <dborkman@redhat.com>,
	Thomas Graf <tgraf@suug.ch>, Neil Horman <nhorman@tuxdriver.com>,
	Vlad Yasevich <vyasevic@redhat.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 3.10 39/86] net: sctp: wake up all assocs if sndbuf policy is per socket
Date: Wed, 28 May 2014 21:37:14 -0700	[thread overview]
Message-ID: <20140529043518.759787741@linuxfoundation.org> (raw)
In-Reply-To: <20140529043513.451722422@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit 52c35befb69b005c3fc5afdaae3a5717ad013411 ]

SCTP charges chunks for wmem accounting via skb->truesize in
sctp_set_owner_w(), and sctp_wfree() respectively as the
reverse operation. If a sender runs out of wmem, it needs to
wait via sctp_wait_for_sndbuf(), and gets woken up by a call
to __sctp_write_space() mostly via sctp_wfree().

__sctp_write_space() is being called per association. Although
we assign sk->sk_write_space() to sctp_write_space(), which
is then being done per socket, it is only used if send space
is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE
is set and therefore not invoked in sock_wfree().

Commit 4c3a5bdae293 ("sctp: Don't charge for data in sndbuf
again when transmitting packet") fixed an issue where in case
sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again
unless it is interrupted by a signal. However, a still
remaining issue is that if net.sctp.sndbuf_policy=0, that is
accounting per socket, and one-to-many sockets are in use,
the reclaimed write space from sctp_wfree() is 'unfairly'
handed back on the server to the association that is the lucky
one to be woken up again via __sctp_write_space(), while
the remaining associations are never be woken up again
(unless by a signal).

The effect disappears with net.sctp.sndbuf_policy=1, that
is wmem accounting per association, as it guarantees a fair
share of wmem among associations.

Therefore, if we have reclaimed memory in case of per socket
accounting, wake all related associations to a socket in a
fair manner, that is, traverse the socket association list
starting from the current neighbour of the association and
issue a __sctp_write_space() to everyone until we end up
waking ourselves. This guarantees that no association is
preferred over another and even if more associations are
taken into the one-to-many session, all receivers will get
messages from the server and are not stalled forever on
high load. This setting still leaves the advantage of per
socket accounting in touch as an association can still use
up global limits if unused by others.

Fixes: 4eb701dfc618 ("[SCTP] Fix SCTP sendbuffer accouting.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Vlad Yasevich <vyasevic@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sctp/socket.c |   36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -6582,6 +6582,40 @@ static void __sctp_write_space(struct sc
 	}
 }
 
+static void sctp_wake_up_waiters(struct sock *sk,
+				 struct sctp_association *asoc)
+{
+	struct sctp_association *tmp = asoc;
+
+	/* We do accounting for the sndbuf space per association,
+	 * so we only need to wake our own association.
+	 */
+	if (asoc->ep->sndbuf_policy)
+		return __sctp_write_space(asoc);
+
+	/* Accounting for the sndbuf space is per socket, so we
+	 * need to wake up others, try to be fair and in case of
+	 * other associations, let them have a go first instead
+	 * of just doing a sctp_write_space() call.
+	 *
+	 * Note that we reach sctp_wake_up_waiters() only when
+	 * associations free up queued chunks, thus we are under
+	 * lock and the list of associations on a socket is
+	 * guaranteed not to change.
+	 */
+	for (tmp = list_next_entry(tmp, asocs); 1;
+	     tmp = list_next_entry(tmp, asocs)) {
+		/* Manually skip the head element. */
+		if (&tmp->asocs == &((sctp_sk(sk))->ep->asocs))
+			continue;
+		/* Wake up association. */
+		__sctp_write_space(tmp);
+		/* We've reached the end. */
+		if (tmp == asoc)
+			break;
+	}
+}
+
 /* Do accounting for the sndbuf space.
  * Decrement the used sndbuf space of the corresponding association by the
  * data size which was just transmitted(freed).
@@ -6609,7 +6643,7 @@ static void sctp_wfree(struct sk_buff *s
 	sk_mem_uncharge(sk, skb->truesize);
 
 	sock_wfree(skb);
-	__sctp_write_space(asoc);
+	sctp_wake_up_waiters(sk, asoc);
 
 	sctp_association_put(asoc);
 }



  parent reply	other threads:[~2014-05-29  4:37 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-29  4:36 [PATCH 3.10 00/86] 3.10.41-stable review Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 01/86] scsi: fix our current target reap infrastructure Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 02/86] SCSI: dual scan thread bug fix Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 03/86] SCSI: megaraid: missing bounds check in mimd_to_kioc() Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 04/86] blktrace: fix accounting of partially completed requests Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 05/86] netfilter: nf_conntrack: reserve two bytes for nf_ct_ext->len Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 06/86] netfilter: Cant fail and free after table replacement Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 07/86] tracepoint: Do not waste memory on mods with no tracepoints Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 08/86] firewire: ohci: beautify some macro definitions Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 09/86] firewire: ohci: fix probe failure with Agere/LSI controllers Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 10/86] arm: multi_v7_defconfig: Enable initrd/initramfs support Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 11/86] ARM: multi_v7_defconfig: enable ARM_ATAG_DTB_COMPAT Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 12/86] rbd: fix error paths in rbd_img_request_fill() Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 13/86] powerpc: Add vr save/restore functions Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 14/86] tgafb: fix mode setting with fbset Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 15/86] parisc: fix epoll_pwait syscall on compat kernel Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 16/86] dont bother with {get,put}_write_access() on non-regular files Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 17/86] md/raid1: r1buf_pool_alloc: free allocate pages when subsequent allocation fails Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 18/86] mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages() Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 19/86] mm: use paravirt friendly ops for NUMA hinting ptes Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 20/86] USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 21/86] USB: cp210x: Add 8281 (Nanotec Plug & Drive) Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 22/86] USB: usb_wwan: fix handling of missing bulk endpoints Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 23/86] USB: serial: ftdi_sio: add id for Brainboxes serial cards Greg Kroah-Hartman
2014-05-29  4:36 ` [PATCH 3.10 24/86] usb: option driver, add support for Telit UE910v2 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 25/86] Revert "USB: serial: add usbid for dell wwan card to sierra.c" Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 26/86] USB: serial: fix sysfs-attribute removal deadlock Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 27/86] USB: io_ti: fix firmware download on big-endian machines Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 28/86] usb: qcserial: add Sierra Wireless EM7355 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 29/86] usb: qcserial: add Sierra Wireless MC73xx Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 30/86] usb: qcserial: add Sierra Wireless MC7305/MC7355 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 31/86] usb: option: add Olivetti Olicard 500 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 32/86] usb: option: add Alcatel L800MA Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 33/86] usb: option: add and update a number of CMOTech devices Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 34/86] drm/vmwgfx: correct fb_fix_screeninfo.line_length Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 35/86] drm/vmwgfx: Make sure user-space cant DMA across buffer object boundaries v2 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 36/86] drm/qxl: unset a pointer in sync_obj_unref Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 37/86] drm/radeon: call drm_edid_to_eld when we update the edid Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 38/86] list: introduce list_next_entry() and list_prev_entry() Greg Kroah-Hartman
2014-05-29  4:37 ` Greg Kroah-Hartman [this message]
2014-05-29  4:37 ` [PATCH 3.10 40/86] net: sctp: test if association is dead in sctp_wake_up_waiters Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 41/86] l2tp: take PMTU from tunnel UDP socket Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 42/86] net: core: dont account for udp header size when computing seglen Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 43/86] bonding: Remove debug_fs files when module init fails Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 44/86] bridge: Fix double free and memory leak around br_allowed_ingress Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 45/86] ipv6: Limit mtu to 65575 bytes Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 46/86] gre: dont allow to add the same tunnel twice Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 47/86] vti: " Greg Kroah-Hartman
2014-06-02  8:43   ` Nicolas Dichtel
2014-05-29  4:37 ` [PATCH 3.10 48/86] net: ipv4: current group_info should be put after using Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 49/86] ipv4: return valid RTA_IIF on ip route get Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 50/86] filter: prevent nla extensions to peek beyond the end of the message Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 51/86] ip6_gre: dont allow to remove the fb_tunnel_dev Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 52/86] vlan: Fix lockdep warning when vlan dev handle notification Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 53/86] tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 54/86] net: sctp: cache auth_enable per endpoint Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 55/86] net: Fix ns_capable check in sock_diag_put_filterinfo Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 56/86] rtnetlink: Warn when interfaces information wont fit in our packet Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 57/86] rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is set Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 58/86] ipv6: fib: fix fib dump restart Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 59/86] bridge: Handle IFLA_ADDRESS correctly when creating bridge device Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 60/86] sctp: reset flowi4_oif parameter on route lookup Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 61/86] Revert "macvlan : fix checksums error when we are in bridge mode" Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 62/86] tcp_cubic: fix the range of delayed_ack Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 63/86] net: ipv4: ip_forward: fix inverted local_df test Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 64/86] net: ipv6: send pkttoobig immediately if orig frag size > mtu Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 65/86] ipv4: fib_semantics: increment fib_info_cnt after fib_info allocation Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 66/86] net: cdc_mbim: handle unaccelerated VLAN tagged frames Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 67/86] macvlan: Dont propagate IFF_ALLMULTI changes on down interfaces Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 68/86] ip6_tunnel: fix potential NULL pointer dereference Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 69/86] ipv4: initialise the itag variable in __mkroute_input Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 70/86] net-gro: reset skb->truesize in napi_reuse_skb() Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 71/86] net: qmi_wwan: fixup Sierra Wireless MC8305 entry Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 72/86] net: qmi_wwan: add Option GTM681W Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 73/86] net: qmi_wwan: add TP-LINK MA260 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 74/86] qmi_wwan: add ONDA MT689DC device ID (fwd) Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 75/86] net: qmi_wwan: add Telit LE920 newer firmware support Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 76/86] net: qmi_wwan: fix Cinterion PLXX product ID Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 77/86] net: qmi_wwan: Olivetti Olicard 200 support Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 78/86] net: qmi_wwan: add ZTE MF667 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 79/86] net: qmi_wwan: add support for Cinterion PXS8 and PHS8 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 80/86] net: qmi_wwan: add Sierra Wireless EM7355 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 81/86] net: qmi_wwan: add Sierra Wireless MC73xx Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 82/86] net: qmi_wwan: add Sierra Wireless MC7305/MC7355 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 83/86] net: qmi_wwan: add Olivetti Olicard 500 Greg Kroah-Hartman
2014-05-29  4:37 ` [PATCH 3.10 84/86] net: qmi_wwan: add Alcatel L800MA Greg Kroah-Hartman
2014-05-29  4:38 ` [PATCH 3.10 85/86] net: qmi_wwan: add a number of CMOTech devices Greg Kroah-Hartman
2014-05-29  4:38 ` [PATCH 3.10 86/86] net: qmi_wwan: add a number of Dell devices Greg Kroah-Hartman
2014-05-29 14:32 ` [PATCH 3.10 00/86] 3.10.41-stable review Guenter Roeck
2014-05-30 19:44   ` Shuah Khan
2014-05-30 23:20     ` Greg Kroah-Hartman
2014-05-30 23:20   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140529043518.759787741@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=dborkman@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=stable@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=vyasevic@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).