All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: David Miller <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>, Jon Maloy <jon.maloy@ericsson.com>,
	Ying Xue <ying.xue@windriver.com>,
	Erik Hugne <erik.hugne@ericsson.com>,
	Neil Horman <nhorman@tuxdriver.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>
Subject: [PATCH net-next 02/16] tipc: change socket buffer overflow control to respect sk_rcvbuf
Date: Mon, 17 Jun 2013 10:54:37 -0400	[thread overview]
Message-ID: <1371480891-5264-3-git-send-email-paul.gortmaker@windriver.com> (raw)
In-Reply-To: <1371480891-5264-1-git-send-email-paul.gortmaker@windriver.com>

From: Ying Xue <ying.xue@windriver.com>

As per feedback from the netdev community, we change the buffer
overflow protection algorithm in receiving sockets so that it
always respects the nominal upper limit set in sk_rcvbuf.

Instead of scaling up from a small sk_rcvbuf value, which leads to
violation of the configured sk_rcvbuf limit, we now calculate the
weighted per-message limit by scaling down from a much bigger value,
still in the same field, according to the importance priority of the
received message.

To allow for administrative tunability of the socket receive buffer
size, we create a tipc_rmem sysctl variable to allow the user to
configure an even bigger value via sysctl command.  It is a size of
three (min/default/max) to be consistent with things like tcp_rmem.

By default, the value initialized in tipc_rmem[1] is equal to the
receive socket size needed by a TIPC_CRITICAL_IMPORTANCE message.
This value is also set as the default value of sk_rcvbuf.

Originally-by: Jon Maloy <jon.maloy@ericsson.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
[Ying: added sysctl variation to Jon's original patch]
Signed-off-by: Ying Xue <ying.xue@windriver.com>
[PG: don't compile sysctl.c if not config'd; add Documentation]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 Documentation/sysctl/net.txt | 17 +++++++++++-
 net/tipc/Makefile            |  1 +
 net/tipc/core.c              | 12 +++++++--
 net/tipc/core.h              |  9 +++++++
 net/tipc/port.h              |  2 ++
 net/tipc/socket.c            | 19 ++++++-------
 net/tipc/sysctl.c            | 64 ++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 112 insertions(+), 12 deletions(-)
 create mode 100644 net/tipc/sysctl.c

diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 85ab72d..5369879 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -26,7 +26,7 @@ Table : Subdirectories in /proc/sys/net
  ipv4      IP version 4        x25        X.25 protocol
  ipx       IPX                 token-ring IBM token ring
  bridge    Bridging            decnet     DEC net
- ipv6      IP version 6
+ ipv6      IP version 6        tipc       TIPC
 ..............................................................................
 
 1. /proc/sys/net/core - Network core options
@@ -207,3 +207,18 @@ IPX.
 The /proc/net/ipx_route  table  holds  a list of IPX routes. For each route it
 gives the  destination  network, the router node (or Directly) and the network
 address of the router (or Connected) for internal networks.
+
+6. TIPC
+-------------------------------------------------------
+
+The TIPC protocol now has a tunable for the receive memory, similar to the
+tcp_rmem - i.e. a vector of 3 INTEGERs: (min, default, max)
+
+    # cat /proc/sys/net/tipc/tipc_rmem
+    4252725 34021800        68043600
+    #
+
+The max value is set to CONN_OVERLOAD_LIMIT, and the default and min values
+are scaled (shifted) versions of that same value.  Note that the min value
+is not at this point in time used in any meaningful way, but the triplet is
+preserved in order to be consistent with things like tcp_rmem.
diff --git a/net/tipc/Makefile b/net/tipc/Makefile
index 4df8e02..02636d0 100644
--- a/net/tipc/Makefile
+++ b/net/tipc/Makefile
@@ -11,3 +11,4 @@ tipc-y	+= addr.o bcast.o bearer.o config.o \
 	   socket.o log.o eth_media.o
 
 tipc-$(CONFIG_TIPC_MEDIA_IB)	+= ib_media.o
+tipc-$(CONFIG_SYSCTL)		+= sysctl.o
diff --git a/net/tipc/core.c b/net/tipc/core.c
index 7ec2c1e..b0e42a0 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -39,6 +39,7 @@
 #include "name_table.h"
 #include "subscr.h"
 #include "config.h"
+#include "port.h"
 
 #include <linux/module.h>
 
@@ -50,7 +51,7 @@ u32 tipc_own_addr __read_mostly;
 int tipc_max_ports __read_mostly;
 int tipc_net_id __read_mostly;
 int tipc_remote_management __read_mostly;
-
+int sysctl_tipc_rmem[3] __read_mostly;	/* min/default/max */
 
 /**
  * tipc_buf_acquire - creates a TIPC message buffer
@@ -118,6 +119,7 @@ static void tipc_core_stop(void)
 	tipc_nametbl_stop();
 	tipc_ref_table_stop();
 	tipc_socket_stop();
+	tipc_unregister_sysctl();
 }
 
 /**
@@ -142,13 +144,14 @@ static int tipc_core_start(void)
 		res = tipc_netlink_start();
 	if (!res)
 		res = tipc_socket_init();
+	if (!res)
+		res = tipc_register_sysctl();
 	if (res)
 		tipc_core_stop();
 
 	return res;
 }
 
-
 static int __init tipc_init(void)
 {
 	int res;
@@ -160,6 +163,11 @@ static int __init tipc_init(void)
 	tipc_max_ports = CONFIG_TIPC_PORTS;
 	tipc_net_id = 4711;
 
+	sysctl_tipc_rmem[0] = CONN_OVERLOAD_LIMIT >> 4 << TIPC_LOW_IMPORTANCE;
+	sysctl_tipc_rmem[1] = CONN_OVERLOAD_LIMIT >> 4 <<
+			      TIPC_CRITICAL_IMPORTANCE;
+	sysctl_tipc_rmem[2] = CONN_OVERLOAD_LIMIT;
+
 	res = tipc_core_start();
 	if (res)
 		pr_err("Unable to start in single node mode\n");
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 0207db0..fe7f2b7 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -80,6 +80,7 @@ extern u32 tipc_own_addr __read_mostly;
 extern int tipc_max_ports __read_mostly;
 extern int tipc_net_id __read_mostly;
 extern int tipc_remote_management __read_mostly;
+extern int sysctl_tipc_rmem[3] __read_mostly;
 
 /*
  * Other global variables
@@ -97,6 +98,14 @@ extern void tipc_netlink_stop(void);
 extern int  tipc_socket_init(void);
 extern void tipc_socket_stop(void);
 
+#ifdef CONFIG_SYSCTL
+extern int tipc_register_sysctl(void);
+extern void tipc_unregister_sysctl(void);
+#else
+#define tipc_register_sysctl() 0
+#define tipc_unregister_sysctl()
+#endif
+
 /*
  * TIPC timer and signal code
  */
diff --git a/net/tipc/port.h b/net/tipc/port.h
index fb66e2e..2485649 100644
--- a/net/tipc/port.h
+++ b/net/tipc/port.h
@@ -43,6 +43,8 @@
 #include "node_subscr.h"
 
 #define TIPC_FLOW_CONTROL_WIN 512
+#define CONN_OVERLOAD_LIMIT	((TIPC_FLOW_CONTROL_WIN * 2 + 1) * \
+				SKB_TRUESIZE(TIPC_MAX_USER_MSG_SIZE))
 
 typedef void (*tipc_msg_err_event) (void *usr_handle, u32 portref,
 		struct sk_buff **buf, unsigned char const *data,
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 515ce38..aba4255 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -43,8 +43,6 @@
 #define SS_LISTENING	-1	/* socket is listening */
 #define SS_READY	-2	/* socket is connectionless */
 
-#define CONN_OVERLOAD_LIMIT	((TIPC_FLOW_CONTROL_WIN * 2 + 1) * \
-				SKB_TRUESIZE(TIPC_MAX_USER_MSG_SIZE))
 #define CONN_TIMEOUT_DEFAULT	8000	/* default connect timeout = 8s */
 
 struct tipc_sock {
@@ -203,6 +201,7 @@ static int tipc_create(struct net *net, struct socket *sock, int protocol,
 
 	sock_init_data(sock, sk);
 	sk->sk_backlog_rcv = backlog_rcv;
+	sk->sk_rcvbuf = sysctl_tipc_rmem[1];
 	sk->sk_data_ready = tipc_data_ready;
 	sk->sk_write_space = tipc_write_space;
 	tipc_sk(sk)->p = tp_ptr;
@@ -1233,10 +1232,10 @@ static u32 filter_connect(struct tipc_sock *tsock, struct sk_buff **buf)
  * For all connectionless messages, by default new queue limits are
  * as belows:
  *
- * TIPC_LOW_IMPORTANCE       (5MB)
- * TIPC_MEDIUM_IMPORTANCE    (10MB)
- * TIPC_HIGH_IMPORTANCE      (20MB)
- * TIPC_CRITICAL_IMPORTANCE  (40MB)
+ * TIPC_LOW_IMPORTANCE       (4 MB)
+ * TIPC_MEDIUM_IMPORTANCE    (8 MB)
+ * TIPC_HIGH_IMPORTANCE      (16 MB)
+ * TIPC_CRITICAL_IMPORTANCE  (32 MB)
  *
  * Returns overload limit according to corresponding message importance
  */
@@ -1246,9 +1245,10 @@ static unsigned int rcvbuf_limit(struct sock *sk, struct sk_buff *buf)
 	unsigned int limit;
 
 	if (msg_connected(msg))
-		limit = CONN_OVERLOAD_LIMIT;
+		limit = sysctl_tipc_rmem[2];
 	else
-		limit = sk->sk_rcvbuf << (msg_importance(msg) + 5);
+		limit = sk->sk_rcvbuf >> TIPC_CRITICAL_IMPORTANCE <<
+			msg_importance(msg);
 	return limit;
 }
 
@@ -1847,7 +1847,8 @@ static const struct net_proto_family tipc_family_ops = {
 static struct proto tipc_proto = {
 	.name		= "TIPC",
 	.owner		= THIS_MODULE,
-	.obj_size	= sizeof(struct tipc_sock)
+	.obj_size	= sizeof(struct tipc_sock),
+	.sysctl_rmem	= sysctl_tipc_rmem
 };
 
 /**
diff --git a/net/tipc/sysctl.c b/net/tipc/sysctl.c
new file mode 100644
index 0000000..f3fef93
--- /dev/null
+++ b/net/tipc/sysctl.c
@@ -0,0 +1,64 @@
+/*
+ * net/tipc/sysctl.c: sysctl interface to TIPC subsystem
+ *
+ * Copyright (c) 2013, Wind River Systems
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the names of the copyright holders nor the names of its
+ *    contributors may be used to endorse or promote products derived from
+ *    this software without specific prior written permission.
+ *
+ * Alternatively, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") version 2 as published by the Free
+ * Software Foundation.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "core.h"
+
+#include <linux/sysctl.h>
+
+static struct ctl_table_header *tipc_ctl_hdr;
+
+static struct ctl_table tipc_table[] = {
+	{
+		.procname	= "tipc_rmem",
+		.data		= &sysctl_tipc_rmem,
+		.maxlen		= sizeof(sysctl_tipc_rmem),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
+	{}
+};
+
+int tipc_register_sysctl(void)
+{
+	tipc_ctl_hdr = register_net_sysctl(&init_net, "net/tipc", tipc_table);
+	if (tipc_ctl_hdr == NULL)
+		return -ENOMEM;
+	return 0;
+}
+
+void tipc_unregister_sysctl(void)
+{
+	unregister_net_sysctl_table(tipc_ctl_hdr);
+}
-- 
1.8.1.2

  parent reply	other threads:[~2013-06-17 14:55 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-17 14:54 [PATCH v2 net-next 00/16] tipc: make use of kernel threads to simplify things Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 01/16] tipc: update code comments to reflect new uapi header path Paul Gortmaker
2013-06-17 17:33   ` Sergei Shtylyov
2013-06-17 14:54 ` Paul Gortmaker [this message]
2013-06-17 23:55   ` [PATCH net-next 02/16] tipc: change socket buffer overflow control to respect sk_rcvbuf Neil Horman
2013-06-17 14:54 ` [PATCH net-next 03/16] tipc: allow implicit connect for stream sockets Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 04/16] tipc: introduce new TIPC server infrastructure Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 05/16] tipc: convert topology server to use new server facility Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 06/16] tipc: convert configuration " Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 07/16] tipc: delete code orphaned by new server infrastructure Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 08/16] tipc: remove user_port instance from tipc_port structure Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 09/16] tipc: rename tipc_createport_raw to tipc_createport Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 10/16] tipc: convert config_lock from spinlock to mutex Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 11/16] tipc: save sock structure pointer instead of void pointer to tipc_port Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 12/16] tipc: cosmetic realignment of function arguments Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 13/16] tipc: enhance priority of link protocol packet Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 14/16] tipc: make tipc_link_send_sections_fast exit earlier Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 15/16] tipc: fix wrong return value for link_send_sections_long routine Paul Gortmaker
2013-06-17 14:54 ` [PATCH net-next 16/16] tipc: remove dev_base_lock use from enable_bearer Paul Gortmaker
2013-06-17 22:53 ` [PATCH v2 net-next 00/16] tipc: make use of kernel threads to simplify things David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1371480891-5264-3-git-send-email-paul.gortmaker@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=davem@davemloft.net \
    --cc=erik.hugne@ericsson.com \
    --cc=jon.maloy@ericsson.com \
    --cc=netdev@vger.kernel.org \
    --cc=nhorman@tuxdriver.com \
    --cc=ying.xue@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.