* [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
@ 2015-03-03 0:32 Christian Seiler
2015-03-03 2:38 ` David Miller
0 siblings, 1 reply; 7+ messages in thread
From: Christian Seiler @ 2015-03-03 0:32 UTC (permalink / raw)
To: netdev
Cc: Christian Seiler, Dan Ballard, Eric Dumazet, David S. Miller,
Hannes Frederic Sowa, linux-api
Allow applications to set their own value for the maximum length of the
receive queue of SOCK_DGRAM AF_UNIX sockets. The default remains the
current value of /proc/sys/net/unix/max_dgram_qlen, which is kept at a
initial value of 10.
Rationale: applications may want to control how many datagrams the
kernel buffers before senders are blocked. A prominent example would be
to create a socket for syslog early at boot but only consume messages
once enough of the system has been set up. The default queue length of
11 messages (= 10 + 1) is too low for this kind of application.
Details:
The value chosen by applications may at most be the limit defined by
the new sysctl max_dgram_qlen_limit. The limit defaults to 2047
(allowing 2048 datagrams to be queued). If the value specified exceeds
the limit, the limit will be used, so applications don't need to try to
guess what the limit is before trying to set it, in analogy to
SO_SNDBUF/SO_RCVBUF.
Also, in analogy to SO_SNDBUF/SO_RCVBUF, a SO_MAX_DGRAM_QLEN_FORCE
option is provided for privileged users to force a higher limit. As
with SO_SNDBUF/SO_RCVBUF, CAP_NET_ADMIN in the init_user_ns is required
for this.
The setsockopt/getsockopt implementations explicitly check to see if
the socket is a UNIX domain socket of type SOCK_DGRAM, otherwise both
will fail. This ensures that listen() for non-datagram sockets is not
broken, since the datagram receive code internally uses the
sk->sk_max_ack_backlog field.
This is inspired by a patch previously submitted to LKML:
<https://lkml.org/lkml/2014/1/22/256>
Cc: Dan Ballard <dan@mindstab.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-api@vger.kernel.org
Signed-off-by: Christian Seiler <christian@iwakd.de>
---
Addendum for the rationale: this has been on the wishlist of the
systemd developers for quite a while. If one wants to configure systemd
to forward log messages to syslog, the only way to do that reliably
now is to increase max_dgram_qlen globally.
I have hopefully addressed the issues brought up in the review of the
patch that inspired this change (arch-stuff, listen(), configurable
limit, unsigned short range check).
I only have access to x86. I've touched all of the corresponding arch
files where I could find other SO_ options, but this has only been
compiled and tested on x86_64.
checkpatch.pl gives me one error, i.e. that there are no spaces before
and after an equal sign in sysctl.h, but there I just followed the rest
of the code. The two warnings it shows me are lines longer than 80
characters; in sysctl_binary.c it follows the rest of the code and in
sock.c I don't think I could make that more readable with less.
OPEN ISSUE: this is somewhat of a generic problem with network
namespaces: upon creation, the parameters are initialized by their
defaults, so each network namespace has the same initial parameters.
This means that for net_ns created within user_ns != init_user_ns,
where sysctls are not exported, the max_dgram_qlen{,_limit} settings
can't be changed at all and will always take the hard-coded default
values, which may or may not be desirable, depending on your
standpoint.
Documentation: this patch updates the documentation in Documentation/,
and I have also prepared a draft update to the man pages, should this
patch be accepted, pushed to github at:
<https://github.com/chris-se/linux-kernel-man-pages/commit/116ebc1e8a14ea0eaffd749c8d19e66c597dabc7>
Note that this is my first submission to the Linux kernel and since
this modfies a lot of arch-specific files, get_maintainer.pl was
decidedly unhelpful (unless I really should write to the maintainers of
every single arch?). If I should have added somebody else to Cc or sent
this somewhere else, please tell me.
Documentation/networking/ip-sysctl.txt | 6 +++++
arch/alpha/include/uapi/asm/socket.h | 3 +++
arch/avr32/include/uapi/asm/socket.h | 3 +++
arch/cris/include/uapi/asm/socket.h | 3 +++
arch/frv/include/uapi/asm/socket.h | 3 +++
arch/ia64/include/uapi/asm/socket.h | 3 +++
arch/m32r/include/uapi/asm/socket.h | 3 +++
arch/mips/include/uapi/asm/socket.h | 3 +++
arch/mn10300/include/uapi/asm/socket.h | 3 +++
arch/parisc/include/uapi/asm/socket.h | 3 +++
arch/powerpc/include/uapi/asm/socket.h | 3 +++
arch/s390/include/uapi/asm/socket.h | 3 +++
arch/sparc/include/uapi/asm/socket.h | 3 +++
arch/xtensa/include/uapi/asm/socket.h | 3 +++
include/net/netns/unix.h | 1 +
include/uapi/asm-generic/socket.h | 3 +++
include/uapi/linux/sysctl.h | 1 +
kernel/sysctl_binary.c | 1 +
net/core/sock.c | 49 ++++++++++++++++++++++++++++++++++
net/unix/af_unix.c | 1 +
net/unix/sysctl_net_unix.c | 41 ++++++++++++++++++++++++++++
21 files changed, 142 insertions(+)
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index 1b8c964..208132b 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -1794,6 +1794,12 @@ max_dgram_qlen - INTEGER
Default: 10
+max_dgram_qlen_limit - INTEGER
+ The maximum length a non-privileged user may set the dgram socket
+ receive queue to.
+
+ Default: 2047
+
UNDOCUMENTED:
diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 9a20821..20c5fa5 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -92,4 +92,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/avr32/include/uapi/asm/socket.h b/arch/avr32/include/uapi/asm/socket.h
index 2b65ed6..2a59ca2 100644
--- a/arch/avr32/include/uapi/asm/socket.h
+++ b/arch/avr32/include/uapi/asm/socket.h
@@ -85,4 +85,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _UAPI__ASM_AVR32_SOCKET_H */
diff --git a/arch/cris/include/uapi/asm/socket.h b/arch/cris/include/uapi/asm/socket.h
index e2503d9f..b7a3b5d 100644
--- a/arch/cris/include/uapi/asm/socket.h
+++ b/arch/cris/include/uapi/asm/socket.h
@@ -87,6 +87,9 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_SOCKET_H */
diff --git a/arch/frv/include/uapi/asm/socket.h b/arch/frv/include/uapi/asm/socket.h
index 4823ad1..f656613 100644
--- a/arch/frv/include/uapi/asm/socket.h
+++ b/arch/frv/include/uapi/asm/socket.h
@@ -85,5 +85,8 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_SOCKET_H */
diff --git a/arch/ia64/include/uapi/asm/socket.h b/arch/ia64/include/uapi/asm/socket.h
index 59be3d8..dea5bf6 100644
--- a/arch/ia64/include/uapi/asm/socket.h
+++ b/arch/ia64/include/uapi/asm/socket.h
@@ -94,4 +94,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_IA64_SOCKET_H */
diff --git a/arch/m32r/include/uapi/asm/socket.h b/arch/m32r/include/uapi/asm/socket.h
index 7bc4cb2..876ba82 100644
--- a/arch/m32r/include/uapi/asm/socket.h
+++ b/arch/m32r/include/uapi/asm/socket.h
@@ -85,4 +85,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_M32R_SOCKET_H */
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index dec3c85..20d7ece 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -103,4 +103,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/mn10300/include/uapi/asm/socket.h b/arch/mn10300/include/uapi/asm/socket.h
index cab7d6d..60fffd1 100644
--- a/arch/mn10300/include/uapi/asm/socket.h
+++ b/arch/mn10300/include/uapi/asm/socket.h
@@ -85,4 +85,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_SOCKET_H */
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index a5cd40c..b155b48 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -84,4 +84,7 @@
#define SO_ATTACH_BPF 0x402B
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 0x402C
+#define SO_MAX_DGRAM_QLEN_FORCE 0x402D
+
#endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/powerpc/include/uapi/asm/socket.h b/arch/powerpc/include/uapi/asm/socket.h
index c046666..732ffb7 100644
--- a/arch/powerpc/include/uapi/asm/socket.h
+++ b/arch/powerpc/include/uapi/asm/socket.h
@@ -92,4 +92,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_POWERPC_SOCKET_H */
diff --git a/arch/s390/include/uapi/asm/socket.h b/arch/s390/include/uapi/asm/socket.h
index 296942d..3dd7e1f 100644
--- a/arch/s390/include/uapi/asm/socket.h
+++ b/arch/s390/include/uapi/asm/socket.h
@@ -91,4 +91,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _ASM_SOCKET_H */
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index e6a16c4..13a6b6d 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -81,6 +81,9 @@
#define SO_ATTACH_BPF 0x0034
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 0x0035
+#define SO_MAX_DGRAM_QLEN_FORCE 0x0036
+
/* Security levels - as per NRL IPv6 - don't actually do anything */
#define SO_SECURITY_AUTHENTICATION 0x5001
#define SO_SECURITY_ENCRYPTION_TRANSPORT 0x5002
diff --git a/arch/xtensa/include/uapi/asm/socket.h b/arch/xtensa/include/uapi/asm/socket.h
index 4120af0..79f0d2f 100644
--- a/arch/xtensa/include/uapi/asm/socket.h
+++ b/arch/xtensa/include/uapi/asm/socket.h
@@ -96,4 +96,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* _XTENSA_SOCKET_H */
diff --git a/include/net/netns/unix.h b/include/net/netns/unix.h
index 284649d..4d0803c 100644
--- a/include/net/netns/unix.h
+++ b/include/net/netns/unix.h
@@ -7,6 +7,7 @@
struct ctl_table_header;
struct netns_unix {
int sysctl_max_dgram_qlen;
+ int sysctl_max_dgram_qlen_limit;
struct ctl_table_header *ctl;
};
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 5c15c2a..fba1974 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -87,4 +87,7 @@
#define SO_ATTACH_BPF 50
#define SO_DETACH_BPF SO_DETACH_FILTER
+#define SO_MAX_DGRAM_QLEN 51
+#define SO_MAX_DGRAM_QLEN_FORCE 52
+
#endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 0956373..d62d83b 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -289,6 +289,7 @@ enum
NET_UNIX_DESTROY_DELAY=1,
NET_UNIX_DELETE_DELAY=2,
NET_UNIX_MAX_DGRAM_QLEN=3,
+ NET_UNIX_MAX_DGRAM_QLEN_LIMIT=4,
};
/* /proc/sys/net/netfilter */
diff --git a/kernel/sysctl_binary.c b/kernel/sysctl_binary.c
index 7e7746a..bce8059 100644
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -203,6 +203,7 @@ static const struct bin_table bin_net_unix_table[] = {
/* NET_UNIX_DESTROY_DELAY unused */
/* NET_UNIX_DELETE_DELAY unused */
{ CTL_INT, NET_UNIX_MAX_DGRAM_QLEN, "max_dgram_qlen" },
+ { CTL_INT, NET_UNIX_MAX_DGRAM_QLEN_LIMIT, "max_dgram_qlen_limit" },
{}
};
diff --git a/net/core/sock.c b/net/core/sock.c
index 93c8b20..91af187 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -973,6 +973,42 @@ set_rcvbuf:
sk->sk_max_pacing_rate);
break;
+ case SO_MAX_DGRAM_QLEN:
+#ifdef CONFIG_UNIX
+ /* Only do this for UNIX datagram sockets,
+ * because listen() uses the same field.
+ */
+ if (sock->ops->family == PF_UNIX &&
+ sk->sk_type == SOCK_DGRAM) {
+ /* As with SO_SNDBUF/SO_RCVBUF, don't let
+ * applications play a game of finding out the
+ * largest possible value.
+ */
+ val = min_t(u32, val, sock_net(sk)->unx.sysctl_max_dgram_qlen_limit);
+set_max_dgram_qlen:
+ if (val < 0 || val > USHRT_MAX)
+ ret = -EINVAL;
+ else
+ sk->sk_max_ack_backlog = val;
+ break;
+ }
+#endif
+ ret = -ENOPROTOOPT;
+ break;
+
+ case SO_MAX_DGRAM_QLEN_FORCE:
+#ifdef CONFIG_UNIX
+ if (sock->ops->family == PF_UNIX &&
+ sk->sk_type == SOCK_DGRAM) {
+ if (capable(CAP_NET_ADMIN))
+ goto set_max_dgram_qlen;
+ ret = -EPERM;
+ break;
+ }
+#endif
+ ret = -ENOPROTOOPT;
+ break;
+
default:
ret = -ENOPROTOOPT;
break;
@@ -1233,6 +1269,19 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
v.val = sk->sk_incoming_cpu;
break;
+ case SO_MAX_DGRAM_QLEN:
+#ifdef CONFIG_UNIX
+ /* Only do this for UNIX datagram sockets,
+ * because listen() uses the same field.
+ */
+ if (sock->ops->family == PF_UNIX &&
+ sk->sk_type == SOCK_DGRAM) {
+ v.val = sk->sk_max_ack_backlog;
+ break;
+ }
+#endif
+ return -ENOPROTOOPT;
+
default:
return -ENOPROTOOPT;
}
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 526b6ed..33fec15 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2406,6 +2406,7 @@ static int __net_init unix_net_init(struct net *net)
int error = -ENOMEM;
net->unx.sysctl_max_dgram_qlen = 10;
+ net->unx.sysctl_max_dgram_qlen_limit = 2047;
if (unix_sysctl_register(net))
goto out;
diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c
index b3d5150..b4c75a7 100644
--- a/net/unix/sysctl_net_unix.c
+++ b/net/unix/sysctl_net_unix.c
@@ -15,12 +15,23 @@
#include <net/af_unix.h>
+static int proc_unix_do_dgram_qlen(struct ctl_table *ctl, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos);
+
static struct ctl_table unix_table[] = {
{
.procname = "max_dgram_qlen",
.data = &init_net.unx.sysctl_max_dgram_qlen,
.maxlen = sizeof(int),
.mode = 0644,
+ .proc_handler = proc_unix_do_dgram_qlen
+ },
+ {
+ .procname = "max_dgram_qlen_limit",
+ .data = &init_net.unx.sysctl_max_dgram_qlen_limit,
+ .maxlen = sizeof(int),
+ .mode = 0644,
.proc_handler = proc_dointvec
},
{ }
@@ -39,6 +50,7 @@ int __net_init unix_sysctl_register(struct net *net)
table[0].procname = NULL;
table[0].data = &net->unx.sysctl_max_dgram_qlen;
+ table[1].data = &net->unx.sysctl_max_dgram_qlen_limit;
net->unx.ctl = register_net_sysctl(net, "net/unix", table);
if (net->unx.ctl == NULL)
goto err_reg;
@@ -59,3 +71,32 @@ void unix_sysctl_unregister(struct net *net)
unregister_net_sysctl_table(net->unx.ctl);
kfree(table);
}
+
+int proc_unix_do_dgram_qlen(struct ctl_table *ctl, int write,
+ void __user *buffer, size_t *lenp,
+ loff_t *ppos)
+{
+ struct net *net = current->nsproxy->net_ns;
+ struct ctl_table tbl;
+ int ret, new_value;
+
+ memset(&tbl, 0, sizeof(struct ctl_table));
+ tbl.maxlen = sizeof(unsigned int);
+
+ if (write)
+ tbl.data = &new_value;
+ else
+ tbl.data = &net->unx.sysctl_max_dgram_qlen;
+
+ ret = proc_dointvec(&tbl, write, buffer, lenp, ppos);
+ if (write && ret == 0) {
+ if (new_value > net->unx.sysctl_max_dgram_qlen_limit) {
+ pr_warn_once("New value of max_dgram_qlen higher than max_dgram_qlen_limit, also adjusting the latter.");
+ net->unx.sysctl_max_dgram_qlen_limit = new_value;
+ }
+
+ net->unx.sysctl_max_dgram_qlen = new_value;
+ }
+
+ return ret;
+}
--
2.1.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
2015-03-03 0:32 [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets Christian Seiler
@ 2015-03-03 2:38 ` David Miller
2015-03-03 9:04 ` Christian Seiler
0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2015-03-03 2:38 UTC (permalink / raw)
To: christian; +Cc: netdev, dan, edumazet, hannes, linux-api
From: Christian Seiler <christian@iwakd.de>
Date: Tue, 3 Mar 2015 01:32:54 +0100
> Rationale: applications may want to control how many datagrams the
> kernel buffers before senders are blocked. A prominent example would be
> to create a socket for syslog early at boot but only consume messages
> once enough of the system has been set up. The default queue length of
> 11 messages (= 10 + 1) is too low for this kind of application.
I never like arguments that talk about forcing the kernel to do
excessive buffering for an application.
Queue this stuff in the userspace side, then you can have as many
messages backlogged as you like _without_ consuming unswappable kernel
memory.
I'm tossing this, you're going to have to do a much better job
explaining to me why userspace cannot take upon itself the burdon of
queueing data until it can be sent.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
2015-03-03 2:38 ` David Miller
@ 2015-03-03 9:04 ` Christian Seiler
[not found] ` <0b5908020d83bcbaa7f4938e5cb433ea-+GPkE3DhqnY@public.gmane.org>
2015-03-03 18:59 ` David Miller
0 siblings, 2 replies; 7+ messages in thread
From: Christian Seiler @ 2015-03-03 9:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev, dan, edumazet, hannes, linux-api
Am 2015-03-03 03:38, schrieb David Miller:
>> Rationale: applications may want to control how many datagrams the
>> kernel buffers before senders are blocked. A prominent example would
>> be
>> to create a socket for syslog early at boot but only consume
>> messages
>> once enough of the system has been set up. The default queue length
>> of
>> 11 messages (= 10 + 1) is too low for this kind of application.
>
> I never like arguments that talk about forcing the kernel to do
> excessive buffering for an application.
>
> Queue this stuff in the userspace side, then you can have as many
> messages backlogged as you like _without_ consuming unswappable
> kernel
> memory.
>
> I'm tossing this, you're going to have to do a much better job
> explaining to me why userspace cannot take upon itself the burdon of
> queueing data until it can be sent.
There are certain things that can't be done in userspace:
- If SO_PASSCRED is active, a userspace process relaying the messages
can't fake the PID of the original process unless that one is still
around (sendmsg will return -ESRCH). Also, one needs CAP_SYS_ADMIN
to do this (and CAP_SETUID/CAP_SETGID to fake uid/gid as well).
- More importantly, timestamps of messages can't be faked at all. So
in the example of a socket used for syslog purposes, all the
timestamps on the messages queued would be wrong.
Also note that if I have a stream socket, by default I can buffer up to
256 kiB of data in the kernel. I did some test measurements on x86_64
and including overhead of internal bookkeeping structures, I can fit up
to 555 datagrams in there if each is at most 192 bytes long, at least
333 datagrams if each is at most 704 bytes long and at least 185
datagrams if each is at most 1728 bytes long. If I compare these
numbers to 11, that's an order of magnitude in difference.
I'm not asking to be able to use a lot of memory, I'm just asking to be
able to raise an artificial limit that doesn't apply to other types of
sockets.
Finally, increasing the queue length is not the only use case, some
applications might want to decrease it. For example, if the value is
set to zero, only a single datagram can be queued at a time (and all
else blocks or fails with -EAGAIN), which might be interesting if the
application processing the datagrams takes a long time to do so for
each one of them.
Christian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
[not found] ` <0b5908020d83bcbaa7f4938e5cb433ea-+GPkE3DhqnY@public.gmane.org>
@ 2015-03-03 14:30 ` Eric Dumazet
2015-03-03 15:05 ` Christian Seiler
0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2015-03-03 14:30 UTC (permalink / raw)
To: Christian Seiler
Cc: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA,
dan-+z8lB9qDZjnk1uMJSBkQmQ, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
linux-api-u79uwXL29TY76Z2rM5mHXA
On Tue, 2015-03-03 at 10:04 +0100, Christian Seiler wrote:
> Also note that if I have a stream socket, by default I can buffer up to
> 256 kiB of data in the kernel. I did some test measurements on x86_64
> and including overhead of internal bookkeeping structures, I can fit up
> to 555 datagrams in there if each is at most 192 bytes long, at least
> 333 datagrams if each is at most 704 bytes long and at least 185
> datagrams if each is at most 1728 bytes long. If I compare these
> numbers to 11, that's an order of magnitude in difference.
Problem about AF_UNIX socket is file descriptor passing.
Increasing the 10 limit allows attackers to OOM host faster I guess.
You could extend the limit if we were sure queued messages were without
passed fds.
Then, we could either increase sysctl_max_dgram_qlen or do something
like :
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 526b6edab018..a608317e7dd4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -643,7 +643,9 @@ static struct sock *unix_create1(struct net *net, struct socket *sock)
&af_unix_sk_receive_queue_lock_key);
sk->sk_write_space = unix_write_space;
- sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen;
+ sk->sk_max_ack_backlog = max_t(u32,
+ net->unx.sysctl_max_dgram_qlen,
+ sk->sk_rcvbuf / SKB_TRUESIZE(256));
sk->sk_destruct = unix_sock_destructor;
u = unix_sk(sk);
u->path.dentry = NULL;
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
2015-03-03 14:30 ` Eric Dumazet
@ 2015-03-03 15:05 ` Christian Seiler
[not found] ` <55ee39bff7875967acb06f25fa695f95-+GPkE3DhqnY@public.gmane.org>
0 siblings, 1 reply; 7+ messages in thread
From: Christian Seiler @ 2015-03-03 15:05 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, dan, edumazet, hannes, linux-api
Am 2015-03-03 15:30, schrieb Eric Dumazet:
>> Also note that if I have a stream socket, by default I can buffer up
>> to
>> 256 kiB of data in the kernel. I did some test measurements on
>> x86_64
>> and including overhead of internal bookkeeping structures, I can fit
>> up
>> to 555 datagrams in there if each is at most 192 bytes long, at
>> least
>> 333 datagrams if each is at most 704 bytes long and at least 185
>> datagrams if each is at most 1728 bytes long. If I compare these
>> numbers to 11, that's an order of magnitude in difference.
>
> Problem about AF_UNIX socket is file descriptor passing.
>
> Increasing the 10 limit allows attackers to OOM host faster I guess.
But what's really preventing that currently? Sure, there's a limit to
the maximum number of file descriptors a process may create, but that's
usually high enough that one could create just a bunch of sockets and
queue stuff in all of them. Sure, if the limit is increased, this could
occur earlier, but my guess is that one would have to put
unrealistically tight restrictions on number of FDs / etc. in order to
really prevent OOM currently. And because modern applications tend to
use a ton of FDs, distros tend to set the FD number limits really high
by default. And my patch does allow the second limit to be changed.
> You could extend the limit if we were sure queued messages were
> without
> passed fds.
How about this? Add a flag that allows the user to specify that
SCM_RIGHTS will never be used on this socket, and that if the user
wants to increase the queue length beyond the initial limit, the
process has to either be privileged or that flag has to be set (and can
then not be unset again). On the other hand, decreasing below the
current value will not enforce this flag.
> Then, we could either increase sysctl_max_dgram_qlen or do something
> like :
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 526b6edab018..a608317e7dd4 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -643,7 +643,9 @@ static struct sock *unix_create1(struct net *net,
> struct socket *sock)
> &af_unix_sk_receive_queue_lock_key);
>
> sk->sk_write_space = unix_write_space;
> - sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen;
> + sk->sk_max_ack_backlog = max_t(u32,
> + net->unx.sysctl_max_dgram_qlen,
> + sk->sk_rcvbuf / SKB_TRUESIZE(256));
> sk->sk_destruct = unix_sock_destructor;
> u = unix_sk(sk);
> u->path.dentry = NULL;
Doesn't this assume a typical datagram size of 256 bytes? Isn't that
something that should be left up to the user? Also, suddenly the RCVBUF
size of UNIX domain sockets suddenly becomes relevant, even though it
is never actually checked when it comes to queuing the messages (only
the SNDBUF size of the sending socket is checked). This creates really
inconsistent semantics in my eyes where the receive buffer is useful
for some things, but not for others.
Christian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
[not found] ` <55ee39bff7875967acb06f25fa695f95-+GPkE3DhqnY@public.gmane.org>
@ 2015-03-03 15:56 ` Eric Dumazet
0 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2015-03-03 15:56 UTC (permalink / raw)
To: Christian Seiler
Cc: David Miller, netdev-u79uwXL29TY76Z2rM5mHXA,
dan-+z8lB9qDZjnk1uMJSBkQmQ, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
linux-api-u79uwXL29TY76Z2rM5mHXA
On Tue, 2015-03-03 at 16:05 +0100, Christian Seiler wrote:
> Doesn't this assume a typical datagram size of 256 bytes? Isn't that
> something that should be left up to the user? Also, suddenly the RCVBUF
> size of UNIX domain sockets suddenly becomes relevant, even though it
> is never actually checked when it comes to queuing the messages (only
> the SNDBUF size of the sending socket is checked). This creates really
> inconsistent semantics in my eyes where the receive buffer is useful
> for some things, but not for others.
Simple : If limit is expressed in term of packets, not in term of memory
usage, then the limit is in number of packets, not in bytes.
This is the reason we have pfifo and bfifo qdisc :
Admin can choose what he wants to limit on a device : packets or bytes.
If you want to allow the limitation being done in same SO_RCVBUF spirit,
you need to submit a patch for that.
(Using skb_set_owner_r() and tracking sk->sk_rmem_alloc instead of
skb_queue_len(&sk->sk_receive_queue) )
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets
2015-03-03 9:04 ` Christian Seiler
[not found] ` <0b5908020d83bcbaa7f4938e5cb433ea-+GPkE3DhqnY@public.gmane.org>
@ 2015-03-03 18:59 ` David Miller
1 sibling, 0 replies; 7+ messages in thread
From: David Miller @ 2015-03-03 18:59 UTC (permalink / raw)
To: christian; +Cc: netdev, dan, edumazet, hannes, linux-api
From: Christian Seiler <christian@iwakd.de>
Date: Tue, 03 Mar 2015 10:04:10 +0100
> - More importantly, timestamps of messages can't be faked at all. So
> in the example of a socket used for syslog purposes, all the
> timestamps on the messages queued would be wrong.
You can timestamp in the application and put that into your protocol
for sending the syslog messages.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-03-03 18:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-03 0:32 [PATCH] net: add SO_MAX_DGRAM_QLEN for AF_UNIX SOCK_DGRAM sockets Christian Seiler
2015-03-03 2:38 ` David Miller
2015-03-03 9:04 ` Christian Seiler
[not found] ` <0b5908020d83bcbaa7f4938e5cb433ea-+GPkE3DhqnY@public.gmane.org>
2015-03-03 14:30 ` Eric Dumazet
2015-03-03 15:05 ` Christian Seiler
[not found] ` <55ee39bff7875967acb06f25fa695f95-+GPkE3DhqnY@public.gmane.org>
2015-03-03 15:56 ` Eric Dumazet
2015-03-03 18:59 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).