From: Patrick McHardy <kaber@trash.net>
To: Ben Greear <greearb@candelatech.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Stephen Hemminger <shemminger@vyatta.com>,
NetDev <netdev@vger.kernel.org>
Subject: Re: iproute uses too small of a receive buffer
Date: Wed, 28 Oct 2009 20:50:48 +0100 [thread overview]
Message-ID: <4AE8A098.8040207@trash.net> (raw)
In-Reply-To: <4AE89927.9090405@candelatech.com>
[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]
Ben Greear wrote:
> On 10/28/2009 12:05 PM, Patrick McHardy wrote:
>> Eric Dumazet wrote:
>>> Stephen Hemminger a écrit :
>>>> Just having larger buffer isn't guarantee of success. Allocating
>>>> a huge buffer is not going to work on embedded.
>>>>
>>>
>>> Please note we do not allocate a big buffer, only allow more small skbs
>>> to be queued on socket receive queue.
>>>
>>> If memory is not available, skb allocation will eventually fail
>>> and be reported as well, embedded or not.
>>>
>>> I vote for allowing 1024*1024 bytes instead of 32768,
>>> and eventually user should be warned that it is capped by
>>> /proc/sys/net/core/rmem_max
>>
>> How about this? It will double the receive queue limit on ENOBUFS
>> up to 1024 * 1024b, then bail out with the normal error message on
>> further ENOBUFS.
>>
>> Signed-off-by: Patrick McHardy<kaber@trash.net>
>
> First: This still pretty much guarantees that messages will be lost when
> the program starts (when messages are coming in too large of chunks for
> small buffers)
> If you are debugging something tricky, having lost messages will be
> very annoying!
Yeah, on second thought the probing also doesn't make too much sense
since the memory is only used when its really needed anyways. And its
capped by rmem_max.
> Second: Why bail on ENOBUFS at all? I don't see how it helps the user
> since they will probably just have to start it again, and will miss more
> messages than keeping going would have.
Agreed.
> And, even 1MB may not be enough for some scenarios. So, probably best to
> let users over-ride the initial setting on cmd-line. If not, then use
> a large value to start with.
How about this? It uses 1MB as receive buf limit by default (without
increasing /proc/sys/net/core/rmem_max it will be limited by less
however) and allows to specify the size manually using "-rcvbuf X"
(-r is already used, so you need to specify at least -rc).
Additionally rtnl_listen() continues on ENOBUFS after printing the
error message.
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 2170 bytes --]
diff --git a/include/libnetlink.h b/include/libnetlink.h
index 0e02468..61da15b 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -17,6 +17,8 @@ struct rtnl_handle
__u32 dump;
};
+extern int rcvbuf;
+
extern int rtnl_open(struct rtnl_handle *rth, unsigned subscriptions);
extern int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, int protocol);
extern void rtnl_close(struct rtnl_handle *rth);
diff --git a/ip/ip.c b/ip/ip.c
index 2bd54b2..b4c076a 100644
--- a/ip/ip.c
+++ b/ip/ip.c
@@ -50,7 +50,8 @@ static void usage(void)
" tunnel | maddr | mroute | monitor | xfrm }\n"
" OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n"
" -f[amily] { inet | inet6 | ipx | dnet | link } |\n"
-" -o[neline] | -t[imestamp] | -b[atch] [filename] }\n");
+" -o[neline] | -t[imestamp] | -b[atch] [filename] |\n"
+" -rc[vbuf] [size]}\n");
exit(-1);
}
@@ -213,6 +214,19 @@ int main(int argc, char **argv)
if (argc <= 1)
usage();
batch_file = argv[1];
+ } else if (matches(opt, "-rcvbuf") == 0) {
+ unsigned int size;
+
+ argc--;
+ argv++;
+ if (argc <= 1)
+ usage();
+ if (get_unsigned(&size, argv[1], 0)) {
+ fprintf(stderr, "Invalid rcvbuf size '%s'\n",
+ argv[1]);
+ exit(-1);
+ }
+ rcvbuf = size;
} else if (matches(opt, "-help") == 0) {
usage();
} else {
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index b68e2fd..5c716ab 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -25,6 +25,8 @@
#include "libnetlink.h"
+int rcvbuf = 1024 * 1024;
+
void rtnl_close(struct rtnl_handle *rth)
{
if (rth->fd >= 0) {
@@ -38,7 +40,6 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions,
{
socklen_t addr_len;
int sndbuf = 32768;
- int rcvbuf = 32768;
memset(rth, 0, sizeof(*rth));
@@ -409,6 +410,8 @@ int rtnl_listen(struct rtnl_handle *rtnl,
continue;
fprintf(stderr, "netlink receive error %s (%d)\n",
strerror(errno), errno);
+ if (errno == ENOBUFS)
+ continue;
return -1;
}
if (status == 0) {
next prev parent reply other threads:[~2009-10-28 19:50 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-27 23:16 iproute uses too small of a receive buffer Ben Greear
2009-10-27 23:24 ` Stephen Hemminger
2009-10-27 23:30 ` Ben Greear
2009-10-28 7:01 ` Eric Dumazet
2009-10-28 7:09 ` Eric Dumazet
2009-10-28 7:37 ` Eric Dumazet
2009-10-28 7:52 ` Eric Dumazet
2009-10-28 7:55 ` David Miller
2009-10-28 19:05 ` Patrick McHardy
2009-10-28 19:19 ` Ben Greear
2009-10-28 19:50 ` Patrick McHardy [this message]
2009-10-28 20:04 ` Ben Greear
2009-10-28 20:07 ` Patrick McHardy
2009-10-28 20:21 ` Ben Greear
2009-11-10 17:15 ` Stephen Hemminger
2009-10-28 20:38 ` Eric Dumazet
2009-10-29 8:17 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AE8A098.8040207@trash.net \
--to=kaber@trash.net \
--cc=eric.dumazet@gmail.com \
--cc=greearb@candelatech.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).