* iproute uses too small of a receive buffer @ 2009-10-27 23:16 Ben Greear 2009-10-27 23:24 ` Stephen Hemminger 0 siblings, 1 reply; 17+ messages in thread From: Ben Greear @ 2009-10-27 23:16 UTC (permalink / raw) To: NetDev [-- Attachment #1: Type: text/plain, Size: 1065 bytes --] I have a very busy system with a bunch of xorp router processes (mis)configured. This thing is rapidly making route changes for whatever reason. The 'ip monitor route' command was failing: [root@i7-dqc-1 ]# ip monitor route netlink receive error No buffer space available (105) Dump terminated It is only using a 32k rcv buffer, and it seems the OS was overdriving it. Please consider making the rcv buffer larger, perhaps something like this (inline is white-space damaged...attachment should apply if deemed useful.): Signed-off-by: Ben Greear <greearb@candelatech.com> diff --git a/lib/libnetlink.c b/lib/libnetlink.c index b68e2fd..95a7d1d 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, { socklen_t addr_len; int sndbuf = 32768; - int rcvbuf = 32768; + int rcvbuf = 3276800; memset(rth, 0, sizeof(*rth)); Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com [-- Attachment #2: iputils.patch --] [-- Type: text/plain, Size: 343 bytes --] diff --git a/lib/libnetlink.c b/lib/libnetlink.c index b68e2fd..95a7d1d 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, { socklen_t addr_len; int sndbuf = 32768; - int rcvbuf = 32768; + int rcvbuf = 3276800; memset(rth, 0, sizeof(*rth)); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-27 23:16 iproute uses too small of a receive buffer Ben Greear @ 2009-10-27 23:24 ` Stephen Hemminger 2009-10-27 23:30 ` Ben Greear 2009-10-28 7:52 ` Eric Dumazet 0 siblings, 2 replies; 17+ messages in thread From: Stephen Hemminger @ 2009-10-27 23:24 UTC (permalink / raw) To: Ben Greear; +Cc: NetDev On Tue, 27 Oct 2009 16:16:52 -0700 Ben Greear <greearb@candelatech.com> wrote: > I have a very busy system with a bunch of xorp router processes (mis)configured. > > This thing is rapidly making route changes for whatever reason. > > The 'ip monitor route' command was failing: > > [root@i7-dqc-1 ]# ip monitor route > netlink receive error No buffer space available (105) > Dump terminated > > > It is only using a 32k rcv buffer, and it seems the OS was > overdriving it. > > Please consider making the rcv buffer larger, perhaps something > like this (inline is white-space damaged...attachment should apply > if deemed useful.): > > Signed-off-by: Ben Greear <greearb@candelatech.com> > > diff --git a/lib/libnetlink.c b/lib/libnetlink.c > index b68e2fd..95a7d1d 100644 > --- a/lib/libnetlink.c > +++ b/lib/libnetlink.c > @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, > { > socklen_t addr_len; > int sndbuf = 32768; > - int rcvbuf = 32768; > + int rcvbuf = 3276800; > > memset(rth, 0, sizeof(*rth)); > > > Thanks, > Ben > Just having larger buffer isn't guarantee of success. Allocating a huge buffer is not going to work on embedded. Why not have it continue after one error. -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-27 23:24 ` Stephen Hemminger @ 2009-10-27 23:30 ` Ben Greear 2009-10-28 7:01 ` Eric Dumazet 2009-10-28 7:52 ` Eric Dumazet 1 sibling, 1 reply; 17+ messages in thread From: Ben Greear @ 2009-10-27 23:30 UTC (permalink / raw) To: Stephen Hemminger; +Cc: NetDev On 10/27/2009 04:24 PM, Stephen Hemminger wrote: > On Tue, 27 Oct 2009 16:16:52 -0700 > Ben Greear<greearb@candelatech.com> wrote: > >> I have a very busy system with a bunch of xorp router processes (mis)configured. >> >> This thing is rapidly making route changes for whatever reason. >> >> The 'ip monitor route' command was failing: >> >> [root@i7-dqc-1 ]# ip monitor route >> netlink receive error No buffer space available (105) >> Dump terminated >> >> >> It is only using a 32k rcv buffer, and it seems the OS was >> overdriving it. >> >> Please consider making the rcv buffer larger, perhaps something >> like this (inline is white-space damaged...attachment should apply >> if deemed useful.): >> >> Signed-off-by: Ben Greear<greearb@candelatech.com> >> >> diff --git a/lib/libnetlink.c b/lib/libnetlink.c >> index b68e2fd..95a7d1d 100644 >> --- a/lib/libnetlink.c >> +++ b/lib/libnetlink.c >> @@ -38,7 +38,7 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, >> { >> socklen_t addr_len; >> int sndbuf = 32768; >> - int rcvbuf = 32768; >> + int rcvbuf = 3276800; >> >> memset(rth, 0, sizeof(*rth)); >> >> >> Thanks, >> Ben >> > > Just having larger buffer isn't guarantee of success. Allocating > a huge buffer is not going to work on embedded. > > Why not have it continue after one error. Probably the right way is to give a cmd-line arg to set the buffer size and also continue if the error is ENOBUFs (but print some error out so users know they have issues). I can make the attempt if that sounds good to you. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-27 23:30 ` Ben Greear @ 2009-10-28 7:01 ` Eric Dumazet 2009-10-28 7:09 ` Eric Dumazet 2009-10-28 7:37 ` Eric Dumazet 0 siblings, 2 replies; 17+ messages in thread From: Eric Dumazet @ 2009-10-28 7:01 UTC (permalink / raw) To: Ben Greear; +Cc: Stephen Hemminger, NetDev Ben Greear a écrit : > > Probably the right way is to give a cmd-line arg to set the buffer size > and also continue if the error is ENOBUFs (but print some error out > so users know they have issues). I can make the attempt if that > sounds good to you. Real fix is to realloc buffer at receive time, no need for user setting. In my testings I saw it reaching 1 Mbyte write(2, "REALLOC buflen 8192\n"..., 20) = 20 write(2, "REALLOC buflen 16384\n"..., 21) = 21 write(2, "REALLOC buflen 32768\n"..., 21) = 21 write(2, "REALLOC buflen 65536\n"..., 21) = 21 write(2, "REALLOC buflen 131072\n"..., 22) = 22 write(2, "REALLOC buflen 262144\n"..., 22) = 22 write(2, "REALLOC buflen 524288\n"..., 22) = 22 [iproute2] realloc buffer in rtnl_listen # ip monitor route netlink receive error No buffer space available (105) Dump terminated Reported-by: Ben Greear<greearb@candelatech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- diff --git a/lib/libnetlink.c b/lib/libnetlink.c index b68e2fd..134ce7f 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -392,8 +392,14 @@ int rtnl_listen(struct rtnl_handle *rtnl, .msg_iov = &iov, .msg_iovlen = 1, }; - char buf[8192]; + char *buf; + size_t buflen = 8192; + buf = malloc(buflen); + if (buf == NULL) { + fprintf(stderr, "netlink could not alloc %lu bytes\n", buflen); + return -1; + } memset(&nladdr, 0, sizeof(nladdr)); nladdr.nl_family = AF_NETLINK; nladdr.nl_pid = 0; @@ -401,12 +407,20 @@ int rtnl_listen(struct rtnl_handle *rtnl, iov.iov_base = buf; while (1) { - iov.iov_len = sizeof(buf); + iov.iov_len = buflen; status = recvmsg(rtnl->fd, &msg, 0); if (status < 0) { if (errno == EINTR || errno == EAGAIN) continue; + if (errno == ENOBUFS) { + buf = realloc(buf, buflen * 2); + if (buf) { + buflen *= 2; + iov.iov_base = buf; + continue; + } + } fprintf(stderr, "netlink receive error %s (%d)\n", strerror(errno), errno); return -1; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 7:01 ` Eric Dumazet @ 2009-10-28 7:09 ` Eric Dumazet 2009-10-28 7:37 ` Eric Dumazet 1 sibling, 0 replies; 17+ messages in thread From: Eric Dumazet @ 2009-10-28 7:09 UTC (permalink / raw) Cc: Ben Greear, Stephen Hemminger, NetDev Eric Dumazet a écrit : > Ben Greear a écrit : >> Probably the right way is to give a cmd-line arg to set the buffer size >> and also continue if the error is ENOBUFs (but print some error out >> so users know they have issues). I can make the attempt if that >> sounds good to you. > > Real fix is to realloc buffer at receive time, no need for user setting. > Then, another problem is that some information can be dropped at kernel level when socket rcvbuf is full (ip monitor too slow to read its socket) Thats hard to fix because you need to tweak /proc/sys/net/core/rmem_max ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 7:01 ` Eric Dumazet 2009-10-28 7:09 ` Eric Dumazet @ 2009-10-28 7:37 ` Eric Dumazet 1 sibling, 0 replies; 17+ messages in thread From: Eric Dumazet @ 2009-10-28 7:37 UTC (permalink / raw) To: Ben Greear, Stephen Hemminger; +Cc: NetDev Eric Dumazet a écrit : > Ben Greear a écrit : >> Probably the right way is to give a cmd-line arg to set the buffer size >> and also continue if the error is ENOBUFs (but print some error out >> so users know they have issues). I can make the attempt if that >> sounds good to you. > > Real fix is to realloc buffer at receive time, no need for user setting. > > In my testings I saw it reaching 1 Mbyte > write(2, "REALLOC buflen 8192\n"..., 20) = 20 > write(2, "REALLOC buflen 16384\n"..., 21) = 21 > write(2, "REALLOC buflen 32768\n"..., 21) = 21 > write(2, "REALLOC buflen 65536\n"..., 21) = 21 > write(2, "REALLOC buflen 131072\n"..., 22) = 22 > write(2, "REALLOC buflen 262144\n"..., 22) = 22 > write(2, "REALLOC buflen 524288\n"..., 22) = 22 > > > [iproute2] realloc buffer in rtnl_listen > > # ip monitor route > netlink receive error No buffer space available (105) > Dump terminated > > Reported-by: Ben Greear<greearb@candelatech.com> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Oops, this was wrong, Ben was right, sorry... ENOBUFS errors is a flag to actually report to user that some information was dropped, not that user supplied buffer at recv() time is not big enough. I was surprised that buffer could reach 1Mbytes, while RCVBUF was 32768 or so. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-27 23:24 ` Stephen Hemminger 2009-10-27 23:30 ` Ben Greear @ 2009-10-28 7:52 ` Eric Dumazet 2009-10-28 7:55 ` David Miller 2009-10-28 19:05 ` Patrick McHardy 1 sibling, 2 replies; 17+ messages in thread From: Eric Dumazet @ 2009-10-28 7:52 UTC (permalink / raw) To: Stephen Hemminger; +Cc: Ben Greear, NetDev Stephen Hemminger a écrit : > > Just having larger buffer isn't guarantee of success. Allocating > a huge buffer is not going to work on embedded. > Please note we do not allocate a big buffer, only allow more small skbs to be queued on socket receive queue. If memory is not available, skb allocation will eventually fail and be reported as well, embedded or not. I vote for allowing 1024*1024 bytes instead of 32768, and eventually user should be warned that it is capped by /proc/sys/net/core/rmem_max > Why not have it continue after one error. Yes, but caller of 'ip monitor' just restart it anyway ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 7:52 ` Eric Dumazet @ 2009-10-28 7:55 ` David Miller 2009-10-28 19:05 ` Patrick McHardy 1 sibling, 0 replies; 17+ messages in thread From: David Miller @ 2009-10-28 7:55 UTC (permalink / raw) To: eric.dumazet; +Cc: shemminger, greearb, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Wed, 28 Oct 2009 08:52:57 +0100 > Stephen Hemminger a écrit : >> >> Just having larger buffer isn't guarantee of success. Allocating >> a huge buffer is not going to work on embedded. >> > > Please note we do not allocate a big buffer, only allow more small skbs > to be queued on socket receive queue. > > If memory is not available, skb allocation will eventually fail > and be reported as well, embedded or not. > > I vote for allowing 1024*1024 bytes instead of 32768, > and eventually user should be warned that it is capped by > /proc/sys/net/core/rmem_max This discussion constantly reminds me of: /* * skb should fit one page. This choice is good for headerless malloc. * But we should limit to 8K so that userspace does not have to * use enormous buffer sizes on recvmsg() calls just to avoid * MSG_TRUNC when PAGE_SIZE is very large. */ #if PAGE_SIZE < 8192UL #define NLMSG_GOODSIZE SKB_WITH_OVERHEAD(PAGE_SIZE) #else #define NLMSG_GOODSIZE SKB_WITH_OVERHEAD(8192UL) #endif #define NLMSG_DEFAULT_SIZE (NLMSG_GOODSIZE - NLMSG_HDRLEN) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 7:52 ` Eric Dumazet 2009-10-28 7:55 ` David Miller @ 2009-10-28 19:05 ` Patrick McHardy 2009-10-28 19:19 ` Ben Greear 2009-10-29 8:17 ` David Miller 1 sibling, 2 replies; 17+ messages in thread From: Patrick McHardy @ 2009-10-28 19:05 UTC (permalink / raw) To: Eric Dumazet; +Cc: Stephen Hemminger, Ben Greear, NetDev [-- Attachment #1: Type: text/plain, Size: 753 bytes --] Eric Dumazet wrote: > Stephen Hemminger a écrit : >> Just having larger buffer isn't guarantee of success. Allocating >> a huge buffer is not going to work on embedded. >> > > Please note we do not allocate a big buffer, only allow more small skbs > to be queued on socket receive queue. > > If memory is not available, skb allocation will eventually fail > and be reported as well, embedded or not. > > I vote for allowing 1024*1024 bytes instead of 32768, > and eventually user should be warned that it is capped by > /proc/sys/net/core/rmem_max How about this? It will double the receive queue limit on ENOBUFS up to 1024 * 1024b, then bail out with the normal error message on further ENOBUFS. Signed-off-by: Patrick McHardy <kaber@trash.net> [-- Attachment #2: x --] [-- Type: text/plain, Size: 894 bytes --] diff --git a/lib/libnetlink.c b/lib/libnetlink.c index b68e2fd..e4fda40 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -25,6 +25,8 @@ #include "libnetlink.h" +static int rcvbuf = 32768; + void rtnl_close(struct rtnl_handle *rth) { if (rth->fd >= 0) { @@ -38,7 +40,6 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, { socklen_t addr_len; int sndbuf = 32768; - int rcvbuf = 32768; memset(rth, 0, sizeof(*rth)); @@ -407,6 +409,12 @@ int rtnl_listen(struct rtnl_handle *rtnl, if (status < 0) { if (errno == EINTR || errno == EAGAIN) continue; + if (errno == ENOBUFS && rcvbuf < 1024 * 1024) { + rcvbuf *= 2; + if (setsockopt(rtnl->fd, SOL_SOCKET, SO_RCVBUF, + &rcvbuf, sizeof(rcvbuf)) == 0) + continue; + } fprintf(stderr, "netlink receive error %s (%d)\n", strerror(errno), errno); return -1; ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:05 ` Patrick McHardy @ 2009-10-28 19:19 ` Ben Greear 2009-10-28 19:50 ` Patrick McHardy 2009-10-28 20:38 ` Eric Dumazet 2009-10-29 8:17 ` David Miller 1 sibling, 2 replies; 17+ messages in thread From: Ben Greear @ 2009-10-28 19:19 UTC (permalink / raw) To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev On 10/28/2009 12:05 PM, Patrick McHardy wrote: > Eric Dumazet wrote: >> Stephen Hemminger a écrit : >>> Just having larger buffer isn't guarantee of success. Allocating >>> a huge buffer is not going to work on embedded. >>> >> >> Please note we do not allocate a big buffer, only allow more small skbs >> to be queued on socket receive queue. >> >> If memory is not available, skb allocation will eventually fail >> and be reported as well, embedded or not. >> >> I vote for allowing 1024*1024 bytes instead of 32768, >> and eventually user should be warned that it is capped by >> /proc/sys/net/core/rmem_max > > How about this? It will double the receive queue limit on ENOBUFS > up to 1024 * 1024b, then bail out with the normal error message on > further ENOBUFS. > > Signed-off-by: Patrick McHardy<kaber@trash.net> First: This still pretty much guarantees that messages will be lost when the program starts (when messages are coming in too large of chunks for small buffers) If you are debugging something tricky, having lost messages will be very annoying! Second: Why bail on ENOBUFS at all? I don't see how it helps the user since they will probably just have to start it again, and will miss more messages than keeping going would have. And, even 1MB may not be enough for some scenarios. So, probably best to let users over-ride the initial setting on cmd-line. If not, then use a large value to start with. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:19 ` Ben Greear @ 2009-10-28 19:50 ` Patrick McHardy 2009-10-28 20:04 ` Ben Greear 2009-11-10 17:15 ` Stephen Hemminger 2009-10-28 20:38 ` Eric Dumazet 1 sibling, 2 replies; 17+ messages in thread From: Patrick McHardy @ 2009-10-28 19:50 UTC (permalink / raw) To: Ben Greear; +Cc: Eric Dumazet, Stephen Hemminger, NetDev [-- Attachment #1: Type: text/plain, Size: 2006 bytes --] Ben Greear wrote: > On 10/28/2009 12:05 PM, Patrick McHardy wrote: >> Eric Dumazet wrote: >>> Stephen Hemminger a écrit : >>>> Just having larger buffer isn't guarantee of success. Allocating >>>> a huge buffer is not going to work on embedded. >>>> >>> >>> Please note we do not allocate a big buffer, only allow more small skbs >>> to be queued on socket receive queue. >>> >>> If memory is not available, skb allocation will eventually fail >>> and be reported as well, embedded or not. >>> >>> I vote for allowing 1024*1024 bytes instead of 32768, >>> and eventually user should be warned that it is capped by >>> /proc/sys/net/core/rmem_max >> >> How about this? It will double the receive queue limit on ENOBUFS >> up to 1024 * 1024b, then bail out with the normal error message on >> further ENOBUFS. >> >> Signed-off-by: Patrick McHardy<kaber@trash.net> > > First: This still pretty much guarantees that messages will be lost when > the program starts (when messages are coming in too large of chunks for > small buffers) > If you are debugging something tricky, having lost messages will be > very annoying! Yeah, on second thought the probing also doesn't make too much sense since the memory is only used when its really needed anyways. And its capped by rmem_max. > Second: Why bail on ENOBUFS at all? I don't see how it helps the user > since they will probably just have to start it again, and will miss more > messages than keeping going would have. Agreed. > And, even 1MB may not be enough for some scenarios. So, probably best to > let users over-ride the initial setting on cmd-line. If not, then use > a large value to start with. How about this? It uses 1MB as receive buf limit by default (without increasing /proc/sys/net/core/rmem_max it will be limited by less however) and allows to specify the size manually using "-rcvbuf X" (-r is already used, so you need to specify at least -rc). Additionally rtnl_listen() continues on ENOBUFS after printing the error message. [-- Attachment #2: x --] [-- Type: text/plain, Size: 2170 bytes --] diff --git a/include/libnetlink.h b/include/libnetlink.h index 0e02468..61da15b 100644 --- a/include/libnetlink.h +++ b/include/libnetlink.h @@ -17,6 +17,8 @@ struct rtnl_handle __u32 dump; }; +extern int rcvbuf; + extern int rtnl_open(struct rtnl_handle *rth, unsigned subscriptions); extern int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, int protocol); extern void rtnl_close(struct rtnl_handle *rth); diff --git a/ip/ip.c b/ip/ip.c index 2bd54b2..b4c076a 100644 --- a/ip/ip.c +++ b/ip/ip.c @@ -50,7 +50,8 @@ static void usage(void) " tunnel | maddr | mroute | monitor | xfrm }\n" " OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] | -r[esolve] |\n" " -f[amily] { inet | inet6 | ipx | dnet | link } |\n" -" -o[neline] | -t[imestamp] | -b[atch] [filename] }\n"); +" -o[neline] | -t[imestamp] | -b[atch] [filename] |\n" +" -rc[vbuf] [size]}\n"); exit(-1); } @@ -213,6 +214,19 @@ int main(int argc, char **argv) if (argc <= 1) usage(); batch_file = argv[1]; + } else if (matches(opt, "-rcvbuf") == 0) { + unsigned int size; + + argc--; + argv++; + if (argc <= 1) + usage(); + if (get_unsigned(&size, argv[1], 0)) { + fprintf(stderr, "Invalid rcvbuf size '%s'\n", + argv[1]); + exit(-1); + } + rcvbuf = size; } else if (matches(opt, "-help") == 0) { usage(); } else { diff --git a/lib/libnetlink.c b/lib/libnetlink.c index b68e2fd..5c716ab 100644 --- a/lib/libnetlink.c +++ b/lib/libnetlink.c @@ -25,6 +25,8 @@ #include "libnetlink.h" +int rcvbuf = 1024 * 1024; + void rtnl_close(struct rtnl_handle *rth) { if (rth->fd >= 0) { @@ -38,7 +40,6 @@ int rtnl_open_byproto(struct rtnl_handle *rth, unsigned subscriptions, { socklen_t addr_len; int sndbuf = 32768; - int rcvbuf = 32768; memset(rth, 0, sizeof(*rth)); @@ -409,6 +410,8 @@ int rtnl_listen(struct rtnl_handle *rtnl, continue; fprintf(stderr, "netlink receive error %s (%d)\n", strerror(errno), errno); + if (errno == ENOBUFS) + continue; return -1; } if (status == 0) { ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:50 ` Patrick McHardy @ 2009-10-28 20:04 ` Ben Greear 2009-10-28 20:07 ` Patrick McHardy 2009-11-10 17:15 ` Stephen Hemminger 1 sibling, 1 reply; 17+ messages in thread From: Ben Greear @ 2009-10-28 20:04 UTC (permalink / raw) To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev On 10/28/2009 12:50 PM, Patrick McHardy wrote: >> And, even 1MB may not be enough for some scenarios. So, probably best to >> let users over-ride the initial setting on cmd-line. If not, then use >> a large value to start with. > > How about this? It uses 1MB as receive buf limit by default (without > increasing /proc/sys/net/core/rmem_max it will be limited by less > however) and allows to specify the size manually using "-rcvbuf X" > (-r is already used, so you need to specify at least -rc). > > Additionally rtnl_listen() continues on ENOBUFS after printing the > error message. Looks good..except: If rmem_max is smaller than 1M, will that cause setsocktopt to fail and thus fail early out of rtnl_open_byproto? Maybe we should only print errors but not return in that method when setsockopt fails? In another project, I ended up trying ever smaller values until one worked in order to get near what the user wanted even if rmem_max was configured smaller. Not sure if that is worth doing here or not. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 20:04 ` Ben Greear @ 2009-10-28 20:07 ` Patrick McHardy 2009-10-28 20:21 ` Ben Greear 0 siblings, 1 reply; 17+ messages in thread From: Patrick McHardy @ 2009-10-28 20:07 UTC (permalink / raw) To: Ben Greear; +Cc: Eric Dumazet, Stephen Hemminger, NetDev Ben Greear wrote: > On 10/28/2009 12:50 PM, Patrick McHardy wrote: > >>> And, even 1MB may not be enough for some scenarios. So, probably >>> best to >>> let users over-ride the initial setting on cmd-line. If not, then use >>> a large value to start with. >> >> How about this? It uses 1MB as receive buf limit by default (without >> increasing /proc/sys/net/core/rmem_max it will be limited by less >> however) and allows to specify the size manually using "-rcvbuf X" >> (-r is already used, so you need to specify at least -rc). >> >> Additionally rtnl_listen() continues on ENOBUFS after printing the >> error message. > > Looks good..except: > > If rmem_max is smaller than 1M, will that cause setsocktopt to > fail and thus fail early out of rtnl_open_byproto? No, the kernel takes the value as a hint and only uses the maximum allowable value: case SO_RCVBUF: /* Don't error on this BSD doesn't and if you think about it this is right. Otherwise apps have to play 'guess the biggest size' games. RCVBUF/SNDBUF are treated in BSD as hints */ if (val > sysctl_rmem_max) val = sysctl_rmem_max; > Maybe we should only print errors but not return in that method > when setsockopt fails? > > In another project, I ended up trying ever smaller values until one > worked in order to get near what the user wanted even if rmem_max > was configured smaller. Not sure if that is worth doing here or not. I think it should be fine this way. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 20:07 ` Patrick McHardy @ 2009-10-28 20:21 ` Ben Greear 0 siblings, 0 replies; 17+ messages in thread From: Ben Greear @ 2009-10-28 20:21 UTC (permalink / raw) To: Patrick McHardy; +Cc: Eric Dumazet, Stephen Hemminger, NetDev On 10/28/2009 01:07 PM, Patrick McHardy wrote: > Ben Greear wrote: >> On 10/28/2009 12:50 PM, Patrick McHardy wrote: >> >>>> And, even 1MB may not be enough for some scenarios. So, probably >>>> best to >>>> let users over-ride the initial setting on cmd-line. If not, then use >>>> a large value to start with. >>> >>> How about this? It uses 1MB as receive buf limit by default (without >>> increasing /proc/sys/net/core/rmem_max it will be limited by less >>> however) and allows to specify the size manually using "-rcvbuf X" >>> (-r is already used, so you need to specify at least -rc). >>> >>> Additionally rtnl_listen() continues on ENOBUFS after printing the >>> error message. >> >> Looks good..except: >> >> If rmem_max is smaller than 1M, will that cause setsocktopt to >> fail and thus fail early out of rtnl_open_byproto? > > No, the kernel takes the value as a hint and only uses the > maximum allowable value: Sweet. No complaints from me then. Thanks, Ben -- Ben Greear <greearb@candelatech.com> Candela Technologies Inc http://www.candelatech.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:50 ` Patrick McHardy 2009-10-28 20:04 ` Ben Greear @ 2009-11-10 17:15 ` Stephen Hemminger 1 sibling, 0 replies; 17+ messages in thread From: Stephen Hemminger @ 2009-11-10 17:15 UTC (permalink / raw) To: Patrick McHardy; +Cc: Ben Greear, Eric Dumazet, NetDev > > How about this? It uses 1MB as receive buf limit by default (without > increasing /proc/sys/net/core/rmem_max it will be limited by less > however) and allows to specify the size manually using "-rcvbuf X" > (-r is already used, so you need to specify at least -rc). > > Additionally rtnl_listen() continues on ENOBUFS after printing the > error message. Applied, seems like the best workaround ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:19 ` Ben Greear 2009-10-28 19:50 ` Patrick McHardy @ 2009-10-28 20:38 ` Eric Dumazet 1 sibling, 0 replies; 17+ messages in thread From: Eric Dumazet @ 2009-10-28 20:38 UTC (permalink / raw) To: Ben Greear; +Cc: Patrick McHardy, Stephen Hemminger, NetDev Ben Greear a écrit : > Second: Why bail on ENOBUFS at all? I don't see how it helps the user > since they will probably just have to start it again, and will miss more > messages than keeping going would have. > > And, even 1MB may not be enough for some scenarios. So, probably best to > let users over-ride the initial setting on cmd-line. If not, then use > a large value to start with. > In this case, just dont call setsockopt() at all in "ip" and let system use the standard/default value (/proc/sys/net/core/rmem_default) that an admin can change if he wants to handle one million devices :) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: iproute uses too small of a receive buffer 2009-10-28 19:05 ` Patrick McHardy 2009-10-28 19:19 ` Ben Greear @ 2009-10-29 8:17 ` David Miller 1 sibling, 0 replies; 17+ messages in thread From: David Miller @ 2009-10-29 8:17 UTC (permalink / raw) To: kaber; +Cc: eric.dumazet, shemminger, greearb, netdev From: Patrick McHardy <kaber@trash.net> Date: Wed, 28 Oct 2009 20:05:12 +0100 > How about this? It will double the receive queue limit on ENOBUFS > up to 1024 * 1024b, then bail out with the normal error message on > further ENOBUFS. > > Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net> ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2009-11-10 17:15 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-10-27 23:16 iproute uses too small of a receive buffer Ben Greear 2009-10-27 23:24 ` Stephen Hemminger 2009-10-27 23:30 ` Ben Greear 2009-10-28 7:01 ` Eric Dumazet 2009-10-28 7:09 ` Eric Dumazet 2009-10-28 7:37 ` Eric Dumazet 2009-10-28 7:52 ` Eric Dumazet 2009-10-28 7:55 ` David Miller 2009-10-28 19:05 ` Patrick McHardy 2009-10-28 19:19 ` Ben Greear 2009-10-28 19:50 ` Patrick McHardy 2009-10-28 20:04 ` Ben Greear 2009-10-28 20:07 ` Patrick McHardy 2009-10-28 20:21 ` Ben Greear 2009-11-10 17:15 ` Stephen Hemminger 2009-10-28 20:38 ` Eric Dumazet 2009-10-29 8:17 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).