Re: getsockopt/setsockopt with SO_RCVBUF and SO_SNDBUF "non-standard" behaviour

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Rick Jones <rick.jones2@hp.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Eugen Dedu <Eugen.Dedu@pu-pm.univ-fcomte.fr>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>
Subject: Re: getsockopt/setsockopt with SO_RCVBUF and SO_SNDBUF "non-standard" behaviour
Date: Wed, 18 Jul 2012 10:32:30 -0700	[thread overview]
Message-ID: <5006F32E.8060405@hp.com> (raw)
In-Reply-To: <1342627875.2626.3070.camel@edumazet-glaptop>

On 07/18/2012 09:11 AM, Eric Dumazet wrote:
>
> That the way it's done on linux since day 0
>
> You can probably find a lot of pages on the web explaining the
> rationale.
>
> If your application handles UDP frames, what SO_RCVBUF should count ?
>
> If its the amount of payload bytes, you could have a pathological
> situation where an attacker sends 1-byte UDP frames fast enough and
> could consume a lot of kernel memory.
>
> Each frame consumes a fair amount of kernel memory (between 512 bytes
> and 8 Kbytes depending on the driver).
>
> So linux says : If user expect to receive  XXXX bytes, set a limit of
> _kernel_ memory used to store these bytes, and use an estimation of 100%
> of overhead. That is : allow 2*XXXX bytes to be allocated for socket
> receive buffers.

Expanding on/rewording that, in a setsockopt() call SO_RCVBUF specifies 
the data bytes and gets doubled to become the kernel/overhead byte 
limit.  Unless the doubling would be greater than net.core.rmem_max, in 
which case the limit becomes net.core.rmem_max.

But on getsockopt() SO_RCVBUF is always the kernel/overhead byte limit.

In one call it is fish.  In the other it is fowl.

Other stacks appear to keep their kernel/overhead limit quiet, keeping 
SO_RCVBUF an expression of a data limit in both setsockopt() and 
getsockopt().  With those stacks, there is I suppose the possible source 
of confusion when/if someone tests the queuing to a socket, sends "high 
overhead" packets and doesn't get to SO_RCVBUF worth of data though I 
don't recall encountering that in my "pre-linux" time.

The sometimes fish, sometimes fowl version (along with the auto tuning 
when one doesn't make setsockopt() calls) gave me fits in netperf for 
years until I finally relented and split the socket buffer size 
variables into three - what netperf's user requested via the command 
line, what it was right after the socket was created, and what it was at 
the end of the data phase of the test.

rick jones

next prev parent reply	other threads:[~2012-07-18 17:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-17  9:27 getsockopt/setsockopt with SO_RCVBUF and SO_SNDBUF "non-standard" behaviour Eugen Dedu
2012-07-18 15:59 ` Eugen Dedu
2012-07-18 16:11   ` Eric Dumazet
2012-07-18 17:32     ` Rick Jones [this message]
2012-07-19 16:14       ` Eugen Dedu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5006F32E.8060405@hp.com \
    --to=rick.jones2@hp.com \
    --cc=Eugen.Dedu@pu-pm.univ-fcomte.fr \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.