From: "J. Bruce Fields" <bfields@fieldses.org>
To: Stephen Hemminger <shemminger@vyatta.com>
Cc: Olga Kornievskaia <aglo@citi.umich.edu>,
netdev@vger.kernel.org, Jim Rees <rees@umich.edu>
Subject: Re: setsockopt()
Date: Mon, 7 Jul 2008 17:32:27 -0400 [thread overview]
Message-ID: <20080707213227.GD19523@fieldses.org> (raw)
In-Reply-To: <20080707142408.43aa2a2e@extreme>
On Mon, Jul 07, 2008 at 02:24:08PM -0700, Stephen Hemminger wrote:
> On Mon, 07 Jul 2008 14:18:38 -0400
> Olga Kornievskaia <aglo@citi.umich.edu> wrote:
>
> > Hi,
> >
> > I'd like to ask a question regarding socket options, more
> > specifically send and receive buffer sizes.
> >
> > One simple question: (on the server-side) is it true that, to set
> > send/receive buffer size, setsockopt() can only be called before
> > listen()? From what I can tell, if I were to set socket options for the
> > listening socket, they get inherited by the socket created during the
> > accept(). However, when I try to change send/receive buffer size for the
> > new socket, they take no affect.
> >
> > The server in question is the NFSD server in the kernel. NFSD's code
> > tries to adjust the buffer size (in order to have TCP increase the
> > window size appropriately) but it does so after the new socket is
> > created. It leads to the fact that the TCP window doesn't open beyond
> > the TCP's "default" sysctl value (that would be the 2nd value in the
> > triple net.ipv4.tcp_rmem, which on our system is set to 64KB). We
> > changed the code so that setsockopt() is called for the listening socket
> > is created and we set the buffer sizes to something bigger, like 8MB.
> > Then we try to increase the buffer size for each socket created by the
> > accept() but what is seen on the network trace is that window size
> > doesn't open beyond the values used for the listening socket.
>
> It would be better if NFSD stayed out of doign setsockopt and just
> let the sender/receiver autotuning work?
Just googling around.... Yes, that's probably exactly what we want,
thanks! Any pointers to a good tutorial on the autotuning behavior?
So all we should have to do is never mess with setsockopt, and the
receive buffer size can increase up to the maximum (the third integer in
the tcp_rmem sysctl) if necessary?
--b.
>
> > I looked around in the code. There is a variable called
> > "window_clamp" that seems to specifies the largest possible window
> > advertisement. window_clamp gets set during the creation of the accept
> > socket. At that time, it's value is based on the sk_rcvbuf of the
> > listening socket. Thus, that would explain the behavior that window
> > doesn't grow beyond the values used in setsockopt() for the listening
> > socket, even though the new socket has new (larger) sk_sndbuf and
> > sk_rcvbuf than the listening socket.
> >
> > I realize that send/receive buffer size and window advertisement are
> > different but they are related in the way that by telling TCP that we
> > have a certain amount of memory for socket operations, it should try to
> > open big enough window (provided that there is no congestion).
> >
> > Can somebody advise us on how to properly set send/receive buffer
> > sizes for the NFSD in the kernel such that (1) the window is not bound
> > by the TCP's default sysctl value and (2) if it is possible to do so for
> > the accept sockets and not the listening socket.
> >
> > I would appreciate if we could be CC-ed on the reply as we are not
> > subscribed to the netdev mailing list.
> >
> > Thank you.
> >
> > -Olga
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-07-07 21:32 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-07 18:18 setsockopt() Olga Kornievskaia
2008-07-07 21:24 ` setsockopt() Stephen Hemminger
2008-07-07 21:30 ` setsockopt() Olga Kornievskaia
2008-07-07 21:33 ` setsockopt() Stephen Hemminger
2008-07-07 21:49 ` setsockopt() David Miller
2008-07-08 4:54 ` setsockopt() Evgeniy Polyakov
2008-07-08 6:02 ` setsockopt() Bill Fink
2008-07-08 6:29 ` setsockopt() Roland Dreier
2008-07-08 6:43 ` setsockopt() Evgeniy Polyakov
2008-07-08 7:03 ` setsockopt() Roland Dreier
2008-07-08 18:48 ` setsockopt() Bill Fink
2008-07-09 18:10 ` setsockopt() Roland Dreier
2008-07-09 18:34 ` setsockopt() Evgeniy Polyakov
2008-07-10 2:50 ` setsockopt() Bill Fink
2008-07-10 17:26 ` setsockopt() Rick Jones
2008-07-11 0:50 ` setsockopt() Bill Fink
2008-07-08 20:48 ` setsockopt() Stephen Hemminger
2008-07-08 22:05 ` setsockopt() Bill Fink
2008-07-09 5:25 ` setsockopt() Evgeniy Polyakov
2008-07-09 5:47 ` setsockopt() Bill Fink
2008-07-09 6:03 ` setsockopt() Evgeniy Polyakov
2008-07-09 18:11 ` setsockopt() J. Bruce Fields
2008-07-09 18:43 ` setsockopt() Evgeniy Polyakov
2008-07-09 22:28 ` setsockopt() J. Bruce Fields
2008-07-10 1:06 ` setsockopt() Evgeniy Polyakov
2008-07-10 20:05 ` [PATCH] Documentation: clarify tcp_{r,w}mem sysctl docs J. Bruce Fields
2008-07-10 23:50 ` David Miller
2008-07-08 20:12 ` setsockopt() Jim Rees
2008-07-08 21:54 ` setsockopt() John Heffner
2008-07-08 23:51 ` setsockopt() Jim Rees
2008-07-09 0:07 ` setsockopt() John Heffner
2008-07-07 22:50 ` setsockopt() Rick Jones
2008-07-07 23:00 ` setsockopt() David Miller
2008-07-07 23:27 ` setsockopt() Rick Jones
2008-07-08 1:15 ` setsockopt() Rick Jones
2008-07-08 1:48 ` setsockopt() J. Bruce Fields
2008-07-08 1:44 ` setsockopt() David Miller
2008-07-08 3:33 ` setsockopt() John Heffner
2008-07-08 18:16 ` setsockopt() Rick Jones
2008-07-08 19:10 ` setsockopt() John Heffner
[not found] ` <349f35ee0807090255s58fd040bne265ee117d06d397@mail.gmail.com>
2008-07-09 10:38 ` setsockopt() Jerry Chu
2008-07-07 21:32 ` J. Bruce Fields [this message]
2008-07-08 1:17 ` setsockopt() John Heffner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080707213227.GD19523@fieldses.org \
--to=bfields@fieldses.org \
--cc=aglo@citi.umich.edu \
--cc=netdev@vger.kernel.org \
--cc=rees@umich.edu \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.