public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
* TCP_CONGESTION documentation
@ 2008-11-21 16:06 Michael Kerrisk
       [not found] ` <4926DC7B.7020203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Kerrisk @ 2008-11-21 16:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, linux-net-u79uwXL29TY76Z2rM5mHXA, Andi Kleen,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk

Hello Stephen,

Back in 2.6.13, you added the TCP_CONGESTION sockopt, but
provided no man-page patch...

Below is my attempt to document this sockopt.  Could you please
review.  Please don't assume I've well understood the code: I
may well have messed up in my reading of it, so review what
I've written with care.

Also, a question: was the silent truncation of the returned
string on getsockopt() if optlen is too small really intended?
Would it not be/have been better to error on this case?

Cheers,

Michael


   TCP_CONGESTION (since Linux 2.6.13)
      Get  or  set  the  congestion-control algorithm for this
      socket.   The  optval  argument  is  a  pointer   to   a
      character-string buffer.

      For  getsockopt()  *optlen specifies the amount of space
      available in the buffer  pointed  to  by  optval,  which
      should  be  at  least  16  bytes (defined by the kernel-
      internal  constant  TCP_CA_NAME_MAX).   On  return,  the
      buffer  pointed to by optval is set to a null-terminated
      string containing the  name  of  the  congestion-control
      algorithm  for  this  socket,  and *optlen is set to the
      minimum of its original value and  TCP_CA_NAME_MAX.   If
      the  value  passed  in  *optlen  is  too small, then the
      string returned in *optval is silently truncated, and no
      terminating  null  byte is added.  If an empty string is
      returned, then the socket is using the  default  conges-
      tion-control  algorithm,  determined  as described under
      tcp_congestion_control above.

      For setsockopt() optlen specifies the length of the con-
      gestion-control  algorithm  name contained in the buffer
      pointed to by optval; this length need not  include  any
      terminating  null  byte.  The algorithm "reno" is always
      permitted; other algorithms may be available,  depending
      on  kernel configuration.  Possible errors from setsock-
      opt() include: algorithm not  found/available  (ENOENT);
      setting  this algorithm requires the CAP_NET_ADMIN capa-
      bility  (EPERM);  and  failure  getting  kernel   module
      (EBUSY).

--- tcp.7       2008-11-21 10:54:08.000000000 -0500
+++ tcp.7.TCP_CONGESTION.patch   2008-11-21 10:53:36.000000000 -0500
@@ -733,7 +733,58 @@
 socket options are valid on TCP sockets.
 For more information see
 .BR ip (7).
-.\" FIXME Document TCP_CONGESTION (new in 2.6.13)
+.TP
+.BR TCP_CONGESTION " (since Linux 2.6.13)"
+Get or set the congestion-control algorithm for this socket.
+The
+.I optval
+argument is a pointer to a character-string buffer.
+
+For
+.BR getsockopt ()
+.I *optlen
+specifies the amount of space available in the buffer pointed to by
+.IR optval ,
+which should be at least 16 bytes (defined by the kernel-internal constant
+.BR TCP_CA_NAME_MAX ).
+On return, the buffer pointed to by
+.I optval
+is set to a null-terminated string containing the name of the
+congestion-control algorithm for this socket, and
+.I *optlen
+is set to the minimum of its original value and
+.BR TCP_CA_NAME_MAX .
+If the value passed in
+.I *optlen
+is too small, then the string returned in
+.I *optval
+is silently truncated, and no terminating null byte is added.
+If an empty string is returned, then the socket is using the default
+congestion-control algorithm, determined as described under
+.I tcp_congestion_control
+above.
+
+For
+.BR setsockopt ()
+.I optlen
+specifies the length of the congestion-control algorithm name
+contained in the buffer pointed to by
+.IR optval ;
+this length need not include any terminating null byte.
+The algorithm "reno" is always permitted;
+other algorithms may be available, depending on kernel configuration.
+Possible errors from
+.BR setsockopt ()
+include:
+algorithm not found/available
+.RB ( ENOENT );
+setting this algorithm requires the
+.B CAP_NET_ADMIN
+capability
+.RB ( EPERM );
+and failure getting kernel module
+.RB ( EBUSY ).
+.I
 .TP
 .B TCP_CORK
 If set, don't send out partial frames.


--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found] ` <4926DC7B.7020203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2008-11-21 16:08   ` Michael Kerrisk
  2008-11-21 20:42     ` Andi Kleen
  2008-11-21 19:57   ` Stephen Hemminger
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Kerrisk @ 2008-11-21 16:08 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, linux-net-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk, Andi Kleen

[CC+= Andi, this time with the right address]

On Fri, Nov 21, 2008 at 11:06 AM, Michael Kerrisk
<mtk.manpages-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:
> Hello Stephen,
>
> Back in 2.6.13, you added the TCP_CONGESTION sockopt, but
> provided no man-page patch...
>
> Below is my attempt to document this sockopt.  Could you please
> review.  Please don't assume I've well understood the code: I
> may well have messed up in my reading of it, so review what
> I've written with care.
>
> Also, a question: was the silent truncation of the returned
> string on getsockopt() if optlen is too small really intended?
> Would it not be/have been better to error on this case?
>
> Cheers,
>
> Michael
>
>
>   TCP_CONGESTION (since Linux 2.6.13)
>      Get  or  set  the  congestion-control algorithm for this
>      socket.   The  optval  argument  is  a  pointer   to   a
>      character-string buffer.
>
>      For  getsockopt()  *optlen specifies the amount of space
>      available in the buffer  pointed  to  by  optval,  which
>      should  be  at  least  16  bytes (defined by the kernel-
>      internal  constant  TCP_CA_NAME_MAX).   On  return,  the
>      buffer  pointed to by optval is set to a null-terminated
>      string containing the  name  of  the  congestion-control
>      algorithm  for  this  socket,  and *optlen is set to the
>      minimum of its original value and  TCP_CA_NAME_MAX.   If
>      the  value  passed  in  *optlen  is  too small, then the
>      string returned in *optval is silently truncated, and no
>      terminating  null  byte is added.  If an empty string is
>      returned, then the socket is using the  default  conges-
>      tion-control  algorithm,  determined  as described under
>      tcp_congestion_control above.
>
>      For setsockopt() optlen specifies the length of the con-
>      gestion-control  algorithm  name contained in the buffer
>      pointed to by optval; this length need not  include  any
>      terminating  null  byte.  The algorithm "reno" is always
>      permitted; other algorithms may be available,  depending
>      on  kernel configuration.  Possible errors from setsock-
>      opt() include: algorithm not  found/available  (ENOENT);
>      setting  this algorithm requires the CAP_NET_ADMIN capa-
>      bility  (EPERM);  and  failure  getting  kernel   module
>      (EBUSY).
>
> --- tcp.7       2008-11-21 10:54:08.000000000 -0500
> +++ tcp.7.TCP_CONGESTION.patch   2008-11-21 10:53:36.000000000 -0500
> @@ -733,7 +733,58 @@
>  socket options are valid on TCP sockets.
>  For more information see
>  .BR ip (7).
> -.\" FIXME Document TCP_CONGESTION (new in 2.6.13)
> +.TP
> +.BR TCP_CONGESTION " (since Linux 2.6.13)"
> +Get or set the congestion-control algorithm for this socket.
> +The
> +.I optval
> +argument is a pointer to a character-string buffer.
> +
> +For
> +.BR getsockopt ()
> +.I *optlen
> +specifies the amount of space available in the buffer pointed to by
> +.IR optval ,
> +which should be at least 16 bytes (defined by the kernel-internal constant
> +.BR TCP_CA_NAME_MAX ).
> +On return, the buffer pointed to by
> +.I optval
> +is set to a null-terminated string containing the name of the
> +congestion-control algorithm for this socket, and
> +.I *optlen
> +is set to the minimum of its original value and
> +.BR TCP_CA_NAME_MAX .
> +If the value passed in
> +.I *optlen
> +is too small, then the string returned in
> +.I *optval
> +is silently truncated, and no terminating null byte is added.
> +If an empty string is returned, then the socket is using the default
> +congestion-control algorithm, determined as described under
> +.I tcp_congestion_control
> +above.
> +
> +For
> +.BR setsockopt ()
> +.I optlen
> +specifies the length of the congestion-control algorithm name
> +contained in the buffer pointed to by
> +.IR optval ;
> +this length need not include any terminating null byte.
> +The algorithm "reno" is always permitted;
> +other algorithms may be available, depending on kernel configuration.
> +Possible errors from
> +.BR setsockopt ()
> +include:
> +algorithm not found/available
> +.RB ( ENOENT );
> +setting this algorithm requires the
> +.B CAP_NET_ADMIN
> +capability
> +.RB ( EPERM );
> +and failure getting kernel module
> +.RB ( EBUSY ).
> +.I
>  .TP
>  .B TCP_CORK
>  If set, don't send out partial frames.
>
>
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found] ` <4926DC7B.7020203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2008-11-21 16:08   ` Michael Kerrisk
@ 2008-11-21 19:57   ` Stephen Hemminger
  2008-11-21 20:32     ` Michael Kerrisk
  1 sibling, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2008-11-21 19:57 UTC (permalink / raw)
  Cc: David Miller, linux-net-u79uwXL29TY76Z2rM5mHXA, Andi Kleen,
	linux-man-u79uwXL29TY76Z2rM5mHXA, Michael Kerrisk

On Fri, 21 Nov 2008 11:06:19 -0500
Michael Kerrisk <mtk.manpages-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:

> Hello Stephen,
> 
> Back in 2.6.13, you added the TCP_CONGESTION sockopt, but
> provided no man-page patch...
> 
> Below is my attempt to document this sockopt.  Could you please
> review.  Please don't assume I've well understood the code: I
> may well have messed up in my reading of it, so review what
> I've written with care.
> 
> Also, a question: was the silent truncation of the returned
> string on getsockopt() if optlen is too small really intended?
> Would it not be/have been better to error on this case?
> 
> Cheers,
> 
> Michael
> 
> 
>    TCP_CONGESTION (since Linux 2.6.13)
>       Get  or  set  the  congestion-control algorithm for this
>       socket.   The  optval  argument  is  a  pointer   to   a
>       character-string buffer.
> 
>       For  getsockopt()  *optlen specifies the amount of space
>       available in the buffer  pointed  to  by  optval,  which
>       should  be  at  least  16  bytes (defined by the kernel-
>       internal  constant  TCP_CA_NAME_MAX).   On  return,  the
>       buffer  pointed to by optval is set to a null-terminated
>       string containing the  name  of  the  congestion-control
>       algorithm  for  this  socket,  and *optlen is set to the
>       minimum of its original value and  TCP_CA_NAME_MAX.   If
>       the  value  passed  in  *optlen  is  too small, then the
>       string returned in *optval is silently truncated, and no
>       terminating  null  byte is added.  If an empty string is
>       returned, then the socket is using the  default  conges-
>       tion-control  algorithm,  determined  as described under
>       tcp_congestion_control above.
> 
>       For setsockopt() optlen specifies the length of the con-
>       gestion-control  algorithm  name contained in the buffer
>       pointed to by optval; this length need not  include  any
>       terminating  null  byte.  The algorithm "reno" is always
>       permitted; other algorithms may be available,  depending
>       on  kernel configuration.  Possible errors from setsock-
>       opt() include: algorithm not  found/available  (ENOENT);
>       setting  this algorithm requires the CAP_NET_ADMIN capa-
>       bility  (EPERM);  and  failure  getting  kernel   module
>       (EBUSY).

The tcp(7) man page is related and seems out of date as well.
At least on this sytem (Ubuntu 8.04).it seems to be stuck back in pre 2.6.12
timewarp (see tcp_bic, tcp_bic_low_window, ...) values that no
longer exist. Should be updated as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
  2008-11-21 19:57   ` Stephen Hemminger
@ 2008-11-21 20:32     ` Michael Kerrisk
  2008-11-21 20:34       ` Michael Kerrisk
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Kerrisk @ 2008-11-21 20:32 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, linux-net, Andi Kleen, linux-man

Hi Stephen,

On Fri, Nov 21, 2008 at 2:57 PM, Stephen Hemminger
<shemminger@linux-foundation.org> wrote:
> On Fri, 21 Nov 2008 11:06:19 -0500
> Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
>
>> Hello Stephen,
>>
>> Back in 2.6.13, you added the TCP_CONGESTION sockopt, but
>> provided no man-page patch...
>>
>> Below is my attempt to document this sockopt.  Could you please
>> review.  Please don't assume I've well understood the code: I
>> may well have messed up in my reading of it, so review what
>> I've written with care.
>>
>> Also, a question: was the silent truncation of the returned
>> string on getsockopt() if optlen is too small really intended?
>> Would it not be/have been better to error on this case?

You added some other stuff below, but I got no response to the review
request and the question above?

>>    TCP_CONGESTION (since Linux 2.6.13)
>>       Get  or  set  the  congestion-control algorithm for this
>>       socket.   The  optval  argument  is  a  pointer   to   a
>>       character-string buffer.
>>
>>       For  getsockopt()  *optlen specifies the amount of space
>>       available in the buffer  pointed  to  by  optval,  which
>>       should  be  at  least  16  bytes (defined by the kernel-
>>       internal  constant  TCP_CA_NAME_MAX).   On  return,  the
>>       buffer  pointed to by optval is set to a null-terminated
>>       string containing the  name  of  the  congestion-control
>>       algorithm  for  this  socket,  and *optlen is set to the
>>       minimum of its original value and  TCP_CA_NAME_MAX.   If
>>       the  value  passed  in  *optlen  is  too small, then the
>>       string returned in *optval is silently truncated, and no
>>       terminating  null  byte is added.  If an empty string is
>>       returned, then the socket is using the  default  conges-
>>       tion-control  algorithm,  determined  as described under
>>       tcp_congestion_control above.
>>
>>       For setsockopt() optlen specifies the length of the con-
>>       gestion-control  algorithm  name contained in the buffer
>>       pointed to by optval; this length need not  include  any
>>       terminating  null  byte.  The algorithm "reno" is always
>>       permitted; other algorithms may be available,  depending
>>       on  kernel configuration.  Possible errors from setsock-
>>       opt() include: algorithm not  found/available  (ENOENT);
>>       setting  this algorithm requires the CAP_NET_ADMIN capa-
>>       bility  (EPERM);  and  failure  getting  kernel   module
>>       (EBUSY).

> The tcp(7) man page is related

(I'm a little confused by that remark: the patch at the end of my mail
was *for* the tcp(7) man page; maybe you missed that.)

> and seems out of date as well.

Yes, many things on it are out of date.  (I've made occasional
requests to linux-net for help on this point, but have had little
response.)

> At least on this sytem (Ubuntu 8.04).it seems to be stuck back in pre 2.6.12

Yes, 2.6.12 sounds about right.

> timewarp (see tcp_bic, tcp_bic_low_window, ...) values that no
> longer exist.

And when they disappear, no one CCs the man-page maintainer :-(.

Anyway, thanks for the heads up

tcp_bic* is now fixed (disappeared in 2.6.13)
and
tcp_westwood is now fixed (disappeared in 2.6.13)
and
tcp_vegas_cong_avoid is now fixed (disappeared in 2.6.13)

> Should be updated as well.

I'm working on it.  Many updates to tcp(7) will be in man-pages-3.14.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
  2008-11-21 20:32     ` Michael Kerrisk
@ 2008-11-21 20:34       ` Michael Kerrisk
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Kerrisk @ 2008-11-21 20:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, linux-net, linux-man, Andi Kleen

Bother.  CC += Andi again.

On Fri, Nov 21, 2008 at 3:32 PM, Michael Kerrisk
<mtk.manpages@googlemail.com> wrote:
> Hi Stephen,
>
> On Fri, Nov 21, 2008 at 2:57 PM, Stephen Hemminger
> <shemminger@linux-foundation.org> wrote:
>> On Fri, 21 Nov 2008 11:06:19 -0500
>> Michael Kerrisk <mtk.manpages@googlemail.com> wrote:
>>
>>> Hello Stephen,
>>>
>>> Back in 2.6.13, you added the TCP_CONGESTION sockopt, but
>>> provided no man-page patch...
>>>
>>> Below is my attempt to document this sockopt.  Could you please
>>> review.  Please don't assume I've well understood the code: I
>>> may well have messed up in my reading of it, so review what
>>> I've written with care.
>>>
>>> Also, a question: was the silent truncation of the returned
>>> string on getsockopt() if optlen is too small really intended?
>>> Would it not be/have been better to error on this case?
>
> You added some other stuff below, but I got no response to the review
> request and the question above?
>
>>>    TCP_CONGESTION (since Linux 2.6.13)
>>>       Get  or  set  the  congestion-control algorithm for this
>>>       socket.   The  optval  argument  is  a  pointer   to   a
>>>       character-string buffer.
>>>
>>>       For  getsockopt()  *optlen specifies the amount of space
>>>       available in the buffer  pointed  to  by  optval,  which
>>>       should  be  at  least  16  bytes (defined by the kernel-
>>>       internal  constant  TCP_CA_NAME_MAX).   On  return,  the
>>>       buffer  pointed to by optval is set to a null-terminated
>>>       string containing the  name  of  the  congestion-control
>>>       algorithm  for  this  socket,  and *optlen is set to the
>>>       minimum of its original value and  TCP_CA_NAME_MAX.   If
>>>       the  value  passed  in  *optlen  is  too small, then the
>>>       string returned in *optval is silently truncated, and no
>>>       terminating  null  byte is added.  If an empty string is
>>>       returned, then the socket is using the  default  conges-
>>>       tion-control  algorithm,  determined  as described under
>>>       tcp_congestion_control above.
>>>
>>>       For setsockopt() optlen specifies the length of the con-
>>>       gestion-control  algorithm  name contained in the buffer
>>>       pointed to by optval; this length need not  include  any
>>>       terminating  null  byte.  The algorithm "reno" is always
>>>       permitted; other algorithms may be available,  depending
>>>       on  kernel configuration.  Possible errors from setsock-
>>>       opt() include: algorithm not  found/available  (ENOENT);
>>>       setting  this algorithm requires the CAP_NET_ADMIN capa-
>>>       bility  (EPERM);  and  failure  getting  kernel   module
>>>       (EBUSY).
>
>> The tcp(7) man page is related
>
> (I'm a little confused by that remark: the patch at the end of my mail
> was *for* the tcp(7) man page; maybe you missed that.)
>
>> and seems out of date as well.
>
> Yes, many things on it are out of date.  (I've made occasional
> requests to linux-net for help on this point, but have had little
> response.)
>
>> At least on this sytem (Ubuntu 8.04).it seems to be stuck back in pre 2.6.12
>
> Yes, 2.6.12 sounds about right.
>
>> timewarp (see tcp_bic, tcp_bic_low_window, ...) values that no
>> longer exist.
>
> And when they disappear, no one CCs the man-page maintainer :-(.
>
> Anyway, thanks for the heads up
>
> tcp_bic* is now fixed (disappeared in 2.6.13)
> and
> tcp_westwood is now fixed (disappeared in 2.6.13)
> and
> tcp_vegas_cong_avoid is now fixed (disappeared in 2.6.13)
>
>> Should be updated as well.
>
> I'm working on it.  Many updates to tcp(7) will be in man-pages-3.14.
>
> Cheers,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
> man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
> Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
  2008-11-21 16:08   ` Michael Kerrisk
@ 2008-11-21 20:42     ` Andi Kleen
       [not found]       ` <20081121204210.GG6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2008-11-21 20:42 UTC (permalink / raw)
  To: mtk.manpages
  Cc: Stephen Hemminger, David Miller, linux-net, linux-man,
	Michael Kerrisk, Andi Kleen

On Fri, Nov 21, 2008 at 11:08:22AM -0500, Michael Kerrisk wrote:
> [CC+= Andi, this time with the right address]

Just a general comment. The initial DESCRIPTION in tcp should 
be probably adapted to mentioned that Linux has pluggable 
congestion avoidance modules now and also that the defaults
have changed (from NewReno to CUBIC etc.)

-Andi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found]       ` <20081121204210.GG6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
@ 2008-11-21 20:44         ` Michael Kerrisk
       [not found]           ` <cfd18e0f0811211244v5d391f8du3380332a721ed33-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Kerrisk @ 2008-11-21 20:44 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Stephen Hemminger, David Miller, linux-net-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

Hi Andi,

On Fri, Nov 21, 2008 at 3:42 PM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote:
> On Fri, Nov 21, 2008 at 11:08:22AM -0500, Michael Kerrisk wrote:
>> [CC+= Andi, this time with the right address]
>
> Just a general comment. The initial DESCRIPTION in tcp should
> be probably adapted to mentioned that Linux has pluggable
> congestion avoidance modules now and also that the defaults
> have changed (from NewReno to CUBIC etc.)

If I try to do this, I'm going to create rubbish, because I know next
to nothing about these details...

Could I ask a favor?  Below is the DESCRIPTION text.  Could you note
write some sentences in the rough location where you think they below,
and I will turn that into a *roff patch.

Thanks,

Michael

       This  is an implementation of the TCP protocol defined in
       RFC 793, RFC 1122 and RFC 2001 with the NewReno and  SACK
       extensions.   It  provides  a  reliable, stream-oriented,
       full-duplex connection between  two  sockets  on  top  of
       ip(7),  for both v4 and v6 versions.  TCP guarantees that
       the data arrives in order and retransmits  lost  packets.
       It  generates  and  checks a per-packet checksum to catch
       transmission errors.  TCP does not preserve record bound-
       aries.

       A newly created TCP socket has no remote or local address
       and is not fully specified.  To create  an  outgoing  TCP
       connection  use  connect(2)  to establish a connection to
       another TCP socket.  To receive new incoming connections,
       first  bind(2) the socket to a local address and port and
       then call listen(2) to put the socket into the  listening
       state.  After that a new socket for each incoming connec-
       tion can be accepted using accept(2).  A socket which has
       had  accept(2) or connect(2) successfully called on it is
       fully specified and may transmit data.   Data  cannot  be
       transmitted on listening or not yet connected sockets.

       Linux  supports RFC 1323 TCP high performance extensions.
       These include Protection Against Wrapped Sequence Numbers
       (PAWS),  Window  Scaling  and Timestamps.  Window scaling
       allows the use of large (> 64K) TCP windows in  order  to
       support  links  with  high latency or bandwidth.  To make
       use of them, the send and receive buffer  sizes  must  be
       increased.    They   can   be   set   globally  with  the
       /proc/sys/net/ipv4/tcp_wmem                           and
       /proc/sys/net/ipv4/tcp_rmem files, or on individual sock-
       ets by using the SO_SNDBUF and SO_RCVBUF  socket  options
       with the setsockopt(2) call.

       The  maximum  sizes  for  socket buffers declared via the
       SO_SNDBUF and SO_RCVBUF mechanisms  are  limited  by  the
       values    in    the    /proc/sys/net/core/rmem_max    and
       /proc/sys/net/core/wmem_max files.  Note that  TCP  actu-
       ally  allocates twice the size of the buffer requested in
       the setsockopt(2) call, and so a succeeding getsockopt(2)
       call will not return the same size of buffer as requested
       in the setsockopt(2) call.  TCP uses the extra space  for
       administrative  purposes  and internal kernel structures,
       and the /proc file values reflect the larger  sizes  com-
       pared  to  the actual TCP windows.  On individual connec-
       tions, the socket buffer size must be set  prior  to  the
       listen(2)  or  connect(2)  calls in order to have it take
       effect.  See socket(7) for more information.

       TCP supports urgent data.  Urgent data is used to  signal
       the  receiver  that some important message is part of the
       data stream and that it should be processed  as  soon  as
       possible.  To send urgent data specify the MSG_OOB option
       to send(2).  When urgent data  is  received,  the  kernel
       sends  a  SIGURG  signal  to the process or process group
       that has been set as the socket "owner" using the SIOCSP-
       GRP  or  FIOSETOWN  ioctls (or the POSIX.1-2001-specified
       fcntl(2)  F_SETOWN  operation).   When  the  SO_OOBINLINE
       socket  option  is  enabled,  urgent data is put into the
       normal data stream (a program can test for  its  location
       using the SIOCATMARK ioctl described below), otherwise it
       can be only received when the MSG_OOB  flag  is  set  for
       recv(2) or recvmsg(2).

       Linux  2.4  introduced  a  number of changes for improved
       throughput and scaling, as well as  enhanced  functional-
       ity.   Some  of  these features include support for zero-
       copy sendfile(2), Explicit Congestion  Notification,  new
       management   of   TIME_WAIT  sockets,  keep-alive  socket
       options and support for Duplicate SACK extensions.



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found]           ` <cfd18e0f0811211244v5d391f8du3380332a721ed33-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2008-11-21 21:16             ` Andi Kleen
  2008-11-22  7:39             ` Stephen Hemminger
  1 sibling, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2008-11-21 21:16 UTC (permalink / raw)
  To: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w
  Cc: Andi Kleen, Stephen Hemminger, David Miller,
	linux-net-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

On Fri, Nov 21, 2008 at 03:44:05PM -0500, Michael Kerrisk wrote:
> Hi Andi,
> 
> On Fri, Nov 21, 2008 at 3:42 PM, Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote:
> > On Fri, Nov 21, 2008 at 11:08:22AM -0500, Michael Kerrisk wrote:
> >> [CC+= Andi, this time with the right address]
> >
> > Just a general comment. The initial DESCRIPTION in tcp should
> > be probably adapted to mentioned that Linux has pluggable
> > congestion avoidance modules now and also that the defaults
> > have changed (from NewReno to CUBIC etc.)
> 
> If I try to do this, I'm going to create rubbish, because I know next
> to nothing about these details...
> 
> Could I ask a favor?  Below is the DESCRIPTION text.  Could you note
> write some sentences in the rough location where you think they below,
> and I will turn that into a *roff patch.

It would be better if Stephen or David do that. They kept better
track of it than me. In the worst case I could try too.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found]           ` <cfd18e0f0811211244v5d391f8du3380332a721ed33-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2008-11-21 21:16             ` Andi Kleen
@ 2008-11-22  7:39             ` Stephen Hemminger
  2008-11-22 14:56               ` Andi Kleen
  1 sibling, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2008-11-22  7:39 UTC (permalink / raw)
  To: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w
  Cc: mtk.manpages-gM/Ye1E23mwN+BqQ9rBEUg, Andi Kleen, David Miller,
	linux-net-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

Does this help get it started in right direction??
--------------------------------------------

       This  is an implementation of the TCP protocol defined in
       RFC 793, RFC 1122 and RFC 2001 with the NewReno and  SACK
       extensions.   It  provides  a  reliable, stream-oriented,
       full-duplex connection between  two  sockets  on  top  of
       ip(7),  for both v4 and v6 versions.  TCP guarantees that
       the data arrives in order and retransmits  lost  packets.
       It  generates  and  checks a per-packet checksum to catch
       transmission errors.  TCP does not preserve record bound-
       aries.

       A newly created TCP socket has no remote or local address
       and is not fully specified.  To create  an  outgoing  TCP
       connection  use  connect(2)  to establish a connection to
       another TCP socket.  To receive new incoming connections,
       first  bind(2) the socket to a local address and port and
       then call listen(2) to put the socket into the  listening
       state.  After that a new socket for each incoming connec-
       tion can be accepted using accept(2).  A socket which has
       had  accept(2) or connect(2) successfully called on it is
       fully specified and may transmit data.   Data  cannot  be
       transmitted on listening or not yet connected sockets.

[move buffering ahead of 1323 stuff - more important]

Socket buffers on linux are automatically tuned by Linux
TCP based on available memory and the throughput of the
socket. The starting value and upper bound of buffer tuning
is determined by tcp_rwin (for receiving) and tcp_wwin (for sending)
as described in Sysctls secton. The buffer sizes can be
fixed with SO_SNDBUF and SO_RCVBUF mechanisms.

       Note that  TCP  actually  allocates twice the size of the
       buffer requested in
       the setsockopt(2) call, and so a succeeding getsockopt(2)
       call will not return the same size of buffer as requested
       in the setsockopt(2) call.  TCP uses the extra space  for
       administrative  purposes  and internal kernel structures,
       and the /proc file values reflect the larger  sizes  com-
       pared  to  the actual TCP windows.  
       The  maximum  sizes  for  socket buffers declared via the
       SO_SNDBUF and SO_RCVBUF mechanisms  are  limited  by  the
       values in  the net.core.rmem_max and net.core.wmem_max
       sysctl values.

       On individual connec-
       tions, the socket buffer size must be set  prior  to  the
       listen(2)  or  connect(2)  calls in order to have it take
       effect.  See socket(7) for more information.
      
       Linux  supports RFC 1323 TCP high performance extensions.
       These include Protection Against Wrapped Sequence Numbers
       (PAWS),  Window  Scaling  and Timestamps.  Window scaling
       allows the use of large (> 64K) TCP windows in  order  to
       support  links  with  high latency or bandwidth. 

       TCP supports urgent data.  Urgent data is used to  signal
       the  receiver  that some important message is part of the
       data stream and that it should be processed  as  soon  as
       possible.  To send urgent data specify the MSG_OOB option
       to send(2).  When urgent data  is  received,  the  kernel
       sends  a  SIGURG  signal  to the process or process group
       that has been set as the socket "owner" using the SIOCSP-
       GRP  or  FIOSETOWN  ioctls (or the POSIX.1-2001-specified
       fcntl(2)  F_SETOWN  operation).   When  the  SO_OOBINLINE
       socket  option  is  enabled,  urgent data is put into the
       normal data stream (a program can test for  its  location
       using the SIOCATMARK ioctl described below), otherwise it
       can be only received when the MSG_OOB  flag  is  set  for
       recv(2) or recvmsg(2).

Linux supports multiple different congestion control
algorithms. The default choice of congestion control is controlled
by net.ipv4.tcp_congestion_control sysctl. This value can
be overridden by TCP_CONGESTION socket option. The actual choices
of congestion control available vary according between release
as more are added, and depend on the configuration choices
made when the kernel was built. The list of congestion control
protocols currently loaded is in net.ipv4.tcp_available_congestion_control.

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
  2008-11-22  7:39             ` Stephen Hemminger
@ 2008-11-22 14:56               ` Andi Kleen
       [not found]                 ` <20081122145632.GQ6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Andi Kleen @ 2008-11-22 14:56 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: mtk.manpages, mtk.manpages, Andi Kleen, David Miller, linux-net,
	linux-man

On Fri, Nov 21, 2008 at 11:39:18PM -0800, Stephen Hemminger wrote:
> Does this help get it started in right direction??

Yes.

> --------------------------------------------
> 
>        This  is an implementation of the TCP protocol defined in
>        RFC 793, RFC 1122 and RFC 2001 with the NewReno and  SACK
>        extensions.   It  provides  a  reliable, stream-oriented,

Perhaps drop NewReno, it's really obsolete because Linux is so 
far beyond.

> 
>        Note that  TCP  actually  allocates twice the size of the
>        buffer requested in

The "twice" is obsolete, it's far more complicated now.
So it should be just "more" I think

>        socket  option  is  enabled,  urgent data is put into the
>        normal data stream (a program can test for  its  location
>        using the SIOCATMARK ioctl described below), otherwise it
>        can be only received when the MSG_OOB  flag  is  set  for
>        recv(2) or recvmsg(2).
> 
> Linux supports multiple different congestion control
> algorithms. The default choice of congestion control is controlled
> by net.ipv4.tcp_congestion_control sysctl. This value can
> be overridden by TCP_CONGESTION socket option. The actual choices
> of congestion control available vary according between release
> as more are added, and depend on the configuration choices

Hmm perhaps mention the current standard default?

> made when the kernel was built. The list of congestion control
> protocols currently loaded is in net.ipv4.tcp_available_congestion_control.

Best would be probably to have an manpage for each of them, but I'm
not going to write them :)

-Andi

-- 
ak@linux.intel.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
       [not found]                 ` <20081122145632.GQ6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
@ 2008-11-23  6:34                   ` Stephen Hemminger
  2008-11-23 20:06                     ` Andi Kleen
  0 siblings, 1 reply; 12+ messages in thread
From: Stephen Hemminger @ 2008-11-23  6:34 UTC (permalink / raw)
  Cc: mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w,
	mtk.manpages-gM/Ye1E23mwN+BqQ9rBEUg, Andi Kleen, David Miller,
	linux-net-u79uwXL29TY76Z2rM5mHXA,
	linux-man-u79uwXL29TY76Z2rM5mHXA

On Sat, 22 Nov 2008 15:56:32 +0100
Andi Kleen <andi-Vw/NltI1exuRpAAqCnN02g@public.gmane.org> wrote:

> On Fri, Nov 21, 2008 at 11:39:18PM -0800, Stephen Hemminger wrote:
> > Does this help get it started in right direction??
> 
> Yes.
> 
> > --------------------------------------------
> > 
> >        This  is an implementation of the TCP protocol defined in
> >        RFC 793, RFC 1122 and RFC 2001 with the NewReno and  SACK
> >        extensions.   It  provides  a  reliable, stream-oriented,
> 
> Perhaps drop NewReno, it's really obsolete because Linux is so 
> far beyond.
> 
> > 
> >        Note that  TCP  actually  allocates twice the size of the
> >        buffer requested in
> 
> The "twice" is obsolete, it's far more complicated now.
> So it should be just "more" I think
> 
> >        socket  option  is  enabled,  urgent data is put into the
> >        normal data stream (a program can test for  its  location
> >        using the SIOCATMARK ioctl described below), otherwise it
> >        can be only received when the MSG_OOB  flag  is  set  for
> >        recv(2) or recvmsg(2).
> > 
> > Linux supports multiple different congestion control
> > algorithms. The default choice of congestion control is controlled
> > by net.ipv4.tcp_congestion_control sysctl. This value can
> > be overridden by TCP_CONGESTION socket option. The actual choices
> > of congestion control available vary according between release
> > as more are added, and depend on the configuration choices
> 
> Hmm perhaps mention the current standard default?

There is no "standard default" it is a kernel config option.
And it may change in future, so writing it into the manual page
seems being short sighted.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: TCP_CONGESTION documentation
  2008-11-23  6:34                   ` Stephen Hemminger
@ 2008-11-23 20:06                     ` Andi Kleen
  0 siblings, 0 replies; 12+ messages in thread
From: Andi Kleen @ 2008-11-23 20:06 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andi Kleen, mtk.manpages, mtk.manpages, David Miller, linux-net,
	linux-man

> There is no "standard default" it is a kernel config option.

There is one default y and very strong suggestions
in Kconfig.  I bet 99+% of the users use that, which means CUBIC now
since kernel version number XX.YY (I forgot)

Also the "depends on kernel config option" argument seems a poor
one. A lot of things can be disabled with uncommon CONFIG options,
but the man pages still document the standard defaults used
by near all people.

> And it may change in future, so writing it into the manual page
> seems being short sighted.

That just means that users will never know.

-Andi

-- 
ak@linux.intel.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-11-23 20:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-21 16:06 TCP_CONGESTION documentation Michael Kerrisk
     [not found] ` <4926DC7B.7020203-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2008-11-21 16:08   ` Michael Kerrisk
2008-11-21 20:42     ` Andi Kleen
     [not found]       ` <20081121204210.GG6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2008-11-21 20:44         ` Michael Kerrisk
     [not found]           ` <cfd18e0f0811211244v5d391f8du3380332a721ed33-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2008-11-21 21:16             ` Andi Kleen
2008-11-22  7:39             ` Stephen Hemminger
2008-11-22 14:56               ` Andi Kleen
     [not found]                 ` <20081122145632.GQ6703-qrUzlfsMFqo/4alezvVtWx2eb7JE58TQ@public.gmane.org>
2008-11-23  6:34                   ` Stephen Hemminger
2008-11-23 20:06                     ` Andi Kleen
2008-11-21 19:57   ` Stephen Hemminger
2008-11-21 20:32     ` Michael Kerrisk
2008-11-21 20:34       ` Michael Kerrisk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox