* [patch] ip_local_port_range sysctl has annoying default
@ 2007-05-12 19:40 Mark Glines
2007-05-14 17:08 ` Rick Jones
0 siblings, 1 reply; 4+ messages in thread
From: Mark Glines @ 2007-05-12 19:40 UTC (permalink / raw)
To: netdev; +Cc: davem, kuznet, jmorris, kaber
(resending to netdev and copying maintainers, at Alan Cox's suggestion. Thanks Alan!)
On Sat, 12 May 2007 12:12:38 -0700 "H. Peter Anvin" <hpa@zytor.com> wrote:
> Mark Glines wrote:
> >
> > Well, in that case, is there anything wrong with just using the
> > range IANA recommends, in all cases?
> >
>
> I think the IANA range is considered too small in most cases; I
> suspect there is also a feeling that "there be dragons" near the very
> top.
Ok, thanks for the explanation. Sounds like we're using high port
numbers in the "spirit" of the IANA recommendation, without using
their actual numbers.
I still haven't gotten an answer to this: is there a performance
issue (or memory usage or security or something) with using the same
port range in all cases, even on memory-constrained systems (or whatever
it is that determines the bind hash size)? And if there is, can't we
*still* use big numbers, even if the range isn't as wide?
If there's no reason not to (security, resource consumption,
whatever), I think it would be an improvement to use high, out of the
way port numbering in all cases. (Especially since the kernel already
does this on most of my machines, anyway.)
There was a comment in there about how 32768-61000 should be used on
high-use systems; is there a drawback to just using this range
*everywhere*? (It's already the default in non-memory-constrained
cases, because of what tcp_init() was doing.)
Thanks,
Signed-off-by: Mark Glines <mark@glines.org>
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 43fb160..12d9ddc 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -29,12 +29,7 @@ const char inet_csk_timer_bug_msg[] = "inet_csk BUG:
unknown timer value\n";
EXPORT_SYMBOL(inet_csk_timer_bug_msg);
#endif
-/*
- * This array holds the first and last local port number.
- * For high-usage systems, use sysctl to change this to
- * 32768-61000
- */
-int sysctl_local_port_range[2] = { 1024, 4999 };
+int sysctl_local_port_range[2] = { 32768, 61000 };
int inet_csk_bind_conflict(const struct sock *sk,
const struct inet_bind_bucket *tb)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index bd4c295..33ef0e7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2465,13 +2465,10 @@ void __init tcp_init(void)
order++)
;
if (order >= 4) {
- sysctl_local_port_range[0] = 32768;
- sysctl_local_port_range[1] = 61000;
tcp_death_row.sysctl_max_tw_buckets = 180000;
sysctl_tcp_max_orphans = 4096 << (order - 4);
sysctl_max_syn_backlog = 1024;
} else if (order < 3) {
- sysctl_local_port_range[0] = 1024 * (3 - order);
tcp_death_row.sysctl_max_tw_buckets >>= (3 - order);
sysctl_tcp_max_orphans >>= (3 - order);
sysctl_max_syn_backlog = 128;
Mark
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [patch] ip_local_port_range sysctl has annoying default
2007-05-12 19:40 [patch] ip_local_port_range sysctl has annoying default Mark Glines
@ 2007-05-14 17:08 ` Rick Jones
2007-05-14 18:33 ` Mark Glines
0 siblings, 1 reply; 4+ messages in thread
From: Rick Jones @ 2007-05-14 17:08 UTC (permalink / raw)
To: Mark Glines; +Cc: netdev, davem, kuznet, jmorris, kaber
Mark Glines wrote:
> (resending to netdev and copying maintainers, at Alan Cox's suggestion. Thanks Alan!)
> On Sat, 12 May 2007 12:12:38 -0700 "H. Peter Anvin" <hpa@zytor.com> wrote:
>
>
>>Mark Glines wrote:
>>
>>>Well, in that case, is there anything wrong with just using the
>>>range IANA recommends, in all cases?
>>>
>>
>>I think the IANA range is considered too small in most cases; I
>>suspect there is also a feeling that "there be dragons" near the very
>>top.
About the only dragons which come to mind would be the very old, decrepit,
barely able to puff wisps of steam let alone fire, dragons with the high-order
bit set that would be misinterpreted by those treating port numbers as a short
rather than an unsigned short.
> Ok, thanks for the explanation. Sounds like we're using high port
> numbers in the "spirit" of the IANA recommendation, without using
> their actual numbers.
>
> I still haven't gotten an answer to this: is there a performance
> issue (or memory usage or security or something) with using the same
> port range in all cases, even on memory-constrained systems (or whatever
> it is that determines the bind hash size)? And if there is, can't we
> *still* use big numbers, even if the range isn't as wide?
>
> If there's no reason not to (security, resource consumption,
> whatever), I think it would be an improvement to use high, out of the
> way port numbering in all cases. (Especially since the kernel already
> does this on most of my machines, anyway.)
>
> There was a comment in there about how 32768-61000 should be used on
> high-use systems; is there a drawback to just using this range
> *everywhere*? (It's already the default in non-memory-constrained
> cases, because of what tcp_init() was doing.)
I would think that a "high use system" would probably want even more than
32768-61000. Where the size of the anonymous/ephemeral port space seems to
come-up most (in my experience thusfar) often is in situations where someone is
churning through lots of connections at a time. They probably want something
more like 5000-65535.
Frankly, such applications probably aught (again IMO) to be making explicit
bind() calls to pick local port numbers in that range just as decades-old web
server benchmarks do.
One nice thing about 49152-65535 is that if you have an application with a
busted loop, it will "only" absorb 16K ports before it starts to fail. Still
and all not necessarily a big deal
Oddly enough, it seems that on a system with a 2.6.21.1 kernel, the 32768-61000
is already there:
hpcpc102:~# sysctl -a | grep port
error: permission denied on key 'net.ipv4.route.flush'
net.ipv4.ip_local_port_range = 32768 61000
I cannot imagine there is anything "safer" about 61000 than 63355. They both
have that "sign-bit" set.
While it is "security through obscurity" having the same default port range as
other platforms would I suppose make it just a little bit more difficult for
fingerprinting.
random thoughts,
rick jones
Solaris:
# ndd /dev/tcp tcp_smallest_anon_port
32768
# ndd /dev/tcp tcp_largest_anon_port
65535
# uname -a
SunOS competitive10 5.10 Generic_118833-36 sun4v sparc SUNW,Sun-Fire-T200
HP-UX:
# ndd /dev/tcp tcp_smallest_anon_port
49152
# ndd /dev/tcp tcp_largest_anon_port
65535
# uname -a
HP-UX loiter B.11.23 U ia64 4283463096 unlimited-user license
no idea about AIX or BSD or Windows...
> Thanks,
>
> Signed-off-by: Mark Glines <mark@glines.org>
>
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 43fb160..12d9ddc 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -29,12 +29,7 @@ const char inet_csk_timer_bug_msg[] = "inet_csk BUG:
> unknown timer value\n";
> EXPORT_SYMBOL(inet_csk_timer_bug_msg);
> #endif
>
> -/*
> - * This array holds the first and last local port number.
> - * For high-usage systems, use sysctl to change this to
> - * 32768-61000
> - */
> -int sysctl_local_port_range[2] = { 1024, 4999 };
> +int sysctl_local_port_range[2] = { 32768, 61000 };
>
> int inet_csk_bind_conflict(const struct sock *sk,
> const struct inet_bind_bucket *tb)
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index bd4c295..33ef0e7 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2465,13 +2465,10 @@ void __init tcp_init(void)
> order++)
> ;
> if (order >= 4) {
> - sysctl_local_port_range[0] = 32768;
> - sysctl_local_port_range[1] = 61000;
> tcp_death_row.sysctl_max_tw_buckets = 180000;
> sysctl_tcp_max_orphans = 4096 << (order - 4);
> sysctl_max_syn_backlog = 1024;
> } else if (order < 3) {
> - sysctl_local_port_range[0] = 1024 * (3 - order);
> tcp_death_row.sysctl_max_tw_buckets >>= (3 - order);
> sysctl_tcp_max_orphans >>= (3 - order);
> sysctl_max_syn_backlog = 128;
>
>
>
> Mark
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [patch] ip_local_port_range sysctl has annoying default
2007-05-14 17:08 ` Rick Jones
@ 2007-05-14 18:33 ` Mark Glines
2007-05-14 18:47 ` Rick Jones
0 siblings, 1 reply; 4+ messages in thread
From: Mark Glines @ 2007-05-14 18:33 UTC (permalink / raw)
To: Rick Jones; +Cc: netdev, davem, kuznet, jmorris, kaber
On Mon, 14 May 2007 10:08:43 -0700
Rick Jones <rick.jones2@hp.com> wrote:
> >>I think the IANA range is considered too small in most cases; I
> >>suspect there is also a feeling that "there be dragons" near the
> >>very top.
>
> About the only dragons which come to mind would be the very old,
> decrepit, barely able to puff wisps of steam let alone fire, dragons
> with the high-order bit set that would be misinterpreted by those
> treating port numbers as a short rather than an unsigned short.
Note that the high-order bit is set for all ports above 32768, so this
dragon would be stepped on pretty badly by Linux's default (and
indeed, the default for most OS's).
However, by "the very top", I think he was referring to the range
61000-65535, not all ports from 32768 up. Alan Cox clarified (in
http://www.ussg.iu.edu/hypermail/linux/kernel/0705.1/2597.html), "The
top space is reserved when using masquerading and used for the
masquerading ports normally in that situation. Clipping them off avoids
differing behaviour with masquerading on/off." So I think that's the
dragon in question, and NAT is a big ugly scary dragon indeed.
[snip]
> Oddly enough, it seems that on a system with a 2.6.21.1 kernel, the
> 32768-61000 is already there:
>
> hpcpc102:~# sysctl -a | grep port
> error: permission denied on key 'net.ipv4.route.flush'
> net.ipv4.ip_local_port_range = 32768 61000
Yes, Linux does use the range of 32768-61000 in most cases, and it
works great. The problem is, this default is determined at runtime by
tcp_init() (in net/ipv4/tcp.c), based on the bind hash size. If the
bind hash size is above a certain threshold, it will use 32768-61000,
which seems to be the common case these days. Otherwise, it will use a
range of 3072-4999, 2048-4999, or 1024-4999, depending on how small the
bind hash is.
I have a box here with 128M of RAM, which, running the same kernel rev,
*doesn't* have this default (because the bind hash size is too small),
which causes problems because its range (2048-4999) stomps on NFS's UDP
port (2049) by default. So I was getting a weird failure where nfsd
wouldn't start when klive was running. But only on that machine. The
same setup works great on all of my other machines.
I think the range of 32768-61000 is smart, and I am hoping Linux can
use this default range *everywhere* by default, regardless of the bind
hash size. This is what my patch does.
If the list doesn't like this idea, I will happily submit another patch
which uses a dynamic range of the same size as before, but moves the
beginning of that range up to 32768. (Or maybe moves the end of the
range up to 61000.)
> Solaris:
> # ndd /dev/tcp tcp_smallest_anon_port
> 32768
> # ndd /dev/tcp tcp_largest_anon_port
> 65535
> # uname -a
> SunOS competitive10 5.10 Generic_118833-36 sun4v sparc
> SUNW,Sun-Fire-T200
>
> HP-UX:
>
> # ndd /dev/tcp tcp_smallest_anon_port
> 49152
> # ndd /dev/tcp tcp_largest_anon_port
> 65535
> # uname -a
> HP-UX loiter B.11.23 U ia64 4283463096 unlimited-user license
>
> no idea about AIX or BSD or Windows...
Interesting!
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 1024
net.inet.ip.portrange.last: 5000
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535
DragonFly dfly181.tahoe 1.8.1-RELEASE DragonFly 1.8.1-RELEASE #2: Mon Mar 26 08:03:12 PDT 2007 root@:/usr/obj/usr/src/sys/GENERIC i386
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535
net.inet.ip.portrange.reservedhigh: 1023
net.inet.ip.portrange.reservedlow: 0
FreeBSD fbsd62.tahoe 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 10:40:27 UTC 2007 root@dessler.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386
...whatever that means.
Mark
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [patch] ip_local_port_range sysctl has annoying default
2007-05-14 18:33 ` Mark Glines
@ 2007-05-14 18:47 ` Rick Jones
0 siblings, 0 replies; 4+ messages in thread
From: Rick Jones @ 2007-05-14 18:47 UTC (permalink / raw)
To: Mark Glines; +Cc: netdev, davem, kuznet, jmorris, kaber
> Note that the high-order bit is set for all ports above 32768, so this
> dragon would be stepped on pretty badly by Linux's default (and
> indeed, the default for most OS's).
>
> However, by "the very top", I think he was referring to the range
> 61000-65535, not all ports from 32768 up. Alan Cox clarified (in
> http://www.ussg.iu.edu/hypermail/linux/kernel/0705.1/2597.html), "The
> top space is reserved when using masquerading and used for the
> masquerading ports normally in that situation. Clipping them off avoids
> differing behaviour with masquerading on/off." So I think that's the
> dragon in question, and NAT is a big ugly scary dragon indeed.
NAT, why does there have to be NAT... :) yeah, it is big and ugly, shame we
cannot put a stake through its heart :(
> [snip]
>
>>Oddly enough, it seems that on a system with a 2.6.21.1 kernel, the
>>32768-61000 is already there:
>>
>>hpcpc102:~# sysctl -a | grep port
>>error: permission denied on key 'net.ipv4.route.flush'
>>net.ipv4.ip_local_port_range = 32768 61000
>
>
> Yes, Linux does use the range of 32768-61000 in most cases, and it
> works great. The problem is, this default is determined at runtime by
> tcp_init() (in net/ipv4/tcp.c), based on the bind hash size. If the
> bind hash size is above a certain threshold, it will use 32768-61000,
> which seems to be the common case these days. Otherwise, it will use a
> range of 3072-4999, 2048-4999, or 1024-4999, depending on how small the
> bind hash is.
Ah (insert suitable emily litella reference here) All the systems with which I
play are probably considered "large" - even the ones I consider "small."
> I have a box here with 128M of RAM, which, running the same kernel rev,
> *doesn't* have this default (because the bind hash size is too small),
> which causes problems because its range (2048-4999) stomps on NFS's UDP
> port (2049) by default. So I was getting a weird failure where nfsd
> wouldn't start when klive was running. But only on that machine. The
> same setup works great on all of my other machines.
Hmm, those small values feel like variations on the old BSD defaults theme. I
don't recall issues with NFS there, but it is very likely that NFS would have
been started well before most anything else so it would "win" the race to 2049.
> I think the range of 32768-61000 is smart, and I am hoping Linux can
> use this default range *everywhere* by default, regardless of the bind
> hash size. This is what my patch does.
>
> If the list doesn't like this idea, I will happily submit another patch
> which uses a dynamic range of the same size as before, but moves the
> beginning of that range up to 32768. (Or maybe moves the end of the
> range up to 61000.)
Unless the memory size changes the hash algorithm itself (which bits are used,
that sort of thing) I wouldn't think that the values in the port number range
would particularly matter.
>
>
>>Solaris:
>># ndd /dev/tcp tcp_smallest_anon_port
>>32768
>># ndd /dev/tcp tcp_largest_anon_port
>>65535
>># uname -a
>>SunOS competitive10 5.10 Generic_118833-36 sun4v sparc
>>SUNW,Sun-Fire-T200
>>
>>HP-UX:
>>
>># ndd /dev/tcp tcp_smallest_anon_port
>>49152
>># ndd /dev/tcp tcp_largest_anon_port
>>65535
>># uname -a
>>HP-UX loiter B.11.23 U ia64 4283463096 unlimited-user license
>>
>>no idea about AIX or BSD or Windows...
>
>
> Interesting!
>
> net.inet.ip.portrange.lowfirst: 1023
> net.inet.ip.portrange.lowlast: 600
> net.inet.ip.portrange.first: 1024
> net.inet.ip.portrange.last: 5000
> net.inet.ip.portrange.hifirst: 49152
> net.inet.ip.portrange.hilast: 65535
> DragonFly dfly181.tahoe 1.8.1-RELEASE DragonFly 1.8.1-RELEASE #2: Mon Mar 26 08:03:12 PDT 2007 root@:/usr/obj/usr/src/sys/GENERIC i386
>
> net.inet.ip.portrange.lowfirst: 1023
> net.inet.ip.portrange.lowlast: 600
> net.inet.ip.portrange.first: 49152
> net.inet.ip.portrange.last: 65535
> net.inet.ip.portrange.hifirst: 49152
> net.inet.ip.portrange.hilast: 65535
> net.inet.ip.portrange.reservedhigh: 1023
> net.inet.ip.portrange.reservedlow: 0
> FreeBSD fbsd62.tahoe 6.2-RELEASE FreeBSD 6.2-RELEASE #0: Fri Jan 12 10:40:27 UTC 2007 root@dessler.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386
>
> ...whatever that means.
>
> Mark
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-05-14 18:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-12 19:40 [patch] ip_local_port_range sysctl has annoying default Mark Glines
2007-05-14 17:08 ` Rick Jones
2007-05-14 18:33 ` Mark Glines
2007-05-14 18:47 ` Rick Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).