* Re: NAT and locally bound sockets
[not found] ` <20020530153247.I7658@sunbeam.de.gnumonks.org>
@ 2002-07-01 16:32 ` Michael Shuey
2002-07-01 16:59 ` Henrik Nordstrom
2002-07-02 14:21 ` Harald Welte
0 siblings, 2 replies; 5+ messages in thread
From: Michael Shuey @ 2002-07-01 16:32 UTC (permalink / raw)
To: Harald Welte, netfilter-devel
On Thu, May 30, 2002 at 03:32:47PM +0200, Harald Welte wrote:
> Interestingly I don't remember this bug. I (and nobody else) has added
> something to the TODO list about this either. Maybe it somehow got lost :(
I can't fault that; heck, I just took a month to reply to this email.
I looked up the previous comment about this issue:
http://lists.samba.org/pipermail/netfilter-devel/2002-January/003041.html
> Are you aware thet netfilter/iptables NAT is [in IETF terms] 'symmetric nat',
> which means that we can use the same port on the NAT gw multiple times, as
> long as the tuple (consisting out of srcip,dstip,srcport,dstport,l4proto) is
> unique.
Yes, I am aware that netfilter provides symmetric nat. Unfortunately, its
port selection can provide a tuple that is _not_ unique.
> Could you please try to describe the scenario of this alleged bug?
I ran into this when I had 100 hosts behind a single NAT box, all trying to
reach the same NFS file server at the same time the NAT box itself was
mounting the NFS server. At the time I was using the default SNAT options
(not specifying a port range for the translations) so any traffic originating
at a port below 1024 would be mapped to natbox:<something under 1024>. That
was working as expected.
In this scenario, think about the tuple for a moment. Since all clients and
the natbox are mounting the same NFS server, selecting the same port by
default, using UDP across the board, the connection tuples are (after SNAT)
going to be very similar - they only differ in srcport. Normally that would
be just fine; however, with a high level of traffic the NAT system would
occaisionally select a srcport that was already in use by the NFS client local
to natbox. That's not fine - it causes quite a few NFS timeouts, retransmits,
etc. on natbox.
This is hard to observe (unless you have lots of NAT clients or a very
restricted range of possible srcports for translations), but lsof on the
natbox and a packet sniffer showed that this was, in fact, occurring. I
ended up setting up an IP alias on the natbox just to route all translated
traffic through. When I did that everything worked normally again - no NFS
problems at all on natbox (which is to be expected, since the translated
connections' tuples are now different from the local NFS client).
To fix this the NAT code needs to check to see what, if any, ports are
locally bound (and not use them, of course!) The original poster for this
bug had a bit of code that tried to address this, but it was limited to UDP.
A proper fix would have to involve TCP as well (as a similar problem most
likely exists there, it just crops up much less frequently).
--
Mike Shuey
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAT and locally bound sockets
2002-07-01 16:32 ` NAT and locally bound sockets Michael Shuey
@ 2002-07-01 16:59 ` Henrik Nordstrom
2002-07-01 18:46 ` Michael Shuey
2002-07-02 14:21 ` Harald Welte
1 sibling, 1 reply; 5+ messages in thread
From: Henrik Nordstrom @ 2002-07-01 16:59 UTC (permalink / raw)
To: shuey, Michael Shuey, Harald Welte, netfilter-devel
Michael Shuey wrote:
> In this scenario, think about the tuple for a moment. Since all clients
> and the natbox are mounting the same NFS server, selecting the same port by
> default, using UDP across the board, the connection tuples are (after SNAT)
> going to be very similar - they only differ in srcport. Normally that
> would be just fine; however, with a high level of traffic the NAT system
> would occaisionally select a srcport that was already in use by the NFS
> client local to natbox. That's not fine - it causes quite a few NFS
> timeouts, retransmits, etc. on natbox.
This is handled fine in all tests I have done provided your SNAT rule applies
to both forwarded and locally originating packets.
If however your UDP nat entries times out from conntrack, which they can
easily do for a idle NFS mount, then all bets is off.. The default udp
timeout is only 180 seconds which is not by far sufficient for multi-client
NAT of NFS. A typical case where conntrack by default cannot easily know a
suitable timeout without additional information.
Regards
Henrik
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAT and locally bound sockets
2002-07-01 16:59 ` Henrik Nordstrom
@ 2002-07-01 18:46 ` Michael Shuey
2002-07-02 7:52 ` Henrik Nordstrom
0 siblings, 1 reply; 5+ messages in thread
From: Michael Shuey @ 2002-07-01 18:46 UTC (permalink / raw)
To: Henrik Nordstrom; +Cc: Harald Welte, netfilter-devel
On Mon, Jul 01, 2002 at 06:59:39PM +0200, Henrik Nordstrom wrote:
> > would be just fine; however, with a high level of traffic the NAT system
> > would occaisionally select a srcport that was already in use by the NFS
> > client local to natbox. That's not fine - it causes quite a few NFS
>
> This is handled fine in all tests I have done provided your SNAT rule applies
> to both forwarded and locally originating packets.
First, why would I want to SNAT locally originating packets? Second, are
you telling me that netfilter _does_ check to see if a port is locally bound
before using it for a translation?
> If however your UDP nat entries times out from conntrack, which they can
> easily do for a idle NFS mount, then all bets is off.. The default udp
> timeout is only 180 seconds which is not by far sufficient for multi-client
> NAT of NFS. A typical case where conntrack by default cannot easily know a
> suitable timeout without additional information.
The problem is not that UDP NAT entries are timing out from conntrack. The
problem is that SNAT'd NFS connections are stealing packets bound for the
nat host. As near as I can tell the NAT code will occaisionally select a
srcport that's already in use by a client local to the natbox. For more
information, check the posting at the URL I mailed to the list earlier.
If my problems were caused by UDP nat entried timing out from conntrack, why
did all my problems disappear when I SNAT'd the connections through an IP
alias? I didn't change the timeout, so if your assumption were correct I
would still have NFS issues.
--
Mike Shuey
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAT and locally bound sockets
2002-07-01 18:46 ` Michael Shuey
@ 2002-07-02 7:52 ` Henrik Nordstrom
0 siblings, 0 replies; 5+ messages in thread
From: Henrik Nordstrom @ 2002-07-02 7:52 UTC (permalink / raw)
To: shuey; +Cc: netfilter-devel
On Monday 01 July 2002 20.46, Michael Shuey wrote:
> First, why would I want to SNAT locally originating packets?
> Second, are you telling me that netfilter _does_ check to see if a
> port is locally bound before using it for a translation?
Mainly in case the locally selected port is already in use by a NAT:ed
connection.
NAT checks to see that the port isn't already in use by NAT or local
sockets, but as far as I know local traffic do not check that the
port isn't already in use by NAT.
By applying SNAT to the locally originating traffic as well the NAT
engine will detect any such collisions and reassign the port
automatically.
Regards
Henrik
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NAT and locally bound sockets
2002-07-01 16:32 ` NAT and locally bound sockets Michael Shuey
2002-07-01 16:59 ` Henrik Nordstrom
@ 2002-07-02 14:21 ` Harald Welte
1 sibling, 0 replies; 5+ messages in thread
From: Harald Welte @ 2002-07-02 14:21 UTC (permalink / raw)
To: shuey; +Cc: netfilter-devel
On Mon, Jul 01, 2002 at 11:32:32AM -0500, Michael Shuey wrote:
> On Thu, May 30, 2002 at 03:32:47PM +0200, Harald Welte wrote:
> > Interestingly I don't remember this bug. I (and nobody else) has added
> > something to the TODO list about this either. Maybe it somehow got lost :(
>
> I can't fault that; heck, I just took a month to reply to this email.
>
> I looked up the previous comment about this issue:
>
> http://lists.samba.org/pipermail/netfilter-devel/2002-January/003041.html
>
> > Are you aware thet netfilter/iptables NAT is [in IETF terms] 'symmetric nat',
> > which means that we can use the same port on the NAT gw multiple times, as
> > long as the tuple (consisting out of srcip,dstip,srcport,dstport,l4proto) is
> > unique.
>
> Yes, I am aware that netfilter provides symmetric nat. Unfortunately, its
> port selection can provide a tuple that is _not_ unique.
mmh... not exactly. I understand that there is a problem, but the tuple is
always unique - as long as there is any tuple!
In the case where you just bind to an udp port, but haven't sent any
packets yet, somebody else can use a tuple including that port - which
of course clashes if the local port then starts sending packets to the
same destip/destport as the now-used tuple -> boom.
> In this scenario, think about the tuple for a moment. Since all clients and
> the natbox are mounting the same NFS server, selecting the same port by
> default, using UDP across the board, the connection tuples are (after SNAT)
> going to be very similar - they only differ in srcport. Normally that would
> be just fine; however, with a high level of traffic the NAT system would
> occaisionally select a srcport that was already in use by the NFS client local
> to natbox. That's not fine - it causes quite a few NFS timeouts, retransmits,
> etc. on natbox.
so you need to include your nat box itself into the SNAT rule.
> A proper fix would have to involve TCP as well (as a similar problem most
> likely exists there, it just crops up much less frequently).
mh. the issue with TCP is the same: if you bind to a socket and not use
it for quite some time, before you actually innitiate any connection.
> Mike Shuey
--
Live long and prosper
- Harald Welte / laforge@gnumonks.org http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M-
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2002-07-02 14:21 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20020529164926.GA9003@gort.ecn.purdue.edu>
[not found] ` <20020530153247.I7658@sunbeam.de.gnumonks.org>
2002-07-01 16:32 ` NAT and locally bound sockets Michael Shuey
2002-07-01 16:59 ` Henrik Nordstrom
2002-07-01 18:46 ` Michael Shuey
2002-07-02 7:52 ` Henrik Nordstrom
2002-07-02 14:21 ` Harald Welte
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.