connection dropouts

All of lore.kernel.org
 help / color / mirror / Atom feed

* connection dropouts
@ 2004-02-25 19:05 T. Horsnell (tsh)
  2004-02-26  9:00 ` Philip Craig
  0 siblings, 1 reply; 5+ messages in thread
From: T. Horsnell (tsh) @ 2004-02-25 19:05 UTC (permalink / raw)
  To: netfilter

Hi all,

We are currently using iptables 1.2.5 and kernel 2.4.18 to do filtering
and NAT. There are about 800 hosts behind the firewall, and we are in
the process of moving them into private ip space (10.x.x.x) so not all
the 800 are NAT candidates, only about 400 so far. The rest still have
their global ip addresses.
Users are starting to report that when their machine is moved to
10. space, they experience network hangups when accessing offsite
servers (mainly web/ftp but also ssh) and I'd like your advice 
where I should start looking. 

The firewall box is a 1GHz AMD with 128MBytes mem, and
/proc/sys/net/ipv4/ip_conntrack_max is currently set to 8184.

How can I track how close I get to this limit? 
What is the memory use per conntrack entry?
Is there anything particular about NAT entries in the conntrack
tables that would make NAT'd hosts more prone to net hangups
that unNAT'd ones?
If I raise my ip_conntrack_max value, am I likely to crash
the firewall if I raise it too high?
What is the theoretical maximum number of conntrack entries?
What is the theoretical maximum number of NAT connections?
(this would seem to me to be 65536 - the maximum number
of ports available on a single host, i.e. the NAT box
since it has to map a source host:hostport into a NAT:natport)

Sorry for all the questions, but I'm starting to get worried
that we may have bitten off more than we can chew here...

TIA,
Terry.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: connection dropouts
  2004-02-25 19:05 connection dropouts T. Horsnell (tsh)
@ 2004-02-26  9:00 ` Philip Craig
  2004-02-26 16:15   ` T. Horsnell (tsh)
  0 siblings, 1 reply; 5+ messages in thread
From: Philip Craig @ 2004-02-26  9:00 UTC (permalink / raw)
  To: T. Horsnell (tsh); +Cc: netfilter

T. Horsnell (tsh) wrote:
> The firewall box is a 1GHz AMD with 128MBytes mem, and
> /proc/sys/net/ipv4/ip_conntrack_max is currently set to 8184.
> 
> How can I track how close I get to this limit? 

There will be a syslog message telling you when you reach the limit.
Or 'grep ip_conntrack /proc/slabinfo', and read the first number
(active_objs).
Or 'wc -l /proc/net/ip_conntrack', but that is slow and maybe unreliable.

> What is the memory use per conntrack entry?

The kernel displays this when the conntrack module is loaded, eg
kernel: ip_conntrack version 2.1 (8191 buckets, 65528 max) - 300 bytes
per conntrack
Or 'grep ip_conntrack /proc/slabinfo', and read the third number (objsize).
(Hmm, except I noticed those two numbers are different on my PC.  The
slabinfo may be more accurate.)

> Is there anything particular about NAT entries in the conntrack
> tables that would make NAT'd hosts more prone to net hangups
> that unNAT'd ones?

Not that I'm aware of.  Do the hangups have any sort of consistency?

> If I raise my ip_conntrack_max value, am I likely to crash
> the firewall if I raise it too high?
> What is the theoretical maximum number of conntrack entries?
> What is the theoretical maximum number of NAT connections?
> (this would seem to me to be 65536 - the maximum number
> of ports available on a single host, i.e. the NAT box
> since it has to map a source host:hostport into a NAT:natport)

The only limit is available memory.  It is not limited to 65536 since
each entry is based off protocol/source host/source port/dest host/dest
port, the number of combinations of which is more than you'll ever need.
There is a limitation of 65535 NATed connections to a single port on a
given dest host (ie if source host, dest host, and dest port are all
constant), but that would be unusual to encounter.

-- 
Philip Craig - SnapGear, A CyberGuard Company - http://www.SnapGear.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: connection dropouts
  2004-02-26  9:00 ` Philip Craig
@ 2004-02-26 16:15   ` T. Horsnell (tsh)
  2004-02-27  7:44     ` Philip Craig
  0 siblings, 1 reply; 5+ messages in thread
From: T. Horsnell (tsh) @ 2004-02-26 16:15 UTC (permalink / raw)
  To: Philip Craig; +Cc: netfilter

>T. Horsnell (tsh) wrote:
>> The firewall box is a 1GHz AMD with 128MBytes mem, and
>> /proc/sys/net/ipv4/ip_conntrack_max is currently set to 8184.
>> 
>> How can I track how close I get to this limit? 
>
>There will be a syslog message telling you when you reach the limit.
>Or 'grep ip_conntrack /proc/slabinfo', and read the first number
>(active_objs).
>Or 'wc -l /proc/net/ip_conntrack', but that is slow and maybe unreliable.
>
>> What is the memory use per conntrack entry?
>
>The kernel displays this when the conntrack module is loaded, eg
>kernel: ip_conntrack version 2.1 (8191 buckets, 65528 max) - 300 bytes
>per conntrack
>Or 'grep ip_conntrack /proc/slabinfo', and read the third number (objsize).
>(Hmm, except I noticed those two numbers are different on my PC.  The
>slabinfo may be more accurate.)
>
>> Is there anything particular about NAT entries in the conntrack
>> tables that would make NAT'd hosts more prone to net hangups
>> that unNAT'd ones?
>
>Not that I'm aware of.  Do the hangups have any sort of consistency?

They seem much more prevalent in machines which have been moved
to our private addresses. I havent gathered much evidence yet,
but one thing I have noticed is that I accumulate a large number
of entries in ip_conntrack like:

tcp      6 431253 ESTABLISHED src=10.2.0.4 dst=131.111.85.78 sport=49278 dport=143 [UNREPLIED] src=131.111.85.78 dst=10.2.0.4 sport=143 dport=49278 use=1

'ESTABLISHED' 'UNREPLIED' seems an odd combination to me.
To help in the move to 10.x.x.x we have a number of servers with
(temporary) dual ip addresses.
Host 131.111.85.78 is one of these and has a primary ip address
of 10.1.0.1 with an alias of 131.111.85.78. The netfilter box
is configured to route between 10. and 131.111.85.x, and
10.2.0.4 will have been configured to use the netfilter box
as its default router.

The conntrack entry above appears to be an attempt by host 10.2.0.4
to make an imap connection to 131.111.85.78. My feeling is that
host 131.111.85.78 switches to using its 10.1.0.1 address at some
point in the connection, and so the UNREPLIED state stays put.
I dont yet know why traffic between our 10. hosts and our
131.111 hosts should generate a conntrack entry at all...

Cheers,
Terry.


>
>> If I raise my ip_conntrack_max value, am I likely to crash
>> the firewall if I raise it too high?
>> What is the theoretical maximum number of conntrack entries?
>> What is the theoretical maximum number of NAT connections?
>> (this would seem to me to be 65536 - the maximum number
>> of ports available on a single host, i.e. the NAT box
>> since it has to map a source host:hostport into a NAT:natport)
>
>The only limit is available memory.  It is not limited to 65536 since
>each entry is based off protocol/source host/source port/dest host/dest
>port, the number of combinations of which is more than you'll ever need.
>There is a limitation of 65535 NATed connections to a single port on a
>given dest host (ie if source host, dest host, and dest port are all
>constant), but that would be unusual to encounter.
>
>-- 
>Philip Craig - SnapGear, A CyberGuard Company - http://www.SnapGear.com
>
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: connection dropouts
  2004-02-26 16:15   ` T. Horsnell (tsh)
@ 2004-02-27  7:44     ` Philip Craig
  2004-02-27 17:28       ` T. Horsnell (tsh)
  0 siblings, 1 reply; 5+ messages in thread
From: Philip Craig @ 2004-02-27  7:44 UTC (permalink / raw)
  To: T. Horsnell (tsh); +Cc: netfilter

T. Horsnell (tsh) wrote:
> tcp      6 431253 ESTABLISHED src=10.2.0.4 dst=131.111.85.78 sport=49278 dport=143 [UNREPLIED] src=131.111.85.78 dst=10.2.0.4 sport=143 dport=49278 use=1
> 
> 'ESTABLISHED' 'UNREPLIED' seems an odd combination to me.

This is happening when the firewall only sees packets travelling in
one direction.  That is, 10.2.0.4 uses the firewall as its gateway
to talk to 131.111.85.78, but since 131.11.85.78 knows about the
10.x.x.x network, it replies directly to 10.2.0.4, so the firewall
is missing half of the conversation.  It doesn't look to me like this
particular connection has hanged.

Do you have any DNAT rules on the firewall?  This kind of assymetrical
routing does cause problems with DNAT, since the firewall doesn't get
a chance to reverse the DNAT in the reply packets, and the symptom is
that the connection hangs.

> I dont yet know why traffic between our 10. hosts and our
> 131.111 hosts should generate a conntrack entry at all...

If the packets go via the firewall, then a conntrack entry will
always be created.

-- 
Philip Craig - SnapGear, A CyberGuard Company - http://www.SnapGear.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: connection dropouts
  2004-02-27  7:44     ` Philip Craig
@ 2004-02-27 17:28       ` T. Horsnell (tsh)
  0 siblings, 0 replies; 5+ messages in thread
From: T. Horsnell (tsh) @ 2004-02-27 17:28 UTC (permalink / raw)
  To: Philip Craig; +Cc: netfilter

>T. Horsnell (tsh) wrote:
>> tcp      6 431253 ESTABLISHED src=10.2.0.4 dst=131.111.85.78 sport=49278 dport=143 [UNREPLIED] src=131.111.85.78 dst=10.2.0.4 sport=143 dport=49278 use=1
>> 
>> 'ESTABLISHED' 'UNREPLIED' seems an odd combination to me.
>
>This is happening when the firewall only sees packets travelling in
>one direction.  That is, 10.2.0.4 uses the firewall as its gateway
>to talk to 131.111.85.78, but since 131.11.85.78 knows about the
>10.x.x.x network, it replies directly to 10.2.0.4, so the firewall
>is missing half of the conversation.  It doesn't look to me like this
>particular connection has hanged.
>
>Do you have any DNAT rules on the firewall?  This kind of assymetrical
>routing does cause problems with DNAT, since the firewall doesn't get
>a chance to reverse the DNAT in the reply packets, and the symptom is
>that the connection hangs.

Yes, I do have a few DNATs. We have a bunch of servers which have
10. addresses, but which also have to be visible from the world, and
so have 131.111 addresses in the DNS. Incoming packets to these are
DNAT'd from the global addresses to their corresponding 10. addresses.
I also SNAT the outgoing packets from these servers, from their 10.
addresses to their corresponding global ip addresses, but I guess this
is probably unnecessary.

Evidence so far is that the network hangups are more like timeouts.
In many cases, the connection succeeds if the user is prepared to
wait. I dont know whether 30 secs is the hangup time (this might
correspond to the some of the timeouts for conntrack entries)
but there is a possiblilty that the hangs may correspond to a big splat
of UDP connections which I've noticed taking place periodically.
Even tho' these are between machines on the same LAN, they are currently
between hosts on different logical networks and so generate conntrack
entries (see my red-face blurb below).

>
>> I dont yet know why traffic between our 10. hosts and our
>> 131.111 hosts should generate a conntrack entry at all...
>
>If the packets go via the firewall, then a conntrack entry will
>always be created.

Sorry, I'm being a total idiot here. As well as filtering and NAT, the box
is configured as a router to route between our 10. hosts and the 131.111 ones
which havent migrated yet. So as well as static routes set in rc.local

# routes to our own UCam subnets:
route add -net 131.111.26.0/24 dev eth0
route add -net 131.111.84.0/24 dev eth0
route add -net 131.111.85.0/24 dev eth0
route add -net 131.111.89.0/24 dev eth0
route add -net 131.111.184.0/24 dev eth0
route add -net 131.111.185.0/24 dev eth0
# a route to 10. untilw we are completely 10.
route add -net 10.0.0.0/9 dev eth0

I have FORWARD rules:
#allow the firewall to route between our private net (10.0.0.0/9)
#and and our UCam subnets
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.26.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.26.0/24 -d 10.0.0.0/9 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.84.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.84.0/24 -d 10.0.0.0/9 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.85.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.85.0/24 -d 10.0.0.0/9 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.89.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.89.0/24 -d 10.0.0.0/9 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.184.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.184.0/24 -d 10.0.0.0/9 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 10.0.0.0/9 -d 131.111.185.0/24 -j ACCEPT
iptables -A FORWARD -i eth0 -o eth0 -s 131.111.185.0/24 -d 10.0.0.0/9 -j ACCEPT

>
>-- 
>Philip Craig - SnapGear, A CyberGuard Company - http://www.SnapGear.com
>
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-02-27 17:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-25 19:05 connection dropouts T. Horsnell (tsh)
2004-02-26  9:00 ` Philip Craig
2004-02-26 16:15   ` T. Horsnell (tsh)
2004-02-27  7:44     ` Philip Craig
2004-02-27 17:28       ` T. Horsnell (tsh)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.