* Ramdom NAT drop
@ 2009-10-14 0:12 Gary Smith
2009-10-20 14:39 ` Gary Smith
0 siblings, 1 reply; 9+ messages in thread
From: Gary Smith @ 2009-10-14 0:12 UTC (permalink / raw)
To: 'netfilter@vger.kernel.org'
Hello,
I have a scenario where we are NAT'ing multiple ports and in some cases entire IP addresses to our internal private range. Some time ago we noticed that web pages from one of the web servers would randomly fail. Investigating it we found that conntrack was full and that packets were being dropped.
So, since the server has ram, we upped the max bucket and conntrack to 1048576 and 4194304, respectably. The problem appears to go away as we watched the counter go above 40k connections. It has since then been hovering around 40k (currently 35k).
About two weeks later, I noticed that I started getting the failures again. Checking the firewall, connections looked good (once again, 40k or so). Checked the web server logs, request never hit. What I found is that after about 20 minutes or so I will see this failure randomly. I think it's in conjunction with some type of keep alive in IE/Firefox. So, when the problem happens in IE, and the pages continually fail, if I open up Firefox the page comes up fine. This issue comes up when hitting the page from internally on the network through NAT
To me is looks like NAT is dropping the connection that has been established and doesn't want to reconnect. A tcpdump on the external interface shows the request stopping at the iptables firewall and not going beyond that. But then everything will clear up for a few days.
Here are the relevant rules:
-A PREROUTING -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
-A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp --sport 20 --dport 1024:65535 -j ACCEPT
-A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp -m multiport --dports 80,443,21,20 -j ACCEPT
-A OUTPUT -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
The final rule is a log and drop for anything coming in on this particular IP address (which I know works as we see a lot of attempts for 445).
I'm just trying to find any logic reason on why the connections are getting dropped. I'm thinking it's NAT, but that's just a WAG at this point.
OS is CentOS 5, 2.6.18-128.el5, iptables v1.3.5, minimal install, firewall only. Machine has 512mb ram.
total used free shared buffers cached
Mem: 515444 483240 32204 0 141504 296208
-/+ buffers/cache: 45528 469916
Swap: 1052248 0 1052248
Any advice?
^ permalink raw reply [flat|nested] 9+ messages in thread* RE: Ramdom NAT drop
2009-10-14 0:12 Ramdom NAT drop Gary Smith
@ 2009-10-20 14:39 ` Gary Smith
2009-10-21 6:15 ` Anatoly Muliarski
2009-10-21 7:55 ` Mart Frauenlob
0 siblings, 2 replies; 9+ messages in thread
From: Gary Smith @ 2009-10-20 14:39 UTC (permalink / raw)
To: 'netfilter@vger.kernel.org'
Anyone?
> -----Original Message-----
> From: netfilter-owner@vger.kernel.org [mailto:netfilter-
> owner@vger.kernel.org] On Behalf Of Gary Smith
> Sent: Tuesday, October 13, 2009 5:13 PM
> To: 'netfilter@vger.kernel.org'
> Subject: Ramdom NAT drop
>
> Hello,
>
> I have a scenario where we are NAT'ing multiple ports and in some cases
> entire IP addresses to our internal private range. Some time ago we
> noticed that web pages from one of the web servers would randomly fail.
> Investigating it we found that conntrack was full and that packets were
> being dropped.
>
> So, since the server has ram, we upped the max bucket and conntrack to
> 1048576 and 4194304, respectably. The problem appears to go away as we
> watched the counter go above 40k connections. It has since then been
> hovering around 40k (currently 35k).
>
> About two weeks later, I noticed that I started getting the failures
> again. Checking the firewall, connections looked good (once again, 40k
> or so). Checked the web server logs, request never hit. What I found
> is that after about 20 minutes or so I will see this failure randomly.
> I think it's in conjunction with some type of keep alive in IE/Firefox.
> So, when the problem happens in IE, and the pages continually fail, if
> I open up Firefox the page comes up fine. This issue comes up when
> hitting the page from internally on the network through NAT
>
> To me is looks like NAT is dropping the connection that has been
> established and doesn't want to reconnect. A tcpdump on the external
> interface shows the request stopping at the iptables firewall and not
> going beyond that. But then everything will clear up for a few days.
>
> Here are the relevant rules:
>
> -A PREROUTING -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
> -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp --sport 20 --dport
> 1024:65535 -j ACCEPT
> -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp -m multiport --dports
> 80,443,21,20 -j ACCEPT
> -A OUTPUT -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
>
> The final rule is a log and drop for anything coming in on this
> particular IP address (which I know works as we see a lot of attempts
> for 445).
>
> I'm just trying to find any logic reason on why the connections are
> getting dropped. I'm thinking it's NAT, but that's just a WAG at this
> point.
>
> OS is CentOS 5, 2.6.18-128.el5, iptables v1.3.5, minimal install,
> firewall only. Machine has 512mb ram.
>
> total used free shared buffers
> cached
> Mem: 515444 483240 32204 0 141504
> 296208
> -/+ buffers/cache: 45528 469916
> Swap: 1052248 0 1052248
>
> Any advice?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ramdom NAT drop
2009-10-20 14:39 ` Gary Smith
@ 2009-10-21 6:15 ` Anatoly Muliarski
2009-10-21 15:35 ` Gary Smith
2009-10-21 7:55 ` Mart Frauenlob
1 sibling, 1 reply; 9+ messages in thread
From: Anatoly Muliarski @ 2009-10-21 6:15 UTC (permalink / raw)
To: Gary Smith; +Cc: netfilter
Hi Gary,
You must have set the conntrack value too much for your RAM size.
Each conntrack structure use ~300 bytes in physical memory, so the
value of 1024*1024 would be suitable. And the hashsize should be
lowered too. Look at
http://www.wallfire.org/misc/netfilter_conntrack_perf.txt for details.
The problem may also be in your iptables rules or in settings for
/proc/sys/net/netfilter/nf_conntrack_XXX timeouts.
2009/10/20, Gary Smith <gary.smith@holdstead.com>:
> Anyone?
>
> > -----Original Message-----
> > From: netfilter-owner@vger.kernel.org [mailto:netfilter-
> > owner@vger.kernel.org] On Behalf Of Gary Smith
> > Sent: Tuesday, October 13, 2009 5:13 PM
> > To: 'netfilter@vger.kernel.org'
> > Subject: Ramdom NAT drop
> >
> > Hello,
> >
> > I have a scenario where we are NAT'ing multiple ports and in some cases
> > entire IP addresses to our internal private range. Some time ago we
> > noticed that web pages from one of the web servers would randomly fail.
> > Investigating it we found that conntrack was full and that packets were
> > being dropped.
> >
> > So, since the server has ram, we upped the max bucket and conntrack to
> > 1048576 and 4194304, respectably. The problem appears to go away as we
> > watched the counter go above 40k connections. It has since then been
> > hovering around 40k (currently 35k).
> >
> > About two weeks later, I noticed that I started getting the failures
> > again. Checking the firewall, connections looked good (once again, 40k
> > or so). Checked the web server logs, request never hit. What I found
> > is that after about 20 minutes or so I will see this failure randomly.
> > I think it's in conjunction with some type of keep alive in IE/Firefox.
> > So, when the problem happens in IE, and the pages continually fail, if
> > I open up Firefox the page comes up fine. This issue comes up when
> > hitting the page from internally on the network through NAT
> >
> > To me is looks like NAT is dropping the connection that has been
> > established and doesn't want to reconnect. A tcpdump on the external
> > interface shows the request stopping at the iptables firewall and not
> > going beyond that. But then everything will clear up for a few days.
> >
> > Here are the relevant rules:
> >
> > -A PREROUTING -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
> > -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp --sport 20 --dport
> > 1024:65535 -j ACCEPT
> > -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp -m multiport --dports
> > 80,443,21,20 -j ACCEPT
> > -A OUTPUT -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
> >
> > The final rule is a log and drop for anything coming in on this
> > particular IP address (which I know works as we see a lot of attempts
> > for 445).
> >
> > I'm just trying to find any logic reason on why the connections are
> > getting dropped. I'm thinking it's NAT, but that's just a WAG at this
> > point.
> >
> > OS is CentOS 5, 2.6.18-128.el5, iptables v1.3.5, minimal install,
> > firewall only. Machine has 512mb ram.
> >
> > total used free shared buffers
> > cached
> > Mem: 515444 483240 32204 0 141504
> > 296208
> > -/+ buffers/cache: 45528 469916
> > Swap: 1052248 0 1052248
> >
> > Any advice?
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Best regards
Anatoly Muliarski
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Ramdom NAT drop
2009-10-21 6:15 ` Anatoly Muliarski
@ 2009-10-21 15:35 ` Gary Smith
0 siblings, 0 replies; 9+ messages in thread
From: Gary Smith @ 2009-10-21 15:35 UTC (permalink / raw)
To: 'Anatoly Muliarski'; +Cc: 'netfilter@vger.kernel.org'
> You must have set the conntrack value too much for your RAM size.
> Each conntrack structure use ~300 bytes in physical memory, so the
> value of 1024*1024 would be suitable. And the hashsize should be
> lowered too. Look at
> http://www.wallfire.org/misc/netfilter_conntrack_perf.txt for details.
> The problem may also be in your iptables rules or in settings for
> /proc/sys/net/netfilter/nf_conntrack_XXX timeouts.
>
Anatoly,
I had looked at the link. I will either crank the value lower, or increase the ram. I had increased the values to rule that out as a problem.
I will take a look at the conntrack timeouts. It definitely appears to be some type of timeout problem. When the problem does happen, it seems to continue for a while, then when I come back it works just fine. The problem looks like it's some type of timeout what the NAT though. A connection is made, NAT is setup, later trips doesn't get forwarded past the firewall. As for the rules for the filter chain, they are all direct rules (not connection tracking on them). I will definitely look a little deeping, I just didn't know where to begin.
Gary Smith
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ramdom NAT drop
2009-10-20 14:39 ` Gary Smith
2009-10-21 6:15 ` Anatoly Muliarski
@ 2009-10-21 7:55 ` Mart Frauenlob
2009-10-21 15:40 ` Gary Smith
1 sibling, 1 reply; 9+ messages in thread
From: Mart Frauenlob @ 2009-10-21 7:55 UTC (permalink / raw)
To: netfilter
Gary Smith wrote:
> Anyone?
>
>
>> -----Original Message-----
>> From: netfilter-owner@vger.kernel.org [mailto:netfilter-
>> owner@vger.kernel.org] On Behalf Of Gary Smith
>> Sent: Tuesday, October 13, 2009 5:13 PM
>> To: 'netfilter@vger.kernel.org'
>> Subject: Ramdom NAT drop
>>
>> Hello,
>>
>> I have a scenario where we are NAT'ing multiple ports and in some cases
>> entire IP addresses to our internal private range. Some time ago we
>> noticed that web pages from one of the web servers would randomly fail.
>> Investigating it we found that conntrack was full and that packets were
>> being dropped.
>>
>> So, since the server has ram, we upped the max bucket and conntrack to
>> 1048576 and 4194304, respectably. The problem appears to go away as we
>> watched the counter go above 40k connections. It has since then been
>> hovering around 40k (currently 35k).
>>
>> About two weeks later, I noticed that I started getting the failures
>> again. Checking the firewall, connections looked good (once again, 40k
>> or so). Checked the web server logs, request never hit. What I found
>> is that after about 20 minutes or so I will see this failure randomly.
>> I think it's in conjunction with some type of keep alive in IE/Firefox.
>> So, when the problem happens in IE, and the pages continually fail, if
>> I open up Firefox the page comes up fine. This issue comes up when
>> hitting the page from internally on the network through NAT
>>
>> To me is looks like NAT is dropping the connection that has been
>> established and doesn't want to reconnect. A tcpdump on the external
>> interface shows the request stopping at the iptables firewall and not
>> going beyond that. But then everything will clear up for a few days.
>>
>> Here are the relevant rules:
>>
>> -A PREROUTING -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
>> -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp --sport 20 --dport
>> 1024:65535 -j ACCEPT
>> -A INPUT -d 208.209.210.211 -i eth1 -p tcp -m tcp -m multiport --dports
>> 80,443,21,20 -j ACCEPT
>> -A OUTPUT -d 208.209.210.211 -j DNAT --to-destination 192.168.0.10
>>
>> The final rule is a log and drop for anything coming in on this
>> particular IP address (which I know works as we see a lot of attempts
>> for 445).
>>
>> I'm just trying to find any logic reason on why the connections are
>> getting dropped. I'm thinking it's NAT, but that's just a WAG at this
>> point.
>>
>> OS is CentOS 5, 2.6.18-128.el5, iptables v1.3.5, minimal install,
>> firewall only. Machine has 512mb ram.
>>
>> total used free shared buffers
>> cached
>> Mem: 515444 483240 32204 0 141504
>> 296208
>> -/+ buffers/cache: 45528 469916
>> Swap: 1052248 0 1052248
>>
>> Any advice?
>>
Hello,
I'm just guessing, but what I know from my FW logs, is that IE tends to
send packets in INVALID state.
That would explain, why there's no problem with Firefox.
Regards
Mart
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Ramdom NAT drop
2009-10-21 7:55 ` Mart Frauenlob
@ 2009-10-21 15:40 ` Gary Smith
2009-10-21 20:33 ` Gary Smith
0 siblings, 1 reply; 9+ messages in thread
From: Gary Smith @ 2009-10-21 15:40 UTC (permalink / raw)
To: 'netfilter@vger.kernel.org'
> I'm just guessing, but what I know from my FW logs, is that IE tends to
> send packets in INVALID state.
> That would explain, why there's no problem with Firefox.
I would also expect to see this, but I don't think the packet is even making it to the filter section. I have logging for anything dropped and yet nothing is coming in from originating IP's that are affected. I will probably do something painful and put more logging in the chains to see if I can better catch the problem. The only issue I have is that the problem is random.
I will definitely look for that though.
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Ramdom NAT drop
2009-10-21 15:40 ` Gary Smith
@ 2009-10-21 20:33 ` Gary Smith
2009-10-21 21:19 ` Mart Frauenlob
0 siblings, 1 reply; 9+ messages in thread
From: Gary Smith @ 2009-10-21 20:33 UTC (permalink / raw)
To: 'netfilter@vger.kernel.org'
> I would also expect to see this, but I don't think the packet is even
> making it to the filter section. I have logging for anything dropped
> and yet nothing is coming in from originating IP's that are affected.
> I will probably do something painful and put more logging in the chains
> to see if I can better catch the problem. The only issue I have is
> that the problem is random.
>
> I will definitely look for that though.
Included is the rule that I think is being randomly ignored.
-A PREROUTING -d 208.46.23.38 -j DNAT --to-destination 10.80.65.38
This is in effect. So I believe that I should never see a hit in the INPUT chain for this rule since all requests are being forwarded to the 10.80.65.38 IP address. Only 10.80.0.0/16 are local.
I expcted to see this rule as the forward is indeed happening (basically we logged all traffic prior to this rule to generate the hit:
Oct 21 13:33:35 hsoakfiw01c kernel: FW-F-443: IN=eth1 OUT=eth0 SRC=116.250.48.135 DST=10.80.65.38 LEN=1050 TOS=0x00 PREC=0x00 TTL=102 ID=30940 DF PROTO=TCP SPT=2374 DPT=80 WINDOW=32768 RES=0x00 ACK PSH URGP=0
The INPUT catch had a rule to log all traffic coming in as well, which is where we picked up this hit:
Oct 21 13:31:01 hsoakfiw01c kernel: FW-I: IN=eth1 OUT= MAC=00:0c:29:9c:88:9b:00:13:c3:d7:a3:68:08:00 SRC=189.162.111.146 DST=208.46.23.38 LEN=40 TOS=0x00 PREC=0x00 TTL=241 ID=16396 DF PROTO=TCP SPT=3552 DPT=80 WINDOW=0 RES=0x00 ACK RST URGP=0
So, am I wrong in thinking that external traffic forwarded in via NAT should never hit the INPUT chain and go straight to FORWARD chain, or is my problem something else completely?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Ramdom NAT drop
2009-10-21 20:33 ` Gary Smith
@ 2009-10-21 21:19 ` Mart Frauenlob
2009-10-21 21:52 ` Gary Smith
0 siblings, 1 reply; 9+ messages in thread
From: Mart Frauenlob @ 2009-10-21 21:19 UTC (permalink / raw)
To: netfilter
Gary Smith wrote:
>> I would also expect to see this, but I don't think the packet is even
>> making it to the filter section. I have logging for anything dropped
>> and yet nothing is coming in from originating IP's that are affected.
>> I will probably do something painful and put more logging in the chains
>> to see if I can better catch the problem. The only issue I have is
>> that the problem is random.
>>
>> I will definitely look for that though.
>>
>
>
> Included is the rule that I think is being randomly ignored.
>
> -A PREROUTING -d 208.46.23.38 -j DNAT --to-destination 10.80.65.38
>
> This is in effect. So I believe that I should never see a hit in the INPUT chain for this rule since all requests are being forwarded to the 10.80.65.38 IP address. Only 10.80.0.0/16 are local.
>
> I expcted to see this rule as the forward is indeed happening (basically we logged all traffic prior to this rule to generate the hit:
>
> Oct 21 13:33:35 hsoakfiw01c kernel: FW-F-443: IN=eth1 OUT=eth0 SRC=116.250.48.135 DST=10.80.65.38 LEN=1050 TOS=0x00 PREC=0x00 TTL=102 ID=30940 DF PROTO=TCP SPT=2374 DPT=80 WINDOW=32768 RES=0x00 ACK PSH URGP=0
>
> The INPUT catch had a rule to log all traffic coming in as well, which is where we picked up this hit:
>
> Oct 21 13:31:01 hsoakfiw01c kernel: FW-I: IN=eth1 OUT= MAC=00:0c:29:9c:88:9b:00:13:c3:d7:a3:68:08:00 SRC=189.162.111.146 DST=208.46.23.38 LEN=40 TOS=0x00 PREC=0x00 TTL=241 ID=16396 DF PROTO=TCP SPT=3552 DPT=80 WINDOW=0 RES=0x00 ACK RST URGP=0
>
> So, am I wrong in thinking that external traffic forwarded in via NAT should never hit the INPUT chain and go straight to FORWARD chain, or is my problem something else completely?
>
>
>
From a request few days ago 'Re: FIN packets not getting NAT-ed',
Pascal Hambourg answered:
Dhyanesh Ramaiya a écrit :
> >
> > I have setup a Linux firewall on the edge of the network and doing SNAT for
> > internal IPs. When I sniff on external interface for internal source IPs,I
> > am seeing FIN packets from internal IPs going out without being NAT-ed.
>
> These packets are probably classified in the INVALID state by the
> connection tracking. Such packets are ignored by the NAT. A reason may
> be that they belong to old connections the connection tracking has
> forgotten about or considers already closed.
>
> Does your rulest DROP outgoing packets in the INVALID state ?
Maybe it's the same thing just with DNAT at your end.
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Ramdom NAT drop
2009-10-21 21:19 ` Mart Frauenlob
@ 2009-10-21 21:52 ` Gary Smith
0 siblings, 0 replies; 9+ messages in thread
From: Gary Smith @ 2009-10-21 21:52 UTC (permalink / raw)
To: 'netfilter@vger.kernel.org'
> > > I have setup a Linux firewall on the edge of the network and doing
> SNAT for
> > > internal IPs. When I sniff on external interface for internal
> source IPs,I
> > > am seeing FIN packets from internal IPs going out without being
> NAT-ed.
> >
> > These packets are probably classified in the INVALID state by the
> > connection tracking. Such packets are ignored by the NAT. A reason
> may
> > be that they belong to old connections the connection tracking has
> > forgotten about or considers already closed.
> >
> > Does your rulest DROP outgoing packets in the INVALID state ?
>
> Maybe it's the same thing just with DNAT at your end.
We don't drop any outgoing packets. I'll have to look for that thread and see how that might impact it.
My understanding is that any packets that aren't matched in conntrack are sent through the NAT chain and if there are no match then it must be local, therefore input chain, otherwise forward chain. Therefore they should always go through forward if they are either in conntrack with a NAT destination or just processed by NAT and matched with a destination.
LOGIC:
1) Conntrack with nat, yes, forward, otherwise input
2) No conntrack, check nat, if nat, add conntrack, then forward, otherwise input
Either case should result in a forward.
Can any of the dev's confirm this logic:
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-10-21 21:52 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-14 0:12 Ramdom NAT drop Gary Smith
2009-10-20 14:39 ` Gary Smith
2009-10-21 6:15 ` Anatoly Muliarski
2009-10-21 15:35 ` Gary Smith
2009-10-21 7:55 ` Mart Frauenlob
2009-10-21 15:40 ` Gary Smith
2009-10-21 20:33 ` Gary Smith
2009-10-21 21:19 ` Mart Frauenlob
2009-10-21 21:52 ` Gary Smith
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.