* Bizarre NAT behavior
@ 2011-06-15 2:03 Greg Scott
2011-06-23 15:17 ` Greg Scott
0 siblings, 1 reply; 10+ messages in thread
From: Greg Scott @ 2011-06-15 2:03 UTC (permalink / raw)
To: netfilter; +Cc: Lynn Hanson, Joe Whalen
I ran into a bizarre NAT problem recently. I have a firewall with eth0
and eth1 bridged using device br0. This site hosts a few publicly
visible web and ftp sites. These are all accessible across the Internet
as they should be.
For internal users accessing these sites using public IP Addresses, I
MASQUERADE the request and also DNAT it. This has worked for several
years - but broke recently when I put in a firewall upgrade using kernel
2.6.35.6-48.fc14.i686.PAE. Identical ruleset from the old and new, just
a newer kernel with Fedora 14.
Here's the really weird part - it all works when I watch it with
tcpdump. The website has a public IP Address (obfuscated here) of
1.2.115.121. This NATs to private IP Address 192.168.10.8. When a user
in the 192.168.10.nnn subnet tries to access the website at its public
IP Address, nothing happens. But when I do this:
[root@ehac-fw2011 ~]# /usr/sbin/tcpdump -i br0 host 1.2.115.151 -nn
Now that user can see the website. This works for a few minutes after I
terminate tcpdump until the TCP connection goes away. I can reproduce
the problem at will - am I looking a kernel bug? How weird is that when
the problem stops when I watch the packets. Some kind of timing glitch?
Thanks
- Greg Scott
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-15 2:03 Bizarre NAT behavior Greg Scott
@ 2011-06-23 15:17 ` Greg Scott
2011-06-23 15:28 ` Jan Engelhardt
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Greg Scott @ 2011-06-23 15:17 UTC (permalink / raw)
To: netfilter
Wow, more than a week later and silence from everyone - am I on my own
with this problem? Why would NATing in both PREROUTING and POSTROUTING
work **only** when I watch it with tcpdump and not work otherwise?
Surely I can't be the only one seeing this problem?
- Greg Scott
-----Original Message-----
From: netfilter-owner@vger.kernel.org
[mailto:netfilter-owner@vger.kernel.org] On Behalf Of Greg Scott
Sent: Tuesday, June 14, 2011 9:04 PM
To: netfilter@vger.kernel.org
Cc: Lynn Hanson; Joe Whalen
Subject: Bizarre NAT behavior
I ran into a bizarre NAT problem recently. I have a firewall with eth0
and eth1 bridged using device br0. This site hosts a few publicly
visible web and ftp sites. These are all accessible across the Internet
as they should be.
For internal users accessing these sites using public IP Addresses, I
MASQUERADE the request and also DNAT it. This has worked for several
years - but broke recently when I put in a firewall upgrade using kernel
2.6.35.6-48.fc14.i686.PAE. Identical ruleset from the old and new, just
a newer kernel with Fedora 14.
Here's the really weird part - it all works when I watch it with
tcpdump. The website has a public IP Address (obfuscated here) of
1.2.115.121. This NATs to private IP Address 192.168.10.8. When a user
in the 192.168.10.nnn subnet tries to access the website at its public
IP Address, nothing happens. But when I do this:
[root@ehac-fw2011 ~]# /usr/sbin/tcpdump -i br0 host 1.2.115.151 -nn
Now that user can see the website. This works for a few minutes after I
terminate tcpdump until the TCP connection goes away. I can reproduce
the problem at will - am I looking a kernel bug? How weird is that when
the problem stops when I watch the packets. Some kind of timing glitch?
Thanks
- Greg Scott
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-23 15:17 ` Greg Scott
@ 2011-06-23 15:28 ` Jan Engelhardt
2011-06-23 16:28 ` Greg Scott
2011-06-23 20:49 ` Steven Kath
2 siblings, 0 replies; 10+ messages in thread
From: Jan Engelhardt @ 2011-06-23 15:28 UTC (permalink / raw)
To: Greg Scott; +Cc: netfilter
On Thursday 2011-06-23 17:17, Greg Scott wrote:
>Wow, more than a week later and silence from everyone - am I on my own
>with this problem? Why would NATing in both PREROUTING and POSTROUTING
>work **only** when I watch it with tcpdump and not work otherwise?
>Surely I can't be the only one seeing this problem?
Potential explanation: Packets being dropped inside the routing core due
to violation of RP filter. [tcpdump hooks in very early and thus can
still see stuff.]
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-23 15:17 ` Greg Scott
2011-06-23 15:28 ` Jan Engelhardt
@ 2011-06-23 16:28 ` Greg Scott
2011-06-23 17:00 ` Payam Chychi
2011-06-23 20:49 ` Steven Kath
2 siblings, 1 reply; 10+ messages in thread
From: Greg Scott @ 2011-06-23 16:28 UTC (permalink / raw)
To: netfilter
> Why would NATing in both PREROUTING and POSTROUTING
> work **only** when I watch it with tcpdump and not work otherwise?
I should be more clear. The problem is with internal users looking at
internally hosted web and ftp sites using the public IP Addresses. The
way you do this is, DNAT the packet in PREROUTING and then MASQUERADE
the packet in POSTROUTING. The technique is documented in a howto
someplace and I've been doing it for several years at several sites.
At this particular site, all worked fine until I replaced the old
firewall with a new one. Now it only works properly when I watch the
conversation the tcpdump. I'm not making this up.
- Greg Scott
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bizarre NAT behavior
2011-06-23 16:28 ` Greg Scott
@ 2011-06-23 17:00 ` Payam Chychi
2011-06-23 18:23 ` Greg Scott
0 siblings, 1 reply; 10+ messages in thread
From: Payam Chychi @ 2011-06-23 17:00 UTC (permalink / raw)
To: Greg Scott, netfilter
Why are u not natting to internal ip space? Better question, if ur
using NAT why are you not using internal ip inside ur network and
natting to external ip on the egress? Just trying to see why you've
chosen this topology over others.
On 6/23/11, Greg Scott <GregScott@infrasupport.com> wrote:
>> Why would NATing in both PREROUTING and POSTROUTING
>> work **only** when I watch it with tcpdump and not work otherwise?
>
> I should be more clear. The problem is with internal users looking at
> internally hosted web and ftp sites using the public IP Addresses. The
> way you do this is, DNAT the packet in PREROUTING and then MASQUERADE
> the packet in POSTROUTING. The technique is documented in a howto
> someplace and I've been doing it for several years at several sites.
>
> At this particular site, all worked fine until I replaced the old
> firewall with a new one. Now it only works properly when I watch the
> conversation the tcpdump. I'm not making this up.
>
> - Greg Scott
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Sent from my mobile device
Payam Tarverdyan Chychi
Network Security Specialist / Network Engineer
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-23 17:00 ` Payam Chychi
@ 2011-06-23 18:23 ` Greg Scott
2011-07-08 20:39 ` Greg Scott
0 siblings, 1 reply; 10+ messages in thread
From: Greg Scott @ 2011-06-23 18:23 UTC (permalink / raw)
To: Payam Chychi, netfilter
Because I want internal users of the website to have the same experience
as external users. There's an external DNS name, www.{myname}.org and
it translates to a public IP Address. I want my internal users to
resolve that name from the public, external DNS server, which means
they'll hit the public IP Address. I know I can reproduce that public
facing DNS zone on a private DNS server using private IP Addresses, but
I would prefer that internal users have an identical experience as the
rest of the world. And that means I need to do both SNAT and DNAT at
the firewall for these.
Why do I want the identical experience? Because this helps with ongoing
website updates and testing. If the internal experience is even a tiny
bit different, my users could do something to the website and the rest
of the world may see it differently than internal users.
The technique to do this is documented and worked nicely for several
years. It still works even now, but only when I watch it with tcpdump.
So I don't see how it could be a problem with my ruleset. I am even
more worried about what will happen when I upgrade other sites to newer
kernels.
- Greg Scott
-----Original Message-----
From: Payam Chychi [mailto:pchychi@gmail.com]
Sent: Thursday, June 23, 2011 12:01 PM
To: Greg Scott; netfilter@vger.kernel.org
Subject: Re: Bizarre NAT behavior
Why are u not natting to internal ip space? Better question, if ur
using NAT why are you not using internal ip inside ur network and
natting to external ip on the egress? Just trying to see why you've
chosen this topology over others.
On 6/23/11, Greg Scott <GregScott@infrasupport.com> wrote:
>> Why would NATing in both PREROUTING and POSTROUTING
>> work **only** when I watch it with tcpdump and not work otherwise?
>
> I should be more clear. The problem is with internal users looking at
> internally hosted web and ftp sites using the public IP Addresses.
The
> way you do this is, DNAT the packet in PREROUTING and then MASQUERADE
> the packet in POSTROUTING. The technique is documented in a howto
> someplace and I've been doing it for several years at several sites.
>
> At this particular site, all worked fine until I replaced the old
> firewall with a new one. Now it only works properly when I watch the
> conversation the tcpdump. I'm not making this up.
>
> - Greg Scott
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
Sent from my mobile device
Payam Tarverdyan Chychi
Network Security Specialist / Network Engineer
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-23 15:17 ` Greg Scott
2011-06-23 15:28 ` Jan Engelhardt
2011-06-23 16:28 ` Greg Scott
@ 2011-06-23 20:49 ` Steven Kath
2 siblings, 0 replies; 10+ messages in thread
From: Steven Kath @ 2011-06-23 20:49 UTC (permalink / raw)
To: Greg Scott; +Cc: netfilter
On Thu, 2011-06-23 at 10:17 -0500, Greg Scott wrote:
> Why would NATing in both PREROUTING and POSTROUTING
> work **only** when I watch it with tcpdump and not work otherwise?
tcpdump by default will put the interface into promiscuous mode, so that
it will not automatically discard frames with a unicast ethernet
destination address which does not match the MAC address of the
interface. If traffic passes with tcpdump running but not without it,
it's likely related to the destination ethernet addresses. That would
be a layer 2/bridging problem more than a NAT/iptables problem.
If promiscuous mode is the factor that allows traffic to pass, a cheap
hack would be to force the interface into promiscuous mode without
tcpdump with "ip link set <dev> promisc on"
I'd gather this information to try to understand the problem better:
tcpdump -e -i <dev> [filters...]
(-e: Print the link-level header on each dump line.)
tcpdump -e -i <dev> -p [filters...]
(-p: Don't put the interface into promiscuous mode.)
If frames are visible when running in promiscuous mode which aren't
visible when running with -p, note the destination ethernet address of
those frames and compare it against the outputs from "ip link" and
"brctl showmacs <brdev>". They're likely coming in a port which
considers that destination address foreign.
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-06-23 18:23 ` Greg Scott
@ 2011-07-08 20:39 ` Greg Scott
2011-07-08 22:29 ` Greg Scott
0 siblings, 1 reply; 10+ messages in thread
From: Greg Scott @ 2011-07-08 20:39 UTC (permalink / raw)
To: netfilter
Cc: Payam Chychi, Jan Engelhardt, Steven Kath, Lynn Hanson,
Joe Whalen
This took me a while to get back and troubleshoot and I still don't
understand what's going on. This is the problem with internal users
addressing a website with its external IP Address, doing SNAT and DNAT
over a Linux bridge. The last discussion was on 6/23/2011.
From Jan Engelhardt:
> Potential explanation: Packets being dropped inside the routing core
due
> to violation of RP filter. [tcpdump hooks in very early and thus can
> still see stuff.]
I don't think it's an rp_filter problem - see below.
And from Steven Kath:
> I'd gather this information to try to understand the problem better:
>
> tcpdump -e -i <dev> [filters...]
> (-e: Print the link-level header on each dump line.)
>
> tcpdump -e -i <dev> -p [filters...]
> (-p: Don't put the interface into promiscuous mode.)
Steve's suggestion helped peel back another onion layer. Bridge br0
bridges physical interfaces eth1 and eth0. Physical interface eth1 is
on the private LAN side, eth0 on the public Internet side.
For my testing, from a CMD window on the server hosting the website, I
do:
telnet aa.bb.115.151 80 (aa.bb obfuscated first 2 IP Address octets),
and then watch with tcpdump on the firewall.
Also on the firewall, I did this:
echo "0" > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/eth1/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/br0/rp_filter
So that turned of any possible rp_filtering.
And then on the firewall:
/usr/sbin/tcpdump -p -e -i br0 host aa.bb.115.151 -nn -vv
while the telnet was running in another window.
Nothing - no output, no matter what value I use for any of the rp_filter
files. Nothing from my telnet session to port 80 hits bridge br0 on the
firewall.
But here's the curious part - looking at physical interface eth1, I see
these packets when I do the same telnet test:
[root@ehac-fw2011 ~]# /usr/sbin/tcpdump -p -e -i eth1 host aa.bb.115.151
-nn -vv
tcpdump: WARNING: eth1: no IPv4 address assigned
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size
65535 bytes
14:51:33.412280 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18631, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
14:51:39.427928 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18733, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
[root@ehac-fw2011 ~]#
Here's something else curious - looking at "ip link show", it looks like
bridge br0 takes on the MAC address of physical NIC eth0. But the
internal LAN is connected to physical eth1. I wonder if this behavior
is different than the older version? If the MAC Address for bridge br0
is different than the physical device I'm actually connected to, I
wonder if bridging "thinks" I'm trying to hit a foreign MAC Address?
[root@ehac-fw2011 ~]# ip link show eth1
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:0d:88:31:d8:24 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# ip link show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]#
[root@ehac-fw2011 ~]# ip link show br0
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state
UNKNOWN
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# brctl show macs br0
bridge name bridge id STP enabled interfaces
br0 8000.0003473a5979 no eth0
eth1
Hmmmm - so a packet comes in on eth1, with a destination MAC Address
belonging to port eth0. So eth1 throws it away because it "thinks" this
is a foreign MAC Address? But this all worked before, so what's
different? Or were earlier bridges in promiscuous mode by default and
now they're not? Have I stumbled across a new bridging bug? Is the
best workaround to just put br0 into promiscuous mode?
Thanks
- Greg Scott
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-07-08 20:39 ` Greg Scott
@ 2011-07-08 22:29 ` Greg Scott
2011-07-22 4:53 ` Greg Scott
0 siblings, 1 reply; 10+ messages in thread
From: Greg Scott @ 2011-07-08 22:29 UTC (permalink / raw)
To: netfilter
Cc: Payam Chychi, Jan Engelhardt, Steven Kath, Lynn Hanson,
Joe Whalen
One more piece of data:
On the firewall, I did:
ip link set br0 promisc on
and now my telnet on port 80 test connects, as does a real browser. For
now, I'll just put this in my rc.firewall script as a workaround unless
and until a better answer comes along.
- Greg Scott
-----Original Message-----
From: netfilter-owner@vger.kernel.org
[mailto:netfilter-owner@vger.kernel.org] On Behalf Of Greg Scott
Sent: Friday, July 08, 2011 3:39 PM
To: netfilter@vger.kernel.org
Cc: Payam Chychi; Jan Engelhardt; Steven Kath; Lynn Hanson; Joe Whalen
Subject: RE: Bizarre NAT behavior
This took me a while to get back and troubleshoot and I still don't
understand what's going on. This is the problem with internal users
addressing a website with its external IP Address, doing SNAT and DNAT
over a Linux bridge. The last discussion was on 6/23/2011.
From Jan Engelhardt:
> Potential explanation: Packets being dropped inside the routing core
due
> to violation of RP filter. [tcpdump hooks in very early and thus can
> still see stuff.]
I don't think it's an rp_filter problem - see below.
And from Steven Kath:
> I'd gather this information to try to understand the problem better:
>
> tcpdump -e -i <dev> [filters...]
> (-e: Print the link-level header on each dump line.)
>
> tcpdump -e -i <dev> -p [filters...]
> (-p: Don't put the interface into promiscuous mode.)
Steve's suggestion helped peel back another onion layer. Bridge br0
bridges physical interfaces eth1 and eth0. Physical interface eth1 is
on the private LAN side, eth0 on the public Internet side.
For my testing, from a CMD window on the server hosting the website, I
do:
telnet aa.bb.115.151 80 (aa.bb obfuscated first 2 IP Address octets),
and then watch with tcpdump on the firewall.
Also on the firewall, I did this:
echo "0" > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/eth1/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/br0/rp_filter
So that turned of any possible rp_filtering.
And then on the firewall:
/usr/sbin/tcpdump -p -e -i br0 host aa.bb.115.151 -nn -vv
while the telnet was running in another window.
Nothing - no output, no matter what value I use for any of the rp_filter
files. Nothing from my telnet session to port 80 hits bridge br0 on the
firewall.
But here's the curious part - looking at physical interface eth1, I see
these packets when I do the same telnet test:
[root@ehac-fw2011 ~]# /usr/sbin/tcpdump -p -e -i eth1 host aa.bb.115.151
-nn -vv
tcpdump: WARNING: eth1: no IPv4 address assigned
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size
65535 bytes
14:51:33.412280 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18631, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
14:51:39.427928 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18733, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
[root@ehac-fw2011 ~]#
Here's something else curious - looking at "ip link show", it looks like
bridge br0 takes on the MAC address of physical NIC eth0. But the
internal LAN is connected to physical eth1. I wonder if this behavior
is different than the older version? If the MAC Address for bridge br0
is different than the physical device I'm actually connected to, I
wonder if bridging "thinks" I'm trying to hit a foreign MAC Address?
[root@ehac-fw2011 ~]# ip link show eth1
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:0d:88:31:d8:24 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# ip link show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]#
[root@ehac-fw2011 ~]# ip link show br0
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state
UNKNOWN
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# brctl show macs br0
bridge name bridge id STP enabled interfaces
br0 8000.0003473a5979 no eth0
eth1
Hmmmm - so a packet comes in on eth1, with a destination MAC Address
belonging to port eth0. So eth1 throws it away because it "thinks" this
is a foreign MAC Address? But this all worked before, so what's
different? Or were earlier bridges in promiscuous mode by default and
now they're not? Have I stumbled across a new bridging bug? Is the
best workaround to just put br0 into promiscuous mode?
Thanks
- Greg Scott
--
To unsubscribe from this list: send the line "unsubscribe netfilter" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: Bizarre NAT behavior
2011-07-08 22:29 ` Greg Scott
@ 2011-07-22 4:53 ` Greg Scott
0 siblings, 0 replies; 10+ messages in thread
From: Greg Scott @ 2011-07-22 4:53 UTC (permalink / raw)
To: netfilter
Cc: Payam Chychi, Jan Engelhardt, Steven Kath, Lynn Hanson,
Joe Whalen
I took this up with the netdev folks per a suggestion here, but haven't
gotten very far. The netdev guys told me to go bug Red Hat. So I just
did a Red Hat bugzilla report, bug number 724862.
The problem took another unexpected twist. I learned last night - when
I do:
ip link set br0 promisc on
This breaks both inbound and outbound PPTP VPNs. I have a NATed PPTP
server and firewall rules to DNAT TCP 1723 and IP Protocol 47 to my PPTP
server and SNAT IP Protocol 47 from my PPTP server. And I am using
ip_nat_pptp and ip_conntrack_pptp. Watching with tcpdump when I try an
inbound PPTP connection, I see this undending storm of packets for
several minutes, until the PPTP server and remote user times out. I
think what happens is, that br0 bridge forwards the packets to the wrong
physical interface when in promisc mode.
When I do:
ip link set br0 promise off
now both inbound and outbound PPTP VPNs work as expected. But, of
course, this breaks my "router on a stick" rules, which was the original
bizarre NAT behavior I noticed and documented a while ago. So I'm kind
of back to square one.
- Greg Scott
-----Original Message-----
From: netfilter-owner@vger.kernel.org
[mailto:netfilter-owner@vger.kernel.org] On Behalf Of Greg Scott
Sent: Friday, July 08, 2011 5:30 PM
To: netfilter@vger.kernel.org
Cc: Payam Chychi; Jan Engelhardt; Steven Kath; Lynn Hanson; Joe Whalen
Subject: RE: Bizarre NAT behavior
One more piece of data:
On the firewall, I did:
ip link set br0 promisc on
and now my telnet on port 80 test connects, as does a real browser. For
now, I'll just put this in my rc.firewall script as a workaround unless
and until a better answer comes along.
- Greg Scott
-----Original Message-----
From: netfilter-owner@vger.kernel.org
[mailto:netfilter-owner@vger.kernel.org] On Behalf Of Greg Scott
Sent: Friday, July 08, 2011 3:39 PM
To: netfilter@vger.kernel.org
Cc: Payam Chychi; Jan Engelhardt; Steven Kath; Lynn Hanson; Joe Whalen
Subject: RE: Bizarre NAT behavior
This took me a while to get back and troubleshoot and I still don't
understand what's going on. This is the problem with internal users
addressing a website with its external IP Address, doing SNAT and DNAT
over a Linux bridge. The last discussion was on 6/23/2011.
From Jan Engelhardt:
> Potential explanation: Packets being dropped inside the routing core
due
> to violation of RP filter. [tcpdump hooks in very early and thus can
> still see stuff.]
I don't think it's an rp_filter problem - see below.
And from Steven Kath:
> I'd gather this information to try to understand the problem better:
>
> tcpdump -e -i <dev> [filters...]
> (-e: Print the link-level header on each dump line.)
>
> tcpdump -e -i <dev> -p [filters...]
> (-p: Don't put the interface into promiscuous mode.)
Steve's suggestion helped peel back another onion layer. Bridge br0
bridges physical interfaces eth1 and eth0. Physical interface eth1 is
on the private LAN side, eth0 on the public Internet side.
For my testing, from a CMD window on the server hosting the website, I
do:
telnet aa.bb.115.151 80 (aa.bb obfuscated first 2 IP Address octets),
and then watch with tcpdump on the firewall.
Also on the firewall, I did this:
echo "0" > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/eth1/rp_filter
echo "0" > /proc/sys/net/ipv4/conf/br0/rp_filter
So that turned of any possible rp_filtering.
And then on the firewall:
/usr/sbin/tcpdump -p -e -i br0 host aa.bb.115.151 -nn -vv
while the telnet was running in another window.
Nothing - no output, no matter what value I use for any of the rp_filter
files. Nothing from my telnet session to port 80 hits bridge br0 on the
firewall.
But here's the curious part - looking at physical interface eth1, I see
these packets when I do the same telnet test:
[root@ehac-fw2011 ~]# /usr/sbin/tcpdump -p -e -i eth1 host aa.bb.115.151
-nn -vv
tcpdump: WARNING: eth1: no IPv4 address assigned
tcpdump: listening on eth1, link-type EN10MB (Ethernet), capture size
65535 bytes
14:51:33.412280 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18631, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
14:51:39.427928 00:0f:20:f7:06:18 > 00:03:47:3a:59:79, ethertype IPv4
(0x0800), length 62: (tos 0x0, ttl 128, id 18733, offset 0, flags [DF],
proto TCP (6), length 48)
192.168.10.2.54092 > aa.bb.115.151.80: Flags [S], cksum 0xddb2
(correct), seq 4146878900, win 65535, options [mss 1460,nop,nop,sackOK],
length 0
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
[root@ehac-fw2011 ~]#
Here's something else curious - looking at "ip link show", it looks like
bridge br0 takes on the MAC address of physical NIC eth0. But the
internal LAN is connected to physical eth1. I wonder if this behavior
is different than the older version? If the MAC Address for bridge br0
is different than the physical device I'm actually connected to, I
wonder if bridging "thinks" I'm trying to hit a foreign MAC Address?
[root@ehac-fw2011 ~]# ip link show eth1
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:0d:88:31:d8:24 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# ip link show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
state UP qlen 1000
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]#
[root@ehac-fw2011 ~]# ip link show br0
5: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc prio state
UNKNOWN
link/ether 00:03:47:3a:59:79 brd ff:ff:ff:ff:ff:ff
[root@ehac-fw2011 ~]# brctl show macs br0
bridge name bridge id STP enabled interfaces
br0 8000.0003473a5979 no eth0
eth1
Hmmmm - so a packet comes in on eth1, with a destination MAC Address
belonging to port eth0. So eth1 throws it away because it "thinks" this
is a foreign MAC Address? But this all worked before, so what's
different? Or were earlier bridges in promiscuous mode by default and
now they're not? Have I stumbled across a new bridging bug? Is the
best workaround to just put br0 into promiscuous mode?
Thanks
- Greg Scott
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-07-22 4:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-15 2:03 Bizarre NAT behavior Greg Scott
2011-06-23 15:17 ` Greg Scott
2011-06-23 15:28 ` Jan Engelhardt
2011-06-23 16:28 ` Greg Scott
2011-06-23 17:00 ` Payam Chychi
2011-06-23 18:23 ` Greg Scott
2011-07-08 20:39 ` Greg Scott
2011-07-08 22:29 ` Greg Scott
2011-07-22 4:53 ` Greg Scott
2011-06-23 20:49 ` Steven Kath
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox