[PATCH 1/1] bridge: detect NAT66 correctly and change MAC address

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address
@ 2014-12-05 21:12 Bernhard Thaler
  2014-12-23 14:03 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Thaler @ 2014-12-05 21:12 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, sven, Bernhard Thaler

IPv4 allows to redirect any traffic over a bridge to the local machine using
iptables.

$ sysctl -w net.bridge.bridge-nf-call-iptables=1
$ iptables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

This didn't work with ip6tables because the redirect was not correctly detected.
The bridge pre-routing (finish) netfilter hook has to check for a possible
redirect and then fix the destination mac address. This makes it possible to
use the ip6tables rules for local DNAT REDIRECT similar to the IPv4 version.

$ sysctl -w net.bridge.bridge-nf-call-ip6tables=1
$ ip6tables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
[bernhard.thaler@wvnet.at: rebased, adjust function order]
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
---
taken from http://marc.info/?l=netfilter-devel&m=140837298123062&w=2
rebased to apply to current net-next

 include/linux/netfilter_bridge.h |    2 +
 net/bridge/br_netfilter.c        |  126 +++++++++++++++++++++++++++++---------
 net/ipv6/route.c                 |    1 +
 3 files changed, 100 insertions(+), 29 deletions(-)

diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h
index c755e49..9150e8a 100644
--- a/include/linux/netfilter_bridge.h
+++ b/include/linux/netfilter_bridge.h
@@ -2,6 +2,7 @@
 #define __LINUX_BRIDGE_NETFILTER_H
 
 #include <uapi/linux/netfilter_bridge.h>
+#include <uapi/linux/in6.h>
 
 
 enum nf_br_hook_priorities {
@@ -107,6 +108,7 @@ static inline unsigned int nf_bridge_pad(const struct sk_buff *skb)
 struct bridge_skb_cb {
 	union {
 		__be32 ipv4;
+		struct in6_addr ipv6;
 	} daddr;
 };
 
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index c190d22..73ea96a 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -18,6 +18,7 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/ip.h>
+#include <linux/ipv6.h>
 #include <linux/netdevice.h>
 #include <linux/skbuff.h>
 #include <linux/if_arp.h>
@@ -30,11 +31,13 @@
 #include <linux/netfilter_ipv6.h>
 #include <linux/netfilter_arp.h>
 #include <linux/in_route.h>
+#include <linux/ipv6_route.h>
 #include <linux/inetdevice.h>
 
 #include <net/ip.h>
 #include <net/ipv6.h>
 #include <net/route.h>
+#include <net/ip6_route.h>
 #include <net/netfilter/br_netfilter.h>
 
 #include <asm/uaccess.h>
@@ -45,8 +48,14 @@
 
 #define skb_origaddr(skb)	 (((struct bridge_skb_cb *) \
 				 (skb->nf_bridge->data))->daddr.ipv4)
+#define skb_origaddr6(skb)	 (((struct bridge_skb_cb *) \
+				 (skb->nf_bridge->data))->daddr.ipv6)
 #define store_orig_dstaddr(skb)	 (skb_origaddr(skb) = ip_hdr(skb)->daddr)
+#define store_orig_dstaddr6(skb) (skb_origaddr6(skb) = ipv6_hdr(skb)->daddr)
 #define dnat_took_place(skb)	 (skb_origaddr(skb) != ip_hdr(skb)->daddr)
+#define dnat_took_place6(skb)	 (memcmp(&skb_origaddr6(skb), \
+				 &ipv6_hdr(skb)->daddr, \
+				 sizeof(ipv6_hdr(skb)->daddr)) != 0)
 
 #ifdef CONFIG_SYSCTL
 static struct ctl_table_header *brnf_sysctl_header;
@@ -239,35 +248,6 @@ drop:
 	return -1;
 }
 
-/* PF_BRIDGE/PRE_ROUTING *********************************************/
-/* Undo the changes made for ip6tables PREROUTING and continue the
- * bridge PRE_ROUTING hook. */
-static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
-{
-	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
-	struct rtable *rt;
-
-	if (nf_bridge->mask & BRNF_PKT_TYPE) {
-		skb->pkt_type = PACKET_OTHERHOST;
-		nf_bridge->mask ^= BRNF_PKT_TYPE;
-	}
-	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
-
-	rt = bridge_parent_rtable(nf_bridge->physindev);
-	if (!rt) {
-		kfree_skb(skb);
-		return 0;
-	}
-	skb_dst_set_noref(skb, &rt->dst);
-
-	skb->dev = nf_bridge->physindev;
-	nf_bridge_update_protocol(skb);
-	nf_bridge_push_encap_header(skb);
-	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
-		       br_handle_frame_finish, 1);
-
-	return 0;
-}
 
 /* Obtain the correct destination MAC address, while preserving the original
  * source MAC address. If we already know this address, we just copy it. If we
@@ -314,6 +294,93 @@ free_skb:
 	return 0;
 }
 
+/* PF_BRIDGE/PRE_ROUTING *********************************************/
+/* Undo the changes made for ip6tables PREROUTING and continue the
+ * bridge PRE_ROUTING hook. */
+
+/* This requires some explaining. If DNAT has taken place,
+ * we will need to fix up the destination Ethernet address.
+ *
+ * There are two cases to consider:
+ * 1. The packet was DNAT'ed to a device in the same bridge
+ *    port group as it was received on. We can still bridge
+ *    the packet.
+ * 2. The packet was DNAT'ed to a different device, either
+ *    a non-bridged device or another bridge port group.
+ *    The packet will need to be routed.
+ *
+ * The correct way of distinguishing between these two cases is to
+ * call ip6_route_input() and to look at skb->dst->dev, which is
+ * changed to the destination device if ip6_route_input() succeeds.
+ *
+ * Let's first consider the case that ip6_route_input() succeeds:
+ *
+ * If the output device equals the logical bridge device the packet
+ * came in on, we can consider this bridging. The corresponding MAC
+ * address will be obtained in br_nf_pre_routing_finish_bridge.
+ * Otherwise, the packet is considered to be routed and we just
+ * change the destination MAC address so that the packet will
+ * later be passed up to the IP stack to be routed. For a redirected
+ * packet, ip6_route_input() will give back the localhost as output device,
+ * which differs from the bridge device.
+ *
+ * Let's now consider the case that ip6_route_input() fails:
+ *
+ * This can be because the destination address is martian, in which case
+ * the packet will be dropped.
+ */
+static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
+{
+	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
+	struct rtable *rt;
+	struct net_device *dev = skb->dev;
+
+	if (nf_bridge->mask & BRNF_PKT_TYPE) {
+		skb->pkt_type = PACKET_OTHERHOST;
+		nf_bridge->mask ^= BRNF_PKT_TYPE;
+	}
+	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
+
+	if (dnat_took_place6(skb)) {
+		skb_dst_drop(skb);
+		ip6_route_input(skb);
+
+		if (skb_dst(skb)->error) {
+			kfree_skb(skb);
+			return 0;
+		}
+
+		if (skb_dst(skb)->dev == dev) {
+			skb->dev = nf_bridge->physindev;
+			nf_bridge_update_protocol(skb);
+			nf_bridge_push_encap_header(skb);
+			NF_HOOK_THRESH(NFPROTO_BRIDGE,
+				       NF_BR_PRE_ROUTING,
+				       skb, skb->dev, NULL,
+				       br_nf_pre_routing_finish_bridge,
+				       1);
+			return 0;
+		}
+		memcpy(eth_hdr(skb)->h_dest, dev->dev_addr, ETH_ALEN);
+		skb->pkt_type = PACKET_HOST;
+	} else {
+		rt = bridge_parent_rtable(nf_bridge->physindev);
+		if (!rt) {
+			kfree_skb(skb);
+			return 0;
+		}
+		skb_dst_set_noref(skb, &rt->dst);
+	}
+
+	skb->dev = nf_bridge->physindev;
+	nf_bridge_update_protocol(skb);
+	nf_bridge_push_encap_header(skb);
+	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
+		       br_handle_frame_finish, 1);
+
+	return 0;
+}
+
 /* This requires some explaining. If DNAT has taken place,
  * we will need to fix up the destination Ethernet address.
  *
@@ -562,6 +629,7 @@ static unsigned int br_nf_pre_routing_ipv6(const struct nf_hook_ops *ops,
 	if (!setup_pre_routing(skb))
 		return NF_DROP;
 
+	store_orig_dstaddr6(skb);
 	skb->protocol = htons(ETH_P_IPV6);
 	NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, skb, skb->dev, NULL,
 		br_nf_pre_routing_finish_ipv6);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c910831..91a82de 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1025,6 +1025,7 @@ void ip6_route_input(struct sk_buff *skb)
 
 	skb_dst_set(skb, ip6_route_input_lookup(net, skb->dev, &fl6, flags));
 }
+EXPORT_SYMBOL(ip6_route_input);
 
 static struct rt6_info *ip6_pol_route_output(struct net *net, struct fib6_table *table,
 					     struct flowi6 *fl6, int flags)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address
  2014-12-05 21:12 [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address Bernhard Thaler
@ 2014-12-23 14:03 ` Pablo Neira Ayuso
  2014-12-23 14:13   ` Pablo Neira Ayuso
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2014-12-23 14:03 UTC (permalink / raw)
  To: Bernhard Thaler; +Cc: netfilter-devel, sven

On Fri, Dec 05, 2014 at 10:12:25PM +0100, Bernhard Thaler wrote:
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index c190d22..73ea96a 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
[...]
> +static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
> +{
> +	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
> +	struct rtable *rt;
> +	struct net_device *dev = skb->dev;
> +
> +	if (nf_bridge->mask & BRNF_PKT_TYPE) {
> +		skb->pkt_type = PACKET_OTHERHOST;
> +		nf_bridge->mask ^= BRNF_PKT_TYPE;
> +	}
> +	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;

There is no fragmentation handling here. Actually, not your fault, the
original br_nf_pre_routing_finish_ipv6() doesn't consider this case.

I can take this patch, it doesn't do any worse than the existing code,
but probably you want to have a look at this.

Please, let me know. Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address
  2014-12-23 14:03 ` Pablo Neira Ayuso
@ 2014-12-23 14:13   ` Pablo Neira Ayuso
  2015-01-09  0:05     ` Bernhard Thaler
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2014-12-23 14:13 UTC (permalink / raw)
  To: Bernhard Thaler; +Cc: netfilter-devel, sven

On Tue, Dec 23, 2014 at 03:03:43PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Dec 05, 2014 at 10:12:25PM +0100, Bernhard Thaler wrote:
> > diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> > index c190d22..73ea96a 100644
> > --- a/net/bridge/br_netfilter.c
> > +++ b/net/bridge/br_netfilter.c
> [...]
> > +static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
> > +{
> > +	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
> > +	struct rtable *rt;
> > +	struct net_device *dev = skb->dev;
> > +
> > +	if (nf_bridge->mask & BRNF_PKT_TYPE) {
> > +		skb->pkt_type = PACKET_OTHERHOST;
> > +		nf_bridge->mask ^= BRNF_PKT_TYPE;
> > +	}
> > +	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
> 
> There is no fragmentation handling here. Actually, not your fault, the
> original br_nf_pre_routing_finish_ipv6() doesn't consider this case.
> 
> I can take this patch, it doesn't do any worse than the existing code,
> but probably you want to have a look at this.

A bit more info if you have a look at this: br_netfilter fragmentation
handling is poorly designed, basically it may modify original fragment
boundaries and a bridge shouldn't do that. But this is how this has
been working since long time ago.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address
  2014-12-23 14:13   ` Pablo Neira Ayuso
@ 2015-01-09  0:05     ` Bernhard Thaler
  2015-03-18 21:52       ` [PATCHv2 1/4] " Bernhard Thaler
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Thaler @ 2015-01-09  0:05 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, sven

Sorry for not coming back to you for so long.

Here is my current status:
	-> I am still looking into the "fragmentation problem"
	-> but I need some more time for a patch

However, after working on the issue I am more and more convinced that
you should take the first patch "[PATCH 1/1] bridge: detect NAT66
correctly and change MAC address" solving the NAT66 MAC address problem
now and I will definitely split solving the "fragmentation problem" into
a separate patch.
The "fragmentation problem" is a different issue and affects any bridge
setup even it does not do any NAT66 I fear.

It may be obvious for some (for me it was not in the beginning),
but the "fragmentation problem" does not even need any NAT rule to be
configured to come into being.

I configured a simple bridge like that:

modprobe br_netfilter
brctl addbr br0
brctl addif br0 eth0
brctl addif br0 eth2
ifconfig eth0 up
ifconfig eth2 up
ifconfig br0 up

with a node having an IPv6 address on each end.

I started to do a "ping6 -s 8000 fd01:2345:6789:1::2" between the two
nodes interconnected by this bridge. These fragmented packets work fine
at the beginning but AS SOON AS I do a

ip6tables -t nat -nvL

on the machine hosting the bridge it stops to transmit fragmented
packets. I do not even need to add a single rule to trigger the problem.
A normal ping without need to fragment such as "ping6
fd01:2345:6789:1::2" still works through the bridge (tested with
net-next from 20150102).

This is why I think it is worth taking the NAT66 mac address patch as it
is now to have this one completely different NAT66 problem solved.

I will focus on the "fragmentation problem" independently with a simple
setup / error description as stated above as this will make it easier
for anyone to review / test the fragmentation patch in the end as well.

Regards,
Bernhard

On 23.12.2014 15:13, Pablo Neira Ayuso wrote:
> On Tue, Dec 23, 2014 at 03:03:43PM +0100, Pablo Neira Ayuso wrote:
>> On Fri, Dec 05, 2014 at 10:12:25PM +0100, Bernhard Thaler wrote:
>>> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
>>> index c190d22..73ea96a 100644
>>> --- a/net/bridge/br_netfilter.c
>>> +++ b/net/bridge/br_netfilter.c
>> [...]
>>> +static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
>>> +{
>>> +	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
>>> +	struct rtable *rt;
>>> +	struct net_device *dev = skb->dev;
>>> +
>>> +	if (nf_bridge->mask & BRNF_PKT_TYPE) {
>>> +		skb->pkt_type = PACKET_OTHERHOST;
>>> +		nf_bridge->mask ^= BRNF_PKT_TYPE;
>>> +	}
>>> +	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
>>
>> There is no fragmentation handling here. Actually, not your fault, the
>> original br_nf_pre_routing_finish_ipv6() doesn't consider this case.
>>
>> I can take this patch, it doesn't do any worse than the existing code,
>> but probably you want to have a look at this.
> 
> A bit more info if you have a look at this: br_netfilter fragmentation
> handling is poorly designed, basically it may modify original fragment
> boundaries and a bridge shouldn't do that. But this is how this has
> been working since long time ago.
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCHv2 1/4] bridge: detect NAT66 correctly and change MAC address
  2015-01-09  0:05     ` Bernhard Thaler
@ 2015-03-18 21:52       ` Bernhard Thaler
  2015-03-23 12:07         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 7+ messages in thread
From: Bernhard Thaler @ 2015-03-18 21:52 UTC (permalink / raw)
  To: pablo, kadlec; +Cc: netfilter-devel, fw, Bernhard Thaler, Sven Eckelmann

IPv4 allows to redirect any traffic over a bridge to the local machine using
iptables.

$ sysctl -w net.bridge.bridge-nf-call-iptables=1
$ iptables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

This didn't work with ip6tables because the redirect was not correctly detected.
The bridge pre-routing (finish) netfilter hook has to check for a possible
redirect and then fix the destination mac address. This makes it possible to
use the ip6tables rules for local DNAT REDIRECT similar to the IPv4 version.

$ sysctl -w net.bridge.bridge-nf-call-ip6tables=1
$ ip6tables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
  -j REDIRECT --to-ports 81

Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
[bernhard.thaler@wvnet.at: rebased, adjust function order]
[bernhard.thaler@wvnet.at: rebased, add indirect call to ip6_route_input]
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
---
Patch revision history:

v2
* re-base again to davem's current net-next
* add indirect call to ip6_route_input via nf_ipv6_ops to avoid 
  direct dependency to ipv6.ko just because of function calls

v1
* rebase "bridge: Allow to redirect IPv6 traffic to local machine"
  to davem's current net-next
* adjust function order to avoid prototype for br_nf_pre_routing_finish_bridge
   
(v0)
* originally there were two patches solving this problem
* Patch from Sven Eckelmann was chosen to base solution on 
  see: bridge: Allow to redirect IPv6 traffic to local machine
  see: bridge: Fix NAT66ed IPv6 packets not being bridged correctly

 include/linux/netfilter_bridge.h |    2 +
 include/linux/netfilter_ipv6.h   |    1 +
 net/bridge/br_netfilter.c        |  128 +++++++++++++++++++++++++++++---------
 net/ipv6/netfilter.c             |    1 +
 4 files changed, 102 insertions(+), 30 deletions(-)

diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h
index bb39113..419f3db 100644
--- a/include/linux/netfilter_bridge.h
+++ b/include/linux/netfilter_bridge.h
@@ -2,6 +2,7 @@
 #define __LINUX_BRIDGE_NETFILTER_H
 
 #include <uapi/linux/netfilter_bridge.h>
+#include <uapi/linux/in6.h>
 
 
 enum nf_br_hook_priorities {
@@ -57,6 +58,7 @@ static inline unsigned int nf_bridge_pad(const struct sk_buff *skb)
 struct bridge_skb_cb {
 	union {
 		__be32 ipv4;
+		struct in6_addr ipv6;
 	} daddr;
 };
 
diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
index 64dad1cc..e2d1969 100644
--- a/include/linux/netfilter_ipv6.h
+++ b/include/linux/netfilter_ipv6.h
@@ -25,6 +25,7 @@ void ipv6_netfilter_fini(void);
 struct nf_ipv6_ops {
 	int (*chk_addr)(struct net *net, const struct in6_addr *addr,
 			const struct net_device *dev, int strict);
+	void (*route_input)(struct sk_buff *skb);
 };
 
 extern const struct nf_ipv6_ops __rcu *nf_ipv6_ops;
diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index b260a97..775d638 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -45,8 +45,14 @@
 
 #define skb_origaddr(skb)	 (((struct bridge_skb_cb *) \
 				 (skb->nf_bridge->data))->daddr.ipv4)
+#define skb_origaddr6(skb)	 (((struct bridge_skb_cb *) \
+				 (skb->nf_bridge->data))->daddr.ipv6)
 #define store_orig_dstaddr(skb)	 (skb_origaddr(skb) = ip_hdr(skb)->daddr)
+#define store_orig_dstaddr6(skb) (skb_origaddr6(skb) = ipv6_hdr(skb)->daddr)
 #define dnat_took_place(skb)	 (skb_origaddr(skb) != ip_hdr(skb)->daddr)
+#define dnat_took_place6(skb)	 (memcmp(&skb_origaddr6(skb), \
+				 &ipv6_hdr(skb)->daddr, \
+				 sizeof(ipv6_hdr(skb)->daddr)) != 0)
 
 #ifdef CONFIG_SYSCTL
 static struct ctl_table_header *brnf_sysctl_header;
@@ -247,36 +253,6 @@ static void nf_bridge_update_protocol(struct sk_buff *skb)
 		skb->protocol = htons(ETH_P_PPP_SES);
 }
 
-/* PF_BRIDGE/PRE_ROUTING *********************************************/
-/* Undo the changes made for ip6tables PREROUTING and continue the
- * bridge PRE_ROUTING hook. */
-static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
-{
-	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
-	struct rtable *rt;
-
-	if (nf_bridge->mask & BRNF_PKT_TYPE) {
-		skb->pkt_type = PACKET_OTHERHOST;
-		nf_bridge->mask ^= BRNF_PKT_TYPE;
-	}
-	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
-
-	rt = bridge_parent_rtable(nf_bridge->physindev);
-	if (!rt) {
-		kfree_skb(skb);
-		return 0;
-	}
-	skb_dst_set_noref(skb, &rt->dst);
-
-	skb->dev = nf_bridge->physindev;
-	nf_bridge_update_protocol(skb);
-	nf_bridge_push_encap_header(skb);
-	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
-		       br_handle_frame_finish, 1);
-
-	return 0;
-}
-
 /* Obtain the correct destination MAC address, while preserving the original
  * source MAC address. If we already know this address, we just copy it. If we
  * don't, we use the neighbour framework to find out. In both cases, we make
@@ -322,6 +298,97 @@ free_skb:
 	return 0;
 }
 
+/* PF_BRIDGE/PRE_ROUTING *********************************************/
+/* Undo the changes made for ip6tables PREROUTING and continue the
+ * bridge PRE_ROUTING hook.
+ */
+
+/* This requires some explaining. If DNAT has taken place,
+ * we will need to fix up the destination Ethernet address.
+ *
+ * There are two cases to consider:
+ * 1. The packet was DNAT'ed to a device in the same bridge
+ *    port group as it was received on. We can still bridge
+ *    the packet.
+ * 2. The packet was DNAT'ed to a different device, either
+ *    a non-bridged device or another bridge port group.
+ *    The packet will need to be routed.
+ *
+ * The correct way of distinguishing between these two cases is to
+ * call ip6_route_input() and to look at skb->dst->dev, which is
+ * changed to the destination device if ip6_route_input() succeeds.
+ * ip6_route_input() is called indirectly via v6ops->route_input to
+ * avoid direct dependency to ipv6.ko due to function calls.
+ *
+ * Let's first consider the case that ip6_route_input() succeeds:
+ *
+ * If the output device equals the logical bridge device the packet
+ * came in on, we can consider this bridging. The corresponding MAC
+ * address will be obtained in br_nf_pre_routing_finish_bridge.
+ * Otherwise, the packet is considered to be routed and we just
+ * change the destination MAC address so that the packet will
+ * later be passed up to the IP stack to be routed. For a redirected
+ * packet, ip6_route_input() will give back the localhost as output device,
+ * which differs from the bridge device.
+ *
+ * Let's now consider the case that ip6_route_input() fails:
+ *
+ * This can be because the destination address is martian, in which case
+ * the packet will be dropped.
+ */
+static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
+{
+	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
+	struct rtable *rt;
+	struct net_device *dev = skb->dev;
+	const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops();
+
+	if (nf_bridge->mask & BRNF_PKT_TYPE) {
+		skb->pkt_type = PACKET_OTHERHOST;
+		nf_bridge->mask ^= BRNF_PKT_TYPE;
+	}
+	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
+
+	if (dnat_took_place6(skb)) {
+		skb_dst_drop(skb);
+		v6ops->route_input(skb);
+
+		if (skb_dst(skb)->error) {
+			kfree_skb(skb);
+			return 0;
+		}
+
+		if (skb_dst(skb)->dev == dev) {
+			skb->dev = nf_bridge->physindev;
+			nf_bridge_update_protocol(skb);
+			nf_bridge_push_encap_header(skb);
+			NF_HOOK_THRESH(NFPROTO_BRIDGE,
+				       NF_BR_PRE_ROUTING,
+				       skb, skb->dev, NULL,
+				       br_nf_pre_routing_finish_bridge,
+				       1);
+			return 0;
+		}
+		memcpy(eth_hdr(skb)->h_dest, dev->dev_addr, ETH_ALEN);
+		skb->pkt_type = PACKET_HOST;
+	} else {
+		rt = bridge_parent_rtable(nf_bridge->physindev);
+		if (!rt) {
+			kfree_skb(skb);
+			return 0;
+		}
+		skb_dst_set_noref(skb, &rt->dst);
+	}
+
+	skb->dev = nf_bridge->physindev;
+	nf_bridge_update_protocol(skb);
+	nf_bridge_push_encap_header(skb);
+	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
+		       br_handle_frame_finish, 1);
+
+	return 0;
+}
+
 /* This requires some explaining. If DNAT has taken place,
  * we will need to fix up the destination Ethernet address.
  *
@@ -570,6 +637,7 @@ static unsigned int br_nf_pre_routing_ipv6(const struct nf_hook_ops *ops,
 	if (!setup_pre_routing(skb))
 		return NF_DROP;
 
+	store_orig_dstaddr6(skb);
 	skb->protocol = htons(ETH_P_IPV6);
 	NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, skb, skb->dev, NULL,
 		br_nf_pre_routing_finish_ipv6);
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 398377a..0cd8ec9 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -191,6 +191,7 @@ static __sum16 nf_ip6_checksum_partial(struct sk_buff *skb, unsigned int hook,
 
 static const struct nf_ipv6_ops ipv6ops = {
 	.chk_addr	= ipv6_chk_addr,
+	.route_input    = ip6_route_input
 };
 
 static const struct nf_afinfo nf_ip6_afinfo = {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCHv2 1/4] bridge: detect NAT66 correctly and change MAC address
  2015-03-18 21:52       ` [PATCHv2 1/4] " Bernhard Thaler
@ 2015-03-23 12:07         ` Pablo Neira Ayuso
  2015-03-23 12:41           ` Florian Westphal
  0 siblings, 1 reply; 7+ messages in thread
From: Pablo Neira Ayuso @ 2015-03-23 12:07 UTC (permalink / raw)
  To: Bernhard Thaler; +Cc: kadlec, netfilter-devel, fw, Sven Eckelmann

Hi Bernhard,

Florian Westphal is currently exploring alternative solutions so
br_netfilter can stop (ab)using the layer 3 infrastructure from the
bridge code (this layering violation has been causing problems for
quite some time, eg. some users don't expect a bridge to modify alter
the fragmented traffic).

Although IPv6 support in br_netfilter is fairly incomplete, let me put
these patches in a hold until Florian comes back to us with some
feedback, we'll integrate them in some way or another at some point.

Thanks for your work and patience so far.

On Wed, Mar 18, 2015 at 10:52:11PM +0100, Bernhard Thaler wrote:
> IPv4 allows to redirect any traffic over a bridge to the local machine using
> iptables.
> 
> $ sysctl -w net.bridge.bridge-nf-call-iptables=1
> $ iptables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
>   -j REDIRECT --to-ports 81
> 
> This didn't work with ip6tables because the redirect was not correctly detected.
> The bridge pre-routing (finish) netfilter hook has to check for a possible
> redirect and then fix the destination mac address. This makes it possible to
> use the ip6tables rules for local DNAT REDIRECT similar to the IPv4 version.
> 
> $ sysctl -w net.bridge.bridge-nf-call-ip6tables=1
> $ ip6tables -t nat -A PREROUTING -p tcp -m tcp --dport 8080 \
>   -j REDIRECT --to-ports 81
> 
> Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
> [bernhard.thaler@wvnet.at: rebased, adjust function order]
> [bernhard.thaler@wvnet.at: rebased, add indirect call to ip6_route_input]
> Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
> ---
> Patch revision history:
> 
> v2
> * re-base again to davem's current net-next
> * add indirect call to ip6_route_input via nf_ipv6_ops to avoid 
>   direct dependency to ipv6.ko just because of function calls
> 
> v1
> * rebase "bridge: Allow to redirect IPv6 traffic to local machine"
>   to davem's current net-next
> * adjust function order to avoid prototype for br_nf_pre_routing_finish_bridge
>    
> (v0)
> * originally there were two patches solving this problem
> * Patch from Sven Eckelmann was chosen to base solution on 
>   see: bridge: Allow to redirect IPv6 traffic to local machine
>   see: bridge: Fix NAT66ed IPv6 packets not being bridged correctly
> 
>  include/linux/netfilter_bridge.h |    2 +
>  include/linux/netfilter_ipv6.h   |    1 +
>  net/bridge/br_netfilter.c        |  128 +++++++++++++++++++++++++++++---------
>  net/ipv6/netfilter.c             |    1 +
>  4 files changed, 102 insertions(+), 30 deletions(-)
> 
> diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h
> index bb39113..419f3db 100644
> --- a/include/linux/netfilter_bridge.h
> +++ b/include/linux/netfilter_bridge.h
> @@ -2,6 +2,7 @@
>  #define __LINUX_BRIDGE_NETFILTER_H
>  
>  #include <uapi/linux/netfilter_bridge.h>
> +#include <uapi/linux/in6.h>
>  
>  
>  enum nf_br_hook_priorities {
> @@ -57,6 +58,7 @@ static inline unsigned int nf_bridge_pad(const struct sk_buff *skb)
>  struct bridge_skb_cb {
>  	union {
>  		__be32 ipv4;
> +		struct in6_addr ipv6;
>  	} daddr;
>  };
>  
> diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h
> index 64dad1cc..e2d1969 100644
> --- a/include/linux/netfilter_ipv6.h
> +++ b/include/linux/netfilter_ipv6.h
> @@ -25,6 +25,7 @@ void ipv6_netfilter_fini(void);
>  struct nf_ipv6_ops {
>  	int (*chk_addr)(struct net *net, const struct in6_addr *addr,
>  			const struct net_device *dev, int strict);
> +	void (*route_input)(struct sk_buff *skb);
>  };
>  
>  extern const struct nf_ipv6_ops __rcu *nf_ipv6_ops;
> diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
> index b260a97..775d638 100644
> --- a/net/bridge/br_netfilter.c
> +++ b/net/bridge/br_netfilter.c
> @@ -45,8 +45,14 @@
>  
>  #define skb_origaddr(skb)	 (((struct bridge_skb_cb *) \
>  				 (skb->nf_bridge->data))->daddr.ipv4)
> +#define skb_origaddr6(skb)	 (((struct bridge_skb_cb *) \
> +				 (skb->nf_bridge->data))->daddr.ipv6)
>  #define store_orig_dstaddr(skb)	 (skb_origaddr(skb) = ip_hdr(skb)->daddr)
> +#define store_orig_dstaddr6(skb) (skb_origaddr6(skb) = ipv6_hdr(skb)->daddr)
>  #define dnat_took_place(skb)	 (skb_origaddr(skb) != ip_hdr(skb)->daddr)
> +#define dnat_took_place6(skb)	 (memcmp(&skb_origaddr6(skb), \
> +				 &ipv6_hdr(skb)->daddr, \
> +				 sizeof(ipv6_hdr(skb)->daddr)) != 0)
>  
>  #ifdef CONFIG_SYSCTL
>  static struct ctl_table_header *brnf_sysctl_header;
> @@ -247,36 +253,6 @@ static void nf_bridge_update_protocol(struct sk_buff *skb)
>  		skb->protocol = htons(ETH_P_PPP_SES);
>  }
>  
> -/* PF_BRIDGE/PRE_ROUTING *********************************************/
> -/* Undo the changes made for ip6tables PREROUTING and continue the
> - * bridge PRE_ROUTING hook. */
> -static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
> -{
> -	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
> -	struct rtable *rt;
> -
> -	if (nf_bridge->mask & BRNF_PKT_TYPE) {
> -		skb->pkt_type = PACKET_OTHERHOST;
> -		nf_bridge->mask ^= BRNF_PKT_TYPE;
> -	}
> -	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
> -
> -	rt = bridge_parent_rtable(nf_bridge->physindev);
> -	if (!rt) {
> -		kfree_skb(skb);
> -		return 0;
> -	}
> -	skb_dst_set_noref(skb, &rt->dst);
> -
> -	skb->dev = nf_bridge->physindev;
> -	nf_bridge_update_protocol(skb);
> -	nf_bridge_push_encap_header(skb);
> -	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
> -		       br_handle_frame_finish, 1);
> -
> -	return 0;
> -}
> -
>  /* Obtain the correct destination MAC address, while preserving the original
>   * source MAC address. If we already know this address, we just copy it. If we
>   * don't, we use the neighbour framework to find out. In both cases, we make
> @@ -322,6 +298,97 @@ free_skb:
>  	return 0;
>  }
>  
> +/* PF_BRIDGE/PRE_ROUTING *********************************************/
> +/* Undo the changes made for ip6tables PREROUTING and continue the
> + * bridge PRE_ROUTING hook.
> + */
> +
> +/* This requires some explaining. If DNAT has taken place,
> + * we will need to fix up the destination Ethernet address.
> + *
> + * There are two cases to consider:
> + * 1. The packet was DNAT'ed to a device in the same bridge
> + *    port group as it was received on. We can still bridge
> + *    the packet.
> + * 2. The packet was DNAT'ed to a different device, either
> + *    a non-bridged device or another bridge port group.
> + *    The packet will need to be routed.
> + *
> + * The correct way of distinguishing between these two cases is to
> + * call ip6_route_input() and to look at skb->dst->dev, which is
> + * changed to the destination device if ip6_route_input() succeeds.
> + * ip6_route_input() is called indirectly via v6ops->route_input to
> + * avoid direct dependency to ipv6.ko due to function calls.
> + *
> + * Let's first consider the case that ip6_route_input() succeeds:
> + *
> + * If the output device equals the logical bridge device the packet
> + * came in on, we can consider this bridging. The corresponding MAC
> + * address will be obtained in br_nf_pre_routing_finish_bridge.
> + * Otherwise, the packet is considered to be routed and we just
> + * change the destination MAC address so that the packet will
> + * later be passed up to the IP stack to be routed. For a redirected
> + * packet, ip6_route_input() will give back the localhost as output device,
> + * which differs from the bridge device.
> + *
> + * Let's now consider the case that ip6_route_input() fails:
> + *
> + * This can be because the destination address is martian, in which case
> + * the packet will be dropped.
> + */
> +static int br_nf_pre_routing_finish_ipv6(struct sk_buff *skb)
> +{
> +	struct nf_bridge_info *nf_bridge = skb->nf_bridge;
> +	struct rtable *rt;
> +	struct net_device *dev = skb->dev;
> +	const struct nf_ipv6_ops *v6ops = nf_get_ipv6_ops();
> +
> +	if (nf_bridge->mask & BRNF_PKT_TYPE) {
> +		skb->pkt_type = PACKET_OTHERHOST;
> +		nf_bridge->mask ^= BRNF_PKT_TYPE;
> +	}
> +	nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;
> +
> +	if (dnat_took_place6(skb)) {
> +		skb_dst_drop(skb);
> +		v6ops->route_input(skb);
> +
> +		if (skb_dst(skb)->error) {
> +			kfree_skb(skb);
> +			return 0;
> +		}
> +
> +		if (skb_dst(skb)->dev == dev) {
> +			skb->dev = nf_bridge->physindev;
> +			nf_bridge_update_protocol(skb);
> +			nf_bridge_push_encap_header(skb);
> +			NF_HOOK_THRESH(NFPROTO_BRIDGE,
> +				       NF_BR_PRE_ROUTING,
> +				       skb, skb->dev, NULL,
> +				       br_nf_pre_routing_finish_bridge,
> +				       1);
> +			return 0;
> +		}
> +		memcpy(eth_hdr(skb)->h_dest, dev->dev_addr, ETH_ALEN);
> +		skb->pkt_type = PACKET_HOST;
> +	} else {
> +		rt = bridge_parent_rtable(nf_bridge->physindev);
> +		if (!rt) {
> +			kfree_skb(skb);
> +			return 0;
> +		}
> +		skb_dst_set_noref(skb, &rt->dst);
> +	}
> +
> +	skb->dev = nf_bridge->physindev;
> +	nf_bridge_update_protocol(skb);
> +	nf_bridge_push_encap_header(skb);
> +	NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL,
> +		       br_handle_frame_finish, 1);
> +
> +	return 0;
> +}
> +
>  /* This requires some explaining. If DNAT has taken place,
>   * we will need to fix up the destination Ethernet address.
>   *
> @@ -570,6 +637,7 @@ static unsigned int br_nf_pre_routing_ipv6(const struct nf_hook_ops *ops,
>  	if (!setup_pre_routing(skb))
>  		return NF_DROP;
>  
> +	store_orig_dstaddr6(skb);
>  	skb->protocol = htons(ETH_P_IPV6);
>  	NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING, skb, skb->dev, NULL,
>  		br_nf_pre_routing_finish_ipv6);
> diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
> index 398377a..0cd8ec9 100644
> --- a/net/ipv6/netfilter.c
> +++ b/net/ipv6/netfilter.c
> @@ -191,6 +191,7 @@ static __sum16 nf_ip6_checksum_partial(struct sk_buff *skb, unsigned int hook,
>  
>  static const struct nf_ipv6_ops ipv6ops = {
>  	.chk_addr	= ipv6_chk_addr,
> +	.route_input    = ip6_route_input
>  };
>  
>  static const struct nf_afinfo nf_ip6_afinfo = {
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv2 1/4] bridge: detect NAT66 correctly and change MAC address
  2015-03-23 12:07         ` Pablo Neira Ayuso
@ 2015-03-23 12:41           ` Florian Westphal
  0 siblings, 0 replies; 7+ messages in thread
From: Florian Westphal @ 2015-03-23 12:41 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Bernhard Thaler, kadlec, netfilter-devel, fw, Sven Eckelmann

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Florian Westphal is currently exploring alternative solutions so
> br_netfilter can stop (ab)using the layer 3 infrastructure from the
> bridge code (this layering violation has been causing problems for
> quite some time, eg. some users don't expect a bridge to modify alter
> the fragmented traffic).
> 
> Although IPv6 support in br_netfilter is fairly incomplete, let me put
> these patches in a hold until Florian comes back to us with some
> feedback, we'll integrate them in some way or another at some point.

TBH I am not too sure abut this.

IPv4 DNAT doesn't work 100% either, see

http://marc.info/?l=linux-netdev&m=136627779125382&w=2

[ btw, thanks that the crap patch referenced above didn't end up in the kernel ;) ]

So I think we first need to _clearly_ define how DNAT should work on a
bridge, rather than inherit all the weird corner cases that we have
with ipv4.

F.e. I think we wouldn't have all of these issues if we wouldn't care
about the l2 mac address (and would always route in NAT case).

But I'll have to think about this some more.

In any case, I understand that not being able to e.g. REDIRECT is bad,
and perhaps it would be preferable to first fix ipv6 fragment
handling and then make REDIRECT work (and defer handling/supporting arbitrary DNAT
until we think we know how it should work).

One small comment on the patch below.

> > @@ -57,6 +58,7 @@ static inline unsigned int nf_bridge_pad(const struct sk_buff *skb)
> >  struct bridge_skb_cb {
> >  	union {
> >  		__be32 ipv4;
> > +		struct in6_addr ipv6;
> >  	} daddr;

This is gone, dnat_took_place() should work without further changes if
you call it in the ipv6 prerouting finish hook.

> > +/* This requires some explaining. If DNAT has taken place,
> > + * we will need to fix up the destination Ethernet address.
> > + *

I really think this novel^W comment should not be copied, just add a
reference to the ipv4 one.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-03-23 12:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-05 21:12 [PATCH 1/1] bridge: detect NAT66 correctly and change MAC address Bernhard Thaler
2014-12-23 14:03 ` Pablo Neira Ayuso
2014-12-23 14:13   ` Pablo Neira Ayuso
2015-01-09  0:05     ` Bernhard Thaler
2015-03-18 21:52       ` [PATCHv2 1/4] " Bernhard Thaler
2015-03-23 12:07         ` Pablo Neira Ayuso
2015-03-23 12:41           ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).