netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf-next 0/4] netfilter: bridge: remove broute hook
@ 2019-04-11 14:36 Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 1/4] selftests: netfilter: add ebtables broute test case Florian Westphal
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Florian Westphal @ 2019-04-11 14:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: roopa, nikolay, netdev

This series removes the 'broute' hook by promoting ebtables' broute table
to a normal ebtables table (invoked via normal PREROUTING netfilter hook).

The downside is that nf_hook_slow() needs to be duplicated in br_input.c
(see patch 3).

However, I think its worth the price as this allows to remove the
br_should_route_hook.

There are quite some changes in bridge specific code, if you prefer
I can re-submit this for net-next instead of nf-next.

Main motivation is to provide 'ebtables -t broute' functionality via
nftables later on, this can then be done without touching the bridge
or netfilter core infrastructure again.

Florian Westphal (4):
      selftests: netfilter: add ebtables broute test case
      bridge: reduce size of input cb to 16 bytes
      bridge: netfilter: unroll NF_HOOK helper in bridge input path
      bridge: broute: make broute a real ebtables table

 include/linux/if_bridge.h                           |    3 
 include/net/netfilter/nf_queue.h                    |    3 
 net/bridge/br_arp_nd_proxy.c                        |   18 +-
 net/bridge/br_input.c                               |   72 +++++++--
 net/bridge/br_private.h                             |   15 +-
 net/bridge/netfilter/ebtable_broute.c               |   63 ++++++--
 net/bridge/netfilter/ebtables.c                     |    7 
 net/netfilter/core.c                                |    1 
 net/netfilter/nf_internals.h                        |    3 
 net/netfilter/nf_queue.c                            |    1 
 tools/testing/selftests/netfilter/Makefile          |    2 
 tools/testing/selftests/netfilter/bridge_brouter.sh |  146 ++++++++++++++++++++
 12 files changed, 268 insertions(+), 66 deletions(-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH nf-next 1/4] selftests: netfilter: add ebtables broute test case
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
@ 2019-04-11 14:36 ` Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 2/4] bridge: reduce size of input cb to 16 bytes Florian Westphal
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-04-11 14:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: roopa, nikolay, netdev, Florian Westphal

ebtables -t broute allows to redirect packets in a way that
they get pushed up the stack, even if the interface is part
of a bridge.

In case of IP packets to non-local address, this means
those IP packets are routed instead of bridged-forwarded, just
as if the bridge would not have existed.

Expected test output is:
PASS: netns connectivity: ns1 and ns2 can reach each other
PASS: ns1/ns2 connectivity with active broute rule
PASS: ns1/ns2 connectivity with active broute rule and bridge forward drop

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 tools/testing/selftests/netfilter/Makefile    |   2 +-
 .../selftests/netfilter/bridge_brouter.sh     | 146 ++++++++++++++++++
 2 files changed, 147 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/netfilter/bridge_brouter.sh

diff --git a/tools/testing/selftests/netfilter/Makefile b/tools/testing/selftests/netfilter/Makefile
index c9ff2b47bd1c..80dae72a25c7 100644
--- a/tools/testing/selftests/netfilter/Makefile
+++ b/tools/testing/selftests/netfilter/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 # Makefile for netfilter selftests
 
-TEST_PROGS := nft_trans_stress.sh nft_nat.sh
+TEST_PROGS := nft_trans_stress.sh nft_nat.sh bridge_brouter.sh
 
 include ../lib.mk
diff --git a/tools/testing/selftests/netfilter/bridge_brouter.sh b/tools/testing/selftests/netfilter/bridge_brouter.sh
new file mode 100755
index 000000000000..29f3955b9af7
--- /dev/null
+++ b/tools/testing/selftests/netfilter/bridge_brouter.sh
@@ -0,0 +1,146 @@
+#!/bin/bash
+#
+# This test is for bridge 'brouting', i.e. make some packets being routed
+# rather than getting bridged even though they arrive on interface that is
+# part of a bridge.
+
+#           eth0    br0     eth0
+# setup is: ns1 <-> ns0 <-> ns2
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+ret=0
+
+ebtables -V > /dev/null 2>&1
+if [ $? -ne 0 ];then
+	echo "SKIP: Could not run test without ebtables"
+	exit $ksft_skip
+fi
+
+ip -Version > /dev/null 2>&1
+if [ $? -ne 0 ];then
+	echo "SKIP: Could not run test without ip tool"
+	exit $ksft_skip
+fi
+
+ip netns add ns0
+ip netns add ns1
+ip netns add ns2
+
+ip link add veth0 netns ns0 type veth peer name eth0 netns ns1
+if [ $? -ne 0 ]; then
+	echo "SKIP: Can't create veth device"
+	exit $ksft_skip
+fi
+ip link add veth1 netns ns0 type veth peer name eth0 netns ns2
+
+ip -net ns0 link set lo up
+ip -net ns0 link set veth0 up
+ip -net ns0 link set veth1 up
+
+ip -net ns0 link add br0 type bridge
+if [ $? -ne 0 ]; then
+	echo "SKIP: Can't create bridge br0"
+	exit $ksft_skip
+fi
+
+ip -net ns0 link set veth0 master br0
+ip -net ns0 link set veth1 master br0
+ip -net ns0 link set br0 up
+ip -net ns0 addr add 10.0.0.1/24 dev br0
+
+# place both in same subnet, ns1 and ns2 connected via ns0:br0
+for i in 1 2; do
+  ip -net ns$i link set lo up
+  ip -net ns$i link set eth0 up
+  ip -net ns$i addr add 10.0.0.1$i/24 dev eth0
+done
+
+test_ebtables_broute()
+{
+	local cipt
+
+	# redirect is needed so the dstmac is rewritten to the bridge itself,
+	# ip stack won't process OTHERHOST (foreign unicast mac) packets.
+	ip netns exec ns0 ebtables -t broute -A BROUTING -p ipv4 --ip-protocol icmp -j redirect --redirect-target=DROP
+	if [ $? -ne 0 ]; then
+		echo "SKIP: Could not add ebtables broute redirect rule"
+		return $ksft_skip
+	fi
+
+	# ping netns1, expected to not work (ip forwarding is off)
+	ip netns exec ns1 ping -q -c 1 10.0.0.12 > /dev/null 2>&1
+	if [ $? -eq 0 ]; then
+		echo "ERROR: ping works, should have failed" 1>&2
+		return 1
+	fi
+
+	# enable forwarding on both interfaces.
+	# neither needs an ip address, but at least the bridge needs
+	# an ip address in same network segment as ns1 and ns2 (ns0
+	# needs to be able to determine route for to-be-forwarded packet).
+	ip netns exec ns0 sysctl -q net.ipv4.conf.veth0.forwarding=1
+	ip netns exec ns0 sysctl -q net.ipv4.conf.veth1.forwarding=1
+
+	sleep 1
+
+	ip netns exec ns1 ping -q -c 1 10.0.0.12 > /dev/null
+	if [ $? -ne 0 ]; then
+		echo "ERROR: ping did not work, but it should (broute+forward)" 1>&2
+		return 1
+	fi
+
+	echo "PASS: ns1/ns2 connectivity with active broute rule"
+	ip netns exec ns0 ebtables -t broute -F
+
+	# ping netns1, expected to work (frames are bridged)
+	ip netns exec ns1 ping -q -c 1 10.0.0.12 > /dev/null
+	if [ $? -ne 0 ]; then
+		echo "ERROR: ping did not work, but it should (bridged)" 1>&2
+		return 1
+	fi
+
+	ip netns exec ns0 ebtables -t filter -A FORWARD -p ipv4 --ip-protocol icmp -j DROP
+
+	# ping netns1, expected to not work (DROP in bridge forward)
+	ip netns exec ns1 ping -q -c 1 10.0.0.12 > /dev/null 2>&1
+	if [ $? -eq 0 ]; then
+		echo "ERROR: ping works, should have failed (icmp forward drop)" 1>&2
+		return 1
+	fi
+
+	# re-activate brouter
+	ip netns exec ns0 ebtables -t broute -A BROUTING -p ipv4 --ip-protocol icmp -j redirect --redirect-target=DROP
+
+	ip netns exec ns2 ping -q -c 1 10.0.0.11 > /dev/null
+	if [ $? -ne 0 ]; then
+		echo "ERROR: ping did not work, but it should (broute+forward 2)" 1>&2
+		return 1
+	fi
+
+	echo "PASS: ns1/ns2 connectivity with active broute rule and bridge forward drop"
+	return 0
+}
+
+# test basic connectivity
+ip netns exec ns1 ping -c 1 -q 10.0.0.12 > /dev/null
+if [ $? -ne 0 ]; then
+    echo "ERROR: Could not reach ns2 from ns1" 1>&2
+    ret=1
+fi
+
+ip netns exec ns2 ping -c 1 -q 10.0.0.11 > /dev/null
+if [ $? -ne 0 ]; then
+    echo "ERROR: Could not reach ns1 from ns2" 1>&2
+    ret=1
+fi
+
+if [ $ret -eq 0 ];then
+    echo "PASS: netns connectivity: ns1 and ns2 can reach each other"
+fi
+
+test_ebtables_broute
+ret=$?
+for i in 0 1 2; do ip netns del ns$i;done
+
+exit $ret
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH nf-next 2/4] bridge: reduce size of input cb to 16 bytes
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 1/4] selftests: netfilter: add ebtables broute test case Florian Westphal
@ 2019-04-11 14:36 ` Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 3/4] bridge: netfilter: unroll NF_HOOK helper in bridge input path Florian Westphal
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-04-11 14:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: roopa, nikolay, netdev, Florian Westphal

Reduce size of br_input_skb_cb from 24 to 16 bytes by
using bitfield for those values that can only be 0 or 1.

igmp is the igmp type value, so it needs to be at least u8.

Furthermore, the bridge currently relies on step-by-step initialization
of br_input_skb_cb fields as the skb passes through the stack.

Explicitly zero out the bridge input cb instead, this avoids having to
review/validate that no BR_INPUT_SKB_CB(skb)->foo test can see a
'random' value from previous protocol cb.

AFAICS all current fields are always set up before they are read again,
so this is not a bug fix.

Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/bridge/br_arp_nd_proxy.c | 18 +++++++++---------
 net/bridge/br_input.c        |  2 ++
 net/bridge/br_private.h      | 12 +++++-------
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/net/bridge/br_arp_nd_proxy.c b/net/bridge/br_arp_nd_proxy.c
index 724b474ade54..15116752365a 100644
--- a/net/bridge/br_arp_nd_proxy.c
+++ b/net/bridge/br_arp_nd_proxy.c
@@ -131,7 +131,7 @@ void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 	u8 *arpptr, *sha;
 	__be32 sip, tip;
 
-	BR_INPUT_SKB_CB(skb)->proxyarp_replied = false;
+	BR_INPUT_SKB_CB(skb)->proxyarp_replied = 0;
 
 	if ((dev->flags & IFF_NOARP) ||
 	    !pskb_may_pull(skb, arp_hdr_len(dev)))
@@ -161,7 +161,7 @@ void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 			return;
 		if (ipv4_is_zeronet(sip) || sip == tip) {
 			/* prevent flooding to neigh suppress ports */
-			BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+			BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 			return;
 		}
 	}
@@ -181,7 +181,7 @@ void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 		/* its our local ip, so don't proxy reply
 		 * and don't forward to neigh suppress ports
 		 */
-		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		return;
 	}
 
@@ -217,7 +217,7 @@ void br_do_proxy_suppress_arp(struct sk_buff *skb, struct net_bridge *br,
 			 */
 			if (replied ||
 			    br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED))
-				BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+				BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		}
 
 		neigh_release(n);
@@ -393,7 +393,7 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
 	struct ipv6hdr *iphdr;
 	struct neighbour *n;
 
-	BR_INPUT_SKB_CB(skb)->proxyarp_replied = false;
+	BR_INPUT_SKB_CB(skb)->proxyarp_replied = 0;
 
 	if (p && (p->flags & BR_NEIGH_SUPPRESS))
 		return;
@@ -401,7 +401,7 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
 	if (msg->icmph.icmp6_type == NDISC_NEIGHBOUR_ADVERTISEMENT &&
 	    !msg->icmph.icmp6_solicited) {
 		/* prevent flooding to neigh suppress ports */
-		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		return;
 	}
 
@@ -414,7 +414,7 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
 
 	if (ipv6_addr_any(saddr) || !ipv6_addr_cmp(saddr, daddr)) {
 		/* prevent flooding to neigh suppress ports */
-		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		return;
 	}
 
@@ -432,7 +432,7 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
 		/* its our own ip, so don't proxy reply
 		 * and don't forward to arp suppress ports
 		 */
-		BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+		BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		return;
 	}
 
@@ -465,7 +465,7 @@ void br_do_suppress_nd(struct sk_buff *skb, struct net_bridge *br,
 			 */
 			if (replied ||
 			    br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED))
-				BR_INPUT_SKB_CB(skb)->proxyarp_replied = true;
+				BR_INPUT_SKB_CB(skb)->proxyarp_replied = 1;
 		}
 		neigh_release(n);
 	}
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 5ea7e56119c1..e2f93e5c72da 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -227,6 +227,8 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 	if (!skb)
 		return RX_HANDLER_CONSUMED;
 
+	memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
+
 	p = br_port_get_rcu(skb->dev);
 	if (p->flags & BR_VLAN_TUNNEL) {
 		if (br_handle_ingress_vlan_tunnel(skb, p,
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 7946aa3b6e09..e7110a6e2b7e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -425,15 +425,13 @@ struct br_input_skb_cb {
 	struct net_device *brdev;
 
 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING
-	int igmp;
-	int mrouters_only;
+	u8 igmp;
+	u8 mrouters_only:1;
 #endif
-
-	bool proxyarp_replied;
-	bool src_port_isolated;
-
+	u8 proxyarp_replied:1;
+	u8 src_port_isolated:1;
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
-	bool vlan_filtered;
+	u8 vlan_filtered:1;
 #endif
 
 #ifdef CONFIG_NET_SWITCHDEV
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH nf-next 3/4] bridge: netfilter: unroll NF_HOOK helper in bridge input path
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 1/4] selftests: netfilter: add ebtables broute test case Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 2/4] bridge: reduce size of input cb to 16 bytes Florian Westphal
@ 2019-04-11 14:36 ` Florian Westphal
  2019-04-11 14:36 ` [PATCH nf-next 4/4] bridge: broute: make broute a real ebtables table Florian Westphal
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-04-11 14:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: roopa, nikolay, netdev, Florian Westphal

Replace NF_HOOK() based invocation of the netfilter hooks with a private
copy of nf_hook_slow().

This copy has one difference: it can return the rx handler value expected
by the stack, i.e. RX_HANDLER_CONSUMED or RX_HANDLER_PASS.

This is needed by the next patch to invoke the ebtables
"broute" table via the standard netfilter hooks rather than the custom
"br_should_route_hook" indirection that is used now.

When the skb is to be "brouted", we must return RX_HANDLER_PASS from the
bridge rx input handler, but there is no way to indicate this via
NF_HOOK(), unless perhaps by some hack such as exposing bridge_cb in the
netfilter core or a percpu flag.

  text    data     bss     dec   filename
  3369      56       0    3425   net/bridge/br_input.o.before
  3458      40       0    3498   net/bridge/br_input.o.after

This allows removal of the "br_should_route_hook" in the next patch.

Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_queue.h |  3 ++
 net/bridge/br_input.c            | 55 +++++++++++++++++++++++++++++---
 net/netfilter/core.c             |  1 +
 net/netfilter/nf_internals.h     |  3 --
 net/netfilter/nf_queue.c         |  1 +
 5 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h
index a50a69f5334c..7239105d9d2e 100644
--- a/include/net/netfilter/nf_queue.h
+++ b/include/net/netfilter/nf_queue.h
@@ -119,4 +119,7 @@ nfqueue_hash(const struct sk_buff *skb, u16 queue, u16 queues_total, u8 family,
 	return queue;
 }
 
+int nf_queue(struct sk_buff *skb, struct nf_hook_state *state,
+	     const struct nf_hook_entries *entries, unsigned int index,
+	     unsigned int verdict);
 #endif /* _NF_QUEUE_H */
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index e2f93e5c72da..4ac34fb5f943 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -16,6 +16,7 @@
 #include <linux/netdevice.h>
 #include <linux/etherdevice.h>
 #include <linux/netfilter_bridge.h>
+#include <net/netfilter/nf_queue.h>
 #include <linux/neighbour.h>
 #include <net/arp.h>
 #include <linux/export.h>
@@ -206,6 +207,55 @@ static int br_handle_local_finish(struct net *net, struct sock *sk, struct sk_bu
 	return 0;
 }
 
+static int nf_hook_bridge_pre(struct sk_buff *skb, struct sk_buff **pskb)
+{
+#ifdef CONFIG_NETFILTER_FAMILY_BRIDGE
+	struct nf_hook_entries *e = NULL;
+	struct nf_hook_state state;
+	unsigned int verdict, i;
+	struct net *net;
+	int ret;
+
+	net = dev_net(skb->dev);
+#ifdef HAVE_JUMP_LABEL
+	if (!static_key_false(&nf_hooks_needed[NFPROTO_BRIDGE][NF_BR_PRE_ROUTING]))
+		goto frame_finish;
+#endif
+
+	e = rcu_dereference(net->nf.hooks_bridge[NF_BR_PRE_ROUTING]);
+	if (!e)
+		goto frame_finish;
+
+	nf_hook_state_init(&state, NF_BR_PRE_ROUTING,
+			   NFPROTO_BRIDGE, skb->dev, NULL, NULL,
+			   net, br_handle_frame_finish);
+
+	for (i = 0; i < e->num_hook_entries; i++) {
+		verdict = nf_hook_entry_hookfn(&e->hooks[i], skb, &state);
+		switch (verdict & NF_VERDICT_MASK) {
+		case NF_ACCEPT:
+			break;
+		case NF_DROP:
+			kfree_skb(skb);
+			return RX_HANDLER_CONSUMED;
+		case NF_QUEUE:
+			ret = nf_queue(skb, &state, e, i, verdict);
+			if (ret == 1)
+				continue;
+			return RX_HANDLER_CONSUMED;
+		default: /* STOLEN */
+			return RX_HANDLER_CONSUMED;
+		}
+	}
+frame_finish:
+	net = dev_net(skb->dev);
+	br_handle_frame_finish(net, NULL, skb);
+#else
+	br_handle_frame_finish(dev_net(skb->dev), NULL, skb);
+#endif
+	return RX_HANDLER_CONSUMED;
+}
+
 /*
  * Return NULL if skb is handled
  * note: already called with rcu_read_lock
@@ -304,10 +354,7 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 		if (ether_addr_equal(p->br->dev->dev_addr, dest))
 			skb->pkt_type = PACKET_HOST;
 
-		NF_HOOK(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING,
-			dev_net(skb->dev), NULL, skb, skb->dev, NULL,
-			br_handle_frame_finish);
-		break;
+		return nf_hook_bridge_pre(skb, pskb);
 	default:
 drop:
 		kfree_skb(skb);
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 93aaec3a54ec..71f06900473e 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -23,6 +23,7 @@
 #include <linux/mm.h>
 #include <linux/rcupdate.h>
 #include <net/net_namespace.h>
+#include <net/netfilter/nf_queue.h>
 #include <net/sock.h>
 
 #include "nf_internals.h"
diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h
index e15779fd58e3..d6c43902ebd7 100644
--- a/net/netfilter/nf_internals.h
+++ b/net/netfilter/nf_internals.h
@@ -7,9 +7,6 @@
 #include <linux/netdevice.h>
 
 /* nf_queue.c */
-int nf_queue(struct sk_buff *skb, struct nf_hook_state *state,
-	     const struct nf_hook_entries *entries, unsigned int index,
-	     unsigned int verdict);
 void nf_queue_nf_hook_drop(struct net *net);
 
 /* nf_log.c */
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index a36a77bae1d6..9dc1d6e04946 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -240,6 +240,7 @@ int nf_queue(struct sk_buff *skb, struct nf_hook_state *state,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(nf_queue);
 
 static unsigned int nf_iterate(struct sk_buff *skb,
 			       struct nf_hook_state *state,
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH nf-next 4/4] bridge: broute: make broute a real ebtables table
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
                   ` (2 preceding siblings ...)
  2019-04-11 14:36 ` [PATCH nf-next 3/4] bridge: netfilter: unroll NF_HOOK helper in bridge input path Florian Westphal
@ 2019-04-11 14:36 ` Florian Westphal
  2019-04-11 18:25 ` [PATCH nf-next 0/4] netfilter: bridge: remove broute hook David Miller
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2019-04-11 14:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: roopa, nikolay, netdev, Florian Westphal

This makes broute a normal ebtables table, hooking at PREROUTING.
The broute hook is removed.

It uses skb->cb to signal to bridge rx handler that the skb should be
routed instead of being bridged.

This change is backwards compatible with ebtables as no userspace visible
parts are changed.

This means we can also remove the !ops test in ebt_register_table,
it was only there for broute table sake.

Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/linux/if_bridge.h             |  3 --
 net/bridge/br_input.c                 | 18 ++------
 net/bridge/br_private.h               |  3 ++
 net/bridge/netfilter/ebtable_broute.c | 63 +++++++++++++++++++--------
 net/bridge/netfilter/ebtables.c       |  7 +--
 5 files changed, 52 insertions(+), 42 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 627b788ba0ff..ef0819ced0fc 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -56,9 +56,6 @@ struct br_ip_list {
 
 extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __user *));
 
-typedef int br_should_route_hook_t(struct sk_buff *skb);
-extern br_should_route_hook_t __rcu *br_should_route_hook;
-
 #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING)
 int br_multicast_list_adjacent(struct net_device *dev,
 			       struct list_head *br_ip_list);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 4ac34fb5f943..e0aacfedcfe1 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -24,10 +24,6 @@
 #include "br_private.h"
 #include "br_private_tunnel.h"
 
-/* Hook for brouter */
-br_should_route_hook_t __rcu *br_should_route_hook __read_mostly;
-EXPORT_SYMBOL(br_should_route_hook);
-
 static int
 br_netif_receive_skb(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
@@ -234,6 +230,10 @@ static int nf_hook_bridge_pre(struct sk_buff *skb, struct sk_buff **pskb)
 		verdict = nf_hook_entry_hookfn(&e->hooks[i], skb, &state);
 		switch (verdict & NF_VERDICT_MASK) {
 		case NF_ACCEPT:
+			if (BR_INPUT_SKB_CB(skb)->br_netfilter_broute) {
+				*pskb = skb;
+				return RX_HANDLER_PASS;
+			}
 			break;
 		case NF_DROP:
 			kfree_skb(skb);
@@ -265,7 +265,6 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 	struct net_bridge_port *p;
 	struct sk_buff *skb = *pskb;
 	const unsigned char *dest = eth_hdr(skb)->h_dest;
-	br_should_route_hook_t *rhook;
 
 	if (unlikely(skb->pkt_type == PACKET_LOOPBACK))
 		return RX_HANDLER_PASS;
@@ -341,15 +340,6 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 forward:
 	switch (p->state) {
 	case BR_STATE_FORWARDING:
-		rhook = rcu_dereference(br_should_route_hook);
-		if (rhook) {
-			if ((*rhook)(skb)) {
-				*pskb = skb;
-				return RX_HANDLER_PASS;
-			}
-			dest = eth_hdr(skb)->h_dest;
-		}
-		/* fall through */
 	case BR_STATE_LEARNING:
 		if (ether_addr_equal(p->br->dev->dev_addr, dest))
 			skb->pkt_type = PACKET_HOST;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index e7110a6e2b7e..4bea2f11da9b 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -433,6 +433,9 @@ struct br_input_skb_cb {
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
 	u8 vlan_filtered:1;
 #endif
+#ifdef CONFIG_NETFILTER_FAMILY_BRIDGE
+	u8 br_netfilter_broute:1;
+#endif
 
 #ifdef CONFIG_NET_SWITCHDEV
 	int offload_fwd_mark;
diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c
index 276b60262981..ec2652a459da 100644
--- a/net/bridge/netfilter/ebtable_broute.c
+++ b/net/bridge/netfilter/ebtable_broute.c
@@ -15,6 +15,8 @@
 #include <linux/module.h>
 #include <linux/if_bridge.h>
 
+#include "../br_private.h"
+
 /* EBT_ACCEPT means the frame will be bridged
  * EBT_DROP means the frame will be routed
  */
@@ -48,30 +50,63 @@ static const struct ebt_table broute_table = {
 	.me		= THIS_MODULE,
 };
 
-static int ebt_broute(struct sk_buff *skb)
+static unsigned int ebt_broute(void *priv, struct sk_buff *skb,
+			       const struct nf_hook_state *s)
 {
+	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
 	struct nf_hook_state state;
+	unsigned char *dest;
 	int ret;
 
+	if (!p || p->state != BR_STATE_FORWARDING)
+		return NF_ACCEPT;
+
 	nf_hook_state_init(&state, NF_BR_BROUTING,
-			   NFPROTO_BRIDGE, skb->dev, NULL, NULL,
-			   dev_net(skb->dev), NULL);
+			   NFPROTO_BRIDGE, s->in, NULL, NULL,
+			   s->net, NULL);
 
 	ret = ebt_do_table(skb, &state, state.net->xt.broute_table);
-	if (ret == NF_DROP)
-		return 1; /* route it */
-	return 0; /* bridge it */
+
+	if (ret != NF_DROP)
+		return ret;
+
+	/* DROP in ebtables -t broute means that the
+	 * skb should be routed, not bridged.
+	 * This is awkward, but can't be changed for compatibility
+	 * reasons.
+	 *
+	 * We map DROP to ACCEPT and set the ->br_netfilter_broute flag.
+	 */
+	BR_INPUT_SKB_CB(skb)->br_netfilter_broute = 1;
+
+	/* undo PACKET_HOST mangling done in br_input in case the dst
+	 * address matches the logical bridge but not the port.
+	 */
+	dest = eth_hdr(skb)->h_dest;
+	if (skb->pkt_type == PACKET_HOST &&
+	    !ether_addr_equal(skb->dev->dev_addr, dest) &&
+	     ether_addr_equal(p->br->dev->dev_addr, dest))
+		skb->pkt_type = PACKET_OTHERHOST;
+
+	return NF_ACCEPT;
 }
 
+static const struct nf_hook_ops ebt_ops_broute = {
+	.hook		= ebt_broute,
+	.pf		= NFPROTO_BRIDGE,
+	.hooknum	= NF_BR_PRE_ROUTING,
+	.priority	= NF_BR_PRI_FIRST,
+};
+
 static int __net_init broute_net_init(struct net *net)
 {
-	return ebt_register_table(net, &broute_table, NULL,
+	return ebt_register_table(net, &broute_table, &ebt_ops_broute,
 				  &net->xt.broute_table);
 }
 
 static void __net_exit broute_net_exit(struct net *net)
 {
-	ebt_unregister_table(net, net->xt.broute_table, NULL);
+	ebt_unregister_table(net, net->xt.broute_table, &ebt_ops_broute);
 }
 
 static struct pernet_operations broute_net_ops = {
@@ -81,21 +116,11 @@ static struct pernet_operations broute_net_ops = {
 
 static int __init ebtable_broute_init(void)
 {
-	int ret;
-
-	ret = register_pernet_subsys(&broute_net_ops);
-	if (ret < 0)
-		return ret;
-	/* see br_input.c */
-	RCU_INIT_POINTER(br_should_route_hook,
-			   (br_should_route_hook_t *)ebt_broute);
-	return 0;
+	return register_pernet_subsys(&broute_net_ops);
 }
 
 static void __exit ebtable_broute_fini(void)
 {
-	RCU_INIT_POINTER(br_should_route_hook, NULL);
-	synchronize_net();
 	unregister_pernet_subsys(&broute_net_ops);
 }
 
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index eb15891f8b9f..383f0328ff68 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1221,10 +1221,6 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table,
 	mutex_unlock(&ebt_mutex);
 
 	WRITE_ONCE(*res, table);
-
-	if (!ops)
-		return 0;
-
 	ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks));
 	if (ret) {
 		__ebt_unregister_table(net, table);
@@ -1248,8 +1244,7 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table,
 void ebt_unregister_table(struct net *net, struct ebt_table *table,
 			  const struct nf_hook_ops *ops)
 {
-	if (ops)
-		nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
+	nf_unregister_net_hooks(net, ops, hweight32(table->valid_hooks));
 	__ebt_unregister_table(net, table);
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH nf-next 0/4] netfilter: bridge: remove broute hook
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
                   ` (3 preceding siblings ...)
  2019-04-11 14:36 ` [PATCH nf-next 4/4] bridge: broute: make broute a real ebtables table Florian Westphal
@ 2019-04-11 18:25 ` David Miller
  2019-04-11 19:46 ` Nikolay Aleksandrov
  2019-04-11 23:48 ` Pablo Neira Ayuso
  6 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2019-04-11 18:25 UTC (permalink / raw)
  To: fw; +Cc: netfilter-devel, roopa, nikolay, netdev

From: Florian Westphal <fw@strlen.de>
Date: Thu, 11 Apr 2019 16:36:38 +0200

> This series removes the 'broute' hook by promoting ebtables' broute table
> to a normal ebtables table (invoked via normal PREROUTING netfilter hook).
> 
> The downside is that nf_hook_slow() needs to be duplicated in br_input.c
> (see patch 3).
> 
> However, I think its worth the price as this allows to remove the
> br_should_route_hook.
> 
> There are quite some changes in bridge specific code, if you prefer
> I can re-submit this for net-next instead of nf-next.
> 
> Main motivation is to provide 'ebtables -t broute' functionality via
> nftables later on, this can then be done without touching the bridge
> or netfilter core infrastructure again.

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH nf-next 0/4] netfilter: bridge: remove broute hook
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
                   ` (4 preceding siblings ...)
  2019-04-11 18:25 ` [PATCH nf-next 0/4] netfilter: bridge: remove broute hook David Miller
@ 2019-04-11 19:46 ` Nikolay Aleksandrov
  2019-04-11 23:48 ` Pablo Neira Ayuso
  6 siblings, 0 replies; 8+ messages in thread
From: Nikolay Aleksandrov @ 2019-04-11 19:46 UTC (permalink / raw)
  To: Florian Westphal, netfilter-devel; +Cc: roopa, netdev

On 11/04/2019 17:36, Florian Westphal wrote:
> This series removes the 'broute' hook by promoting ebtables' broute table
> to a normal ebtables table (invoked via normal PREROUTING netfilter hook).
> 
> The downside is that nf_hook_slow() needs to be duplicated in br_input.c
> (see patch 3).
> 
> However, I think its worth the price as this allows to remove the
> br_should_route_hook.
> 
> There are quite some changes in bridge specific code, if you prefer
> I can re-submit this for net-next instead of nf-next.
> 
> Main motivation is to provide 'ebtables -t broute' functionality via
> nftables later on, this can then be done without touching the bridge
> or netfilter core infrastructure again.
> 
> Florian Westphal (4):
>       selftests: netfilter: add ebtables broute test case
>       bridge: reduce size of input cb to 16 bytes
>       bridge: netfilter: unroll NF_HOOK helper in bridge input path
>       bridge: broute: make broute a real ebtables table
> 
>  include/linux/if_bridge.h                           |    3 
>  include/net/netfilter/nf_queue.h                    |    3 
>  net/bridge/br_arp_nd_proxy.c                        |   18 +-
>  net/bridge/br_input.c                               |   72 +++++++--
>  net/bridge/br_private.h                             |   15 +-
>  net/bridge/netfilter/ebtable_broute.c               |   63 ++++++--
>  net/bridge/netfilter/ebtables.c                     |    7 
>  net/netfilter/core.c                                |    1 
>  net/netfilter/nf_internals.h                        |    3 
>  net/netfilter/nf_queue.c                            |    1 
>  tools/testing/selftests/netfilter/Makefile          |    2 
>  tools/testing/selftests/netfilter/bridge_brouter.sh |  146 ++++++++++++++++++++
>  12 files changed, 268 insertions(+), 66 deletions(-)
> 

The set looks good to me, the only little thing is the new memset() in br_handle_frame(),
before we would lazily zero the fields when needed but that would save us from future
bugs where one could forget to initialize the field.
Now we can remove most of the explicit cb field zeroing and rely on the memset.

Nice work! For the set:

Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH nf-next 0/4] netfilter: bridge: remove broute hook
  2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
                   ` (5 preceding siblings ...)
  2019-04-11 19:46 ` Nikolay Aleksandrov
@ 2019-04-11 23:48 ` Pablo Neira Ayuso
  6 siblings, 0 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2019-04-11 23:48 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, roopa, nikolay, netdev

On Thu, Apr 11, 2019 at 04:36:38PM +0200, Florian Westphal wrote:
> This series removes the 'broute' hook by promoting ebtables' broute table
> to a normal ebtables table (invoked via normal PREROUTING netfilter hook).
> 
> The downside is that nf_hook_slow() needs to be duplicated in br_input.c
> (see patch 3).
> 
> However, I think its worth the price as this allows to remove the
> br_should_route_hook.
> 
> There are quite some changes in bridge specific code, if you prefer
> I can re-submit this for net-next instead of nf-next.
> 
> Main motivation is to provide 'ebtables -t broute' functionality via
> nftables later on, this can then be done without touching the bridge
> or netfilter core infrastructure again.

Series applied, thanks Florian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-04-11 23:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-11 14:36 [PATCH nf-next 0/4] netfilter: bridge: remove broute hook Florian Westphal
2019-04-11 14:36 ` [PATCH nf-next 1/4] selftests: netfilter: add ebtables broute test case Florian Westphal
2019-04-11 14:36 ` [PATCH nf-next 2/4] bridge: reduce size of input cb to 16 bytes Florian Westphal
2019-04-11 14:36 ` [PATCH nf-next 3/4] bridge: netfilter: unroll NF_HOOK helper in bridge input path Florian Westphal
2019-04-11 14:36 ` [PATCH nf-next 4/4] bridge: broute: make broute a real ebtables table Florian Westphal
2019-04-11 18:25 ` [PATCH nf-next 0/4] netfilter: bridge: remove broute hook David Miller
2019-04-11 19:46 ` Nikolay Aleksandrov
2019-04-11 23:48 ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).