netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v16 nf-next 0/3] conntrack: bridge: add double vlan, pppoe and pppoe-in-q
@ 2025-11-04 14:57 Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 1/3] netfilter: utils: nf_checksum(_partial) correct data!=networkheader Eric Woudstra
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Eric Woudstra @ 2025-11-04 14:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Phil Sutter, Nikolay Aleksandrov, Ido Schimmel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, bridge, netdev, Eric Woudstra

Conntrack bridge only tracks untagged and 802.1q.

To make the bridge-fastpath experience more similar to the
forward-fastpath experience, introduce patches for double vlan,
pppoe and pppoe-in-q tagged packets to bridge conntrack and to
bridge filter chain.

Changes in v16:

- Changed nft_chain_filter patch: Only help populating pktinfo offsets,
   call nft_do_chain() with original network_offset.
- Changed commit messages.
- Removed kernel-doc comments.

Changes in v15:

- Do not munge skb->protocol.
- Introduce nft_set_bridge_pktinfo() helper.
- Introduce nf_ct_bridge_pre_inner() helper.
- nf_ct_bridge_pre(): Don't trim on ph->hdr.length, only compare to what
   ip header claims and return NF_ACCEPT if it does not match.
- nf_ct_bridge_pre(): Renamed u32 data_len to pppoe_len.
- nf_ct_bridge_pre(): Reset network_header only when ret == NF_ACCEPT.
- nf_checksum(_partial)(): Use of skb_network_offset().
- nf_checksum(_partial)(): Use 'if (WARN_ON()) return 0' instead.
- nf_checksum(_partial)(): Added comments

Changes in v14:

- nf_checksum(_patial): Use DEBUG_NET_WARN_ON_ONCE(
   !skb_pointer_if_linear()) instead of pskb_may_pull().
- nft_do_chain_bridge: Added default case ph->proto is neither
   ipv4 nor ipv6.
- nft_do_chain_bridge: only reset network header when ret == NF_ACCEPT.

Changes in v13:

- Do not use pull/push before/after calling nf_conntrack_in() or
   nft_do_chain().
- Add patch to correct calculating checksum when skb->data !=
   skb_network_header(skb).

Changes in v12:

- Only allow tracking this traffic when a conntrack zone is set.
- nf_ct_bridge_pre(): skb pull/push without touching the checksum,
   because the pull is always restored with push.
- nft_do_chain_bridge(): handle the extra header similar to
   nf_ct_bridge_pre(), using pull/push.

Changes in v11:

- nft_do_chain_bridge(): Proper readout of encapsulated proto.
- nft_do_chain_bridge(): Use skb_set_network_header() instead of thoff.
- removed test script, it is now in separate patch.

v10 split from patch-set: bridge-fastpath and related improvements v9

Eric Woudstra (3):
  netfilter: utils: nf_checksum(_partial) correct data!=networkheader
  netfilter: bridge: Add conntrack double vlan and pppoe
  netfilter: nft_chain_filter: Add bridge double vlan and pppoe

 net/bridge/netfilter/nf_conntrack_bridge.c | 92 ++++++++++++++++++----
 net/netfilter/nft_chain_filter.c           | 58 +++++++++++++-
 net/netfilter/utils.c                      | 28 +++++--
 3 files changed, 153 insertions(+), 25 deletions(-)

-- 
2.50.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v16 nf-next 1/3] netfilter: utils: nf_checksum(_partial) correct data!=networkheader
  2025-11-04 14:57 [PATCH v16 nf-next 0/3] conntrack: bridge: add double vlan, pppoe and pppoe-in-q Eric Woudstra
@ 2025-11-04 14:57 ` Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 2/3] netfilter: bridge: Add conntrack double vlan and pppoe Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge " Eric Woudstra
  2 siblings, 0 replies; 8+ messages in thread
From: Eric Woudstra @ 2025-11-04 14:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Phil Sutter, Nikolay Aleksandrov, Ido Schimmel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, bridge, netdev, Eric Woudstra

In the conntrack hook it may not always be the case that:
skb_network_header(skb) == skb->data, i.e. skb_network_offset(skb)
is zero.

This is problematic when L4 function nf_conntrack_handle_packet()
is accessing L3 data. This function uses thoff and ip_hdr()
to finds it's data. But it also calculates the checksum.
nf_checksum() and nf_checksum_partial() both use lower skb-checksum
functions that are based on using skb->data.

Adjust for skb_network_offset(skb), so that the checksum is calculated
correctly.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 net/netfilter/utils.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/utils.c b/net/netfilter/utils.c
index 008419db815a..7b33fe63c5fa 100644
--- a/net/netfilter/utils.c
+++ b/net/netfilter/utils.c
@@ -124,16 +124,25 @@ __sum16 nf_checksum(struct sk_buff *skb, unsigned int hook,
 		    unsigned int dataoff, u8 protocol,
 		    unsigned short family)
 {
+	unsigned int nhpull = skb_network_offset(skb);
 	__sum16 csum = 0;
 
+	if (WARN_ON(!skb_pointer_if_linear(skb, nhpull, 0)))
+		return 0;
+
+	/* pull/push because the lower csum functions assume that
+	 * skb_network_offset(skb) is zero.
+	 */
+	__skb_pull(skb, nhpull);
 	switch (family) {
 	case AF_INET:
-		csum = nf_ip_checksum(skb, hook, dataoff, protocol);
+		csum = nf_ip_checksum(skb, hook, dataoff - nhpull, protocol);
 		break;
 	case AF_INET6:
-		csum = nf_ip6_checksum(skb, hook, dataoff, protocol);
+		csum = nf_ip6_checksum(skb, hook, dataoff - nhpull, protocol);
 		break;
 	}
+	__skb_push(skb, nhpull);
 
 	return csum;
 }
@@ -143,18 +152,25 @@ __sum16 nf_checksum_partial(struct sk_buff *skb, unsigned int hook,
 			    unsigned int dataoff, unsigned int len,
 			    u8 protocol, unsigned short family)
 {
+	unsigned int nhpull = skb_network_offset(skb);
 	__sum16 csum = 0;
 
+	if (WARN_ON(!skb_pointer_if_linear(skb, nhpull, 0)))
+		return 0;
+
+	/* See nf_checksum() */
+	__skb_pull(skb, nhpull);
 	switch (family) {
 	case AF_INET:
-		csum = nf_ip_checksum_partial(skb, hook, dataoff, len,
-					      protocol);
+		csum = nf_ip_checksum_partial(skb, hook, dataoff - nhpull,
+					      len, protocol);
 		break;
 	case AF_INET6:
-		csum = nf_ip6_checksum_partial(skb, hook, dataoff, len,
-					       protocol);
+		csum = nf_ip6_checksum_partial(skb, hook, dataoff - nhpull,
+					       len, protocol);
 		break;
 	}
+	__skb_push(skb, nhpull);
 
 	return csum;
 }
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v16 nf-next 2/3] netfilter: bridge: Add conntrack double vlan and pppoe
  2025-11-04 14:57 [PATCH v16 nf-next 0/3] conntrack: bridge: add double vlan, pppoe and pppoe-in-q Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 1/3] netfilter: utils: nf_checksum(_partial) correct data!=networkheader Eric Woudstra
@ 2025-11-04 14:57 ` Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge " Eric Woudstra
  2 siblings, 0 replies; 8+ messages in thread
From: Eric Woudstra @ 2025-11-04 14:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Phil Sutter, Nikolay Aleksandrov, Ido Schimmel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, bridge, netdev, Eric Woudstra

In a bridge, until now, it is possible to track connections of plain
ip(v6) and ip(v6) encapsulated in single 802.1q or 802.1ad.

This patch adds the capability to track connections when the connection
is (also) encapsulated in PPPoE. It also adds the capability to track
connections that are encapsulated in an inner 802.1q, combined with an
outer 802.1ad or 802.1q encapsulation.

To prevent mixing connections that are tagged differently in the L2
encapsulations, one should separate them using conntrack zones.
Using a conntrack zone is a hard requirement for the newly added
encapsulations of the tracking capability inside a bridge.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 net/bridge/netfilter/nf_conntrack_bridge.c | 92 ++++++++++++++++++----
 1 file changed, 75 insertions(+), 17 deletions(-)

diff --git a/net/bridge/netfilter/nf_conntrack_bridge.c b/net/bridge/netfilter/nf_conntrack_bridge.c
index 6482de4d8750..39e844b3d3c4 100644
--- a/net/bridge/netfilter/nf_conntrack_bridge.c
+++ b/net/bridge/netfilter/nf_conntrack_bridge.c
@@ -237,58 +237,116 @@ static int nf_ct_br_ipv6_check(const struct sk_buff *skb)
 	return 0;
 }
 
+static int nf_ct_bridge_pre_inner(struct sk_buff *skb, __be16 *proto, u32 *len)
+{
+	switch (*proto) {
+	case htons(ETH_P_PPP_SES): {
+		struct ppp_hdr {
+			struct pppoe_hdr hdr;
+			__be16 proto;
+		} *ph;
+
+		if (!pskb_may_pull(skb, PPPOE_SES_HLEN))
+			return -1;
+		ph = (struct ppp_hdr *)(skb->data);
+		switch (ph->proto) {
+		case htons(PPP_IP):
+			*proto = htons(ETH_P_IP);
+			*len = ntohs(ph->hdr.length) - 2;
+			skb_set_network_header(skb, PPPOE_SES_HLEN);
+			return PPPOE_SES_HLEN;
+		case htons(PPP_IPV6):
+			*proto = htons(ETH_P_IPV6);
+			*len = ntohs(ph->hdr.length) - 2;
+			skb_set_network_header(skb, PPPOE_SES_HLEN);
+			return PPPOE_SES_HLEN;
+		}
+		break;
+	}
+	case htons(ETH_P_8021Q): {
+		struct vlan_hdr *vhdr;
+
+		if (!pskb_may_pull(skb, VLAN_HLEN))
+			return -1;
+		vhdr = (struct vlan_hdr *)(skb->data);
+		*proto = vhdr->h_vlan_encapsulated_proto;
+		skb_set_network_header(skb, VLAN_HLEN);
+		return VLAN_HLEN;
+	}
+	}
+	return 0;
+}
+
 static unsigned int nf_ct_bridge_pre(void *priv, struct sk_buff *skb,
 				     const struct nf_hook_state *state)
 {
 	struct nf_hook_state bridge_state = *state;
+	int ret = NF_ACCEPT, offset = 0;
 	enum ip_conntrack_info ctinfo;
+	u32 len, pppoe_len = 0;
 	struct nf_conn *ct;
-	u32 len;
-	int ret;
+	__be16 proto;
 
 	ct = nf_ct_get(skb, &ctinfo);
 	if ((ct && !nf_ct_is_template(ct)) ||
 	    ctinfo == IP_CT_UNTRACKED)
 		return NF_ACCEPT;
 
-	switch (skb->protocol) {
-	case htons(ETH_P_IP):
-		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+	proto = skb->protocol;
+
+	if (ct && nf_ct_zone_id(nf_ct_zone(ct), CTINFO2DIR(ctinfo)) !=
+			NF_CT_DEFAULT_ZONE_ID) {
+		offset = nf_ct_bridge_pre_inner(skb, &proto, &pppoe_len);
+		if (offset < 0)
 			return NF_ACCEPT;
+	}
+
+	switch (proto) {
+	case htons(ETH_P_IP):
+		if (!pskb_may_pull(skb, offset + sizeof(struct iphdr)))
+			goto do_not_track;
 
 		len = skb_ip_totlen(skb);
-		if (pskb_trim_rcsum(skb, len))
-			return NF_ACCEPT;
+		if (pppoe_len && pppoe_len != len)
+			goto do_not_track;
+		if (pskb_trim_rcsum(skb, offset + len))
+			goto do_not_track;
 
 		if (nf_ct_br_ip_check(skb))
-			return NF_ACCEPT;
+			goto do_not_track;
 
 		bridge_state.pf = NFPROTO_IPV4;
 		ret = nf_ct_br_defrag4(skb, &bridge_state);
 		break;
 	case htons(ETH_P_IPV6):
-		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
-			return NF_ACCEPT;
+		if (!pskb_may_pull(skb, offset + sizeof(struct ipv6hdr)))
+			goto do_not_track;
 
 		len = sizeof(struct ipv6hdr) + ntohs(ipv6_hdr(skb)->payload_len);
-		if (pskb_trim_rcsum(skb, len))
-			return NF_ACCEPT;
+		if (pppoe_len && pppoe_len != len)
+			goto do_not_track;
+		if (pskb_trim_rcsum(skb, offset + len))
+			goto do_not_track;
 
 		if (nf_ct_br_ipv6_check(skb))
-			return NF_ACCEPT;
+			goto do_not_track;
 
 		bridge_state.pf = NFPROTO_IPV6;
 		ret = nf_ct_br_defrag6(skb, &bridge_state);
 		break;
 	default:
 		nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
-		return NF_ACCEPT;
+		goto do_not_track;
 	}
 
-	if (ret != NF_ACCEPT)
-		return ret;
+	if (ret == NF_ACCEPT)
+		ret = nf_conntrack_in(skb, &bridge_state);
+
+do_not_track:
+	if (offset && ret == NF_ACCEPT)
+		skb_reset_network_header(skb);
 
-	return nf_conntrack_in(skb, &bridge_state);
+	return ret;
 }
 
 static unsigned int nf_ct_bridge_in(void *priv, struct sk_buff *skb,
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  2025-11-04 14:57 [PATCH v16 nf-next 0/3] conntrack: bridge: add double vlan, pppoe and pppoe-in-q Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 1/3] netfilter: utils: nf_checksum(_partial) correct data!=networkheader Eric Woudstra
  2025-11-04 14:57 ` [PATCH v16 nf-next 2/3] netfilter: bridge: Add conntrack double vlan and pppoe Eric Woudstra
@ 2025-11-04 14:57 ` Eric Woudstra
  2025-11-04 15:53   ` Florian Westphal
  2025-11-07  0:06   ` Florian Westphal
  2 siblings, 2 replies; 8+ messages in thread
From: Eric Woudstra @ 2025-11-04 14:57 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	Phil Sutter, Nikolay Aleksandrov, Ido Schimmel, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, bridge, netdev, Eric Woudstra

In nft_do_chain_bridge() pktinfo is only fully populated for plain packets
and packets encapsulated in single 802.1q or 802.1ad.

When implementing the software bridge-fastpath and testing all possible
encapulations, there can be more encapsulations:

The packet could (also) be encapsulated in PPPoE, or the packet could be
encapsulated in an inner 802.1q, combined with an outer 802.1ad or 802.1q
encapsulation.

nft_flow_offload_eval() also examines the L4 header, with the L4 protocol
known from the conntrack-tuplehash. To access the header it uses
nft_thoff(), but for these packets it returns zero.

Introduce nft_set_bridge_pktinfo() to help populate pktinfo with the
offsets, without setting pkt->tprot and the corresponding pkt->flags.

This will not change rule processing, but does make these offsets
available for code that is not checking pkt->flags to use the offsets,
like nft_flow_offload_eval().

Existing behaviour for a rule like "tcp dport 22 accept" is not changed
when, for instance, a PPPoE packet is being matched.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 net/netfilter/nft_chain_filter.c | 58 ++++++++++++++++++++++++++++++--
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nft_chain_filter.c b/net/netfilter/nft_chain_filter.c
index b16185e9a6dd..1f3ae5687917 100644
--- a/net/netfilter/nft_chain_filter.c
+++ b/net/netfilter/nft_chain_filter.c
@@ -227,16 +227,64 @@ static inline void nft_chain_filter_inet_fini(void) {}
 #endif /* CONFIG_NF_TABLES_IPV6 */
 
 #if IS_ENABLED(CONFIG_NF_TABLES_BRIDGE)
+static int nft_set_bridge_pktinfo(struct nft_pktinfo *pkt, struct sk_buff *skb,
+				  const struct nf_hook_state *state,
+				  __be16 *proto)
+{
+	nft_set_pktinfo(pkt, skb, state);
+
+	switch (*proto) {
+	case htons(ETH_P_PPP_SES): {
+		struct ppp_hdr {
+			struct pppoe_hdr hdr;
+			__be16 proto;
+		} *ph;
+
+		if (!pskb_may_pull(skb, PPPOE_SES_HLEN))
+			return -1;
+		ph = (struct ppp_hdr *)(skb->data);
+		switch (ph->proto) {
+		case htons(PPP_IP):
+			*proto = htons(ETH_P_IP);
+			skb_set_network_header(skb, PPPOE_SES_HLEN);
+			return PPPOE_SES_HLEN;
+		case htons(PPP_IPV6):
+			*proto = htons(ETH_P_IPV6);
+			skb_set_network_header(skb, PPPOE_SES_HLEN);
+			return PPPOE_SES_HLEN;
+		}
+		break;
+	}
+	case htons(ETH_P_8021Q): {
+		struct vlan_hdr *vhdr;
+
+		if (!pskb_may_pull(skb, VLAN_HLEN))
+			return -1;
+		vhdr = (struct vlan_hdr *)(skb->data);
+		*proto = vhdr->h_vlan_encapsulated_proto;
+		skb_set_network_header(skb, VLAN_HLEN);
+		return VLAN_HLEN;
+	}
+	}
+	return 0;
+}
+
 static unsigned int
 nft_do_chain_bridge(void *priv,
 		    struct sk_buff *skb,
 		    const struct nf_hook_state *state)
 {
 	struct nft_pktinfo pkt;
+	__be16 proto;
+	int offset;
 
-	nft_set_pktinfo(&pkt, skb, state);
+	proto = eth_hdr(skb)->h_proto;
 
-	switch (eth_hdr(skb)->h_proto) {
+	offset = nft_set_bridge_pktinfo(&pkt, skb, state, &proto);
+	if (offset < 0)
+		return NF_ACCEPT;
+
+	switch (proto) {
 	case htons(ETH_P_IP):
 		nft_set_pktinfo_ipv4_validate(&pkt);
 		break;
@@ -248,6 +296,12 @@ nft_do_chain_bridge(void *priv,
 		break;
 	}
 
+	if (offset) {
+		skb_reset_network_header(skb);
+		pkt.flags = 0;
+		pkt.tprot = 0;
+	}
+
 	return nft_do_chain(&pkt, priv);
 }
 
-- 
2.50.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  2025-11-04 14:57 ` [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge " Eric Woudstra
@ 2025-11-04 15:53   ` Florian Westphal
  2025-11-04 19:15     ` Eric Woudstra
  2025-11-07  0:06   ` Florian Westphal
  1 sibling, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2025-11-04 15:53 UTC (permalink / raw)
  To: Eric Woudstra
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter,
	Nikolay Aleksandrov, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netfilter-devel,
	bridge, netdev

Eric Woudstra <ericwouds@gmail.com> wrote:
> +	switch (*proto) {
> +	case htons(ETH_P_PPP_SES): {
> +		struct ppp_hdr {
> +			struct pppoe_hdr hdr;
> +			__be16 proto;
> +		} *ph;
> +
> +		if (!pskb_may_pull(skb, PPPOE_SES_HLEN))
> +			return -1;
> +		ph = (struct ppp_hdr *)(skb->data);
> +		switch (ph->proto) {
> +		case htons(PPP_IP):
> +			*proto = htons(ETH_P_IP);
> +			skb_set_network_header(skb, PPPOE_SES_HLEN);
> +			return PPPOE_SES_HLEN;
> +		case htons(PPP_IPV6):
> +			*proto = htons(ETH_P_IPV6);
> +			skb_set_network_header(skb, PPPOE_SES_HLEN);
> +			return PPPOE_SES_HLEN;
> +		}
> +		break;
> +	}
> +	case htons(ETH_P_8021Q): {
> +		struct vlan_hdr *vhdr;
> +
> +		if (!pskb_may_pull(skb, VLAN_HLEN))
> +			return -1;
> +		vhdr = (struct vlan_hdr *)(skb->data);
> +		*proto = vhdr->h_vlan_encapsulated_proto;
> +		skb_set_network_header(skb, VLAN_HLEN);
> +		return VLAN_HLEN;
> +	}
> +	}
> +	return 0;
> +}
> +
>  static unsigned int
>  nft_do_chain_bridge(void *priv,
>  		    struct sk_buff *skb,
>  		    const struct nf_hook_state *state)
>  {
>  	struct nft_pktinfo pkt;
> +	__be16 proto;
> +	int offset;
>  
> -	nft_set_pktinfo(&pkt, skb, state);
> +	proto = eth_hdr(skb)->h_proto;
>  
> -	switch (eth_hdr(skb)->h_proto) {
> +	offset = nft_set_bridge_pktinfo(&pkt, skb, state, &proto);
> +	if (offset < 0)
> +		return NF_ACCEPT;


Hmm.  I'm not sure, I think this should either drop them right away
OR pass them to do_chain without any changes (i.e. retain existing
behavior and have this be same as nft_set_pktinfo_unspec()).

but please wait until resend.

I hope to finish a larger set i've been working on by tomorrow.
Then I can give this a more thorough review (and also make a summary +
suggestion wrt. the bridge match semantics wrt.  vlan + pppoe etc.

My hunch is that your approach is pretty much the way to go
but I need to complete related homework to make sure I did not
miss/forget anything.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  2025-11-04 15:53   ` Florian Westphal
@ 2025-11-04 19:15     ` Eric Woudstra
  2025-11-06 23:47       ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Woudstra @ 2025-11-04 19:15 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter,
	Nikolay Aleksandrov, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netfilter-devel,
	bridge, netdev



On 11/4/25 4:53 PM, Florian Westphal wrote:
> Eric Woudstra <ericwouds@gmail.com> wrote:
>> +	switch (*proto) {
>> +	case htons(ETH_P_PPP_SES): {
>> +		struct ppp_hdr {
>> +			struct pppoe_hdr hdr;
>> +			__be16 proto;
>> +		} *ph;
>> +
>> +		if (!pskb_may_pull(skb, PPPOE_SES_HLEN))
>> +			return -1;
>> +		ph = (struct ppp_hdr *)(skb->data);
>> +		switch (ph->proto) {
>> +		case htons(PPP_IP):
>> +			*proto = htons(ETH_P_IP);
>> +			skb_set_network_header(skb, PPPOE_SES_HLEN);
>> +			return PPPOE_SES_HLEN;
>> +		case htons(PPP_IPV6):
>> +			*proto = htons(ETH_P_IPV6);
>> +			skb_set_network_header(skb, PPPOE_SES_HLEN);
>> +			return PPPOE_SES_HLEN;
>> +		}
>> +		break;
>> +	}
>> +	case htons(ETH_P_8021Q): {
>> +		struct vlan_hdr *vhdr;
>> +
>> +		if (!pskb_may_pull(skb, VLAN_HLEN))
>> +			return -1;
>> +		vhdr = (struct vlan_hdr *)(skb->data);
>> +		*proto = vhdr->h_vlan_encapsulated_proto;
>> +		skb_set_network_header(skb, VLAN_HLEN);
>> +		return VLAN_HLEN;
>> +	}
>> +	}
>> +	return 0;
>> +}
>> +
>>  static unsigned int
>>  nft_do_chain_bridge(void *priv,
>>  		    struct sk_buff *skb,
>>  		    const struct nf_hook_state *state)
>>  {
>>  	struct nft_pktinfo pkt;
>> +	__be16 proto;
>> +	int offset;
>>  
>> -	nft_set_pktinfo(&pkt, skb, state);
>> +	proto = eth_hdr(skb)->h_proto;
>>  
>> -	switch (eth_hdr(skb)->h_proto) {
>> +	offset = nft_set_bridge_pktinfo(&pkt, skb, state, &proto);
>> +	if (offset < 0)
>> +		return NF_ACCEPT;
> 
> 
> Hmm.  I'm not sure, I think this should either drop them right away
> OR pass them to do_chain without any changes (i.e. retain existing
> behavior and have this be same as nft_set_pktinfo_unspec()).
> 
> but please wait until resend.
> 
> I hope to finish a larger set i've been working on by tomorrow.
> Then I can give this a more thorough review (and also make a summary +
> suggestion wrt. the bridge match semantics wrt.  vlan + pppoe etc.
> 
> My hunch is that your approach is pretty much the way to go
> but I need to complete related homework to make sure I did not
> miss/forget anything.

I understand. I've send this, because from v5 to v15 it moved towards
matching in the rule, but it all started with the fact that
nft_flow_offload_eval() uses nft_thoff().

At a bare minimum I need to address having pkt->thoff set correctly to
implement the software bridge-fastpath.

The fact that it also makes possible to match L3/L4 data in the rule is
also nice to have.

Hopefully you can take this in consideration during your review.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  2025-11-04 19:15     ` Eric Woudstra
@ 2025-11-06 23:47       ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2025-11-06 23:47 UTC (permalink / raw)
  To: Eric Woudstra
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter,
	Nikolay Aleksandrov, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netfilter-devel,
	bridge, netdev

Eric Woudstra <ericwouds@gmail.com> wrote:
> >> +		return NF_ACCEPT;
> > 
> > 
> > Hmm.  I'm not sure, I think this should either drop them right away
> > OR pass them to do_chain without any changes (i.e. retain existing
> > behavior and have this be same as nft_set_pktinfo_unspec()).
> > 
> > but please wait until resend.
> > 
> > I hope to finish a larger set i've been working on by tomorrow.
> > Then I can give this a more thorough review (and also make a summary +
> > suggestion wrt. the bridge match semantics wrt.  vlan + pppoe etc.
> > 
> > My hunch is that your approach is pretty much the way to go
> > but I need to complete related homework to make sure I did not
> > miss/forget anything.
> 
> I understand. I've send this, because from v5 to v15 it moved towards
> matching in the rule, but it all started with the fact that
> nft_flow_offload_eval() uses nft_thoff().
> 
> At a bare minimum I need to address having pkt->thoff set correctly to
> implement the software bridge-fastpath.

Sorry, fail to see how thats related.

If the header is incomplete, then with your approach the packet isn't
even seen by nft_flow_offload_eval() (-> NF_ACCEPT'd).

Whats the problem with passing the skb to nft_do_chain in its
'original' form?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge double vlan and pppoe
  2025-11-04 14:57 ` [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge " Eric Woudstra
  2025-11-04 15:53   ` Florian Westphal
@ 2025-11-07  0:06   ` Florian Westphal
  1 sibling, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2025-11-07  0:06 UTC (permalink / raw)
  To: Eric Woudstra
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter,
	Nikolay Aleksandrov, Ido Schimmel, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netfilter-devel,
	bridge, netdev

Eric Woudstra <ericwouds@gmail.com> wrote:
> In nft_do_chain_bridge() pktinfo is only fully populated for plain packets
> and packets encapsulated in single 802.1q or 802.1ad.
> 
> When implementing the software bridge-fastpath and testing all possible
> encapulations, there can be more encapsulations:
> 
> The packet could (also) be encapsulated in PPPoE, or the packet could be
> encapsulated in an inner 802.1q, combined with an outer 802.1ad or 802.1q
> encapsulation.
> 
> nft_flow_offload_eval() also examines the L4 header, with the L4 protocol
> known from the conntrack-tuplehash. To access the header it uses
> nft_thoff(), but for these packets it returns zero.
> 
> Introduce nft_set_bridge_pktinfo() to help populate pktinfo with the
> offsets, without setting pkt->tprot and the corresponding pkt->flags.
> 
> This will not change rule processing, but does make these offsets
> available for code that is not checking pkt->flags to use the offsets,
> like nft_flow_offload_eval().

Thanks.  I think this is fine and we can extend/change this later on.

> +	case htons(ETH_P_8021Q): {
> +		struct vlan_hdr *vhdr;
> +
> +		if (!pskb_may_pull(skb, VLAN_HLEN))
> +			return -1;
> +		vhdr = (struct vlan_hdr *)(skb->data);
> +		*proto = vhdr->h_vlan_encapsulated_proto;
> +		skb_set_network_header(skb, VLAN_HLEN);

Could you move the skb_set_network_header() calls
to places where we know that we have an ip/ipv6 proto
in the upper header?

You already return the length of the encap header
anyway, might as well restrict the skb nh update
to when it will be useful.

Or was there another reason to do it this way?

If you prefer you can only resend this patch, I believe there
are two different use cases:

This patch as requirement for future ft offload of ip/ip6 packets
in bridged pppoe/qinq and this patch as starting point to eventually
allow 'ip saddr 1.2.3.4' to auto-match for PPPoE, Vlan, etc.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-07  0:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 14:57 [PATCH v16 nf-next 0/3] conntrack: bridge: add double vlan, pppoe and pppoe-in-q Eric Woudstra
2025-11-04 14:57 ` [PATCH v16 nf-next 1/3] netfilter: utils: nf_checksum(_partial) correct data!=networkheader Eric Woudstra
2025-11-04 14:57 ` [PATCH v16 nf-next 2/3] netfilter: bridge: Add conntrack double vlan and pppoe Eric Woudstra
2025-11-04 14:57 ` [PATCH v16 nf-next 3/3] netfilter: nft_chain_filter: Add bridge " Eric Woudstra
2025-11-04 15:53   ` Florian Westphal
2025-11-04 19:15     ` Eric Woudstra
2025-11-06 23:47       ` Florian Westphal
2025-11-07  0:06   ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).