netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming
@ 2025-04-08 14:28 Eric Woudstra
  2025-04-08 14:28 ` [PATCH v2 nf-next 1/3] netfilter: flow: Add bridge_vid member Eric Woudstra
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Eric Woudstra @ 2025-04-08 14:28 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, netdev, Eric Woudstra

In case of a bridge in the forward-fastpath or bridge-fastpath the fdb is
used to create the tuple. In case of roaming at layer 2 level, for example
802.11r, the destination device is changed in the fdb. The destination
device of a direct transmitting tuple is no longer valid and traffic is
send to the wrong destination. Also the hardware offloaded fastpath is not
valid anymore.

This flowentry needs to be torn down asap. Also make sure that the flow
entry is not being used, when marked for teardown.

Changes in v2:
- Unchanged, only tags RFC net-next to PATCH nf-next.

Eric Woudstra (3):
  netfilter: flow: Add bridge_vid member
  netfilter: nf_flow_table_core: teardown direct xmit when destination
    changed
  netfilter: nf_flow_table_ip: don't follow fastpath when marked
    teardown

 include/net/netfilter/nf_flow_table.h |  2 +
 net/netfilter/nf_flow_table_core.c    | 66 +++++++++++++++++++++++++++
 net/netfilter/nf_flow_table_ip.c      |  6 +++
 net/netfilter/nft_flow_offload.c      |  3 ++
 4 files changed, 77 insertions(+)

-- 
2.47.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 nf-next 1/3] netfilter: flow: Add bridge_vid member
  2025-04-08 14:28 [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
@ 2025-04-08 14:28 ` Eric Woudstra
  2025-04-08 14:28 ` [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed Eric Woudstra
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Eric Woudstra @ 2025-04-08 14:28 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, netdev, Eric Woudstra

Store the vid used on the bridge in the flow_offload_tuple, so it can be
used later to identify fdb entries that relate to the tuple.

The bridge_vid member is added to the structures nft_forward_info,
nf_flow_route and flow_offload_tuple. It can now be passed from
net_device_path->bridge.vlan_id to flow_offload_tuple->out.bridge_vid.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 include/net/netfilter/nf_flow_table.h | 2 ++
 net/netfilter/nf_flow_table_core.c    | 1 +
 net/netfilter/nft_flow_offload.c      | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index d711642e78b5..9d9363e91587 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -146,6 +146,7 @@ struct flow_offload_tuple {
 		struct {
 			u32		ifidx;
 			u32		hw_ifidx;
+			u16		bridge_vid;
 			u8		h_source[ETH_ALEN];
 			u8		h_dest[ETH_ALEN];
 		} out;
@@ -212,6 +213,7 @@ struct nf_flow_route {
 		struct {
 			u32			ifindex;
 			u32			hw_ifindex;
+			u16			bridge_vid;
 			u8			h_source[ETH_ALEN];
 			u8			h_dest[ETH_ALEN];
 		} out;
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 9d8361526f82..f6a30fc14fec 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -128,6 +128,7 @@ static int flow_offload_fill_route(struct flow_offload *flow,
 		       ETH_ALEN);
 		flow_tuple->out.ifidx = route->tuple[dir].out.ifindex;
 		flow_tuple->out.hw_ifidx = route->tuple[dir].out.hw_ifindex;
+		flow_tuple->out.bridge_vid = route->tuple[dir].out.bridge_vid;
 		dst_release(dst);
 		break;
 	case FLOW_OFFLOAD_XMIT_XFRM:
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index fdf927a8252d..31372a8ef37e 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -85,6 +85,7 @@ struct nft_forward_info {
 		__u16	id;
 		__be16	proto;
 	} encap[NF_FLOW_TABLE_ENCAP_MAX];
+	u16 bridge_vid;
 	u8 num_encaps;
 	u8 ingress_vlans;
 	u8 h_source[ETH_ALEN];
@@ -159,6 +160,7 @@ static void nft_dev_path_info(const struct net_device_path_stack *stack,
 			case DEV_PATH_BR_VLAN_KEEP:
 				break;
 			}
+			info->bridge_vid = path->bridge.vlan_id;
 			info->xmit_type = FLOW_OFFLOAD_XMIT_DIRECT;
 			break;
 		default:
@@ -223,6 +225,7 @@ static void nft_dev_forward_path(struct nf_flow_route *route,
 		memcpy(route->tuple[dir].out.h_dest, info.h_dest, ETH_ALEN);
 		route->tuple[dir].out.ifindex = info.outdev->ifindex;
 		route->tuple[dir].out.hw_ifindex = info.hw_outdev->ifindex;
+		route->tuple[dir].out.bridge_vid = info.bridge_vid;
 		route->tuple[dir].xmit_type = info.xmit_type;
 	}
 }
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed
  2025-04-08 14:28 [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
  2025-04-08 14:28 ` [PATCH v2 nf-next 1/3] netfilter: flow: Add bridge_vid member Eric Woudstra
@ 2025-04-08 14:28 ` Eric Woudstra
  2025-04-11 10:23   ` Simon Horman
  2025-04-08 14:28 ` [PATCH v2 nf-next 3/3] netfilter: nf_flow_table_ip: don't follow fastpath when marked teardown Eric Woudstra
  2025-04-11 15:32 ` [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
  3 siblings, 1 reply; 6+ messages in thread
From: Eric Woudstra @ 2025-04-08 14:28 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, netdev, Eric Woudstra

In case of a bridge in the forward-fastpath or bridge-fastpath the fdb is
used to create the tuple. In case of roaming at layer 2 level, for example
802.11r, the destination device is changed in the fdb. The destination
device of a direct transmitting tuple is no longer valid and traffic is
send to the wrong destination. Also the hardware offloaded fastpath is not
valid anymore.

In case of roaming, a switchdev notification is send to delete the old fdb
entry. Upon receiving this notification, mark all direct transmitting flows
with the same ifindex, vid and hardware address as the fdb entry to be
teared down. The hardware offloaded fastpath is still in effect, so
minimize the delay of the work queue by setting the delay to zero.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 net/netfilter/nf_flow_table_core.c | 65 ++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index f6a30fc14fec..d687f3029fbd 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -13,6 +13,7 @@
 #include <net/netfilter/nf_conntrack_core.h>
 #include <net/netfilter/nf_conntrack_l4proto.h>
 #include <net/netfilter/nf_conntrack_tuple.h>
+#include <net/switchdev.h>
 
 static DEFINE_MUTEX(flowtable_lock);
 static LIST_HEAD(flowtables);
@@ -743,6 +744,63 @@ void nf_flow_table_cleanup(struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_cleanup);
 
+struct flow_cleanup_data {
+	const unsigned char *addr;
+	int ifindex;
+	u16 vid;
+	bool found;
+};
+
+static void nf_flow_table_do_cleanup_addr(struct nf_flowtable *flow_table,
+					  struct flow_offload *flow, void *data)
+{
+	struct flow_cleanup_data *cud = data;
+
+	if ((flow->tuplehash[0].tuple.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT &&
+	     flow->tuplehash[0].tuple.out.ifidx == cud->ifindex &&
+	     flow->tuplehash[0].tuple.out.bridge_vid == cud->vid &&
+	     ether_addr_equal(flow->tuplehash[0].tuple.out.h_dest, cud->addr)) ||
+	    (flow->tuplehash[1].tuple.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT &&
+	     flow->tuplehash[1].tuple.out.ifidx == cud->ifindex &&
+	     flow->tuplehash[1].tuple.out.bridge_vid == cud->vid &&
+	     ether_addr_equal(flow->tuplehash[1].tuple.out.h_dest, cud->addr))) {
+		flow_offload_teardown(flow);
+		cud->found = true;
+	}
+}
+
+static int nf_flow_table_switchdev_event(struct notifier_block *unused,
+					 unsigned long event, void *ptr)
+{
+	struct switchdev_notifier_fdb_info *fdb_info;
+	struct nf_flowtable *flowtable;
+	struct flow_cleanup_data cud;
+
+	if (event != SWITCHDEV_FDB_DEL_TO_DEVICE)
+		return NOTIFY_DONE;
+
+	fdb_info = ptr;
+	cud.addr = fdb_info->addr;
+	cud.vid = fdb_info->vid;
+	cud.ifindex = fdb_info->info.dev->ifindex;
+
+	mutex_lock(&flowtable_lock);
+	list_for_each_entry(flowtable, &flowtables, list) {
+		cud.found = false;
+		nf_flow_table_iterate(flowtable, nf_flow_table_do_cleanup_addr, &cud);
+		if (cud.found)
+			mod_delayed_work(system_power_efficient_wq,
+					 &flowtable->gc_work, 0);
+	}
+	mutex_unlock(&flowtable_lock);
+
+	return NOTIFY_DONE;
+}
+
+struct notifier_block nf_flow_table_switchdev_nb __read_mostly = {
+	.notifier_call = nf_flow_table_switchdev_event,
+};
+
 void nf_flow_table_free(struct nf_flowtable *flow_table)
 {
 	mutex_lock(&flowtable_lock);
@@ -816,6 +874,10 @@ static int __init nf_flow_table_module_init(void)
 	if (ret)
 		goto out_offload;
 
+	ret = register_switchdev_notifier(&nf_flow_table_switchdev_nb);
+	if (ret < 0)
+		goto out_sw_noti;
+
 	ret = nf_flow_register_bpf();
 	if (ret)
 		goto out_bpf;
@@ -823,6 +885,8 @@ static int __init nf_flow_table_module_init(void)
 	return 0;
 
 out_bpf:
+	unregister_switchdev_notifier(&nf_flow_table_switchdev_nb);
+out_sw_noti:
 	nf_flow_table_offload_exit();
 out_offload:
 	unregister_pernet_subsys(&nf_flow_table_net_ops);
@@ -831,6 +895,7 @@ static int __init nf_flow_table_module_init(void)
 
 static void __exit nf_flow_table_module_exit(void)
 {
+	unregister_switchdev_notifier(&nf_flow_table_switchdev_nb);
 	nf_flow_table_offload_exit();
 	unregister_pernet_subsys(&nf_flow_table_net_ops);
 }
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 nf-next 3/3] netfilter: nf_flow_table_ip: don't follow fastpath when marked teardown
  2025-04-08 14:28 [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
  2025-04-08 14:28 ` [PATCH v2 nf-next 1/3] netfilter: flow: Add bridge_vid member Eric Woudstra
  2025-04-08 14:28 ` [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed Eric Woudstra
@ 2025-04-08 14:28 ` Eric Woudstra
  2025-04-11 15:32 ` [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
  3 siblings, 0 replies; 6+ messages in thread
From: Eric Woudstra @ 2025-04-08 14:28 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman
  Cc: netfilter-devel, netdev, Eric Woudstra

When a flow is marked for teardown, because the destination is not valid
any more, the software fastpath may still be in effect and traffic is
still send to the wrong destination. Change the ip/ipv6 hooks to not use
the software fastpath for a flow that is marked to be teared down and let
the packet continue along the normal path.

Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
---
 net/netfilter/nf_flow_table_ip.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
index 64a12b9668e7..f9bf2b466ca8 100644
--- a/net/netfilter/nf_flow_table_ip.c
+++ b/net/netfilter/nf_flow_table_ip.c
@@ -542,6 +542,9 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
 	dir = tuplehash->tuple.dir;
 	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
 
+	if (test_bit(NF_FLOW_TEARDOWN, &flow->flags))
+		return NF_ACCEPT;
+
 	switch (tuplehash->tuple.xmit_type) {
 	case FLOW_OFFLOAD_XMIT_NEIGH:
 		rt = dst_rtable(tuplehash->tuple.dst_cache);
@@ -838,6 +841,9 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
 	dir = tuplehash->tuple.dir;
 	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
 
+	if (test_bit(NF_FLOW_TEARDOWN, &flow->flags))
+		return NF_ACCEPT;
+
 	switch (tuplehash->tuple.xmit_type) {
 	case FLOW_OFFLOAD_XMIT_NEIGH:
 		rt = dst_rt6_info(tuplehash->tuple.dst_cache);
-- 
2.47.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed
  2025-04-08 14:28 ` [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed Eric Woudstra
@ 2025-04-11 10:23   ` Simon Horman
  0 siblings, 0 replies; 6+ messages in thread
From: Simon Horman @ 2025-04-11 10:23 UTC (permalink / raw)
  To: Eric Woudstra
  Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netfilter-devel,
	netdev

On Tue, Apr 08, 2025 at 04:28:47PM +0200, Eric Woudstra wrote:
> In case of a bridge in the forward-fastpath or bridge-fastpath the fdb is
> used to create the tuple. In case of roaming at layer 2 level, for example
> 802.11r, the destination device is changed in the fdb. The destination
> device of a direct transmitting tuple is no longer valid and traffic is
> send to the wrong destination. Also the hardware offloaded fastpath is not
> valid anymore.
> 
> In case of roaming, a switchdev notification is send to delete the old fdb
> entry. Upon receiving this notification, mark all direct transmitting flows
> with the same ifindex, vid and hardware address as the fdb entry to be
> teared down. The hardware offloaded fastpath is still in effect, so
> minimize the delay of the work queue by setting the delay to zero.
> 
> Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
> ---
>  net/netfilter/nf_flow_table_core.c | 65 ++++++++++++++++++++++++++++++
>  1 file changed, 65 insertions(+)
> 
> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c

...

> +struct notifier_block nf_flow_table_switchdev_nb __read_mostly = {
> +	.notifier_call = nf_flow_table_switchdev_event,
> +};

Hi Eric,

A minor nit from my side:

nf_flow_table_switchdev_nb seems only be used in this file and if so it
should be static.

Flagged by Sparse.

> +
>  void nf_flow_table_free(struct nf_flowtable *flow_table)
>  {
>  	mutex_lock(&flowtable_lock);
> @@ -816,6 +874,10 @@ static int __init nf_flow_table_module_init(void)
>  	if (ret)
>  		goto out_offload;
>  
> +	ret = register_switchdev_notifier(&nf_flow_table_switchdev_nb);
> +	if (ret < 0)
> +		goto out_sw_noti;
> +
>  	ret = nf_flow_register_bpf();
>  	if (ret)
>  		goto out_bpf;
> @@ -823,6 +885,8 @@ static int __init nf_flow_table_module_init(void)
>  	return 0;
>  
>  out_bpf:
> +	unregister_switchdev_notifier(&nf_flow_table_switchdev_nb);
> +out_sw_noti:
>  	nf_flow_table_offload_exit();
>  out_offload:
>  	unregister_pernet_subsys(&nf_flow_table_net_ops);
> @@ -831,6 +895,7 @@ static int __init nf_flow_table_module_init(void)
>  
>  static void __exit nf_flow_table_module_exit(void)
>  {
> +	unregister_switchdev_notifier(&nf_flow_table_switchdev_nb);
>  	nf_flow_table_offload_exit();
>  	unregister_pernet_subsys(&nf_flow_table_net_ops);
>  }
> -- 
> 2.47.1
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming
  2025-04-08 14:28 [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
                   ` (2 preceding siblings ...)
  2025-04-08 14:28 ` [PATCH v2 nf-next 3/3] netfilter: nf_flow_table_ip: don't follow fastpath when marked teardown Eric Woudstra
@ 2025-04-11 15:32 ` Eric Woudstra
  3 siblings, 0 replies; 6+ messages in thread
From: Eric Woudstra @ 2025-04-11 15:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, netdev, Jozsef Kadlecsik, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman



On 4/8/25 4:28 PM, Eric Woudstra wrote:
> In case of a bridge in the forward-fastpath or bridge-fastpath the fdb is
> used to create the tuple. In case of roaming at layer 2 level, for example
> 802.11r, the destination device is changed in the fdb. The destination
> device of a direct transmitting tuple is no longer valid and traffic is
> send to the wrong destination. Also the hardware offloaded fastpath is not
> valid anymore.
> 
> This flowentry needs to be torn down asap. Also make sure that the flow
> entry is not being used, when marked for teardown.
> 
> Changes in v2:
> - Unchanged, only tags RFC net-next to PATCH nf-next.
> 
> Eric Woudstra (3):
>   netfilter: flow: Add bridge_vid member
>   netfilter: nf_flow_table_core: teardown direct xmit when destination
>     changed
>   netfilter: nf_flow_table_ip: don't follow fastpath when marked
>     teardown
> 
>  include/net/netfilter/nf_flow_table.h |  2 +
>  net/netfilter/nf_flow_table_core.c    | 66 +++++++++++++++++++++++++++
>  net/netfilter/nf_flow_table_ip.c      |  6 +++
>  net/netfilter/nft_flow_offload.c      |  3 ++
>  4 files changed, 77 insertions(+)
> 

Hi Pablo,

I understand if you are busy, but this patch-set could be reviewed
totally separate from my other submissions. It addresses the issue of L2
roaming for any fastpath.

I'll wait for any other comments, before sending the fix for 'static'.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-04-11 15:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-08 14:28 [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra
2025-04-08 14:28 ` [PATCH v2 nf-next 1/3] netfilter: flow: Add bridge_vid member Eric Woudstra
2025-04-08 14:28 ` [PATCH v2 nf-next 2/3] netfilter: nf_flow_table_core: teardown direct xmit when destination changed Eric Woudstra
2025-04-11 10:23   ` Simon Horman
2025-04-08 14:28 ` [PATCH v2 nf-next 3/3] netfilter: nf_flow_table_ip: don't follow fastpath when marked teardown Eric Woudstra
2025-04-11 15:32 ` [PATCH v2 nf-next 0/3] flow offload teardown when layer 2 roaming Eric Woudstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).