* [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
@ 2025-09-12 16:30 Elad Yifee
2025-09-13 20:52 ` Florian Westphal
2025-09-16 16:49 ` Elad Yifee
0 siblings, 2 replies; 10+ messages in thread
From: Elad Yifee @ 2025-09-12 16:30 UTC (permalink / raw)
Cc: eladwf, Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
Phil Sutter, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, netfilter-devel, coreteam, netdev,
linux-kernel
When offloading a flow via the default nft flowtable path,
append a FLOW_ACTION_CT_METADATA action if the flow is associated with a conntrack entry.
We do this in both IPv4 and IPv6 route action builders, after NAT mangles and before redirect.
This mirrors net/sched/act_ct.c’s tcf_ct_flow_table_add_action_meta() so drivers that already
parse FLOW_ACTION_CT_METADATA from TC offloads can reuse the same logic for nft flowtables.
Signed-off-by: Elad Yifee <eladwf@gmail.com>
---
net/netfilter/nf_flow_table_offload.c | 38 +++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index e06bc36f49fe..bccae4052319 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -12,6 +12,7 @@
#include <net/netfilter/nf_conntrack_acct.h>
#include <net/netfilter/nf_conntrack_core.h>
#include <net/netfilter/nf_conntrack_tuple.h>
+#include <net/netfilter/nf_conntrack_labels.h>
static struct workqueue_struct *nf_flow_offload_add_wq;
static struct workqueue_struct *nf_flow_offload_del_wq;
@@ -679,6 +680,41 @@ nf_flow_rule_route_common(struct net *net, const struct flow_offload *flow,
return 0;
}
+static void flow_offload_add_ct_metadata(const struct flow_offload *flow,
+ enum flow_offload_tuple_dir dir,
+ struct nf_flow_rule *flow_rule)
+{
+ struct nf_conn *ct = flow->ct;
+ struct flow_action_entry *entry;
+#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
+ u32 *dst_labels;
+ struct nf_conn_labels *labels;
+#endif
+
+ if (!ct)
+ return;
+
+ entry = flow_action_entry_next(flow_rule);
+ entry->id = FLOW_ACTION_CT_METADATA;
+
+#if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
+ entry->ct_metadata.mark = READ_ONCE(ct->mark);
+#endif
+
+ entry->ct_metadata.orig_dir = (dir == FLOW_OFFLOAD_DIR_ORIGINAL);
+
+#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
+ dst_labels = entry->ct_metadata.labels;
+ labels = nf_ct_labels_find(ct);
+ if (labels)
+ memcpy(dst_labels, labels->bits, NF_CT_LABELS_MAX_SIZE);
+ else
+ memset(dst_labels, 0, NF_CT_LABELS_MAX_SIZE);
+#else
+ memset(entry->ct_metadata.labels, 0, NF_CT_LABELS_MAX_SIZE);
+#endif
+}
+
int nf_flow_rule_route_ipv4(struct net *net, struct flow_offload *flow,
enum flow_offload_tuple_dir dir,
struct nf_flow_rule *flow_rule)
@@ -698,6 +734,7 @@ int nf_flow_rule_route_ipv4(struct net *net, struct flow_offload *flow,
test_bit(NF_FLOW_DNAT, &flow->flags))
flow_offload_ipv4_checksum(net, flow, flow_rule);
+ flow_offload_add_ct_metadata(flow, dir, flow_rule);
flow_offload_redirect(net, flow, dir, flow_rule);
return 0;
@@ -720,6 +757,7 @@ int nf_flow_rule_route_ipv6(struct net *net, struct flow_offload *flow,
flow_offload_port_dnat(net, flow, dir, flow_rule);
}
+ flow_offload_add_ct_metadata(flow, dir, flow_rule);
flow_offload_redirect(net, flow, dir, flow_rule);
return 0;
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-12 16:30 [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables Elad Yifee
@ 2025-09-13 20:52 ` Florian Westphal
2025-09-14 6:40 ` Elad Yifee
2025-09-16 16:49 ` Elad Yifee
1 sibling, 1 reply; 10+ messages in thread
From: Florian Westphal @ 2025-09-13 20:52 UTC (permalink / raw)
To: Elad Yifee
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
Elad Yifee <eladwf@gmail.com> wrote:
> When offloading a flow via the default nft flowtable path,
> append a FLOW_ACTION_CT_METADATA action if the flow is associated with a conntrack entry.
> We do this in both IPv4 and IPv6 route action builders, after NAT mangles and before redirect.
> This mirrors net/sched/act_ct.c’s tcf_ct_flow_table_add_action_meta() so drivers that already
> parse FLOW_ACTION_CT_METADATA from TC offloads can reuse the same logic for nft flowtables.
>
> Signed-off-by: Elad Yifee <eladwf@gmail.com>
> ---
> net/netfilter/nf_flow_table_offload.c | 38 +++++++++++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
> index e06bc36f49fe..bccae4052319 100644
> --- a/net/netfilter/nf_flow_table_offload.c
> +++ b/net/netfilter/nf_flow_table_offload.c
> @@ -12,6 +12,7 @@
> #include <net/netfilter/nf_conntrack_acct.h>
> #include <net/netfilter/nf_conntrack_core.h>
> #include <net/netfilter/nf_conntrack_tuple.h>
> +#include <net/netfilter/nf_conntrack_labels.h>
>
> static struct workqueue_struct *nf_flow_offload_add_wq;
> static struct workqueue_struct *nf_flow_offload_del_wq;
> @@ -679,6 +680,41 @@ nf_flow_rule_route_common(struct net *net, const struct flow_offload *flow,
> return 0;
> }
>
> +static void flow_offload_add_ct_metadata(const struct flow_offload *flow,
> + enum flow_offload_tuple_dir dir,
> + struct nf_flow_rule *flow_rule)
> +{
> + struct nf_conn *ct = flow->ct;
> + struct flow_action_entry *entry;
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
> + u32 *dst_labels;
> + struct nf_conn_labels *labels;
> +#endif
> +
> + if (!ct)
> + return;
Under what circumstances can flow->ct be NULL?
> + entry = flow_action_entry_next(flow_rule);
> + entry->id = FLOW_ACTION_CT_METADATA;
> +
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
> + entry->ct_metadata.mark = READ_ONCE(ct->mark);
> +#endif
> +
> + entry->ct_metadata.orig_dir = (dir == FLOW_OFFLOAD_DIR_ORIGINAL);
> +
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
> + dst_labels = entry->ct_metadata.labels;
> + labels = nf_ct_labels_find(ct);
> + if (labels)
> + memcpy(dst_labels, labels->bits, NF_CT_LABELS_MAX_SIZE);
> + else
> + memset(dst_labels, 0, NF_CT_LABELS_MAX_SIZE);
> +#else
> + memset(entry->ct_metadata.labels, 0, NF_CT_LABELS_MAX_SIZE);
> +#endif
> +}
This looks almost identical tcf_ct_flow_table_add_action_meta().
Any chance to make it a common helper function? act_ct already depends
on nf_flow_table anyway.
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-13 20:52 ` Florian Westphal
@ 2025-09-14 6:40 ` Elad Yifee
0 siblings, 0 replies; 10+ messages in thread
From: Elad Yifee @ 2025-09-14 6:40 UTC (permalink / raw)
To: Florian Westphal
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Sat, Sep 13, 2025 at 11:52 PM Florian Westphal <fw@strlen.de> wrote:
>
> Under what circumstances can flow->ct be NULL?
I thought it could be NULL in a few cases. I’ll verify this on the
inet/IPv4/IPv6 path and report back in the next spin.
In any case, the null-guard is harmless, so I kept it.
> This looks almost identical tcf_ct_flow_table_add_action_meta().
>
> Any chance to make it a common helper function? act_ct already depends
> on nf_flow_table anyway.
agreed. If there are no objections to the main idea (exporting CT
metadata on the nft flowtable path),
I’ll prepare a new series that factors the fill logic into a shared
helper and converts both act_ct and the nft exporter to use it
Thanks for your feedback
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-12 16:30 [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables Elad Yifee
2025-09-13 20:52 ` Florian Westphal
@ 2025-09-16 16:49 ` Elad Yifee
2025-09-16 22:39 ` Pablo Neira Ayuso
1 sibling, 1 reply; 10+ messages in thread
From: Elad Yifee @ 2025-09-16 16:49 UTC (permalink / raw)
Cc: Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
Phil Sutter, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, netfilter-devel, coreteam, netdev,
linux-kernel
Hi all,
One caveat: this change will cause some existing drivers to start
returning -EOPNOTSUPP, since they walk the action list and treat any
unknown action as fatal in their default switch. In other words, adding
CT metadata unconditionally would break offload in those drivers until
they are updated.
Follow-up patches will therefore be needed to make drivers either parse
or safely ignore FLOW_ACTION_CT_METADATA. Because this action is only
advisory, it should be harmless for drivers that don’t use it to simply
accept and no-op it.
Just flagging this up front: the core patch by itself will break some
drivers, and additional work is required to make them tolerant of the
new metadata.
Thanks,
Elad
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-16 16:49 ` Elad Yifee
@ 2025-09-16 22:39 ` Pablo Neira Ayuso
2025-09-17 3:10 ` Elad Yifee
0 siblings, 1 reply; 10+ messages in thread
From: Pablo Neira Ayuso @ 2025-09-16 22:39 UTC (permalink / raw)
To: Elad Yifee
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Tue, Sep 16, 2025 at 07:49:34PM +0300, Elad Yifee wrote:
> Hi all,
>
> One caveat: this change will cause some existing drivers to start
> returning -EOPNOTSUPP, since they walk the action list and treat any
> unknown action as fatal in their default switch. In other words, adding
> CT metadata unconditionally would break offload in those drivers until
> they are updated.
>
> Follow-up patches will therefore be needed to make drivers either parse
> or safely ignore FLOW_ACTION_CT_METADATA. Because this action is only
> advisory, it should be harmless for drivers that don’t use it to simply
> accept and no-op it.
>
> Just flagging this up front: the core patch by itself will break some
> drivers, and additional work is required to make them tolerant of the
> new metadata.
May I ask, where is the software plane extension for this feature?
Will you add it for net/sched/act_ct.c?
Adding the hardware offload feature only is a deal breaker IMO.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-16 22:39 ` Pablo Neira Ayuso
@ 2025-09-17 3:10 ` Elad Yifee
2025-09-17 8:18 ` Pablo Neira Ayuso
0 siblings, 1 reply; 10+ messages in thread
From: Elad Yifee @ 2025-09-17 3:10 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Wed, Sep 17, 2025 at 1:39 AM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>
> May I ask, where is the software plane extension for this feature?
> Will you add it for net/sched/act_ct.c?
>
> Adding the hardware offload feature only is a deal breaker IMO.
Software plane: This doesn’t add a new user feature, it just surfaces
existing CT state to offload so the software plane already exists
today via nft/TC. In software you can already set/match ct mark/labels
(e.g., tag flows), and once offloaded the exporter snapshots that so a
driver can map the tag to a HW queue/class if it wants per-flow QoS in
hardware. Drivers that don’t need it can simply accept and ignore the
metadata.
act_ct.c: Yes - I’ll include a small common helper so TC and nft
flowtable offloads produce identical CT metadata.
If there’s no objection to the direction, I’ll respin with:
- the common helper
- act_ct switched to it
- nft flowtable exporter appending CT metadata
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-17 3:10 ` Elad Yifee
@ 2025-09-17 8:18 ` Pablo Neira Ayuso
2025-09-17 17:33 ` Elad Yifee
0 siblings, 1 reply; 10+ messages in thread
From: Pablo Neira Ayuso @ 2025-09-17 8:18 UTC (permalink / raw)
To: Elad Yifee
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Wed, Sep 17, 2025 at 06:10:10AM +0300, Elad Yifee wrote:
> On Wed, Sep 17, 2025 at 1:39 AM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> >
> > May I ask, where is the software plane extension for this feature?
> > Will you add it for net/sched/act_ct.c?
> >
> >
> > Adding the hardware offload feature only is a deal breaker IMO.
> Software plane: This doesn’t add a new user feature, it just surfaces
> existing CT state to offload so the software plane already exists
> today via nft/TC. In software you can already set/match ct mark/labels
> (e.g., tag flows), and once offloaded the exporter snapshots that so a
> driver can map the tag to a HW queue/class if it wants per-flow QoS in
> hardware. Drivers that don’t need it can simply accept and ignore the
> metadata.
Hm, flowtable software datapath cannot do ct marks/label at this time.
> act_ct.c: Yes - I’ll include a small common helper so TC and nft
> flowtable offloads produce identical CT metadata.
>
> If there’s no objection to the direction, I’ll respin with:
> - the common helper
> - act_ct switched to it
> - nft flowtable exporter appending CT metadata
Just to make sure we are on the same page: Software plane has to match
the capabilities of the hardware offload plan, new features must work
first in the software plane, then extend the hardware offload plane to
support it.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-17 8:18 ` Pablo Neira Ayuso
@ 2025-09-17 17:33 ` Elad Yifee
2025-09-24 22:51 ` Pablo Neira Ayuso
0 siblings, 1 reply; 10+ messages in thread
From: Elad Yifee @ 2025-09-17 17:33 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Wed, Sep 17, 2025 at 11:18 AM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Just to make sure we are on the same page: Software plane has to match
> the capabilities of the hardware offload plan, new features must work
> first in the software plane, then extend the hardware offload plane to
> support it.
Thanks - I see what you meant now.
This isn’t a new feature that needs to be implemented in software
first. We’re not introducing new user semantics, matches, or actions
in nft/TC. no datapath changes (including the flowtable software
offload fast path). The change only surfaces existing CT state
(mark/labels/dir) as FLOW_ACTION_CT_METADATA at the hardware offload
boundary so drivers can use it for per-flow QoS, or simply ignore it.
When a flow stays in software, behavior remains exactly as today,
software QoS continues to use existing tools (nft/TC setting
skb->priority/mark, qdiscs, etc.). There’s no SW-HW mismatch
introduced here.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-17 17:33 ` Elad Yifee
@ 2025-09-24 22:51 ` Pablo Neira Ayuso
2025-09-27 13:55 ` Elad Yifee
0 siblings, 1 reply; 10+ messages in thread
From: Pablo Neira Ayuso @ 2025-09-24 22:51 UTC (permalink / raw)
To: Elad Yifee
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Wed, Sep 17, 2025 at 08:33:49PM +0300, Elad Yifee wrote:
> On Wed, Sep 17, 2025 at 11:18 AM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Just to make sure we are on the same page: Software plane has to match
> > the capabilities of the hardware offload plan, new features must work
> > first in the software plane, then extend the hardware offload plane to
> > support it.
>
> Thanks - I see what you meant now.
>
> This isn’t a new feature that needs to be implemented in software
> first. We’re not introducing new user semantics, matches, or actions
> in nft/TC. no datapath changes (including the flowtable software
> offload fast path). The change only surfaces existing CT state
> (mark/labels/dir) as FLOW_ACTION_CT_METADATA at the hardware offload
> boundary so drivers can use it for per-flow QoS, or simply ignore it.
>
> When a flow stays in software, behavior remains exactly as today,
> software QoS continues to use existing tools (nft/TC setting
> skb->priority/mark, qdiscs, etc.). There’s no SW-HW mismatch
> introduced here.
You have to show me there is no mismatch.
This is exposing the current ct mark/label to your hardware, the
flowtable infrastructure (the software representation) makes no use of
this information from the flowtable datapath, can you explain how you
plan to use this?
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables
2025-09-24 22:51 ` Pablo Neira Ayuso
@ 2025-09-27 13:55 ` Elad Yifee
0 siblings, 0 replies; 10+ messages in thread
From: Elad Yifee @ 2025-09-27 13:55 UTC (permalink / raw)
To: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik, Florian Westphal, Phil Sutter, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
netfilter-devel, coreteam, netdev, linux-kernel
On Thu, Sep 25, 2025 at 1:51 AM Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> You have to show me there is no mismatch.
>
> This is exposing the current ct mark/label to your hardware, the
> flowtable infrastructure (the software representation) makes no use of
> this information from the flowtable datapath, can you explain how you
> plan to use this?
>
> Thanks.
Thanks for getting back to this.
My goal is per-flow HW QoS on offloaded connections. Once a flow is
promoted to the nft flowtable fast path, nft rules that set packet
marks are bypassed, so a driver no longer has a stable tag to map to
HW queues. The conntrack mark/labels are flow-scoped and persist
across offload, which is why I’d like to expose them to the driver as
metadata at the hardware offload boundary.
To address your “no mismatch” concern: this wouldn’t change the
software datapath at all, it would only surface existing CT state to
hardware. Could you advise on the best way to proceed here? Would an
offload-only exposure (drivers may use it or ignore it) be acceptable,
or would you prefer a specific software-side representation before we
add the hardware export?
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-09-27 13:56 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12 16:30 [PATCH net-next RFC] netfilter: flowtable: add CT metadata action for nft flowtables Elad Yifee
2025-09-13 20:52 ` Florian Westphal
2025-09-14 6:40 ` Elad Yifee
2025-09-16 16:49 ` Elad Yifee
2025-09-16 22:39 ` Pablo Neira Ayuso
2025-09-17 3:10 ` Elad Yifee
2025-09-17 8:18 ` Pablo Neira Ayuso
2025-09-17 17:33 ` Elad Yifee
2025-09-24 22:51 ` Pablo Neira Ayuso
2025-09-27 13:55 ` Elad Yifee
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).