public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next,RFC 0/8] netfilter: flowtable bulking
@ 2026-03-17 11:29 Pablo Neira Ayuso
  2026-03-17 11:29 ` [PATCH net-next,RFC 1/8] netfilter: flowtable: Add basic bulking infrastructure for early ingress hook Pablo Neira Ayuso
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Pablo Neira Ayuso @ 2026-03-17 11:29 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms,
	steffen.klassert, antony.antony

Hi,
 
Back in 2018 [1], a new fast forwarding combining the flowtable and
GRO/GSO was proposed, however, "GRO is specialized to optimize the
non-forwarding case", so it was considered "counter-intuitive to base a
fast forwarding path on top of it".
 
Then, Steffen Klassert proposed the idea of adding a new engine for the
flowtable that operates on the skb list that is provided after the NAPI
cycle. The idea is to process this skb list to create bulks grouped by
the ethertype, output device, next hop and tos/dscp. Then, add a
specialized xmit path that can deal with these skb bulks. Note that GRO
needs to be disabled so this new forwarding engine obtains the list of
skbs that resulted from the NAPI cycle.
 
Before grouping skbs in bulks, there is a flowtable lookup to check if
this flow is already in the flowtable, otherwise, the packet follows
slow path. In case the flowtable lookup returns an entry, then this
packet follows fast path: the ttl is decremented, the corresponding NAT
mangling on the packet and layer 2/3 tunnel encapsulation (layer 2:
vlan/pppoe, layer 3: ipip) are performed.
 
The fast forwarding path is enabled through explicit user policy, so the
user needs to request this behaviour from control plane, the following
example shows how to place flows in the new fast forwarding path from
the forward chain:

 table x {
        flowtable f {
                hook early_ingress priority 0; devices = { eth0, eth1 }
        }
 
        chain y {
                type filter hook forward priority 0;
                ip protocol tcp flow offload @f counter
        }
 }
 
 
The example above sets up a fastpath for TCP flows that are placed in
the flowtable 'f', this flowtable is hooked at the new early_ingress
hook.  The initial TCP packets that match this rule from the standard
fowarding path create an entry in the flowtable.
 
Note that tcpdump only shows the packets in the tx path, since this
new early_ingress hook happens before the ingress tap.

The patch series contains 8 patches:

- #1 and #2 adds the basic RX flowtable bulking infrastructure for
  IPv4 and IPv6.
- #3 adds the early_ingress netfilter hook.
- #4 adds a helper function to prepare for the netfilter chain for
  the early_ingress hook.
- #5 adds the early_ingress filter chain.
- #6 and #7 add helper functions to reuse TX path codebase.
- #8 adds the custom TX path for listified skbs and updates
  the flowtable bulking to use it.

= Benchmark numbers =

Using the following testbed with 4 hosts with this topology:
 
 | sunset |-----| west |====| east |----| sunrise |
 
And this hardware:
 
* Supermicro H13SSW Motherboard
* AMD EPYC 9135 16-Core Processor (a.k.a. Bergamo, or Zen 5)
* NIC: Mellanox MT28800 ConnectX-5 Ex (100Gbps NIc)
* NIC: Broadcom BCM57508 NetXtreme-E (only on sunrise, 100Gbps NIc)
 
With 128 byte packets:
 
* From ~2 Mpps (baseline) to ~4 Mpps with 1 flow.
* From ~10.6 Mpps (baseline) to ~15.7 Mpps with 10 flows.
 
Antony Antony collected performance numbers and made a report describing
this the benchmarking[2]. This report includes numbers from the IPsec
support which is not included in this series.

Comments welcome, thanks.

Pablo Neira Ayuso (8):
  netfilter: flowtable: Add basic bulking infrastructure for early ingress hook
  netfilter: flowtable: Add IPv6 bulking infrastructure for early ingress hook
  netfilter: nf_tables: add flowtable early_ingress support
  netfilter: nf_tables: add nft_set_pktinfo_ingress()
  netfilter: nf_tables: add early ingress chain
  net: add dev_dst_drop() helper function
  net: add dev_noqueue_xmit_list() helper function
  net: add dev_queue_xmit_list() and use it

 include/linux/netdevice.h             |   2 +
 include/net/netfilter/nf_flow_table.h |  13 +-
 net/core/dev.c                        | 297 ++++++++++++++++----
 net/netfilter/nf_flow_table_inet.c    |  81 ++++++
 net/netfilter/nf_flow_table_ip.c      | 384 ++++++++++++++++++++++++++
 net/netfilter/nf_tables_api.c         |  12 +-
 net/netfilter/nft_chain_filter.c      | 164 +++++++++--
 7 files changed, 872 insertions(+), 81 deletions(-)

-- 
2.47.3


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2026-03-20  9:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-17 11:29 [PATCH net-next,RFC 0/8] netfilter: flowtable bulking Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 1/8] netfilter: flowtable: Add basic bulking infrastructure for early ingress hook Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 2/8] netfilter: flowtable: Add IPv6 " Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 3/8] netfilter: nf_tables: add flowtable early_ingress support Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 4/8] netfilter: nf_tables: add nft_set_pktinfo_ingress() Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 5/8] netfilter: nf_tables: add early ingress chain Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 6/8] net: add dev_dst_drop() helper function Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 7/8] net: add dev_noqueue_xmit_list() " Pablo Neira Ayuso
2026-03-17 11:29 ` [PATCH net-next,RFC 8/8] net: add dev_queue_xmit_list() and use it Pablo Neira Ayuso
2026-03-17 11:39 ` [PATCH net-next,RFC 0/8] netfilter: flowtable bulking Pablo Neira Ayuso
2026-03-19  6:15 ` Qingfang Deng
2026-03-19 11:28   ` Steffen Klassert
2026-03-19 12:18     ` Felix Fietkau
2026-03-20  6:49       ` Steffen Klassert
2026-03-20  8:50         ` Felix Fietkau
2026-03-20  9:00           ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox