netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH nf-next] netfilter: fib: avoid lookup if socket is available
@ 2025-02-20 13:07 Florian Westphal
  2025-03-12 19:15 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-02-20 13:07 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

In case the fib match is used from the input hook we can avoid the fib
lookup if early demux assigned a socket for us: check that the input
interface matches sk-cached one.

Rework the existing 'lo bypass' logic to first check sk, then
for loopback interface type to elide the fib lookup.

This speeds up fib matching a little, before:
93.08 GBit/s (no rules at all)
75.1  GBit/s ("fib saddr . iif oif missing drop" in prerouting)
75.62 GBit/s ("fib saddr . iif oif missing drop" in input)

After:
92.48 GBit/s (no rules at all)
75.62 GBit/s (fib rule in prerouting)
90.37 GBit/s (fib rule in input).

Numbers for the 'no rules' and 'prerouting' are expected to
closely match in-between runs, the 3rd/input test case exercises the
the 'avoid lookup if cached ifindex in sk matches' case.

Test used iperf3 via veth interface, lo can't be used due to existing
loopback test.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nft_fib.h   | 21 +++++++++++++++++++++
 net/ipv4/netfilter/nft_fib_ipv4.c | 11 +++++------
 net/ipv6/netfilter/nft_fib_ipv6.c | 19 ++++++++++---------
 3 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/include/net/netfilter/nft_fib.h b/include/net/netfilter/nft_fib.h
index 38cae7113de4..6e202ed5e63f 100644
--- a/include/net/netfilter/nft_fib.h
+++ b/include/net/netfilter/nft_fib.h
@@ -18,6 +18,27 @@ nft_fib_is_loopback(const struct sk_buff *skb, const struct net_device *in)
 	return skb->pkt_type == PACKET_LOOPBACK || in->flags & IFF_LOOPBACK;
 }
 
+static inline bool nft_fib_can_skip(const struct nft_pktinfo *pkt)
+{
+	const struct net_device *indev = nft_in(pkt);
+	const struct sock *sk;
+
+	switch (nft_hook(pkt)) {
+	case NF_INET_PRE_ROUTING:
+	case NF_INET_INGRESS:
+	case NF_INET_LOCAL_IN:
+		break;
+	default:
+		return false;
+	}
+
+	sk = pkt->skb->sk;
+	if (sk && sk_fullsock(sk))
+	       return sk->sk_rx_dst_ifindex == indev->ifindex;
+
+	return nft_fib_is_loopback(pkt->skb, indev);
+}
+
 int nft_fib_dump(struct sk_buff *skb, const struct nft_expr *expr, bool reset);
 int nft_fib_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 		 const struct nlattr * const tb[]);
diff --git a/net/ipv4/netfilter/nft_fib_ipv4.c b/net/ipv4/netfilter/nft_fib_ipv4.c
index 625adbc42037..9082ca17e845 100644
--- a/net/ipv4/netfilter/nft_fib_ipv4.c
+++ b/net/ipv4/netfilter/nft_fib_ipv4.c
@@ -71,6 +71,11 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs,
 	const struct net_device *oif;
 	const struct net_device *found;
 
+	if (nft_fib_can_skip(pkt)) {
+		nft_fib_store_result(dest, priv, nft_in(pkt));
+		return;
+	}
+
 	/*
 	 * Do not set flowi4_oif, it restricts results (for example, asking
 	 * for oif 3 will get RTN_UNICAST result even if the daddr exits
@@ -85,12 +90,6 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs,
 	else
 		oif = NULL;
 
-	if (nft_hook(pkt) == NF_INET_PRE_ROUTING &&
-	    nft_fib_is_loopback(pkt->skb, nft_in(pkt))) {
-		nft_fib_store_result(dest, priv, nft_in(pkt));
-		return;
-	}
-
 	iph = skb_header_pointer(pkt->skb, noff, sizeof(_iph), &_iph);
 	if (!iph) {
 		regs->verdict.code = NFT_BREAK;
diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c
index c9f1634b3838..7fd9d7b21cd4 100644
--- a/net/ipv6/netfilter/nft_fib_ipv6.c
+++ b/net/ipv6/netfilter/nft_fib_ipv6.c
@@ -170,6 +170,11 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs,
 	struct rt6_info *rt;
 	int lookup_flags;
 
+	if (nft_fib_can_skip(pkt)) {
+		nft_fib_store_result(dest, priv, nft_in(pkt));
+		return;
+	}
+
 	if (priv->flags & NFTA_FIB_F_IIF)
 		oif = nft_in(pkt);
 	else if (priv->flags & NFTA_FIB_F_OIF)
@@ -181,17 +186,13 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs,
 		return;
 	}
 
-	lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, oif, iph);
-
-	if (nft_hook(pkt) == NF_INET_PRE_ROUTING ||
-	    nft_hook(pkt) == NF_INET_INGRESS) {
-		if (nft_fib_is_loopback(pkt->skb, nft_in(pkt)) ||
-		    nft_fib_v6_skip_icmpv6(pkt->skb, pkt->tprot, iph)) {
-			nft_fib_store_result(dest, priv, nft_in(pkt));
-			return;
-		}
+	if (nft_fib_v6_skip_icmpv6(pkt->skb, pkt->tprot, iph)) {
+		nft_fib_store_result(dest, priv, nft_in(pkt));
+		return;
 	}
 
+	lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, oif, iph);
+
 	*dest = 0;
 	rt = (void *)ip6_route_lookup(nft_net(pkt), &fl6, pkt->skb,
 				      lookup_flags);
-- 
2.45.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH nf-next] netfilter: fib: avoid lookup if socket is available
  2025-02-20 13:07 [PATCH nf-next] netfilter: fib: avoid lookup if socket is available Florian Westphal
@ 2025-03-12 19:15 ` Pablo Neira Ayuso
  2025-03-12 21:38   ` Florian Westphal
  0 siblings, 1 reply; 4+ messages in thread
From: Pablo Neira Ayuso @ 2025-03-12 19:15 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Thu, Feb 20, 2025 at 02:07:01PM +0100, Florian Westphal wrote:
> In case the fib match is used from the input hook we can avoid the fib
> lookup if early demux assigned a socket for us: check that the input
> interface matches sk-cached one.
> 
> Rework the existing 'lo bypass' logic to first check sk, then
> for loopback interface type to elide the fib lookup.
> 
> This speeds up fib matching a little, before:
> 93.08 GBit/s (no rules at all)
> 75.1  GBit/s ("fib saddr . iif oif missing drop" in prerouting)
> 75.62 GBit/s ("fib saddr . iif oif missing drop" in input)
> 
> After:
> 92.48 GBit/s (no rules at all)
> 75.62 GBit/s (fib rule in prerouting)
> 90.37 GBit/s (fib rule in input).
> 
> Numbers for the 'no rules' and 'prerouting' are expected to
> closely match in-between runs, the 3rd/input test case exercises the
> the 'avoid lookup if cached ifindex in sk matches' case.
> 
> Test used iperf3 via veth interface, lo can't be used due to existing
> loopback test.

A few questions below.

> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  include/net/netfilter/nft_fib.h   | 21 +++++++++++++++++++++
>  net/ipv4/netfilter/nft_fib_ipv4.c | 11 +++++------
>  net/ipv6/netfilter/nft_fib_ipv6.c | 19 ++++++++++---------
>  3 files changed, 36 insertions(+), 15 deletions(-)
> 
> diff --git a/include/net/netfilter/nft_fib.h b/include/net/netfilter/nft_fib.h
> index 38cae7113de4..6e202ed5e63f 100644
> --- a/include/net/netfilter/nft_fib.h
> +++ b/include/net/netfilter/nft_fib.h
> @@ -18,6 +18,27 @@ nft_fib_is_loopback(const struct sk_buff *skb, const struct net_device *in)
>  	return skb->pkt_type == PACKET_LOOPBACK || in->flags & IFF_LOOPBACK;
>  }
>  
> +static inline bool nft_fib_can_skip(const struct nft_pktinfo *pkt)
> +{
> +	const struct net_device *indev = nft_in(pkt);
> +	const struct sock *sk;
> +
> +	switch (nft_hook(pkt)) {
> +	case NF_INET_PRE_ROUTING:
> +	case NF_INET_INGRESS:

Not an issue in your patch itself, it seems nft_fib_validate() was
never updated to support NF_INET_INGRESS.

> +	case NF_INET_LOCAL_IN:
> +		break;
> +	default:
> +		return false;
> +	}
> +
> +	sk = pkt->skb->sk;
> +	if (sk && sk_fullsock(sk))
> +	       return sk->sk_rx_dst_ifindex == indev->ifindex;
> +
> +	return nft_fib_is_loopback(pkt->skb, indev);
> +}
> +
>  int nft_fib_dump(struct sk_buff *skb, const struct nft_expr *expr, bool reset);
>  int nft_fib_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
>  		 const struct nlattr * const tb[]);
> diff --git a/net/ipv4/netfilter/nft_fib_ipv4.c b/net/ipv4/netfilter/nft_fib_ipv4.c
> index 625adbc42037..9082ca17e845 100644
> --- a/net/ipv4/netfilter/nft_fib_ipv4.c
> +++ b/net/ipv4/netfilter/nft_fib_ipv4.c
> @@ -71,6 +71,11 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs,
>  	const struct net_device *oif;
>  	const struct net_device *found;
>  
> +	if (nft_fib_can_skip(pkt)) {
> +		nft_fib_store_result(dest, priv, nft_in(pkt));
> +		return;
> +	}

Silly question: Does this optimization work for all cases?
NFTA_FIB_F_MARK and NFTA_FIB_F_DADDR.

>  	/*
>  	 * Do not set flowi4_oif, it restricts results (for example, asking
>  	 * for oif 3 will get RTN_UNICAST result even if the daddr exits
> @@ -85,12 +90,6 @@ void nft_fib4_eval(const struct nft_expr *expr, struct nft_regs *regs,
>  	else
>  		oif = NULL;
>  
> -	if (nft_hook(pkt) == NF_INET_PRE_ROUTING &&
> -	    nft_fib_is_loopback(pkt->skb, nft_in(pkt))) {
> -		nft_fib_store_result(dest, priv, nft_in(pkt));
> -		return;
> -	}
> -
>  	iph = skb_header_pointer(pkt->skb, noff, sizeof(_iph), &_iph);
>  	if (!iph) {
>  		regs->verdict.code = NFT_BREAK;
> diff --git a/net/ipv6/netfilter/nft_fib_ipv6.c b/net/ipv6/netfilter/nft_fib_ipv6.c
> index c9f1634b3838..7fd9d7b21cd4 100644
> --- a/net/ipv6/netfilter/nft_fib_ipv6.c
> +++ b/net/ipv6/netfilter/nft_fib_ipv6.c
> @@ -170,6 +170,11 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs,
>  	struct rt6_info *rt;
>  	int lookup_flags;
>  
> +	if (nft_fib_can_skip(pkt)) {
> +		nft_fib_store_result(dest, priv, nft_in(pkt));
> +		return;
> +	}
> +
>  	if (priv->flags & NFTA_FIB_F_IIF)
>  		oif = nft_in(pkt);
>  	else if (priv->flags & NFTA_FIB_F_OIF)
> @@ -181,17 +186,13 @@ void nft_fib6_eval(const struct nft_expr *expr, struct nft_regs *regs,
>  		return;
>  	}
>  
> -	lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, oif, iph);
> -
> -	if (nft_hook(pkt) == NF_INET_PRE_ROUTING ||
> -	    nft_hook(pkt) == NF_INET_INGRESS) {
> -		if (nft_fib_is_loopback(pkt->skb, nft_in(pkt)) ||
> -		    nft_fib_v6_skip_icmpv6(pkt->skb, pkt->tprot, iph)) {
> -			nft_fib_store_result(dest, priv, nft_in(pkt));
> -			return;
> -		}
> +	if (nft_fib_v6_skip_icmpv6(pkt->skb, pkt->tprot, iph)) {
> +		nft_fib_store_result(dest, priv, nft_in(pkt));
> +		return;
>  	}
>  
> +	lookup_flags = nft_fib6_flowi_init(&fl6, priv, pkt, oif, iph);
> +
>  	*dest = 0;
>  	rt = (void *)ip6_route_lookup(nft_net(pkt), &fl6, pkt->skb,
>  				      lookup_flags);
> -- 
> 2.45.3
> 
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nf-next] netfilter: fib: avoid lookup if socket is available
  2025-03-12 19:15 ` Pablo Neira Ayuso
@ 2025-03-12 21:38   ` Florian Westphal
  2025-03-12 23:19     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-03-12 21:38 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > +	switch (nft_hook(pkt)) {
> > +	case NF_INET_PRE_ROUTING:
> > +	case NF_INET_INGRESS:
> 
> Not an issue in your patch itself, it seems nft_fib_validate() was
> never updated to support NF_INET_INGRESS.

Yes, probably better to do that in a different patch.

> > +	if (nft_fib_can_skip(pkt)) {
> > +		nft_fib_store_result(dest, priv, nft_in(pkt));
> > +		return;
> > +	}
> 
> Silly question: Does this optimization work for all cases?
> NFTA_FIB_F_MARK and NFTA_FIB_F_DADDR.

Its the socket that the skb will be delivered to, so I don't see
an issue.  Theoretically you could set a different mark in input,
but what is it good for? Its too late to change routing result.

As this sits in input hook, route lookup done by stack (not by fib
expr) already picked nft_in as the 'right' interface for this daddr.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH nf-next] netfilter: fib: avoid lookup if socket is available
  2025-03-12 21:38   ` Florian Westphal
@ 2025-03-12 23:19     ` Pablo Neira Ayuso
  0 siblings, 0 replies; 4+ messages in thread
From: Pablo Neira Ayuso @ 2025-03-12 23:19 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Wed, Mar 12, 2025 at 10:38:31PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > +	switch (nft_hook(pkt)) {
> > > +	case NF_INET_PRE_ROUTING:
> > > +	case NF_INET_INGRESS:
> > 
> > Not an issue in your patch itself, it seems nft_fib_validate() was
> > never updated to support NF_INET_INGRESS.
> 
> Yes, probably better to do that in a different patch.
> 
> > > +	if (nft_fib_can_skip(pkt)) {
> > > +		nft_fib_store_result(dest, priv, nft_in(pkt));
> > > +		return;
> > > +	}
> > 
> > Silly question: Does this optimization work for all cases?
> > NFTA_FIB_F_MARK and NFTA_FIB_F_DADDR.
> 
> Its the socket that the skb will be delivered to, so I don't see
> an issue.  Theoretically you could set a different mark in input,
> but what is it good for? Its too late to change routing result.

I see, makes no sense to trigger another lookup with the different
mark after the stack already provides a route (no use-case for this).

> As this sits in input hook, route lookup done by stack (not by fib
> expr) already picked nft_in as the 'right' interface for this daddr.

thanks for explaining.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-03-12 23:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-20 13:07 [PATCH nf-next] netfilter: fib: avoid lookup if socket is available Florian Westphal
2025-03-12 19:15 ` Pablo Neira Ayuso
2025-03-12 21:38   ` Florian Westphal
2025-03-12 23:19     ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).