Linux Netfilter development
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Ren Wei <n05ec@lzu.edu.cn>
Cc: netfilter-devel@vger.kernel.org, fw@strlen.de, phil@nwl.cc,
	lorenzo@kernel.org, yuantan098@gmail.com, yifanwucs@gmail.com,
	tomapufckgml@gmail.com, bird@lzu.edu.cn,
	chzhengyang2023@lzu.edu.cn
Subject: Re: [PATCH nf 1/1] netfilter: nf_flow_table: separate tunnel route state from direct xmit
Date: Tue, 9 Jun 2026 01:56:13 +0200	[thread overview]
Message-ID: <aidWndIYXO9hXFq-@chamomile> (raw)
In-Reply-To: <3947a39286d335b6136bbee26f8bf44b23471c69.1780580352.git.chzhengyang2023@lzu.edu.cn>

Hi,

On Fri, Jun 05, 2026 at 01:10:35AM +0800, Ren Wei wrote:
> From: Zhengyang Chen <chzhengyang2023@lzu.edu.cn>
> 
> When a flow tuple carries tunnel metadata and uses
> FLOW_OFFLOAD_XMIT_DIRECT, the transmit path may still need route state
> for tunnel push. However, the current tuple layout stores direct xmit
> L2 state and route state in overlapping runtime storage.
> 
> As a result, a tuple may keep tun_num set while the tunnel push path
> later reads tuple->dst_cache, even though a direct xmit tuple only has
> out.ifidx/out.h_source/out.h_dest stored in that area. This leads to
> invalid dst usage and can trigger a crash in the tunnel transmit path.
> 
> Fix this by separating tunnel route state from direct xmit runtime
> state. Store dedicated tunnel dst information for direct xmit tunnel
> flows, use it from the IPv4/IPv6 tunnel push helpers, and validate and
> release it independently from the normal neighbour/xfrm dst state.
> 
> Hardware offload rule construction still assumes that direct xmit flows
> do not carry tunnel route state, so reject that combination there for
> now to avoid undefined offload behaviour.
> 
> The issue can be reproduced with the provided namespace + flowtable +
> IPIP setup, and after this change the reproducer no longer triggers the
> observed GPF/panic.
> 
> Fixes: ab427db17885 ("netfilter: flowtable: Add IPIP rx sw acceleration")
> Cc: stable@vger.kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Signed-off-by: Zhengyang Chen <chzhengyang2023@lzu.edu.cn>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
>  include/net/netfilter/nf_flow_table.h |  4 ++++
>  net/netfilter/nf_flow_table_core.c    | 19 ++++++++++++++++++-
>  net/netfilter/nf_flow_table_ip.c      | 13 +++++++++++--
>  net/netfilter/nf_flow_table_offload.c |  5 +++++
>  4 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
> index 7b23b245a5a8..4fe220f97d75 100644
> --- a/include/net/netfilter/nf_flow_table.h
> +++ b/include/net/netfilter/nf_flow_table.h
> @@ -155,6 +155,10 @@ struct flow_offload_tuple {
>  					tun_num:2,
>  					in_vlan_ingress:2;
>  	u16				mtu;
> +	struct {
> +		struct dst_entry	*tun_dst_cache;
> +		u32			tun_dst_cookie;

Instead of adding new tun_dst_cache and tun_dst_cookie fields, you
could move the existing dst_cache and dst_cookie out of the union.
Then, it can be used both for _NEIGH and _DIRECT modes.

Does this make sense to simplify this patch?

One more comment below.

> +	};
>  	union {
>  		struct {
>  			struct dst_entry *dst_cache;
> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
> index 785d8c244a77..5048c0a1ba2e 100644
> --- a/net/netfilter/nf_flow_table_core.c
> +++ b/net/netfilter/nf_flow_table_core.c
> @@ -84,6 +84,14 @@ static u32 flow_offload_dst_cookie(struct flow_offload_tuple *flow_tuple)
>  	return 0;
>  }
>  
> +static u32 flow_offload_tun_dst_cookie(struct flow_offload_tuple *flow_tuple)
> +{
> +	if (flow_tuple->tun.l3_proto == IPPROTO_IPV6)
> +		return rt6_get_cookie(dst_rt6_info(flow_tuple->tun_dst_cache));
> +
> +	return 0;
> +}
> +
>  static struct dst_entry *nft_route_dst_fetch(struct nf_flow_route *route,
>  					     enum flow_offload_tuple_dir dir)
>  {
> @@ -127,12 +135,17 @@ static int flow_offload_fill_route(struct flow_offload *flow,
>  
>  	switch (route->tuple[dir].xmit_type) {
>  	case FLOW_OFFLOAD_XMIT_DIRECT:
> +		if (flow_tuple->tun_num) {
> +			flow_tuple->tun_dst_cache = dst;
> +			flow_tuple->tun_dst_cookie = flow_offload_tun_dst_cookie(flow_tuple);
> +		}
>  		memcpy(flow_tuple->out.h_dest, route->tuple[dir].out.h_dest,
>  		       ETH_ALEN);
>  		memcpy(flow_tuple->out.h_source, route->tuple[dir].out.h_source,
>  		       ETH_ALEN);
>  		flow_tuple->out.ifidx = route->tuple[dir].out.ifindex;
> -		dst_release(dst);
> +		if (!flow_tuple->tun_num)
> +			dst_release(dst);
>  		break;
>  	case FLOW_OFFLOAD_XMIT_XFRM:
>  	case FLOW_OFFLOAD_XMIT_NEIGH:
> @@ -152,6 +165,10 @@ static int flow_offload_fill_route(struct flow_offload *flow,
>  static void nft_flow_dst_release(struct flow_offload *flow,
>  				 enum flow_offload_tuple_dir dir)
>  {
> +	if (flow->tuplehash[dir].tuple.tun_num &&
> +	    flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT)
> +		dst_release(flow->tuplehash[dir].tuple.tun_dst_cache);
> +
>  	if (flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_NEIGH ||
>  	    flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM)
>  		dst_release(flow->tuplehash[dir].tuple.dst_cache);

dst_release() becomes no-op if it is NULL, then I think this can be
simplified.

> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
> index 9c05a50d6013..8dbec82a663a 100644
> --- a/net/netfilter/nf_flow_table_ip.c
> +++ b/net/netfilter/nf_flow_table_ip.c
> @@ -299,6 +299,11 @@ static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
>  
>  static inline bool nf_flow_dst_check(struct flow_offload_tuple *tuple)
>  {
> +	if (tuple->tun_num &&
> +	    tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT &&
> +	    !dst_check(tuple->tun_dst_cache, tuple->tun_dst_cookie))
> +		return false;
> +
>  	if (tuple->xmit_type != FLOW_OFFLOAD_XMIT_NEIGH &&
>  	    tuple->xmit_type != FLOW_OFFLOAD_XMIT_XFRM)
>  		return true;
> @@ -597,7 +602,9 @@ static int nf_flow_tunnel_ipip_push(struct net *net, struct sk_buff *skb,
>  				    __be32 *ip_daddr)
>  {
>  	struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
> -	struct rtable *rt = dst_rtable(tuple->dst_cache);
> +	struct dst_entry *dst = tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT ?
> +				tuple->tun_dst_cache : tuple->dst_cache;
> +	struct rtable *rt = dst_rtable(dst);
>  	u8 tos = iph->tos, ttl = iph->ttl;
>  	__be16 frag_off = iph->frag_off;
>  	u32 headroom = sizeof(*iph);
> @@ -660,7 +667,9 @@ static int nf_flow_tunnel_ip6ip6_push(struct net *net, struct sk_buff *skb,
>  {
>  	struct ipv6hdr *ip6h = (struct ipv6hdr *)skb_network_header(skb);
>  	u8 hop_limit = ip6h->hop_limit, proto = IPPROTO_IPV6;
> -	struct rtable *rt = dst_rtable(tuple->dst_cache);
> +	struct dst_entry *dst = tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT ?
> +				tuple->tun_dst_cache : tuple->dst_cache;
> +	struct rtable *rt = dst_rtable(dst);
>  	__u8 dsfield = ipv6_get_dsfield(ip6h);
>  	struct flowi6 fl6 = {
>  		.daddr = tuple->tun.src_v6,
> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
> index 002ec15d988b..c739c9db68bd 100644
> --- a/net/netfilter/nf_flow_table_offload.c
> +++ b/net/netfilter/nf_flow_table_offload.c
> @@ -820,6 +820,11 @@ nf_flow_offload_rule_alloc(struct net *net,
>  
>  	tuple = &flow->tuplehash[dir].tuple;
>  	other_tuple = &flow->tuplehash[!dir].tuple;
> +
> +	if (other_tuple->tun_num &&
> +	    other_tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT)
> +		goto err_flow_match;
> +
>  	if (other_tuple->xmit_type == FLOW_OFFLOAD_XMIT_NEIGH)
>  		other_dst = other_tuple->dst_cache;
>  
> -- 
> 2.43.0
> 

      reply	other threads:[~2026-06-08 23:56 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1780580352.git.chzhengyang2023@lzu.edu.cn>
2026-06-04 17:10 ` [PATCH nf 1/1] netfilter: nf_flow_table: separate tunnel route state from direct xmit Ren Wei
2026-06-08 23:56   ` Pablo Neira Ayuso [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aidWndIYXO9hXFq-@chamomile \
    --to=pablo@netfilter.org \
    --cc=bird@lzu.edu.cn \
    --cc=chzhengyang2023@lzu.edu.cn \
    --cc=fw@strlen.de \
    --cc=lorenzo@kernel.org \
    --cc=n05ec@lzu.edu.cn \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=phil@nwl.cc \
    --cc=tomapufckgml@gmail.com \
    --cc=yifanwucs@gmail.com \
    --cc=yuantan098@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox