From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Ren Wei <n05ec@lzu.edu.cn>
Cc: netfilter-devel@vger.kernel.org, fw@strlen.de, phil@nwl.cc,
lorenzo@kernel.org, yuantan098@gmail.com, yifanwucs@gmail.com,
tomapufckgml@gmail.com, bird@lzu.edu.cn,
chzhengyang2023@lzu.edu.cn
Subject: Re: [PATCH nf 1/1] netfilter: nf_flow_table: separate tunnel route state from direct xmit
Date: Tue, 9 Jun 2026 01:56:13 +0200 [thread overview]
Message-ID: <aidWndIYXO9hXFq-@chamomile> (raw)
In-Reply-To: <3947a39286d335b6136bbee26f8bf44b23471c69.1780580352.git.chzhengyang2023@lzu.edu.cn>
Hi,
On Fri, Jun 05, 2026 at 01:10:35AM +0800, Ren Wei wrote:
> From: Zhengyang Chen <chzhengyang2023@lzu.edu.cn>
>
> When a flow tuple carries tunnel metadata and uses
> FLOW_OFFLOAD_XMIT_DIRECT, the transmit path may still need route state
> for tunnel push. However, the current tuple layout stores direct xmit
> L2 state and route state in overlapping runtime storage.
>
> As a result, a tuple may keep tun_num set while the tunnel push path
> later reads tuple->dst_cache, even though a direct xmit tuple only has
> out.ifidx/out.h_source/out.h_dest stored in that area. This leads to
> invalid dst usage and can trigger a crash in the tunnel transmit path.
>
> Fix this by separating tunnel route state from direct xmit runtime
> state. Store dedicated tunnel dst information for direct xmit tunnel
> flows, use it from the IPv4/IPv6 tunnel push helpers, and validate and
> release it independently from the normal neighbour/xfrm dst state.
>
> Hardware offload rule construction still assumes that direct xmit flows
> do not carry tunnel route state, so reject that combination there for
> now to avoid undefined offload behaviour.
>
> The issue can be reproduced with the provided namespace + flowtable +
> IPIP setup, and after this change the reproducer no longer triggers the
> observed GPF/panic.
>
> Fixes: ab427db17885 ("netfilter: flowtable: Add IPIP rx sw acceleration")
> Cc: stable@vger.kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Signed-off-by: Zhengyang Chen <chzhengyang2023@lzu.edu.cn>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
> include/net/netfilter/nf_flow_table.h | 4 ++++
> net/netfilter/nf_flow_table_core.c | 19 ++++++++++++++++++-
> net/netfilter/nf_flow_table_ip.c | 13 +++++++++++--
> net/netfilter/nf_flow_table_offload.c | 5 +++++
> 4 files changed, 38 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
> index 7b23b245a5a8..4fe220f97d75 100644
> --- a/include/net/netfilter/nf_flow_table.h
> +++ b/include/net/netfilter/nf_flow_table.h
> @@ -155,6 +155,10 @@ struct flow_offload_tuple {
> tun_num:2,
> in_vlan_ingress:2;
> u16 mtu;
> + struct {
> + struct dst_entry *tun_dst_cache;
> + u32 tun_dst_cookie;
Instead of adding new tun_dst_cache and tun_dst_cookie fields, you
could move the existing dst_cache and dst_cookie out of the union.
Then, it can be used both for _NEIGH and _DIRECT modes.
Does this make sense to simplify this patch?
One more comment below.
> + };
> union {
> struct {
> struct dst_entry *dst_cache;
> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
> index 785d8c244a77..5048c0a1ba2e 100644
> --- a/net/netfilter/nf_flow_table_core.c
> +++ b/net/netfilter/nf_flow_table_core.c
> @@ -84,6 +84,14 @@ static u32 flow_offload_dst_cookie(struct flow_offload_tuple *flow_tuple)
> return 0;
> }
>
> +static u32 flow_offload_tun_dst_cookie(struct flow_offload_tuple *flow_tuple)
> +{
> + if (flow_tuple->tun.l3_proto == IPPROTO_IPV6)
> + return rt6_get_cookie(dst_rt6_info(flow_tuple->tun_dst_cache));
> +
> + return 0;
> +}
> +
> static struct dst_entry *nft_route_dst_fetch(struct nf_flow_route *route,
> enum flow_offload_tuple_dir dir)
> {
> @@ -127,12 +135,17 @@ static int flow_offload_fill_route(struct flow_offload *flow,
>
> switch (route->tuple[dir].xmit_type) {
> case FLOW_OFFLOAD_XMIT_DIRECT:
> + if (flow_tuple->tun_num) {
> + flow_tuple->tun_dst_cache = dst;
> + flow_tuple->tun_dst_cookie = flow_offload_tun_dst_cookie(flow_tuple);
> + }
> memcpy(flow_tuple->out.h_dest, route->tuple[dir].out.h_dest,
> ETH_ALEN);
> memcpy(flow_tuple->out.h_source, route->tuple[dir].out.h_source,
> ETH_ALEN);
> flow_tuple->out.ifidx = route->tuple[dir].out.ifindex;
> - dst_release(dst);
> + if (!flow_tuple->tun_num)
> + dst_release(dst);
> break;
> case FLOW_OFFLOAD_XMIT_XFRM:
> case FLOW_OFFLOAD_XMIT_NEIGH:
> @@ -152,6 +165,10 @@ static int flow_offload_fill_route(struct flow_offload *flow,
> static void nft_flow_dst_release(struct flow_offload *flow,
> enum flow_offload_tuple_dir dir)
> {
> + if (flow->tuplehash[dir].tuple.tun_num &&
> + flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT)
> + dst_release(flow->tuplehash[dir].tuple.tun_dst_cache);
> +
> if (flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_NEIGH ||
> flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM)
> dst_release(flow->tuplehash[dir].tuple.dst_cache);
dst_release() becomes no-op if it is NULL, then I think this can be
simplified.
> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
> index 9c05a50d6013..8dbec82a663a 100644
> --- a/net/netfilter/nf_flow_table_ip.c
> +++ b/net/netfilter/nf_flow_table_ip.c
> @@ -299,6 +299,11 @@ static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu)
>
> static inline bool nf_flow_dst_check(struct flow_offload_tuple *tuple)
> {
> + if (tuple->tun_num &&
> + tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT &&
> + !dst_check(tuple->tun_dst_cache, tuple->tun_dst_cookie))
> + return false;
> +
> if (tuple->xmit_type != FLOW_OFFLOAD_XMIT_NEIGH &&
> tuple->xmit_type != FLOW_OFFLOAD_XMIT_XFRM)
> return true;
> @@ -597,7 +602,9 @@ static int nf_flow_tunnel_ipip_push(struct net *net, struct sk_buff *skb,
> __be32 *ip_daddr)
> {
> struct iphdr *iph = (struct iphdr *)skb_network_header(skb);
> - struct rtable *rt = dst_rtable(tuple->dst_cache);
> + struct dst_entry *dst = tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT ?
> + tuple->tun_dst_cache : tuple->dst_cache;
> + struct rtable *rt = dst_rtable(dst);
> u8 tos = iph->tos, ttl = iph->ttl;
> __be16 frag_off = iph->frag_off;
> u32 headroom = sizeof(*iph);
> @@ -660,7 +667,9 @@ static int nf_flow_tunnel_ip6ip6_push(struct net *net, struct sk_buff *skb,
> {
> struct ipv6hdr *ip6h = (struct ipv6hdr *)skb_network_header(skb);
> u8 hop_limit = ip6h->hop_limit, proto = IPPROTO_IPV6;
> - struct rtable *rt = dst_rtable(tuple->dst_cache);
> + struct dst_entry *dst = tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT ?
> + tuple->tun_dst_cache : tuple->dst_cache;
> + struct rtable *rt = dst_rtable(dst);
> __u8 dsfield = ipv6_get_dsfield(ip6h);
> struct flowi6 fl6 = {
> .daddr = tuple->tun.src_v6,
> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
> index 002ec15d988b..c739c9db68bd 100644
> --- a/net/netfilter/nf_flow_table_offload.c
> +++ b/net/netfilter/nf_flow_table_offload.c
> @@ -820,6 +820,11 @@ nf_flow_offload_rule_alloc(struct net *net,
>
> tuple = &flow->tuplehash[dir].tuple;
> other_tuple = &flow->tuplehash[!dir].tuple;
> +
> + if (other_tuple->tun_num &&
> + other_tuple->xmit_type == FLOW_OFFLOAD_XMIT_DIRECT)
> + goto err_flow_match;
> +
> if (other_tuple->xmit_type == FLOW_OFFLOAD_XMIT_NEIGH)
> other_dst = other_tuple->dst_cache;
>
> --
> 2.43.0
>
prev parent reply other threads:[~2026-06-08 23:56 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1780580352.git.chzhengyang2023@lzu.edu.cn>
2026-06-04 17:10 ` [PATCH nf 1/1] netfilter: nf_flow_table: separate tunnel route state from direct xmit Ren Wei
2026-06-08 23:56 ` Pablo Neira Ayuso [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aidWndIYXO9hXFq-@chamomile \
--to=pablo@netfilter.org \
--cc=bird@lzu.edu.cn \
--cc=chzhengyang2023@lzu.edu.cn \
--cc=fw@strlen.de \
--cc=lorenzo@kernel.org \
--cc=n05ec@lzu.edu.cn \
--cc=netfilter-devel@vger.kernel.org \
--cc=phil@nwl.cc \
--cc=tomapufckgml@gmail.com \
--cc=yifanwucs@gmail.com \
--cc=yuantan098@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox