* [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-14 0:57 ` Jakub Kicinski
2025-08-12 15:52 ` [PATCH net-next 2/7] xfrm: Switch to skb_dst_reset to clear dst_entry Stanislav Fomichev
` (6 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Going forward skb_dst_set will assert that skb dst_entry
is empty during skb_dst_set to prevent potential leaks. There
are few places that still manually manage dst_entry not using
the helpers. Convert them to the following new helpers:
- skb_dst_reset that resets dst_entry and returns previous dst_entry
value
- skb_dst_restore that restores dst_entry previously reset via skb_dst_restore
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
include/linux/skbuff.h | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 14b923ddb6df..8240e0826204 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1159,6 +1159,37 @@ static inline struct dst_entry *skb_dst(const struct sk_buff *skb)
return (struct dst_entry *)(skb->_skb_refdst & SKB_DST_PTRMASK);
}
+/**
+ * skb_dst_reset() - return current dst_entry value and clear it
+ * @skb: buffer
+ *
+ * Resets skb dst_entry without adjusting its reference count. Useful in
+ * cases where dst_entry needs to be temporarily reset and restored.
+ * Note that the returned value cannot be used directly because it
+ * might contain SKB_DST_NOREF bit.
+ *
+ * When in doubt, prefer skb_dst_drop() over skb_dst_reset() to correctly
+ * handle dst_entry reference counting.
+ *
+ * Returns: original skb dst_entry.
+ */
+static inline unsigned long skb_dst_reset(struct sk_buff *skb)
+{
+ unsigned long refdst = skb->_skb_refdst;
+
+ skb->_skb_refdst = 0;
+ return refdst;
+}
+
+/**
+ * skb_dst_restore() - restore skb dst_entry saved via skb_dst_reset
+ * @skb: buffer
+ */
+static inline void skb_dst_restore(struct sk_buff *skb, unsigned long refdst)
+{
+ skb->_skb_refdst = refdst;
+}
+
/**
* skb_dst_set - sets skb dst
* @skb: buffer
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
2025-08-12 15:52 ` [PATCH net-next 1/7] net: Add " Stanislav Fomichev
@ 2025-08-14 0:57 ` Jakub Kicinski
2025-08-14 1:03 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-08-14 0:57 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: netdev, davem, edumazet, pabeni, ayush.sawal, andrew+netdev,
gregkh, horms, dsahern, pablo, kadlec, steffen.klassert, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
On Tue, 12 Aug 2025 08:52:39 -0700 Stanislav Fomichev wrote:
> +/**
> + * skb_dst_reset() - return current dst_entry value and clear it
> + * @skb: buffer
> + *
> + * Resets skb dst_entry without adjusting its reference count. Useful in
> + * cases where dst_entry needs to be temporarily reset and restored.
> + * Note that the returned value cannot be used directly because it
> + * might contain SKB_DST_NOREF bit.
> + *
> + * When in doubt, prefer skb_dst_drop() over skb_dst_reset() to correctly
> + * handle dst_entry reference counting.
thoughts on prefixing these two new helpers with __ to hint that
they are low level and best avoided?
> + *
> + * Returns: original skb dst_entry.
> + */
> +static inline unsigned long skb_dst_reset(struct sk_buff *skb)
> +{
> + unsigned long refdst = skb->_skb_refdst;
> +
> + skb->_skb_refdst = 0;
> + return refdst;
> +}
> +
> +/**
> + * skb_dst_restore() - restore skb dst_entry saved via skb_dst_reset
saved -> removed ?
Also I think for better kdoc linking it's good to add () after function
names
> + * @skb: buffer
kdoc missing for refdst
> + */
> +static inline void skb_dst_restore(struct sk_buff *skb, unsigned long refdst)
> +{
> + skb->_skb_refdst = refdst;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
2025-08-14 0:57 ` Jakub Kicinski
@ 2025-08-14 1:03 ` Jakub Kicinski
2025-08-14 15:45 ` Stanislav Fomichev
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-08-14 1:03 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: netdev, davem, edumazet, pabeni, ayush.sawal, andrew+netdev,
gregkh, horms, dsahern, pablo, kadlec, steffen.klassert, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
On Wed, 13 Aug 2025 17:57:40 -0700 Jakub Kicinski wrote:
> On Tue, 12 Aug 2025 08:52:39 -0700 Stanislav Fomichev wrote:
> > +/**
> > + * skb_dst_reset() - return current dst_entry value and clear it
> > + * @skb: buffer
> > + *
> > + * Resets skb dst_entry without adjusting its reference count. Useful in
> > + * cases where dst_entry needs to be temporarily reset and restored.
> > + * Note that the returned value cannot be used directly because it
> > + * might contain SKB_DST_NOREF bit.
> > + *
> > + * When in doubt, prefer skb_dst_drop() over skb_dst_reset() to correctly
> > + * handle dst_entry reference counting.
>
> thoughts on prefixing these two new helpers with __ to hint that
> they are low level and best avoided?
Looking at the uses -- maybe skb_dstref_steal() or skb_steal_dstref()
would be a better name? We have skb_steal_sock() (et.al.) already,
same semantics.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
2025-08-14 1:03 ` Jakub Kicinski
@ 2025-08-14 15:45 ` Stanislav Fomichev
0 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-14 15:45 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Stanislav Fomichev, netdev, davem, edumazet, pabeni, ayush.sawal,
andrew+netdev, gregkh, horms, dsahern, pablo, kadlec,
steffen.klassert, mhal, abhishektamboli9, linux-kernel,
linux-staging, netfilter-devel, coreteam, herbert
On 08/13, Jakub Kicinski wrote:
> On Wed, 13 Aug 2025 17:57:40 -0700 Jakub Kicinski wrote:
> > On Tue, 12 Aug 2025 08:52:39 -0700 Stanislav Fomichev wrote:
> > > +/**
> > > + * skb_dst_reset() - return current dst_entry value and clear it
> > > + * @skb: buffer
> > > + *
> > > + * Resets skb dst_entry without adjusting its reference count. Useful in
> > > + * cases where dst_entry needs to be temporarily reset and restored.
> > > + * Note that the returned value cannot be used directly because it
> > > + * might contain SKB_DST_NOREF bit.
> > > + *
> > > + * When in doubt, prefer skb_dst_drop() over skb_dst_reset() to correctly
> > > + * handle dst_entry reference counting.
> >
> > thoughts on prefixing these two new helpers with __ to hint that
> > they are low level and best avoided?
>
> Looking at the uses -- maybe skb_dstref_steal() or skb_steal_dstref()
> would be a better name? We have skb_steal_sock() (et.al.) already,
> same semantics.
Sure, will rename and address the rest of your feedback. Thanks for
the review!
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 2/7] xfrm: Switch to skb_dst_reset to clear dst_entry
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 1/7] net: Add " Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 3/7] netfilter: " Stanislav Fomichev
` (5 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Going forward skb_dst_set will assert that skb dst_entry
is empty during skb_dst_set. skb_dst_reset is added to reset
existing entry without doing refcnt. Switch to skb_dst_reset
in __xfrm_route_forward and add a comment on why it's safe
to skip skb_dst_restore.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/xfrm/xfrm_policy.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index c5035a9bc3bb..a5ffe26b64d5 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -3881,12 +3881,18 @@ int __xfrm_route_forward(struct sk_buff *skb, unsigned short family)
}
skb_dst_force(skb);
- if (!skb_dst(skb)) {
+ dst = skb_dst(skb);
+ if (!dst) {
XFRM_INC_STATS(net, LINUX_MIB_XFRMFWDHDRERROR);
return 0;
}
- dst = xfrm_lookup(net, skb_dst(skb), &fl, NULL, XFRM_LOOKUP_QUEUE);
+ /* ignore return value from skb_dst_reset, xfrm_lookup takes
+ * care of dropping the refcnt if needed.
+ */
+ skb_dst_reset(skb);
+
+ dst = xfrm_lookup(net, dst, &fl, NULL, XFRM_LOOKUP_QUEUE);
if (IS_ERR(dst)) {
res = 0;
dst = NULL;
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 3/7] netfilter: Switch to skb_dst_reset to clear dst_entry
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 1/7] net: Add " Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 2/7] xfrm: Switch to skb_dst_reset to clear dst_entry Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-13 6:43 ` Florian Westphal
2025-08-12 15:52 ` [PATCH net-next 4/7] net: Switch to skb_dst_reset/skb_dst_restore for ip_route_input callers Stanislav Fomichev
` (4 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Going forward skb_dst_set will assert that skb dst_entry
is empty during skb_dst_set. skb_dst_reset is added to reset
existing entry without doing refcnt. Switch to skb_dst_reset
in ip[6]_route_me_harder and add a comment on why it's safe
to skip skb_dst_restore.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/ipv4/netfilter.c | 5 ++++-
net/ipv6/netfilter.c | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c
index 0565f001120d..bda67bb0e63b 100644
--- a/net/ipv4/netfilter.c
+++ b/net/ipv4/netfilter.c
@@ -65,7 +65,10 @@ int ip_route_me_harder(struct net *net, struct sock *sk, struct sk_buff *skb, un
if (!(IPCB(skb)->flags & IPSKB_XFRM_TRANSFORMED) &&
xfrm_decode_session(net, skb, flowi4_to_flowi(&fl4), AF_INET) == 0) {
struct dst_entry *dst = skb_dst(skb);
- skb_dst_set(skb, NULL);
+ /* ignore return value from skb_dst_reset, xfrm_lookup takes
+ * care of dropping the refcnt if needed.
+ */
+ skb_dst_reset(skb);
dst = xfrm_lookup(net, dst, flowi4_to_flowi(&fl4), sk, 0);
if (IS_ERR(dst))
return PTR_ERR(dst);
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 45f9105f9ac1..6743c075133d 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -63,7 +63,10 @@ int ip6_route_me_harder(struct net *net, struct sock *sk_partial, struct sk_buff
#ifdef CONFIG_XFRM
if (!(IP6CB(skb)->flags & IP6SKB_XFRM_TRANSFORMED) &&
xfrm_decode_session(net, skb, flowi6_to_flowi(&fl6), AF_INET6) == 0) {
- skb_dst_set(skb, NULL);
+ /* ignore return value from skb_dst_reset, xfrm_lookup takes
+ * care of dropping the refcnt if needed.
+ */
+ skb_dst_reset(skb);
dst = xfrm_lookup(net, dst, flowi6_to_flowi(&fl6), sk, 0);
if (IS_ERR(dst))
return PTR_ERR(dst);
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 3/7] netfilter: Switch to skb_dst_reset to clear dst_entry
2025-08-12 15:52 ` [PATCH net-next 3/7] netfilter: " Stanislav Fomichev
@ 2025-08-13 6:43 ` Florian Westphal
0 siblings, 0 replies; 15+ messages in thread
From: Florian Westphal @ 2025-08-13 6:43 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: netdev, davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev,
gregkh, horms, dsahern, pablo, kadlec, steffen.klassert, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Stanislav Fomichev <sdf@fomichev.me> wrote:
> Going forward skb_dst_set will assert that skb dst_entry
> is empty during skb_dst_set. skb_dst_reset is added to reset
> existing entry without doing refcnt. Switch to skb_dst_reset
> in ip[6]_route_me_harder and add a comment on why it's safe
> to skip skb_dst_restore.
Acked-by: Florian Westphal <fw@strlen.de>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 4/7] net: Switch to skb_dst_reset/skb_dst_restore for ip_route_input callers
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
` (2 preceding siblings ...)
2025-08-12 15:52 ` [PATCH net-next 3/7] netfilter: " Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop Stanislav Fomichev
` (3 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Going forward skb_dst_set will assert that skb dst_entry
is empty during skb_dst_set. skb_dst_reset is added to reset
existing entry without doing refcnt. skb_dst_restore should
be used to restore the previous entry. Convert icmp_route_lookup
and ip_options_rcv_srr to these helpers. Add extra call to
skb_dst_drop to icmp_route_lookup to clear the ip_route_input
entry.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
net/ipv4/icmp.c | 7 ++++---
net/ipv4/ip_options.c | 5 ++---
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 2ffe73ea644f..93a166a7ec8d 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -544,14 +544,15 @@ static struct rtable *icmp_route_lookup(struct net *net, struct flowi4 *fl4,
goto relookup_failed;
}
/* Ugh! */
- orefdst = skb_in->_skb_refdst; /* save old refdst */
- skb_dst_set(skb_in, NULL);
+ orefdst = skb_dst_reset(skb_in);
err = ip_route_input(skb_in, fl4_dec.daddr, fl4_dec.saddr,
dscp, rt2->dst.dev) ? -EINVAL : 0;
dst_release(&rt2->dst);
rt2 = skb_rtable(skb_in);
- skb_in->_skb_refdst = orefdst; /* restore old refdst */
+ /* steal dst entry from skb_in, don't drop refcnt */
+ skb_dst_reset(skb_in);
+ skb_dst_restore(skb_in, orefdst);
}
if (err)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index e3321932bec0..95f113dc37d8 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -615,14 +615,13 @@ int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev)
}
memcpy(&nexthop, &optptr[srrptr-1], 4);
- orefdst = skb->_skb_refdst;
- skb_dst_set(skb, NULL);
+ orefdst = skb_dst_reset(skb);
err = ip_route_input(skb, nexthop, iph->saddr, ip4h_dscp(iph),
dev) ? -EINVAL : 0;
rt2 = skb_rtable(skb);
if (err || (rt2->rt_type != RTN_UNICAST && rt2->rt_type != RTN_LOCAL)) {
skb_dst_drop(skb);
- skb->_skb_refdst = orefdst;
+ skb_dst_restore(skb, orefdst);
return -EINVAL;
}
refdst_drop(orefdst);
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
` (3 preceding siblings ...)
2025-08-12 15:52 ` [PATCH net-next 4/7] net: Switch to skb_dst_reset/skb_dst_restore for ip_route_input callers Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-13 14:20 ` Greg KH
2025-08-12 15:52 ` [PATCH net-next 6/7] chtls: Convert to skb_dst_reset Stanislav Fomichev
` (2 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Instead of doing dst_release and skb_dst_set, do skb_dst_drop which
should do the right thing.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
drivers/staging/octeon/ethernet-tx.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/staging/octeon/ethernet-tx.c b/drivers/staging/octeon/ethernet-tx.c
index 261f8dbdc382..0ba240e634a1 100644
--- a/drivers/staging/octeon/ethernet-tx.c
+++ b/drivers/staging/octeon/ethernet-tx.c
@@ -346,8 +346,7 @@ netdev_tx_t cvm_oct_xmit(struct sk_buff *skb, struct net_device *dev)
* The skbuff will be reused without ever being freed. We must
* cleanup a bunch of core things.
*/
- dst_release(skb_dst(skb));
- skb_dst_set(skb, NULL);
+ skb_dst_drop(skb);
skb_ext_reset(skb);
nf_reset_ct(skb);
skb_reset_redirect(skb);
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop
2025-08-12 15:52 ` [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop Stanislav Fomichev
@ 2025-08-13 14:20 ` Greg KH
0 siblings, 0 replies; 15+ messages in thread
From: Greg KH @ 2025-08-13 14:20 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: netdev, davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev,
horms, dsahern, pablo, kadlec, steffen.klassert, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
On Tue, Aug 12, 2025 at 08:52:43AM -0700, Stanislav Fomichev wrote:
> Instead of doing dst_release and skb_dst_set, do skb_dst_drop which
> should do the right thing.
>
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
> ---
> drivers/staging/octeon/ethernet-tx.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/staging/octeon/ethernet-tx.c b/drivers/staging/octeon/ethernet-tx.c
> index 261f8dbdc382..0ba240e634a1 100644
> --- a/drivers/staging/octeon/ethernet-tx.c
> +++ b/drivers/staging/octeon/ethernet-tx.c
> @@ -346,8 +346,7 @@ netdev_tx_t cvm_oct_xmit(struct sk_buff *skb, struct net_device *dev)
> * The skbuff will be reused without ever being freed. We must
> * cleanup a bunch of core things.
> */
> - dst_release(skb_dst(skb));
> - skb_dst_set(skb, NULL);
> + skb_dst_drop(skb);
> skb_ext_reset(skb);
> nf_reset_ct(skb);
> skb_reset_redirect(skb);
> --
> 2.50.1
>
>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH net-next 6/7] chtls: Convert to skb_dst_reset
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
` (4 preceding siblings ...)
2025-08-12 15:52 ` [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-12 15:52 ` [PATCH net-next 7/7] net: Add skb_dst_check_unset Stanislav Fomichev
2025-08-13 7:57 ` [syzbot ci] Re: net: Convert to skb_dst_reset and skb_dst_restore syzbot ci
7 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
Going forward skb_dst_set will assert that skb dst_entry
is empty during skb_dst_set. skb_dst_reset is added to reset
existing entry without doing refcnt. Chelsio driver is
doing extra dst management via skb_dst_set(NULL). Replace
these calls with skb_dst_reset.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
.../ethernet/chelsio/inline_crypto/chtls/chtls_cm.c | 10 +++++-----
.../ethernet/chelsio/inline_crypto/chtls/chtls_cm.h | 4 ++--
.../ethernet/chelsio/inline_crypto/chtls/chtls_io.c | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
index 6f6525983130..b333da3b21bf 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c
@@ -171,7 +171,7 @@ static void chtls_purge_receive_queue(struct sock *sk)
struct sk_buff *skb;
while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
- skb_dst_set(skb, (void *)NULL);
+ skb_dst_reset(skb);
kfree_skb(skb);
}
}
@@ -194,7 +194,7 @@ static void chtls_purge_recv_queue(struct sock *sk)
struct sk_buff *skb;
while ((skb = __skb_dequeue(&tlsk->sk_recv_queue)) != NULL) {
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
kfree_skb(skb);
}
}
@@ -1734,7 +1734,7 @@ static int chtls_rx_data(struct chtls_dev *cdev, struct sk_buff *skb)
pr_err("can't find conn. for hwtid %u.\n", hwtid);
return -EINVAL;
}
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
process_cpl_msg(chtls_recv_data, sk, skb);
return 0;
}
@@ -1786,7 +1786,7 @@ static int chtls_rx_pdu(struct chtls_dev *cdev, struct sk_buff *skb)
pr_err("can't find conn. for hwtid %u.\n", hwtid);
return -EINVAL;
}
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
process_cpl_msg(chtls_recv_pdu, sk, skb);
return 0;
}
@@ -1855,7 +1855,7 @@ static int chtls_rx_cmp(struct chtls_dev *cdev, struct sk_buff *skb)
pr_err("can't find conn. for hwtid %u.\n", hwtid);
return -EINVAL;
}
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
process_cpl_msg(chtls_rx_hdr, sk, skb);
return 0;
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
index f61ca657601c..4ca919925455 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.h
@@ -171,14 +171,14 @@ static inline void chtls_set_req_addr(struct request_sock *oreq,
static inline void chtls_free_skb(struct sock *sk, struct sk_buff *skb)
{
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
__skb_unlink(skb, &sk->sk_receive_queue);
__kfree_skb(skb);
}
static inline void chtls_kfree_skb(struct sock *sk, struct sk_buff *skb)
{
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
__skb_unlink(skb, &sk->sk_receive_queue);
kfree_skb(skb);
}
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
index 465fa8077964..85e4d90efd5b 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
@@ -1434,7 +1434,7 @@ static int chtls_pt_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
continue;
found_ok_skb:
if (!skb->len) {
- skb_dst_set(skb, NULL);
+ skb_dst_reset(skb);
__skb_unlink(skb, &sk->sk_receive_queue);
kfree_skb(skb);
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH net-next 7/7] net: Add skb_dst_check_unset
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
` (5 preceding siblings ...)
2025-08-12 15:52 ` [PATCH net-next 6/7] chtls: Convert to skb_dst_reset Stanislav Fomichev
@ 2025-08-12 15:52 ` Stanislav Fomichev
2025-08-13 7:57 ` [syzbot ci] Re: net: Convert to skb_dst_reset and skb_dst_restore syzbot ci
7 siblings, 0 replies; 15+ messages in thread
From: Stanislav Fomichev @ 2025-08-12 15:52 UTC (permalink / raw)
To: netdev
Cc: davem, edumazet, kuba, pabeni, ayush.sawal, andrew+netdev, gregkh,
horms, dsahern, pablo, kadlec, steffen.klassert, sdf, mhal,
abhishektamboli9, linux-kernel, linux-staging, netfilter-devel,
coreteam, herbert
To prevent dst_entry leaks, add warning when the non-NULL dst_entry
is rewritten.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
include/linux/skbuff.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 8240e0826204..2f9dac54d627 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1159,6 +1159,12 @@ static inline struct dst_entry *skb_dst(const struct sk_buff *skb)
return (struct dst_entry *)(skb->_skb_refdst & SKB_DST_PTRMASK);
}
+static inline void skb_dst_check_unset(struct sk_buff *skb)
+{
+ DEBUG_NET_WARN_ON_ONCE((skb->_skb_refdst & SKB_DST_PTRMASK) &&
+ !(skb->_skb_refdst & SKB_DST_NOREF));
+}
+
/**
* skb_dst_reset() - return current dst_entry value and clear it
* @skb: buffer
@@ -1187,6 +1193,7 @@ static inline unsigned long skb_dst_reset(struct sk_buff *skb)
*/
static inline void skb_dst_restore(struct sk_buff *skb, unsigned long refdst)
{
+ skb_dst_check_unset(skb);
skb->_skb_refdst = refdst;
}
@@ -1200,6 +1207,7 @@ static inline void skb_dst_restore(struct sk_buff *skb, unsigned long refdst)
*/
static inline void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
{
+ skb_dst_check_unset(skb);
skb->slow_gro |= !!dst;
skb->_skb_refdst = (unsigned long)dst;
}
@@ -1216,6 +1224,7 @@ static inline void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
*/
static inline void skb_dst_set_noref(struct sk_buff *skb, struct dst_entry *dst)
{
+ skb_dst_check_unset(skb);
WARN_ON(!rcu_read_lock_held() && !rcu_read_lock_bh_held());
skb->slow_gro |= !!dst;
skb->_skb_refdst = (unsigned long)dst | SKB_DST_NOREF;
--
2.50.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [syzbot ci] Re: net: Convert to skb_dst_reset and skb_dst_restore
2025-08-12 15:52 [PATCH net-next 0/7] net: Convert to skb_dst_reset and skb_dst_restore Stanislav Fomichev
` (6 preceding siblings ...)
2025-08-12 15:52 ` [PATCH net-next 7/7] net: Add skb_dst_check_unset Stanislav Fomichev
@ 2025-08-13 7:57 ` syzbot ci
2025-08-13 10:01 ` Florian Westphal
7 siblings, 1 reply; 15+ messages in thread
From: syzbot ci @ 2025-08-13 7:57 UTC (permalink / raw)
To: abhishektamboli9, andrew, ayush.sawal, coreteam, davem, dsahern,
edumazet, gregkh, herbert, horms, kadlec, kuba, linux-kernel,
linux-staging, mhal, netdev, netfilter-devel, pabeni, pablo, sdf,
steffen.klassert
Cc: syzbot, syzkaller-bugs
syzbot ci has tested the following series
[v1] net: Convert to skb_dst_reset and skb_dst_restore
https://lore.kernel.org/all/20250812155245.507012-1-sdf@fomichev.me
* [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
* [PATCH net-next 2/7] xfrm: Switch to skb_dst_reset to clear dst_entry
* [PATCH net-next 3/7] netfilter: Switch to skb_dst_reset to clear dst_entry
* [PATCH net-next 4/7] net: Switch to skb_dst_reset/skb_dst_restore for ip_route_input callers
* [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop
* [PATCH net-next 6/7] chtls: Convert to skb_dst_reset
* [PATCH net-next 7/7] net: Add skb_dst_check_unset
and found the following issue:
WARNING in nf_reject_fill_skb_dst
Full report is available here:
https://ci.syzbot.org/series/dcc132bc-db4e-4a0b-a95c-9960b8a48d10
***
WARNING in nf_reject_fill_skb_dst
tree: net-next
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base: 37816488247ddddbc3de113c78c83572274b1e2e
arch: amd64
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
config: https://ci.syzbot.org/builds/2aac43d4-6d95-49c6-8e28-28941fdb3117/config
C repro: https://ci.syzbot.org/findings/0af27fa7-d0b6-43d6-8965-58fbc54ca186/c_repro
syz repro: https://ci.syzbot.org/findings/0af27fa7-d0b6-43d6-8965-58fbc54ca186/syz_repro
------------[ cut here ]------------
WARNING: CPU: 1 PID: 5901 at ./include/linux/skbuff.h:1165 skb_dst_check_unset include/linux/skbuff.h:1164 [inline]
WARNING: CPU: 1 PID: 5901 at ./include/linux/skbuff.h:1165 skb_dst_set include/linux/skbuff.h:1210 [inline]
WARNING: CPU: 1 PID: 5901 at ./include/linux/skbuff.h:1165 nf_reject_fill_skb_dst+0x2a4/0x330 net/ipv4/netfilter/nf_reject_ipv4.c:234
Modules linked in:
CPU: 1 UID: 0 PID: 5901 Comm: kworker/u8:3 Not tainted 6.16.0-syzkaller-12063-g37816488247d-dirty #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: ipv6_addrconf addrconf_dad_work
RIP: 0010:skb_dst_check_unset include/linux/skbuff.h:1164 [inline]
RIP: 0010:skb_dst_set include/linux/skbuff.h:1210 [inline]
RIP: 0010:nf_reject_fill_skb_dst+0x2a4/0x330 net/ipv4/netfilter/nf_reject_ipv4.c:234
Code: 8b 0d 60 b1 8b 08 48 3b 8c 24 e0 00 00 00 75 5d 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d e9 03 8d 67 01 cc e8 cd 6c ab f7 90 <0f> 0b 90 e9 38 ff ff ff 44 89 f9 80 e1 07 fe c1 38 c1 0f 8c 2b fe
RSP: 0018:ffffc900001e0360 EFLAGS: 00010246
RAX: ffffffff8a143ee3 RBX: ffff888110b91200 RCX: ffff88810a3d8000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc900001e0490 R08: ffffffff8fa34737 R09: 1ffffffff1f468e6
R10: dffffc0000000000 R11: fffffbfff1f468e7 R12: ffff888011c5c101
R13: dffffc0000000001 R14: 1ffff9200003c070 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8881a3c21000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c003cc5000 CR3: 000000010fc16000 CR4: 00000000000006f0
Call Trace:
<IRQ>
nf_send_unreach+0x17b/0x6e0 net/ipv4/netfilter/nf_reject_ipv4.c:325
nft_reject_inet_eval+0x4bc/0x690 net/netfilter/nft_reject_inet.c:27
expr_call_ops_eval net/netfilter/nf_tables_core.c:237 [inline]
nft_do_chain+0x40c/0x1920 net/netfilter/nf_tables_core.c:285
nft_do_chain_inet+0x25d/0x340 net/netfilter/nft_chain_filter.c:161
nf_hook_entry_hookfn include/linux/netfilter.h:158 [inline]
nf_hook_slow+0xc5/0x220 net/netfilter/core.c:623
nf_hook include/linux/netfilter.h:273 [inline]
NF_HOOK+0x206/0x3a0 include/linux/netfilter.h:316
__netif_receive_skb_one_core net/core/dev.c:5979 [inline]
__netif_receive_skb+0x143/0x380 net/core/dev.c:6092
process_backlog+0x60e/0x14f0 net/core/dev.c:6444
__napi_poll+0xc7/0x360 net/core/dev.c:7489
napi_poll net/core/dev.c:7552 [inline]
net_rx_action+0x707/0xe30 net/core/dev.c:7679
handle_softirqs+0x286/0x870 kernel/softirq.c:579
do_softirq+0xec/0x180 kernel/softirq.c:480
</IRQ>
<TASK>
__local_bh_enable_ip+0x17d/0x1c0 kernel/softirq.c:407
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:910 [inline]
__dev_queue_xmit+0x1d79/0x3b50 net/core/dev.c:4740
neigh_output include/net/neighbour.h:547 [inline]
ip6_finish_output2+0x11fe/0x16a0 net/ipv6/ip6_output.c:141
NF_HOOK include/linux/netfilter.h:318 [inline]
ndisc_send_skb+0xb96/0x1470 net/ipv6/ndisc.c:512
ndisc_send_ns+0xcb/0x150 net/ipv6/ndisc.c:670
addrconf_dad_work+0xaae/0x14b0 net/ipv6/addrconf.c:4282
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3319
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x3fc/0x770 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
***
If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syzbot@syzkaller.appspotmail.com
---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [syzbot ci] Re: net: Convert to skb_dst_reset and skb_dst_restore
2025-08-13 7:57 ` [syzbot ci] Re: net: Convert to skb_dst_reset and skb_dst_restore syzbot ci
@ 2025-08-13 10:01 ` Florian Westphal
0 siblings, 0 replies; 15+ messages in thread
From: Florian Westphal @ 2025-08-13 10:01 UTC (permalink / raw)
To: syzbot ci
Cc: abhishektamboli9, andrew, ayush.sawal, coreteam, davem, dsahern,
edumazet, gregkh, herbert, horms, kadlec, kuba, linux-kernel,
linux-staging, mhal, netdev, netfilter-devel, pabeni, pablo, sdf,
steffen.klassert, syzbot, syzkaller-bugs
syzbot ci <syzbot+ci1b726090b21fedf1@syzkaller.appspotmail.com> wrote:
> syzbot ci has tested the following series
>
> [v1] net: Convert to skb_dst_reset and skb_dst_restore
> https://lore.kernel.org/all/20250812155245.507012-1-sdf@fomichev.me
> * [PATCH net-next 1/7] net: Add skb_dst_reset and skb_dst_restore
> * [PATCH net-next 2/7] xfrm: Switch to skb_dst_reset to clear dst_entry
> * [PATCH net-next 3/7] netfilter: Switch to skb_dst_reset to clear dst_entry
> * [PATCH net-next 4/7] net: Switch to skb_dst_reset/skb_dst_restore for ip_route_input callers
> * [PATCH net-next 5/7] staging: octeon: Convert to skb_dst_drop
> * [PATCH net-next 6/7] chtls: Convert to skb_dst_reset
> * [PATCH net-next 7/7] net: Add skb_dst_check_unset
>
> and found the following issue:
> WARNING in nf_reject_fill_skb_dst
Looks like a bug uncovered by this series.
I'll have a look.
^ permalink raw reply [flat|nested] 15+ messages in thread