netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe
@ 2025-11-04  3:09 chuang
  2025-11-05 14:21 ` Ido Schimmel
  2025-11-05 14:26 ` Eric Dumazet
  0 siblings, 2 replies; 8+ messages in thread
From: chuang @ 2025-11-04  3:09 UTC (permalink / raw)
  Cc: Chuang Wang, stable, David S. Miller, David Ahern, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Networking, open list

From 35dbc9abd8da820007391b707bd2c1a9c99ee67d Mon Sep 17 00:00:00 2001
From: Chuang Wang <nashuiliang@gmail.com>
Date: Tue, 4 Nov 2025 02:52:11 +0000
Subject: [PATCH net] ipv4: route: Prevent rt_bind_exception() from rebinding
 stale fnhe

A race condition exists between fnhe_remove_oldest() and
rt_bind_exception() where a fnhe that is scheduled for removal can be
rebound to a new dst.

The issue occurs when fnhe_remove_oldest() selects an fnhe (fnheX)
for deletion, but before it can be flushed and freed via RCU,
CPU 0 enters rt_bind_exception() and attempts to reuse the entry.

CPU 0                             CPU 1
__mkroute_output()
  find_exception() [fnheX]
                                  update_or_create_fnhe()
                                    fnhe_remove_oldest() [fnheX]
  rt_bind_exception() [bind dst]
                                  RCU callback [fnheX freed, dst leak]

If rt_bind_exception() successfully binds fnheX to a new dst, the
newly bound dst will never be properly freed because fnheX will
soon be released by the RCU callback, leading to a permanent
reference count leak on the old dst and the device.

This issue manifests as a device reference count leak and a
warning in dmesg when unregistering the net device:

  unregister_netdevice: waiting for ethX to become free. Usage count = N

Fix this race by clearing 'oldest->fnhe_daddr' before calling
fnhe_flush_routes(). Since rt_bind_exception() checks this field,
setting it to zero prevents the stale fnhe from being reused and
bound to a new dst just before it is freed.

Cc: stable@vger.kernel.org
Fixes: 67d6d681e15b ("ipv4: make exception cache less predictible")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
---
 net/ipv4/route.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 6d27d3610c1c..b549d6a57307 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -607,6 +607,11 @@ static void fnhe_remove_oldest(struct
fnhe_hash_bucket *hash)
                        oldest_p = fnhe_p;
                }
        }
+
+       /* Clear oldest->fnhe_daddr to prevent this fnhe from being
+        * rebound with new dsts in rt_bind_exception().
+        */
+       oldest->fnhe_daddr = 0;
        fnhe_flush_routes(oldest);
        *oldest_p = oldest->fnhe_next;
        kfree_rcu(oldest, rcu);
--

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-11-09 15:48 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04  3:09 [PATCH net] ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe chuang
2025-11-05 14:21 ` Ido Schimmel
2025-11-05 14:26 ` Eric Dumazet
2025-11-05 15:34   ` Ido Schimmel
2025-11-05 15:50     ` Eric Dumazet
2025-11-06  0:31     ` chuang
2025-11-07  9:53       ` chuang
2025-11-09 15:48         ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).