Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/3] ipv4/ipv6: Fix UAF and memory leak in IGMP/MLD
@ 2026-07-04 19:43 Eric Dumazet
  2026-07-04 19:43 ` [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer() Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-07-04 19:43 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

This series addresses two potential UAF vulnerabilities
and one memory leak in the IPv4 IGMP and IPv6 MLD subsystems.

The first two patches fix a UAF where the packet receive path races with
device teardown. If the device refcount has already hit 0 (but the memory
is still held by RCU), incoming IGMP/MLD packets trying to schedule delayed
work or timers would call refcount_inc() on the 0 refcount, triggering a
warning and eventually leading to a UAF when the work runs after the device
has been freed. This is fixed by introducing safe hold helpers using
refcount_inc_not_zero().

The third patch fixes a memory leak in IPv4 IGMP timer modification. When
a timer is deleted and not re-armed, the code dropped the group refcount
using refcount_dec(). However, if the group was concurrently removed from
the list, this decrement could drop the refcount to 0 without triggering
the cleanup/free path, leaking the group structure. This is fixed by using
ip_ma_put() instead, and deferring the put until after the lock is released.

Eric Dumazet (3):
  ipv4: igmp: Fix potential UAF in igmp_gq_start_timer()
  ipv6: mcast: Fix potential UAF in MLD delayed work
  ipv4: igmp: Fix potential memory leak in igmp_mod_timer()

 include/linux/inetdevice.h |  5 +++++
 include/net/addrconf.h     |  5 +++++
 net/ipv4/igmp.c            | 21 +++++++++++++++------
 net/ipv6/mcast.c           | 38 ++++++++++++++++++++++++++++----------
 4 files changed, 53 insertions(+), 16 deletions(-)

-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer()
  2026-07-04 19:43 [PATCH net 0/3] ipv4/ipv6: Fix UAF and memory leak in IGMP/MLD Eric Dumazet
@ 2026-07-04 19:43 ` Eric Dumazet
  2026-07-05 11:33   ` Ido Schimmel
  2026-07-04 19:43 ` [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work Eric Dumazet
  2026-07-04 19:43 ` [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer() Eric Dumazet
  2 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-07-04 19:43 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet, Zero Day Initiative

A race condition exists between device teardown (inetdev_destroy) and
incoming IGMP query processing (igmp_rcv), leading to a Use-After-Free
in the IGMP timer callback.

During device destruction, inetdev_destroy() drops the primary reference
to in_device, which can drop its refcount to 0. The actual freeing of
in_device memory is deferred via RCU (using call_rcu()).

Concurrently, igmp_rcv() runs under RCU read lock and obtains the
in_device pointer. Because the memory is RCU-protected, CPU-0 can safely
dereference in_device even if its refcount has hit 0.

However, if CPU-0 calls igmp_gq_start_timer() and re-arms the timer, it
attempts to acquire a reference using in_dev_hold(). This increments the
refcount from 0 to 1, triggering a "refcount_t: addition on 0" warning.
Since the in_device memory is still scheduled to be freed after the RCU
grace period (as the free callback does not check the refcount again),
the device is freed while the timer is still armed. When the timer
expires, it accesses the freed memory, causing a kernel panic.

Fix this by using refcount_inc_not_zero() (via a new helper
in_dev_hold_safe()) to prevent acquiring a reference if the device is
already being destroyed. If the refcount is 0, we do not arm the timer.

A similar issue in IPv6 MLD is fixed in a subsequent patch.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/inetdevice.h |  5 +++++
 net/ipv4/igmp.c            | 14 +++++++++-----
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h
index dccbeb25f70141982160c776f3cc727296def2b7..6032eea2539a60d0476d85667bda3bfcad8f1425 100644
--- a/include/linux/inetdevice.h
+++ b/include/linux/inetdevice.h
@@ -293,6 +293,11 @@ static inline void in_dev_put(struct in_device *idev)
 #define __in_dev_put(idev)  refcount_dec(&(idev)->refcnt)
 #define in_dev_hold(idev)   refcount_inc(&(idev)->refcnt)
 
+static inline bool in_dev_hold_safe(struct in_device *idev)
+{
+	return refcount_inc_not_zero(&idev->refcnt);
+}
+
 #endif /* __KERNEL__ */
 
 static __inline__ __be32 inet_make_mask(int logmask)
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index b6337a47c1418589bf8ef31e00fd98b6187f5271..f5f9763895641bf86bfcf9fd7fd7b06012fa4ece 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -248,16 +248,20 @@ static void igmp_gq_start_timer(struct in_device *in_dev)
 		return;
 
 	in_dev->mr_gq_running = 1;
-	if (!mod_timer(&in_dev->mr_gq_timer, exp))
-		in_dev_hold(in_dev);
+	if (in_dev_hold_safe(in_dev)) {
+		if (mod_timer(&in_dev->mr_gq_timer, exp))
+			in_dev_put(in_dev);
+	}
 }
 
 static void igmp_ifc_start_timer(struct in_device *in_dev, int delay)
 {
-	int tv = get_random_u32_below(delay);
+	if (in_dev_hold_safe(in_dev)) {
+		int tv = get_random_u32_below(delay);
 
-	if (!mod_timer(&in_dev->mr_ifc_timer, jiffies+tv+2))
-		in_dev_hold(in_dev);
+		if (mod_timer(&in_dev->mr_ifc_timer, jiffies + tv + 2))
+			in_dev_put(in_dev);
+	}
 }
 
 static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work
  2026-07-04 19:43 [PATCH net 0/3] ipv4/ipv6: Fix UAF and memory leak in IGMP/MLD Eric Dumazet
  2026-07-04 19:43 ` [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer() Eric Dumazet
@ 2026-07-04 19:43 ` Eric Dumazet
  2026-07-05 10:53   ` Ido Schimmel
  2026-07-04 19:43 ` [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer() Eric Dumazet
  2 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-07-04 19:43 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

A race condition exists between device teardown and incoming MLD query
processing, leading to a Use-After-Free in the MLD delayed work.

During device destruction, the primary reference to inet6_dev is dropped,
which can drop its refcount to 0. The actual freeing of inet6_dev memory
is deferred via RCU.

Concurrently, the packet receive path runs under RCU read lock and obtains
the inet6_dev pointer. Because the memory is RCU-protected, CPU-0 can
safely dereference inet6_dev even if its refcount has hit 0.

However, if CPU-0 calls igmp6_event_query() and schedules delayed work, it
attempts to acquire a reference using in6_dev_hold(). This increments the
refcount from 0 to 1, triggering a "refcount_t: addition on 0" warning.
Since the inet6_dev memory is still scheduled to be freed after the RCU
grace period, the device is freed while the work is still scheduled.
When the work runs, it accesses the freed memory, causing a kernel panic.

Fix this by using refcount_inc_not_zero() (via a new helper
in6_dev_hold_safe()) to prevent acquiring a reference if the device is
already being destroyed. If the refcount is 0, we do not schedule the work.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/addrconf.h |  5 +++++
 net/ipv6/mcast.c       | 38 ++++++++++++++++++++++++++++----------
 2 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 539bbbe54b14e8108ff7304d7a08bc605655cc31..8ced27a8229b6e0580f934be2223676cc123307b 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -446,6 +446,11 @@ static inline void in6_dev_hold(struct inet6_dev *idev)
 	refcount_inc(&idev->refcnt);
 }
 
+static inline bool in6_dev_hold_safe(struct inet6_dev *idev)
+{
+	return refcount_inc_not_zero(&idev->refcnt);
+}
+
 /* called with rcu_read_lock held */
 static inline bool ip6_ignore_linkdown(const struct net_device *dev)
 {
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 04b811b3be978e36b0fd4ecd8312313d73ed2ed9..7c3f739cf7638452f311c64331c89253361b34f9 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1083,8 +1083,10 @@ static void mld_gq_start_work(struct inet6_dev *idev)
 	mc_assert_locked(idev);
 
 	idev->mc_gq_running = 1;
-	if (!mod_delayed_work(mld_wq, &idev->mc_gq_work, tv + 2))
-		in6_dev_hold(idev);
+	if (in6_dev_hold_safe(idev)) {
+		if (mod_delayed_work(mld_wq, &idev->mc_gq_work, tv + 2))
+			in6_dev_put(idev);
+	}
 }
 
 static void mld_gq_stop_work(struct inet6_dev *idev)
@@ -1102,8 +1104,10 @@ static void mld_ifc_start_work(struct inet6_dev *idev, unsigned long delay)
 
 	mc_assert_locked(idev);
 
-	if (!mod_delayed_work(mld_wq, &idev->mc_ifc_work, tv + 2))
-		in6_dev_hold(idev);
+	if (in6_dev_hold_safe(idev)) {
+		if (mod_delayed_work(mld_wq, &idev->mc_ifc_work, tv + 2))
+			in6_dev_put(idev);
+	}
 }
 
 static void mld_ifc_stop_work(struct inet6_dev *idev)
@@ -1121,8 +1125,10 @@ static void mld_dad_start_work(struct inet6_dev *idev, unsigned long delay)
 
 	mc_assert_locked(idev);
 
-	if (!mod_delayed_work(mld_wq, &idev->mc_dad_work, tv + 2))
-		in6_dev_hold(idev);
+	if (in6_dev_hold_safe(idev)) {
+		if (mod_delayed_work(mld_wq, &idev->mc_dad_work, tv + 2))
+			in6_dev_put(idev);
+	}
 }
 
 static void mld_dad_stop_work(struct inet6_dev *idev)
@@ -1395,6 +1401,7 @@ static void mld_process_v2(struct inet6_dev *idev, struct mld2_query *mld,
 void igmp6_event_query(struct sk_buff *skb)
 {
 	struct inet6_dev *idev = __in6_dev_get(skb->dev);
+	bool put = false;
 
 	if (!idev || idev->dead)
 		goto out;
@@ -1402,11 +1409,16 @@ void igmp6_event_query(struct sk_buff *skb)
 	spin_lock_bh(&idev->mc_query_lock);
 	if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS) {
 		__skb_queue_tail(&idev->mc_query_queue, skb);
-		if (!mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
-			in6_dev_hold(idev);
+		if (in6_dev_hold_safe(idev)) {
+			if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
+				put = true;
+		}
 		skb = NULL;
 	}
 	spin_unlock_bh(&idev->mc_query_lock);
+
+	if (put)
+		in6_dev_put(idev);
 out:
 	kfree_skb(skb);
 }
@@ -1570,6 +1582,7 @@ static void mld_query_work(struct work_struct *work)
 void igmp6_event_report(struct sk_buff *skb)
 {
 	struct inet6_dev *idev = __in6_dev_get(skb->dev);
+	bool put = false;
 
 	if (!idev || idev->dead)
 		goto out;
@@ -1577,11 +1590,16 @@ void igmp6_event_report(struct sk_buff *skb)
 	spin_lock_bh(&idev->mc_report_lock);
 	if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS) {
 		__skb_queue_tail(&idev->mc_report_queue, skb);
-		if (!mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
-			in6_dev_hold(idev);
+		if (in6_dev_hold_safe(idev)) {
+			if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
+				put = true;
+		}
 		skb = NULL;
 	}
 	spin_unlock_bh(&idev->mc_report_lock);
+
+	if (put)
+		in6_dev_put(idev);
 out:
 	kfree_skb(skb);
 }
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer()
  2026-07-04 19:43 [PATCH net 0/3] ipv4/ipv6: Fix UAF and memory leak in IGMP/MLD Eric Dumazet
  2026-07-04 19:43 ` [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer() Eric Dumazet
  2026-07-04 19:43 ` [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work Eric Dumazet
@ 2026-07-04 19:43 ` Eric Dumazet
  2026-07-05 11:58   ` Ido Schimmel
  2 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2026-07-04 19:43 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, Kuniyuki Iwashima, Ido Schimmel, David Ahern,
	netdev, eric.dumazet, Eric Dumazet

When a timer is deleted and not re-armed in igmp_mod_timer(), the code
currently decrements the reference counter of the multicast list entry
@im using refcount_dec(&im->refcnt).

However, igmp_mod_timer() can be called from the RCU reader path (e.g., in
igmp_heard_query() via for_each_pmc_rcu()). If the group im was
concurrently removed from the list by ip_mc_dec_group(), its reference count
might have already been decremented to 1.

In this case, timer_delete() succeeds, and refcount_dec() decrements
the refcount from 1 to 0. Since refcount_dec() does not free the object
when it hits 0 (unlike ip_ma_put()), the im structure is leaked.

Fix this by using ip_ma_put(im) instead of refcount_dec(&im->refcnt).

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/igmp.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index f5f9763895641bf86bfcf9fd7fd7b06012fa4ece..2170b33ba147ce4990e3ee71ba4868e8696b00cb 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -266,6 +266,8 @@ static void igmp_ifc_start_timer(struct in_device *in_dev, int delay)
 
 static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
 {
+	bool put = false;
+
 	spin_lock_bh(&im->lock);
 	im->unsolicit_count = 0;
 	if (timer_delete(&im->timer)) {
@@ -275,10 +277,13 @@ static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
 			spin_unlock_bh(&im->lock);
 			return;
 		}
-		refcount_dec(&im->refcnt);
+		put = true;
 	}
 	igmp_start_timer(im, max_delay);
 	spin_unlock_bh(&im->lock);
+
+	if (put)
+		ip_ma_put(im);
 }
 
 
-- 
2.55.0.rc0.799.gd6f94ed593-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work
  2026-07-04 19:43 ` [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work Eric Dumazet
@ 2026-07-05 10:53   ` Ido Schimmel
  2026-07-05 13:58     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2026-07-05 10:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, David Ahern, netdev, eric.dumazet

On Sat, Jul 04, 2026 at 07:43:45PM +0000, Eric Dumazet wrote:
> A race condition exists between device teardown and incoming MLD query
> processing, leading to a Use-After-Free in the MLD delayed work.
> 
> During device destruction, the primary reference to inet6_dev is dropped,
> which can drop its refcount to 0. The actual freeing of inet6_dev memory
> is deferred via RCU.
> 
> Concurrently, the packet receive path runs under RCU read lock and obtains
> the inet6_dev pointer. Because the memory is RCU-protected, CPU-0 can
> safely dereference inet6_dev even if its refcount has hit 0.
> 
> However, if CPU-0 calls igmp6_event_query() and schedules delayed work, it
> attempts to acquire a reference using in6_dev_hold(). This increments the
> refcount from 0 to 1, triggering a "refcount_t: addition on 0" warning.
> Since the inet6_dev memory is still scheduled to be freed after the RCU
> grace period, the device is freed while the work is still scheduled.
> When the work runs, it accesses the freed memory, causing a kernel panic.
> 
> Fix this by using refcount_inc_not_zero() (via a new helper
> in6_dev_hold_safe()) to prevent acquiring a reference if the device is
> already being destroyed. If the refcount is 0, we do not schedule the work.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Eric, thanks for taking care of this!

> ---
>  include/net/addrconf.h |  5 +++++
>  net/ipv6/mcast.c       | 38 ++++++++++++++++++++++++++++----------
>  2 files changed, 33 insertions(+), 10 deletions(-)

[...]

> @@ -1395,6 +1401,7 @@ static void mld_process_v2(struct inet6_dev *idev, struct mld2_query *mld,
>  void igmp6_event_query(struct sk_buff *skb)
>  {
>  	struct inet6_dev *idev = __in6_dev_get(skb->dev);
> +	bool put = false;
>  
>  	if (!idev || idev->dead)
>  		goto out;
> @@ -1402,11 +1409,16 @@ void igmp6_event_query(struct sk_buff *skb)
>  	spin_lock_bh(&idev->mc_query_lock);
>  	if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS) {
>  		__skb_queue_tail(&idev->mc_query_queue, skb);

Shouldn't we only enqueue the skb if we managed to take a reference?
Something like [1] (on top of this patch).

If we failed to take a reference, then ipv6_mc_destroy_dev() already ran
and the queue will not be purged, thereby leaking this skb.

> -		if (!mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> -			in6_dev_hold(idev);
> +		if (in6_dev_hold_safe(idev)) {
> +			if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> +				put = true;
> +		}
>  		skb = NULL;
>  	}
>  	spin_unlock_bh(&idev->mc_query_lock);
> +
> +	if (put)
> +		in6_dev_put(idev);
>  out:
>  	kfree_skb(skb);
>  }
> @@ -1570,6 +1582,7 @@ static void mld_query_work(struct work_struct *work)
>  void igmp6_event_report(struct sk_buff *skb)
>  {
>  	struct inet6_dev *idev = __in6_dev_get(skb->dev);
> +	bool put = false;
>  
>  	if (!idev || idev->dead)
>  		goto out;
> @@ -1577,11 +1590,16 @@ void igmp6_event_report(struct sk_buff *skb)
>  	spin_lock_bh(&idev->mc_report_lock);
>  	if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS) {
>  		__skb_queue_tail(&idev->mc_report_queue, skb);

Same here

> -		if (!mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> -			in6_dev_hold(idev);
> +		if (in6_dev_hold_safe(idev)) {
> +			if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> +				put = true;
> +		}
>  		skb = NULL;
>  	}
>  	spin_unlock_bh(&idev->mc_report_lock);
> +
> +	if (put)
> +		in6_dev_put(idev);
>  out:
>  	kfree_skb(skb);
>  }

[1]
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 12d22de6f496..aaba4c2aae23 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -1408,12 +1408,11 @@ void igmp6_event_query(struct sk_buff *skb)
 		goto out;
 
 	spin_lock_bh(&idev->mc_query_lock);
-	if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS) {
+	if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS &&
+	    in6_dev_hold_safe(idev)) {
 		__skb_queue_tail(&idev->mc_query_queue, skb);
-		if (in6_dev_hold_safe(idev)) {
-			if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
-				put = true;
-		}
+		if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
+			put = true;
 		skb = NULL;
 	}
 	spin_unlock_bh(&idev->mc_query_lock);
@@ -1589,12 +1588,11 @@ void igmp6_event_report(struct sk_buff *skb)
 		goto out;
 
 	spin_lock_bh(&idev->mc_report_lock);
-	if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS) {
+	if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS &&
+	    in6_dev_hold_safe(idev)) {
 		__skb_queue_tail(&idev->mc_report_queue, skb);
-		if (in6_dev_hold_safe(idev)) {
-			if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
-				put = true;
-		}
+		if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
+			put = true;
 		skb = NULL;
 	}
 	spin_unlock_bh(&idev->mc_report_lock);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer()
  2026-07-04 19:43 ` [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer() Eric Dumazet
@ 2026-07-05 11:33   ` Ido Schimmel
  0 siblings, 0 replies; 9+ messages in thread
From: Ido Schimmel @ 2026-07-05 11:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, David Ahern, netdev, eric.dumazet,
	Zero Day Initiative

On Sat, Jul 04, 2026 at 07:43:44PM +0000, Eric Dumazet wrote:
> A race condition exists between device teardown (inetdev_destroy) and
> incoming IGMP query processing (igmp_rcv), leading to a Use-After-Free
> in the IGMP timer callback.
> 
> During device destruction, inetdev_destroy() drops the primary reference
> to in_device, which can drop its refcount to 0. The actual freeing of
> in_device memory is deferred via RCU (using call_rcu()).
> 
> Concurrently, igmp_rcv() runs under RCU read lock and obtains the
> in_device pointer. Because the memory is RCU-protected, CPU-0 can safely
> dereference in_device even if its refcount has hit 0.
> 
> However, if CPU-0 calls igmp_gq_start_timer() and re-arms the timer, it
> attempts to acquire a reference using in_dev_hold(). This increments the
> refcount from 0 to 1, triggering a "refcount_t: addition on 0" warning.
> Since the in_device memory is still scheduled to be freed after the RCU
> grace period (as the free callback does not check the refcount again),
> the device is freed while the timer is still armed. When the timer
> expires, it accesses the freed memory, causing a kernel panic.
> 
> Fix this by using refcount_inc_not_zero() (via a new helper
> in_dev_hold_safe()) to prevent acquiring a reference if the device is
> already being destroyed. If the refcount is 0, we do not arm the timer.
> 
> A similar issue in IPv6 MLD is fixed in a subsequent patch.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer()
  2026-07-04 19:43 ` [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer() Eric Dumazet
@ 2026-07-05 11:58   ` Ido Schimmel
  2026-07-05 14:03     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2026-07-05 11:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, David Ahern, netdev, eric.dumazet

On Sat, Jul 04, 2026 at 07:43:46PM +0000, Eric Dumazet wrote:
> When a timer is deleted and not re-armed in igmp_mod_timer(), the code
> currently decrements the reference counter of the multicast list entry
> @im using refcount_dec(&im->refcnt).
> 
> However, igmp_mod_timer() can be called from the RCU reader path (e.g., in
> igmp_heard_query() via for_each_pmc_rcu()). If the group im was
> concurrently removed from the list by ip_mc_dec_group(), its reference count
> might have already been decremented to 1.
> 
> In this case, timer_delete() succeeds, and refcount_dec() decrements
> the refcount from 1 to 0. Since refcount_dec() does not free the object
> when it hits 0 (unlike ip_ma_put()), the im structure is leaked.
> 
> Fix this by using ip_ma_put(im) instead of refcount_dec(&im->refcnt).
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/ipv4/igmp.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index f5f9763895641bf86bfcf9fd7fd7b06012fa4ece..2170b33ba147ce4990e3ee71ba4868e8696b00cb 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -266,6 +266,8 @@ static void igmp_ifc_start_timer(struct in_device *in_dev, int delay)
>  
>  static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
>  {
> +	bool put = false;
> +
>  	spin_lock_bh(&im->lock);
>  	im->unsolicit_count = 0;
>  	if (timer_delete(&im->timer)) {
> @@ -275,10 +277,13 @@ static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
>  			spin_unlock_bh(&im->lock);
>  			return;
>  		}
> -		refcount_dec(&im->refcnt);
> +		put = true;
>  	}
>  	igmp_start_timer(im, max_delay);
>  	spin_unlock_bh(&im->lock);
> +
> +	if (put)
> +		ip_ma_put(im);
>  }

Don't we also need something similar in igmp_stop_timer() [1]? It can
also be called from an RCU reader (i.e., igmp_rcv() ->
igmp_heard_report() -> igmp_stop_timer()).

[1]
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 8e2b60dcc183..d08331b44519 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -217,13 +217,18 @@ static void ip_sf_list_clear_all(struct ip_sf_list *psf)
 
 static void igmp_stop_timer(struct ip_mc_list *im)
 {
+	bool put = false;
+
 	spin_lock_bh(&im->lock);
 	if (timer_delete(&im->timer))
-		refcount_dec(&im->refcnt);
+		put = true;
 	WRITE_ONCE(im->tm_running, 0);
 	WRITE_ONCE(im->reporter, 0);
 	im->unsolicit_count = 0;
 	spin_unlock_bh(&im->lock);
+
+	if (put)
+		ip_ma_put(im);
 }
 
 /* It must be called with locked im->lock */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work
  2026-07-05 10:53   ` Ido Schimmel
@ 2026-07-05 13:58     ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-07-05 13:58 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, David Ahern, netdev, eric.dumazet

On Sun, Jul 5, 2026 at 3:54 AM Ido Schimmel <idosch@nvidia.com> wrote:
>
> On Sat, Jul 04, 2026 at 07:43:45PM +0000, Eric Dumazet wrote:
> > A race condition exists between device teardown and incoming MLD query
> > processing, leading to a Use-After-Free in the MLD delayed work.
> >
> > During device destruction, the primary reference to inet6_dev is dropped,
> > which can drop its refcount to 0. The actual freeing of inet6_dev memory
> > is deferred via RCU.
> >
> > Concurrently, the packet receive path runs under RCU read lock and obtains
> > the inet6_dev pointer. Because the memory is RCU-protected, CPU-0 can
> > safely dereference inet6_dev even if its refcount has hit 0.
> >
> > However, if CPU-0 calls igmp6_event_query() and schedules delayed work, it
> > attempts to acquire a reference using in6_dev_hold(). This increments the
> > refcount from 0 to 1, triggering a "refcount_t: addition on 0" warning.
> > Since the inet6_dev memory is still scheduled to be freed after the RCU
> > grace period, the device is freed while the work is still scheduled.
> > When the work runs, it accesses the freed memory, causing a kernel panic.
> >
> > Fix this by using refcount_inc_not_zero() (via a new helper
> > in6_dev_hold_safe()) to prevent acquiring a reference if the device is
> > already being destroyed. If the refcount is 0, we do not schedule the work.
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>
> Eric, thanks for taking care of this!
>
> > ---
> >  include/net/addrconf.h |  5 +++++
> >  net/ipv6/mcast.c       | 38 ++++++++++++++++++++++++++++----------
> >  2 files changed, 33 insertions(+), 10 deletions(-)
>
> [...]
>
> > @@ -1395,6 +1401,7 @@ static void mld_process_v2(struct inet6_dev *idev, struct mld2_query *mld,
> >  void igmp6_event_query(struct sk_buff *skb)
> >  {
> >       struct inet6_dev *idev = __in6_dev_get(skb->dev);
> > +     bool put = false;
> >
> >       if (!idev || idev->dead)
> >               goto out;
> > @@ -1402,11 +1409,16 @@ void igmp6_event_query(struct sk_buff *skb)
> >       spin_lock_bh(&idev->mc_query_lock);
> >       if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS) {
> >               __skb_queue_tail(&idev->mc_query_queue, skb);
>
> Shouldn't we only enqueue the skb if we managed to take a reference?
> Something like [1] (on top of this patch).
>
> If we failed to take a reference, then ipv6_mc_destroy_dev() already ran
> and the queue will not be purged, thereby leaking this skb.
>
> > -             if (!mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> > -                     in6_dev_hold(idev);
> > +             if (in6_dev_hold_safe(idev)) {
> > +                     if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> > +                             put = true;
> > +             }
> >               skb = NULL;
> >       }
> >       spin_unlock_bh(&idev->mc_query_lock);
> > +
> > +     if (put)
> > +             in6_dev_put(idev);
> >  out:
> >       kfree_skb(skb);
> >  }
> > @@ -1570,6 +1582,7 @@ static void mld_query_work(struct work_struct *work)
> >  void igmp6_event_report(struct sk_buff *skb)
> >  {
> >       struct inet6_dev *idev = __in6_dev_get(skb->dev);
> > +     bool put = false;
> >
> >       if (!idev || idev->dead)
> >               goto out;
> > @@ -1577,11 +1590,16 @@ void igmp6_event_report(struct sk_buff *skb)
> >       spin_lock_bh(&idev->mc_report_lock);
> >       if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS) {
> >               __skb_queue_tail(&idev->mc_report_queue, skb);
>
> Same here
>
> > -             if (!mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> > -                     in6_dev_hold(idev);
> > +             if (in6_dev_hold_safe(idev)) {
> > +                     if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> > +                             put = true;
> > +             }
> >               skb = NULL;
> >       }
> >       spin_unlock_bh(&idev->mc_report_lock);
> > +
> > +     if (put)
> > +             in6_dev_put(idev);
> >  out:
> >       kfree_skb(skb);
> >  }
>
> [1]
> diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
> index 12d22de6f496..aaba4c2aae23 100644
> --- a/net/ipv6/mcast.c
> +++ b/net/ipv6/mcast.c
> @@ -1408,12 +1408,11 @@ void igmp6_event_query(struct sk_buff *skb)
>                 goto out;
>
>         spin_lock_bh(&idev->mc_query_lock);
> -       if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS) {
> +       if (skb_queue_len(&idev->mc_query_queue) < MLD_MAX_SKBS &&
> +           in6_dev_hold_safe(idev)) {
>                 __skb_queue_tail(&idev->mc_query_queue, skb);
> -               if (in6_dev_hold_safe(idev)) {
> -                       if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> -                               put = true;
> -               }
> +               if (mod_delayed_work(mld_wq, &idev->mc_query_work, 0))
> +                       put = true;
>                 skb = NULL;
>         }
>         spin_unlock_bh(&idev->mc_query_lock);
> @@ -1589,12 +1588,11 @@ void igmp6_event_report(struct sk_buff *skb)
>                 goto out;
>
>         spin_lock_bh(&idev->mc_report_lock);
> -       if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS) {
> +       if (skb_queue_len(&idev->mc_report_queue) < MLD_MAX_SKBS &&
> +           in6_dev_hold_safe(idev)) {
>                 __skb_queue_tail(&idev->mc_report_queue, skb);
> -               if (in6_dev_hold_safe(idev)) {
> -                       if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> -                               put = true;
> -               }
> +               if (mod_delayed_work(mld_wq, &idev->mc_report_work, 0))
> +                       put = true;
>                 skb = NULL;
>         }
>         spin_unlock_bh(&idev->mc_report_lock);

Great catch, I will include this in V2.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer()
  2026-07-05 11:58   ` Ido Schimmel
@ 2026-07-05 14:03     ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2026-07-05 14:03 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, David Ahern, netdev, eric.dumazet

On Sun, Jul 5, 2026 at 4:58 AM Ido Schimmel <idosch@nvidia.com> wrote:
>
> On Sat, Jul 04, 2026 at 07:43:46PM +0000, Eric Dumazet wrote:
> > When a timer is deleted and not re-armed in igmp_mod_timer(), the code
> > currently decrements the reference counter of the multicast list entry
> > @im using refcount_dec(&im->refcnt).
> >
> > However, igmp_mod_timer() can be called from the RCU reader path (e.g., in
> > igmp_heard_query() via for_each_pmc_rcu()). If the group im was
> > concurrently removed from the list by ip_mc_dec_group(), its reference count
> > might have already been decremented to 1.
> >
> > In this case, timer_delete() succeeds, and refcount_dec() decrements
> > the refcount from 1 to 0. Since refcount_dec() does not free the object
> > when it hits 0 (unlike ip_ma_put()), the im structure is leaked.
> >
> > Fix this by using ip_ma_put(im) instead of refcount_dec(&im->refcnt).
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  net/ipv4/igmp.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> > index f5f9763895641bf86bfcf9fd7fd7b06012fa4ece..2170b33ba147ce4990e3ee71ba4868e8696b00cb 100644
> > --- a/net/ipv4/igmp.c
> > +++ b/net/ipv4/igmp.c
> > @@ -266,6 +266,8 @@ static void igmp_ifc_start_timer(struct in_device *in_dev, int delay)
> >
> >  static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
> >  {
> > +     bool put = false;
> > +
> >       spin_lock_bh(&im->lock);
> >       im->unsolicit_count = 0;
> >       if (timer_delete(&im->timer)) {
> > @@ -275,10 +277,13 @@ static void igmp_mod_timer(struct ip_mc_list *im, int max_delay)
> >                       spin_unlock_bh(&im->lock);
> >                       return;
> >               }
> > -             refcount_dec(&im->refcnt);
> > +             put = true;
> >       }
> >       igmp_start_timer(im, max_delay);
> >       spin_unlock_bh(&im->lock);
> > +
> > +     if (put)
> > +             ip_ma_put(im);
> >  }
>
> Don't we also need something similar in igmp_stop_timer() [1]? It can
> also be called from an RCU reader (i.e., igmp_rcv() ->
> igmp_heard_report() -> igmp_stop_timer()).

Hmm, again you are right, thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-07-05 14:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-04 19:43 [PATCH net 0/3] ipv4/ipv6: Fix UAF and memory leak in IGMP/MLD Eric Dumazet
2026-07-04 19:43 ` [PATCH net 1/3] ipv4: igmp: Fix potential UAF in igmp_gq_start_timer() Eric Dumazet
2026-07-05 11:33   ` Ido Schimmel
2026-07-04 19:43 ` [PATCH net 2/3] ipv6: mcast: Fix potential UAF in MLD delayed work Eric Dumazet
2026-07-05 10:53   ` Ido Schimmel
2026-07-05 13:58     ` Eric Dumazet
2026-07-04 19:43 ` [PATCH net 3/3] ipv4: igmp: Fix potential memory leak in igmp_mod_timer() Eric Dumazet
2026-07-05 11:58   ` Ido Schimmel
2026-07-05 14:03     ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox