[PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
@ 2026-06-30  2:23 Yuyang Huang
  2026-06-30  7:46 ` Jagielski, Jedrzej
  0 siblings, 1 reply; 9+ messages in thread
From: Yuyang Huang @ 2026-06-30  2:23 UTC (permalink / raw)
  To: Yuyang Huang
  Cc: David S. Miller, Cong Wang, David Ahern, Eric Dumazet,
	Ido Schimmel, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-kernel, netdev

When a device is destroyed under RTNL, ip_mc_destroy_dev() iterates through
the multicast list and calls ip_ma_put() on each membership, scheduling
them for RCU reclamation. However, they are not unlinked from the device's
multicast hash table (mc_hash).

Since the device remains published in dev->ip_ptr until after
ip_mc_destroy_dev() completes, concurrent RCU readers traversing mc_hash
can still locate and access the multicast group after its refcount is
decremented. If the RCU callback runs and frees the group while a reader is
accessing it, a use-after-free occurs.

Fix this by unlinking the multicast group from mc_hash using
ip_mc_hash_remove() before scheduling it for reclamation.

Fixes: e9897071350b ("igmp: hash a hash table to speedup ip_check_mc_rcu()")
Signed-off-by: Yuyang Huang <yuyanghuang@google.com>
---
v2:
- Add Fixes tag in the commit message.

 net/ipv4/igmp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index b6337a47c141..af38073a822d 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1923,6 +1923,7 @@ void ip_mc_destroy_dev(struct in_device *in_dev)

 	while ((i = rtnl_dereference(in_dev->mc_list)) != NULL) {
 		in_dev->mc_list = i->next_rcu;
+		ip_mc_hash_remove(in_dev, i);
 		WRITE_ONCE(in_dev->mc_count, in_dev->mc_count - 1);
 		ip_mc_clear_src(i);
 		ip_ma_put(i);
-- 
2.55.0.rc0.799.gd6f94ed593-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30  2:23 [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction Yuyang Huang
@ 2026-06-30  7:46 ` Jagielski, Jedrzej
  2026-06-30  7:55   ` Yuyang Huang
  0 siblings, 1 reply; 9+ messages in thread
From: Jagielski, Jedrzej @ 2026-06-30  7:46 UTC (permalink / raw)
  To: Yuyang Huang
  Cc: David S. Miller, Cong Wang, David Ahern, Eric Dumazet,
	Ido Schimmel, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org

From: Yuyang Huang <yuyanghuang@google.com> 
Sent: Tuesday, June 30, 2026 4:23 AM
>When a device is destroyed under RTNL, ip_mc_destroy_dev() iterates through
>the multicast list and calls ip_ma_put() on each membership, scheduling
>them for RCU reclamation. However, they are not unlinked from the device's
>multicast hash table (mc_hash).
>
>Since the device remains published in dev->ip_ptr until after
>ip_mc_destroy_dev() completes, concurrent RCU readers traversing mc_hash
>can still locate and access the multicast group after its refcount is
>decremented. If the RCU callback runs and frees the group while a reader is
>accessing it, a use-after-free occurs.
>
>Fix this by unlinking the multicast group from mc_hash using
>ip_mc_hash_remove() before scheduling it for reclamation.
>
>Fixes: e9897071350b ("igmp: hash a hash table to speedup ip_check_mc_rcu()")
>Signed-off-by: Yuyang Huang <yuyanghuang@google.com>

Hi,

why sending this to net-next not to net if that's a bug fix?

In the v1 thread it was said
>This is a long-standing bug, not a recent regression.

so why do not cc stable kernel to get rid of this bug from
stable kernels in such case?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30  7:46 ` Jagielski, Jedrzej
@ 2026-06-30  7:55   ` Yuyang Huang
  2026-06-30 16:59     ` Ido Schimmel
  0 siblings, 1 reply; 9+ messages in thread
From: Yuyang Huang @ 2026-06-30  7:55 UTC (permalink / raw)
  To: Jagielski, Jedrzej
  Cc: David S. Miller, Cong Wang, David Ahern, Eric Dumazet,
	Ido Schimmel, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org

> Hi,
>
> why sending this to net-next not to net if that's a bug fix?
>
> In the v1 thread it was said
> >This is a long-standing bug, not a recent regression.
>
> so why do not cc stable kernel to get rid of this bug from
> stable kernels in such case?

Thanks for the advise, will send this patch to stable kernel.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30  7:55   ` Yuyang Huang
@ 2026-06-30 16:59     ` Ido Schimmel
  2026-06-30 21:13       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2026-06-30 16:59 UTC (permalink / raw)
  To: Yuyang Huang
  Cc: Jagielski, Jedrzej, David S. Miller, Cong Wang, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org

On Tue, Jun 30, 2026 at 04:55:22PM +0900, Yuyang Huang wrote:
> > Hi,
> >
> > why sending this to net-next not to net if that's a bug fix?
> >
> > In the v1 thread it was said
> > >This is a long-standing bug, not a recent regression.
> >
> > so why do not cc stable kernel to get rid of this bug from
> > stable kernels in such case?
> 
> Thanks for the advise, will send this patch to stable kernel.

Please target v3 at net and add a trace given you're claiming for a
use-after-free. That way we know that the problem is real and not a
false-positive from some tool. You can reproduce it by adding enough
delay in inetdev_destroy():

BUG: KASAN: slab-use-after-free in ip_check_mc_rcu+0x2cc/0x500
Read of size 4 at addr ffff88810c571208 by task mausezahn/419

CPU: 2 UID: 0 PID: 419 Comm: mausezahn Not tainted 7.1.0-virtme-g15d4a7c23bf6 #17 PREEMPT(lazy)
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 dump_stack_lvl+0x4d/0x70
 print_report+0x153/0x4c2
 kasan_report+0xda/0x110
 ip_check_mc_rcu+0x2cc/0x500
 ip_route_input_rcu.part.0+0x13d/0xbc0
 ip_route_input_noref+0xb6/0x110
 ip_rcv_finish_core+0x41b/0x1d90
 ip_rcv_finish+0xea/0x1b0
 ip_rcv+0xb7/0x1b0
 __netif_receive_skb_one_core+0xfc/0x180
 process_backlog+0x1ea/0x5e0
 __napi_poll+0x97/0x480
 net_rx_action+0x97c/0xfa0
 handle_softirqs+0x18c/0x4f0
 do_softirq+0x42/0x60
 </IRQ>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30 16:59     ` Ido Schimmel
@ 2026-06-30 21:13       ` Kuniyuki Iwashima
  2026-07-01  0:41         ` Yuyang Huang
  2026-07-01  8:11         ` Ido Schimmel
  0 siblings, 2 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-06-30 21:13 UTC (permalink / raw)
  To: idosch
  Cc: davem, dsahern, edumazet, horms, jedrzej.jagielski, kuba,
	linux-kernel, netdev, pabeni, xiyou.wangcong, yuyanghuang

From: Ido Schimmel <idosch@nvidia.com>
Date: Tue, 30 Jun 2026 19:59:34 +0300
> On Tue, Jun 30, 2026 at 04:55:22PM +0900, Yuyang Huang wrote:
> > > Hi,
> > >
> > > why sending this to net-next not to net if that's a bug fix?
> > >
> > > In the v1 thread it was said
> > > >This is a long-standing bug, not a recent regression.
> > >
> > > so why do not cc stable kernel to get rid of this bug from
> > > stable kernels in such case?
> > 
> > Thanks for the advise, will send this patch to stable kernel.
> 
> Please target v3 at net and add a trace given you're claiming for a
> use-after-free. That way we know that the problem is real and not a
> false-positive from some tool. You can reproduce it by adding enough
> delay in inetdev_destroy():

I guess delay was added between ip_mc_destroy_dev() and
RCU_INIT_POINTER(dev->ip_ptr, NULL) ?

I feel like we should clear it first and destroy everything
as done in IPv6 addrconf_ifdown().


> 
> BUG: KASAN: slab-use-after-free in ip_check_mc_rcu+0x2cc/0x500
> Read of size 4 at addr ffff88810c571208 by task mausezahn/419
> 
> CPU: 2 UID: 0 PID: 419 Comm: mausezahn Not tainted 7.1.0-virtme-g15d4a7c23bf6 #17 PREEMPT(lazy)
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> Call Trace:
>  <IRQ>
>  dump_stack_lvl+0x4d/0x70
>  print_report+0x153/0x4c2
>  kasan_report+0xda/0x110
>  ip_check_mc_rcu+0x2cc/0x500
>  ip_route_input_rcu.part.0+0x13d/0xbc0
>  ip_route_input_noref+0xb6/0x110
>  ip_rcv_finish_core+0x41b/0x1d90
>  ip_rcv_finish+0xea/0x1b0
>  ip_rcv+0xb7/0x1b0
>  __netif_receive_skb_one_core+0xfc/0x180
>  process_backlog+0x1ea/0x5e0
>  __napi_poll+0x97/0x480
>  net_rx_action+0x97c/0xfa0
>  handle_softirqs+0x18c/0x4f0
>  do_softirq+0x42/0x60
>  </IRQ>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30 21:13       ` Kuniyuki Iwashima
@ 2026-07-01  0:41         ` Yuyang Huang
  2026-07-01 10:12           ` Ido Schimmel
  2026-07-01  8:11         ` Ido Schimmel
  1 sibling, 1 reply; 9+ messages in thread
From: Yuyang Huang @ 2026-07-01  0:41 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: idosch, davem, dsahern, edumazet, horms, jedrzej.jagielski, kuba,
	linux-kernel, netdev, pabeni, xiyou.wangcong

On Wed, Jul 1, 2026 at 6:15 AM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>
> From: Ido Schimmel <idosch@nvidia.com>
> Date: Tue, 30 Jun 2026 19:59:34 +0300
> > On Tue, Jun 30, 2026 at 04:55:22PM +0900, Yuyang Huang wrote:
> > > > Hi,
> > > >
> > > > why sending this to net-next not to net if that's a bug fix?
> > > >
> > > > In the v1 thread it was said
> > > > >This is a long-standing bug, not a recent regression.
> > > >
> > > > so why do not cc stable kernel to get rid of this bug from
> > > > stable kernels in such case?
> > >
> > > Thanks for the advise, will send this patch to stable kernel.
> >
> > Please target v3 at net and add a trace given you're claiming for a
> > use-after-free. That way we know that the problem is real and not a
> > false-positive from some tool. You can reproduce it by adding enough
> > delay in inetdev_destroy():
>
> I guess delay was added between ip_mc_destroy_dev() and
> RCU_INIT_POINTER(dev->ip_ptr, NULL) ?
>
> I feel like we should clear it first and destroy everything
> as done in IPv6 addrconf_ifdown().
>
>
> >
> > BUG: KASAN: slab-use-after-free in ip_check_mc_rcu+0x2cc/0x500
> > Read of size 4 at addr ffff88810c571208 by task mausezahn/419
> >
> > CPU: 2 UID: 0 PID: 419 Comm: mausezahn Not tainted 7.1.0-virtme-g15d4a7c23bf6 #17 PREEMPT(lazy)
> > Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> > Call Trace:
> >  <IRQ>
> >  dump_stack_lvl+0x4d/0x70
> >  print_report+0x153/0x4c2
> >  kasan_report+0xda/0x110
> >  ip_check_mc_rcu+0x2cc/0x500
> >  ip_route_input_rcu.part.0+0x13d/0xbc0
> >  ip_route_input_noref+0xb6/0x110
> >  ip_rcv_finish_core+0x41b/0x1d90
> >  ip_rcv_finish+0xea/0x1b0
> >  ip_rcv+0xb7/0x1b0
> >  __netif_receive_skb_one_core+0xfc/0x180
> >  process_backlog+0x1ea/0x5e0
> >  __napi_poll+0x97/0x480
> >  net_rx_action+0x97c/0xfa0
> >  handle_softirqs+0x18c/0x4f0
> >  do_softirq+0x42/0x60
> >  </IRQ>
> >

Thanks for the advise, I will try to add a trace in v3. For more
reference, the issue is pointed out in the following discussion:

https://lore.kernel.org/netdev/95adff35-ee56-49d3-8567-382ac17810b3@redhat.com/#t

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-07-01  0:41         ` Yuyang Huang
@ 2026-07-01 10:12           ` Ido Schimmel
  0 siblings, 0 replies; 9+ messages in thread
From: Ido Schimmel @ 2026-07-01 10:12 UTC (permalink / raw)
  To: Yuyang Huang
  Cc: Kuniyuki Iwashima, davem, dsahern, edumazet, horms,
	jedrzej.jagielski, kuba, linux-kernel, netdev, pabeni,
	xiyou.wangcong

On Wed, Jul 01, 2026 at 09:41:53AM +0900, Yuyang Huang wrote:
> Thanks for the advise, I will try to add a trace in v3. For more
> reference, the issue is pointed out in the following discussion:
> 
> https://lore.kernel.org/netdev/95adff35-ee56-49d3-8567-382ac17810b3@redhat.com/#t

I'm aware, but we get a lot of patches with various claims and no traces
and some of these patches are completely unnecessary / wrong. A trace
tells the reviewer that the issue is real and that the author validated
the fix. Otherwise, it is up to the reviewer to check that this is not
yet another false-positive from some tool.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-06-30 21:13       ` Kuniyuki Iwashima
  2026-07-01  0:41         ` Yuyang Huang
@ 2026-07-01  8:11         ` Ido Schimmel
  2026-07-01  8:58           ` Yuyang Huang
  1 sibling, 1 reply; 9+ messages in thread
From: Ido Schimmel @ 2026-07-01  8:11 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: davem, dsahern, edumazet, horms, jedrzej.jagielski, kuba,
	linux-kernel, netdev, pabeni, xiyou.wangcong, yuyanghuang

On Tue, Jun 30, 2026 at 09:13:11PM +0000, Kuniyuki Iwashima wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> Date: Tue, 30 Jun 2026 19:59:34 +0300
> > On Tue, Jun 30, 2026 at 04:55:22PM +0900, Yuyang Huang wrote:
> > > > Hi,
> > > >
> > > > why sending this to net-next not to net if that's a bug fix?
> > > >
> > > > In the v1 thread it was said
> > > > >This is a long-standing bug, not a recent regression.
> > > >
> > > > so why do not cc stable kernel to get rid of this bug from
> > > > stable kernels in such case?
> > > 
> > > Thanks for the advise, will send this patch to stable kernel.
> > 
> > Please target v3 at net and add a trace given you're claiming for a
> > use-after-free. That way we know that the problem is real and not a
> > false-positive from some tool. You can reproduce it by adding enough
> > delay in inetdev_destroy():
> 
> I guess delay was added between ip_mc_destroy_dev() and
> RCU_INIT_POINTER(dev->ip_ptr, NULL) ?

Yes, to increase the race window.

> I feel like we should clear it first and destroy everything
> as done in IPv6 addrconf_ifdown().

I agree, but let's do it as a separate change in net-next. The current
one line fix is correct and fixes the root cause. Clearing the pointer
happens to fix the problem because it relies on mc_hash only being
accessible via dev->in_dev (vs reaching in_dev via a different path).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction
  2026-07-01  8:11         ` Ido Schimmel
@ 2026-07-01  8:58           ` Yuyang Huang
  0 siblings, 0 replies; 9+ messages in thread
From: Yuyang Huang @ 2026-07-01  8:58 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Kuniyuki Iwashima, davem, dsahern, edumazet, horms,
	jedrzej.jagielski, kuba, linux-kernel, netdev, pabeni,
	xiyou.wangcong

On Wed, Jul 1, 2026 at 5:11 PM Ido Schimmel <idosch@nvidia.com> wrote:
>
> I agree, but let's do it as a separate change in net-next. The current
> one line fix is correct and fixes the root cause. Clearing the pointer
> happens to fix the problem because it relies on mc_hash only being
> accessible via dev->in_dev (vs reaching in_dev via a different path).

Acked, I can send out a separate patch for fixing this part and keep
this change as it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-07-01 10:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30  2:23 [PATCH net-next v2] ipv4: igmp: remove multicast group from hash table on device destruction Yuyang Huang
2026-06-30  7:46 ` Jagielski, Jedrzej
2026-06-30  7:55   ` Yuyang Huang
2026-06-30 16:59     ` Ido Schimmel
2026-06-30 21:13       ` Kuniyuki Iwashima
2026-07-01  0:41         ` Yuyang Huang
2026-07-01 10:12           ` Ido Schimmel
2026-07-01  8:11         ` Ido Schimmel
2026-07-01  8:58           ` Yuyang Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox