netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace
@ 2024-03-14 21:07 Josef Bacik
  2024-03-14 21:47 ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Josef Bacik @ 2024-03-14 21:07 UTC (permalink / raw)
  To: edumazet; +Cc: kuba, netdev, linux-nfs

Hello,

We've been hitting the following panic in production, and I've root caused
what's happening, but I'm at a loss on how to fix it.

The panic we're seeing is this

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    RIP: 0010:ip6_pol_route+0x59/0x7a0
    Call Trace:
     <IRQ>
     ? __die+0x78/0xc0
     ? page_fault_oops+0x286/0x380
     ? fib6_table_lookup+0x95/0xf40
     ? exc_page_fault+0x5d/0x110
     ? asm_exc_page_fault+0x22/0x30
     ? ip6_pol_route+0x59/0x7a0
     ? unlink_anon_vmas+0x370/0x370
     fib6_rule_lookup+0x56/0x1b0
     ? update_blocked_averages+0x2c6/0x6a0
     ip6_route_output_flags+0xd2/0x130
     ip6_dst_lookup_tail+0x3b/0x220
     ip6_dst_lookup_flow+0x2c/0x80
     inet6_sk_rebuild_header+0x14c/0x1e0
     ? tcp_release_cb+0x150/0x150
     __tcp_retransmit_skb+0x68/0x6b0
     ? tcp_current_mss+0xca/0x150
     ? tcp_release_cb+0x150/0x150
     tcp_send_loss_probe+0x8e/0x220
     tcp_write_timer+0xbe/0x2d0
     run_timer_softirq+0x272/0x840
     ? hrtimer_interrupt+0x2c9/0x5f0
     ? sched_clock_cpu+0xc/0x170
     irq_exit_rcu+0x171/0x330
     sysvec_apic_timer_interrupt+0x6d/0x80
     </IRQ>
     <TASK>
     asm_sysvec_apic_timer_interrupt+0x16/0x20
    RIP: 0010:cpuidle_enter_state+0xe7/0x243

Inspecting the vmcore with drgn you can see why this is a NULL pointer deref

      >>> prog.crashed_thread().stack_trace()[0]
      #0 at 0xffffffff810bfa89 (ip6_pol_route+0x59/0x796) in ip6_pol_route at net/ipv6/route.c:2212:40

      2212        if (net->ipv6.devconf_all->forwarding == 0)
      2213              strict |= RT6_LOOKUP_F_REACHABLE;

      >>> prog.crashed_thread().stack_trace()[0]['net'].ipv6.devconf_all
      (struct ipv6_devconf *)0x0

Looking at the socket you can see that it's been closed

      >>> decode_enum_type_flags(prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_flags, prog.type('enum sock_flags'))
      'SOCK_DEAD|SOCK_KEEPOPEN|SOCK_ZAPPED|SOCK_USE_WRITE_QUEUE'
      >>> decode_enum_type_flags(1 << prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_state.value_(), prog["TCPF_CLOSE"].type_, bit_numbers=False)
      'TCPF_FIN_WAIT1'

The way this reproduces is with our NFS setup.  We have an NFS mount inside of a
container, which has it's own network namespace.  We setup the mount inside of
this network namespace.

On container shutdown sometimes we trigger this panic, it's pretty reliably
reproduced, with a stress tier of 200 machines I can usually trigger it on ~10
machines by stopping the jobs.

My initial thought was that NFS wasn't properly shutting down the sockets, but
this doesn't appear to be the case.  The sock is always marked with SOCK_DEAD.
My second thought was that we had some pending timers when we call
kernel_sock_shutdown(), so I added tcp_clear_xmit_timers(sk); to tcp_shutdown()
to make sure the timers were cleared.  This didn't fix the issue.

I added some debugging to the socket and flagged the socket when NFS called
kernel_sock_shutdown() and then had a WARN_ON(sock_flag(sk,
JOSEFS_SPECIAL_FLAG)) where we arm the timer, and that trips constantly.  So
we're definitely arming the sock after NFS has shutdown the socket.

This is where we leave my ability to figure out what's going on and how to fix
it.  What seems to be happening is this

1. NFS calls kernel_sock_shutdown() when we unmount.
2. We get an ACK on the socket and the timer gets armed.
3. We shutdown the container and tear down the network namespace.
4. The timer fires and we try to send the loss probe and we panic because the
   network namespace teardown removes the devconf as part of its teardown.

It appears to me that sock's will just hang around forever past the end of an
application being done with it, tho I'm not sure if I'm correct in this.  If
that's the case then I don't know the correct way to handle this, other than
adding an extra case for the timer to simply not run when SOCK_DEAD is set.  But
this seems to be done on purpose, so seems like that's a bad fix.

Let me know if you have debug patches or other information you'd like from a
vmcore, I have plenty.  Like I said I can reproduce reliably, it does take a few
hours to deploy a test kernel, but I can have a turn around of about a day for
debug patches.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace
  2024-03-14 21:07 [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace Josef Bacik
@ 2024-03-14 21:47 ` Eric Dumazet
  2024-03-14 22:10   ` Eric Dumazet
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2024-03-14 21:47 UTC (permalink / raw)
  To: Josef Bacik; +Cc: kuba, netdev, linux-nfs

On Thu, Mar 14, 2024 at 10:07 PM Josef Bacik <josef@toxicpanda.com> wrote:
>
> Hello,
>
> We've been hitting the following panic in production, and I've root caused
> what's happening, but I'm at a loss on how to fix it.
>
> The panic we're seeing is this
>
>     BUG: kernel NULL pointer dereference, address: 0000000000000000
>     RIP: 0010:ip6_pol_route+0x59/0x7a0
>     Call Trace:
>      <IRQ>
>      ? __die+0x78/0xc0
>      ? page_fault_oops+0x286/0x380
>      ? fib6_table_lookup+0x95/0xf40
>      ? exc_page_fault+0x5d/0x110
>      ? asm_exc_page_fault+0x22/0x30
>      ? ip6_pol_route+0x59/0x7a0
>      ? unlink_anon_vmas+0x370/0x370
>      fib6_rule_lookup+0x56/0x1b0
>      ? update_blocked_averages+0x2c6/0x6a0
>      ip6_route_output_flags+0xd2/0x130
>      ip6_dst_lookup_tail+0x3b/0x220
>      ip6_dst_lookup_flow+0x2c/0x80
>      inet6_sk_rebuild_header+0x14c/0x1e0
>      ? tcp_release_cb+0x150/0x150
>      __tcp_retransmit_skb+0x68/0x6b0
>      ? tcp_current_mss+0xca/0x150
>      ? tcp_release_cb+0x150/0x150
>      tcp_send_loss_probe+0x8e/0x220
>      tcp_write_timer+0xbe/0x2d0
>      run_timer_softirq+0x272/0x840
>      ? hrtimer_interrupt+0x2c9/0x5f0
>      ? sched_clock_cpu+0xc/0x170
>      irq_exit_rcu+0x171/0x330
>      sysvec_apic_timer_interrupt+0x6d/0x80
>      </IRQ>
>      <TASK>
>      asm_sysvec_apic_timer_interrupt+0x16/0x20
>     RIP: 0010:cpuidle_enter_state+0xe7/0x243
>
> Inspecting the vmcore with drgn you can see why this is a NULL pointer deref
>
>       >>> prog.crashed_thread().stack_trace()[0]
>       #0 at 0xffffffff810bfa89 (ip6_pol_route+0x59/0x796) in ip6_pol_route at net/ipv6/route.c:2212:40
>
>       2212        if (net->ipv6.devconf_all->forwarding == 0)
>       2213              strict |= RT6_LOOKUP_F_REACHABLE;
>
>       >>> prog.crashed_thread().stack_trace()[0]['net'].ipv6.devconf_all
>       (struct ipv6_devconf *)0x0
>
> Looking at the socket you can see that it's been closed
>
>       >>> decode_enum_type_flags(prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_flags, prog.type('enum sock_flags'))
>       'SOCK_DEAD|SOCK_KEEPOPEN|SOCK_ZAPPED|SOCK_USE_WRITE_QUEUE'
>       >>> decode_enum_type_flags(1 << prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_state.value_(), prog["TCPF_CLOSE"].type_, bit_numbers=False)
>       'TCPF_FIN_WAIT1'
>
> The way this reproduces is with our NFS setup.  We have an NFS mount inside of a
> container, which has it's own network namespace.  We setup the mount inside of
> this network namespace.
>
> On container shutdown sometimes we trigger this panic, it's pretty reliably
> reproduced, with a stress tier of 200 machines I can usually trigger it on ~10
> machines by stopping the jobs.
>
> My initial thought was that NFS wasn't properly shutting down the sockets, but
> this doesn't appear to be the case.  The sock is always marked with SOCK_DEAD.
> My second thought was that we had some pending timers when we call
> kernel_sock_shutdown(), so I added tcp_clear_xmit_timers(sk); to tcp_shutdown()
> to make sure the timers were cleared.  This didn't fix the issue.
>
> I added some debugging to the socket and flagged the socket when NFS called
> kernel_sock_shutdown() and then had a WARN_ON(sock_flag(sk,
> JOSEFS_SPECIAL_FLAG)) where we arm the timer, and that trips constantly.  So
> we're definitely arming the sock after NFS has shutdown the socket.
>
> This is where we leave my ability to figure out what's going on and how to fix
> it.  What seems to be happening is this
>
> 1. NFS calls kernel_sock_shutdown() when we unmount.
> 2. We get an ACK on the socket and the timer gets armed.
> 3. We shutdown the container and tear down the network namespace.
> 4. The timer fires and we try to send the loss probe and we panic because the
>    network namespace teardown removes the devconf as part of its teardown.
>
> It appears to me that sock's will just hang around forever past the end of an
> application being done with it, tho I'm not sure if I'm correct in this.  If
> that's the case then I don't know the correct way to handle this, other than
> adding an extra case for the timer to simply not run when SOCK_DEAD is set.  But
> this seems to be done on purpose, so seems like that's a bad fix.
>
> Let me know if you have debug patches or other information you'd like from a
> vmcore, I have plenty.  Like I said I can reproduce reliably, it does take a few
> hours to deploy a test kernel, but I can have a turn around of about a day for
> debug patches.  Thanks,
>
> Josef

  If NFS is using kernel sockets, it is NFS responsibility to remove
all of them when the netns is destroyed.

Also look at recent relevant  patches

2a750d6a5b365265dbda33330a6188547ddb5c24 rds: tcp: Fix use-after-free
of net in reqsk_timer_handler().
1c4e97dd2d3c9a3e84f7e26346aa39bc426d3249 tcp: Fix NEW_SYN_RECV
handling in inet_twsk_purge()

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace
  2024-03-14 21:47 ` Eric Dumazet
@ 2024-03-14 22:10   ` Eric Dumazet
  2024-03-15 14:03     ` Josef Bacik
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Dumazet @ 2024-03-14 22:10 UTC (permalink / raw)
  To: Josef Bacik; +Cc: kuba, netdev, linux-nfs

On Thu, Mar 14, 2024 at 10:47 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Mar 14, 2024 at 10:07 PM Josef Bacik <josef@toxicpanda.com> wrote:
> >
> > Hello,
> >
> > We've been hitting the following panic in production, and I've root caused
> > what's happening, but I'm at a loss on how to fix it.
> >
> > The panic we're seeing is this
> >
> >     BUG: kernel NULL pointer dereference, address: 0000000000000000
> >     RIP: 0010:ip6_pol_route+0x59/0x7a0
> >     Call Trace:
> >      <IRQ>
> >      ? __die+0x78/0xc0
> >      ? page_fault_oops+0x286/0x380
> >      ? fib6_table_lookup+0x95/0xf40
> >      ? exc_page_fault+0x5d/0x110
> >      ? asm_exc_page_fault+0x22/0x30
> >      ? ip6_pol_route+0x59/0x7a0
> >      ? unlink_anon_vmas+0x370/0x370
> >      fib6_rule_lookup+0x56/0x1b0
> >      ? update_blocked_averages+0x2c6/0x6a0
> >      ip6_route_output_flags+0xd2/0x130
> >      ip6_dst_lookup_tail+0x3b/0x220
> >      ip6_dst_lookup_flow+0x2c/0x80
> >      inet6_sk_rebuild_header+0x14c/0x1e0
> >      ? tcp_release_cb+0x150/0x150
> >      __tcp_retransmit_skb+0x68/0x6b0
> >      ? tcp_current_mss+0xca/0x150
> >      ? tcp_release_cb+0x150/0x150
> >      tcp_send_loss_probe+0x8e/0x220
> >      tcp_write_timer+0xbe/0x2d0
> >      run_timer_softirq+0x272/0x840
> >      ? hrtimer_interrupt+0x2c9/0x5f0
> >      ? sched_clock_cpu+0xc/0x170
> >      irq_exit_rcu+0x171/0x330
> >      sysvec_apic_timer_interrupt+0x6d/0x80
> >      </IRQ>
> >      <TASK>
> >      asm_sysvec_apic_timer_interrupt+0x16/0x20
> >     RIP: 0010:cpuidle_enter_state+0xe7/0x243
> >
> > Inspecting the vmcore with drgn you can see why this is a NULL pointer deref
> >
> >       >>> prog.crashed_thread().stack_trace()[0]
> >       #0 at 0xffffffff810bfa89 (ip6_pol_route+0x59/0x796) in ip6_pol_route at net/ipv6/route.c:2212:40
> >
> >       2212        if (net->ipv6.devconf_all->forwarding == 0)
> >       2213              strict |= RT6_LOOKUP_F_REACHABLE;
> >
> >       >>> prog.crashed_thread().stack_trace()[0]['net'].ipv6.devconf_all
> >       (struct ipv6_devconf *)0x0
> >
> > Looking at the socket you can see that it's been closed
> >
> >       >>> decode_enum_type_flags(prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_flags, prog.type('enum sock_flags'))
> >       'SOCK_DEAD|SOCK_KEEPOPEN|SOCK_ZAPPED|SOCK_USE_WRITE_QUEUE'
> >       >>> decode_enum_type_flags(1 << prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_state.value_(), prog["TCPF_CLOSE"].type_, bit_numbers=False)
> >       'TCPF_FIN_WAIT1'
> >
> > The way this reproduces is with our NFS setup.  We have an NFS mount inside of a
> > container, which has it's own network namespace.  We setup the mount inside of
> > this network namespace.
> >
> > On container shutdown sometimes we trigger this panic, it's pretty reliably
> > reproduced, with a stress tier of 200 machines I can usually trigger it on ~10
> > machines by stopping the jobs.
> >
> > My initial thought was that NFS wasn't properly shutting down the sockets, but
> > this doesn't appear to be the case.  The sock is always marked with SOCK_DEAD.
> > My second thought was that we had some pending timers when we call
> > kernel_sock_shutdown(), so I added tcp_clear_xmit_timers(sk); to tcp_shutdown()
> > to make sure the timers were cleared.  This didn't fix the issue.
> >
> > I added some debugging to the socket and flagged the socket when NFS called
> > kernel_sock_shutdown() and then had a WARN_ON(sock_flag(sk,
> > JOSEFS_SPECIAL_FLAG)) where we arm the timer, and that trips constantly.  So
> > we're definitely arming the sock after NFS has shutdown the socket.
> >
> > This is where we leave my ability to figure out what's going on and how to fix
> > it.  What seems to be happening is this
> >
> > 1. NFS calls kernel_sock_shutdown() when we unmount.
> > 2. We get an ACK on the socket and the timer gets armed.
> > 3. We shutdown the container and tear down the network namespace.
> > 4. The timer fires and we try to send the loss probe and we panic because the
> >    network namespace teardown removes the devconf as part of its teardown.
> >
> > It appears to me that sock's will just hang around forever past the end of an
> > application being done with it, tho I'm not sure if I'm correct in this.  If
> > that's the case then I don't know the correct way to handle this, other than
> > adding an extra case for the timer to simply not run when SOCK_DEAD is set.  But
> > this seems to be done on purpose, so seems like that's a bad fix.
> >
> > Let me know if you have debug patches or other information you'd like from a
> > vmcore, I have plenty.  Like I said I can reproduce reliably, it does take a few
> > hours to deploy a test kernel, but I can have a turn around of about a day for
> > debug patches.  Thanks,
> >
> > Josef
>
>   If NFS is using kernel sockets, it is NFS responsibility to remove
> all of them when the netns is destroyed.
>
> Also look at recent relevant  patches
>
> 2a750d6a5b365265dbda33330a6188547ddb5c24 rds: tcp: Fix use-after-free
> of net in reqsk_timer_handler().
> 1c4e97dd2d3c9a3e84f7e26346aa39bc426d3249 tcp: Fix NEW_SYN_RECV
> handling in inet_twsk_purge()

Another relevant patch was

commit 3a58f13a881ed351198ffab4cf9953cf19d2ab3a
Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date:   Mon May 2 10:40:18 2022 +0900

    net: rds: acquire refcount on TCP sockets

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace
  2024-03-14 22:10   ` Eric Dumazet
@ 2024-03-15 14:03     ` Josef Bacik
  0 siblings, 0 replies; 4+ messages in thread
From: Josef Bacik @ 2024-03-15 14:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: kuba, netdev, linux-nfs

On Thu, Mar 14, 2024 at 11:10:45PM +0100, Eric Dumazet wrote:
> On Thu, Mar 14, 2024 at 10:47 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Thu, Mar 14, 2024 at 10:07 PM Josef Bacik <josef@toxicpanda.com> wrote:
> > >
> > > Hello,
> > >
> > > We've been hitting the following panic in production, and I've root caused
> > > what's happening, but I'm at a loss on how to fix it.
> > >
> > > The panic we're seeing is this
> > >
> > >     BUG: kernel NULL pointer dereference, address: 0000000000000000
> > >     RIP: 0010:ip6_pol_route+0x59/0x7a0
> > >     Call Trace:
> > >      <IRQ>
> > >      ? __die+0x78/0xc0
> > >      ? page_fault_oops+0x286/0x380
> > >      ? fib6_table_lookup+0x95/0xf40
> > >      ? exc_page_fault+0x5d/0x110
> > >      ? asm_exc_page_fault+0x22/0x30
> > >      ? ip6_pol_route+0x59/0x7a0
> > >      ? unlink_anon_vmas+0x370/0x370
> > >      fib6_rule_lookup+0x56/0x1b0
> > >      ? update_blocked_averages+0x2c6/0x6a0
> > >      ip6_route_output_flags+0xd2/0x130
> > >      ip6_dst_lookup_tail+0x3b/0x220
> > >      ip6_dst_lookup_flow+0x2c/0x80
> > >      inet6_sk_rebuild_header+0x14c/0x1e0
> > >      ? tcp_release_cb+0x150/0x150
> > >      __tcp_retransmit_skb+0x68/0x6b0
> > >      ? tcp_current_mss+0xca/0x150
> > >      ? tcp_release_cb+0x150/0x150
> > >      tcp_send_loss_probe+0x8e/0x220
> > >      tcp_write_timer+0xbe/0x2d0
> > >      run_timer_softirq+0x272/0x840
> > >      ? hrtimer_interrupt+0x2c9/0x5f0
> > >      ? sched_clock_cpu+0xc/0x170
> > >      irq_exit_rcu+0x171/0x330
> > >      sysvec_apic_timer_interrupt+0x6d/0x80
> > >      </IRQ>
> > >      <TASK>
> > >      asm_sysvec_apic_timer_interrupt+0x16/0x20
> > >     RIP: 0010:cpuidle_enter_state+0xe7/0x243
> > >
> > > Inspecting the vmcore with drgn you can see why this is a NULL pointer deref
> > >
> > >       >>> prog.crashed_thread().stack_trace()[0]
> > >       #0 at 0xffffffff810bfa89 (ip6_pol_route+0x59/0x796) in ip6_pol_route at net/ipv6/route.c:2212:40
> > >
> > >       2212        if (net->ipv6.devconf_all->forwarding == 0)
> > >       2213              strict |= RT6_LOOKUP_F_REACHABLE;
> > >
> > >       >>> prog.crashed_thread().stack_trace()[0]['net'].ipv6.devconf_all
> > >       (struct ipv6_devconf *)0x0
> > >
> > > Looking at the socket you can see that it's been closed
> > >
> > >       >>> decode_enum_type_flags(prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_flags, prog.type('enum sock_flags'))
> > >       'SOCK_DEAD|SOCK_KEEPOPEN|SOCK_ZAPPED|SOCK_USE_WRITE_QUEUE'
> > >       >>> decode_enum_type_flags(1 << prog.crashed_thread().stack_trace()[11]['sk'].__sk_common.skc_state.value_(), prog["TCPF_CLOSE"].type_, bit_numbers=False)
> > >       'TCPF_FIN_WAIT1'
> > >
> > > The way this reproduces is with our NFS setup.  We have an NFS mount inside of a
> > > container, which has it's own network namespace.  We setup the mount inside of
> > > this network namespace.
> > >
> > > On container shutdown sometimes we trigger this panic, it's pretty reliably
> > > reproduced, with a stress tier of 200 machines I can usually trigger it on ~10
> > > machines by stopping the jobs.
> > >
> > > My initial thought was that NFS wasn't properly shutting down the sockets, but
> > > this doesn't appear to be the case.  The sock is always marked with SOCK_DEAD.
> > > My second thought was that we had some pending timers when we call
> > > kernel_sock_shutdown(), so I added tcp_clear_xmit_timers(sk); to tcp_shutdown()
> > > to make sure the timers were cleared.  This didn't fix the issue.
> > >
> > > I added some debugging to the socket and flagged the socket when NFS called
> > > kernel_sock_shutdown() and then had a WARN_ON(sock_flag(sk,
> > > JOSEFS_SPECIAL_FLAG)) where we arm the timer, and that trips constantly.  So
> > > we're definitely arming the sock after NFS has shutdown the socket.
> > >
> > > This is where we leave my ability to figure out what's going on and how to fix
> > > it.  What seems to be happening is this
> > >
> > > 1. NFS calls kernel_sock_shutdown() when we unmount.
> > > 2. We get an ACK on the socket and the timer gets armed.
> > > 3. We shutdown the container and tear down the network namespace.
> > > 4. The timer fires and we try to send the loss probe and we panic because the
> > >    network namespace teardown removes the devconf as part of its teardown.
> > >
> > > It appears to me that sock's will just hang around forever past the end of an
> > > application being done with it, tho I'm not sure if I'm correct in this.  If
> > > that's the case then I don't know the correct way to handle this, other than
> > > adding an extra case for the timer to simply not run when SOCK_DEAD is set.  But
> > > this seems to be done on purpose, so seems like that's a bad fix.
> > >
> > > Let me know if you have debug patches or other information you'd like from a
> > > vmcore, I have plenty.  Like I said I can reproduce reliably, it does take a few
> > > hours to deploy a test kernel, but I can have a turn around of about a day for
> > > debug patches.  Thanks,
> > >
> > > Josef
> >
> >   If NFS is using kernel sockets, it is NFS responsibility to remove
> > all of them when the netns is destroyed.
> >
> > Also look at recent relevant  patches
> >
> > 2a750d6a5b365265dbda33330a6188547ddb5c24 rds: tcp: Fix use-after-free
> > of net in reqsk_timer_handler().
> > 1c4e97dd2d3c9a3e84f7e26346aa39bc426d3249 tcp: Fix NEW_SYN_RECV
> > handling in inet_twsk_purge()
> 
> Another relevant patch was
> 
> commit 3a58f13a881ed351198ffab4cf9953cf19d2ab3a
> Author: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date:   Mon May 2 10:40:18 2022 +0900
> 
>     net: rds: acquire refcount on TCP sockets

Thanks for the quick reply Eric!  I'll get something like this done and tested
for NFS.

Josef

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-03-15 14:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-14 21:07 [BUG] Panic in ipv6 on old NFS sockets from destroyed network namespace Josef Bacik
2024-03-14 21:47 ` Eric Dumazet
2024-03-14 22:10   ` Eric Dumazet
2024-03-15 14:03     ` Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).