From: Stanislav Fomichev <stfomichev@gmail.com>
To: Mina Almasry <almasrymina@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>,
davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
edumazet@google.com, andrew+netdev@lunn.ch, horms@kernel.org,
asml.silence@gmail.com, dw@davidwei.uk, sdf@fomichev.me,
skhawaja@google.com, simona.vetter@ffwll.ch, kaiyuanz@google.com,
netdev@vger.kernel.org
Subject: Re: [PATCH net] net: devmem: fix kernel panic when socket close after module unload
Date: Tue, 15 Apr 2025 11:59:40 -0700 [thread overview]
Message-ID: <Z_6snPXxWLmsNHL5@mini-arch> (raw)
In-Reply-To: <CAHS8izMrN4+UuoRy3zUS0-2KJGfUhRVxyeJHEn81VX=9TdjKcg@mail.gmail.com>
On 04/15, Mina Almasry wrote:
> On Tue, Apr 15, 2025 at 2:24 AM Taehee Yoo <ap420073@gmail.com> wrote:
> >
> > Kernel panic occurs when a devmem TCP socket is closed after NIC module
> > is unloaded.
> >
> > This is Devmem TCP unregistration scenarios. number is an order.
> > (a)socket close (b)pp destroy (c)uninstall result
> > 1 2 3 OK
> > 1 3 2 (d)Impossible
> > 2 1 3 OK
> > 3 1 2 (e)Kernel panic
> > 2 3 1 (d)Impossible
> > 3 2 1 (d)Impossible
> >
> > (a) netdev_nl_sock_priv_destroy() is called when devmem TCP socket is
> > closed.
> > (b) page_pool_destroy() is called when the interface is down.
> > (c) mp_ops->uninstall() is called when an interface is unregistered.
> > (d) There is no scenario in mp_ops->uninstall() is called before
> > page_pool_destroy().
> > Because unregister_netdevice_many_notify() closes interfaces first
> > and then calls mp_ops->uninstall().
> > (e) netdev_nl_sock_priv_destroy() accesses struct net_device.
> > But if the interface module has already been removed, net_device
> > pointer is invalid, so it causes kernel panic.
> >
> > In summary, there are only 3 possible scenarios.
> > A. sk close -> pp destroy -> uninstall.
> > B. pp destroy -> sk close -> uninstall.
> > C. pp destroy -> uninstall -> sk close.
> >
> > Case C is a kernel panic scenario.
> >
> > In order to fix this problem, it makes netdev_nl_sock_priv_destroy() do
> > nothing if a module is already removed.
> > The mp_ops->uninstall() handles these instead.
> >
> > The netdev_nl_sock_priv_destroy() iterates binding->list and releases
> > them all with net_devmem_unbind_dmabuf().
> > The net_devmem_unbind_dmabuf() has the below steps.
> > 1. Delete binding from a list.
> > 2. Call _net_mp_close_rxq() for all rxq's bound to a binding.
> > 3. Call net_devmem_dmabuf_binding_put() to release resources.
> >
> > The mp_ops->uninstall() doesn't need to call _net_mp_close_rxq() because
> > resources are already released properly when an interface is being down.
> >
> > From now on netdev_nl_sock_priv_destroy() will do nothing if a module
> > has been removed because all bindings are removed from a list by
> > mp_ops->uninstall().
> >
> > netdev_nl_sock_priv_destroy() internally sets mp_ops to NULL.
> > So mp_ops->uninstall has not been called if devmem TCP socket was
> > already closed.
> >
> > Tests:
> > Scenario A:
> > ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
> > -v 7 -t 1 -q 1 &
> > pid=$!
> > sleep 10
> > ip link set $interface down
> > kill $pid
> > modprobe -rv $module
> >
> > Scenario B:
> > ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
> > -v 7 -t 1 -q 1 &
> > pid=$!
> > sleep 10
> > ip link set $interface down
> > kill $pid
> > modprobe -rv $module
> >
>
> Scenario A & B are exactly the same steps?
>
> > Scenario C:
> > ./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f $interface -l -p 8000 \
> > -v 7 -t 1 -q 1 &
> > pid=$!
> > sleep 10
> > modprobe -rv $module
> > sleep 5
> > kill $pid
> >
> > Splat looks like:
> > Oops: general protection fault, probably for non-canonical address 0xdffffc001fffa9f7: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
> > KASAN: probably user-memory-access in range [0x00000000fffd4fb8-0x00000000fffd4fbf]
> > CPU: 0 UID: 0 PID: 2041 Comm: ncdevmem Tainted: G B W 6.15.0-rc1+ #2 PREEMPT(undef) 0947ec89efa0fd68838b78e36aa1617e97ff5d7f
> > Tainted: [B]=BAD_PAGE, [W]=WARN
> > RIP: 0010:__mutex_lock (./include/linux/sched.h:2244 kernel/locking/mutex.c:400 kernel/locking/mutex.c:443 kernel/locking/mutex.c:605 kernel/locking/mutex.c:746)
> > Code: ea 03 80 3c 02 00 0f 85 4f 13 00 00 49 8b 1e 48 83 e3 f8 74 6a 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 34 48 89 fa 48 c1 ea 03 <0f> b6 f
> > RSP: 0018:ffff88826f7ef730 EFLAGS: 00010203
> > RAX: dffffc0000000000 RBX: 00000000fffd4f88 RCX: ffffffffaa9bc811
> > RDX: 000000001fffa9f7 RSI: 0000000000000008 RDI: 00000000fffd4fbc
> > RBP: ffff88826f7ef8b0 R08: 0000000000000000 R09: ffffed103e6aa1a4
> > R10: 0000000000000007 R11: ffff88826f7ef442 R12: fffffbfff669f65e
> > R13: ffff88812a830040 R14: ffff8881f3550d20 R15: 00000000fffd4f88
> > FS: 0000000000000000(0000) GS:ffff888866c05000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000563bed0cb288 CR3: 00000001a7c98000 CR4: 00000000007506f0
> > PKRU: 55555554
> > Call Trace:
> > <TASK>
> > ...
> > netdev_nl_sock_priv_destroy (net/core/netdev-genl.c:953 (discriminator 3))
>
> Line 953 is:
>
> netdev_lock(dev);
>
> Which was introduced by:
>
> commit 42f342387841 ("net: fix use-after-free in the
> netdev_nl_sock_priv_destroy()") and rolling back a few fixes, it's
> really introduced by commit 1d22d3060b9b ("net: drop rtnl_lock for
> queue_mgmt operations").
>
> My first question, does this issue still reproduce if you remove the
> per netdev locking and go back to relying on rtnl_locking? Or do we
> crash somewhere else in net_devmem_unbind_dmabuf? If so, where?
> Looking through the rest of the unbinding code, it's not clear to me
> any of it actually uses dev, so it may just be the locking...
A proper fix, most likely, will involve resetting binding->dev to NULL
when the device is going away. Replacing rtnl with dev lock exposes the
fact that we can't assume that the binding->dev is still valid by
the time we do unbind.
next prev parent reply other threads:[~2025-04-15 18:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-15 9:24 [PATCH net] net: devmem: fix kernel panic when socket close after module unload Taehee Yoo
2025-04-15 17:33 ` Stanislav Fomichev
2025-04-15 18:22 ` Mina Almasry
2025-04-15 18:59 ` Stanislav Fomichev [this message]
2025-04-16 2:59 ` Jakub Kicinski
2025-04-16 14:40 ` Stanislav Fomichev
2025-04-17 0:15 ` Jakub Kicinski
2025-04-16 15:01 ` Taehee Yoo
2025-04-17 0:35 ` Jakub Kicinski
2025-04-17 6:57 ` Taehee Yoo
2025-04-17 14:09 ` Jakub Kicinski
2025-04-18 10:46 ` Taehee Yoo
2025-04-16 15:47 ` Mina Almasry
2025-04-17 0:27 ` Jakub Kicinski
2025-04-17 21:07 ` Mina Almasry
2025-04-18 10:52 ` Taehee Yoo
2025-05-05 17:34 ` Jakub Kicinski
2025-05-06 11:41 ` Taehee Yoo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_6snPXxWLmsNHL5@mini-arch \
--to=stfomichev@gmail.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=ap420073@gmail.com \
--cc=asml.silence@gmail.com \
--cc=davem@davemloft.net \
--cc=dw@davidwei.uk \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kaiyuanz@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=simona.vetter@ffwll.ch \
--cc=skhawaja@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.