* Re: [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind
[not found] ` <20251230-ncm-refactor-v1-3-793e347bc7a7@google.com>
@ 2026-03-09 10:20 ` Jon Hunter
2026-03-09 10:37 ` Kuen-Han Tsai
0 siblings, 1 reply; 2+ messages in thread
From: Jon Hunter @ 2026-03-09 10:20 UTC (permalink / raw)
To: Kuen-Han Tsai, Greg Kroah-Hartman, Felipe Balbi, Prashanth K,
Kyungmin Park, Andrzej Pietrasiewicz
Cc: linux-usb, linux-kernel, stable, linux-tegra@vger.kernel.org
On 30/12/2025 10:13, Kuen-Han Tsai wrote:
> Currently, the net_device is allocated in ncm_alloc_inst() and freed in
> ncm_free_inst(). This ties the network interface's lifetime to the
> configuration instance rather than the USB connection (bind/unbind).
>
> This decoupling causes issues when the USB gadget is disconnected where
> the underlying gadget device is removed. The net_device can outlive its
> parent, leading to dangling sysfs links and NULL pointer dereferences
> when accessing the freed gadget device.
>
> Problem 1: NULL pointer dereference on disconnect
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000000
> Call trace:
> __pi_strlen+0x14/0x150
> rtnl_fill_ifinfo+0x6b4/0x708
> rtmsg_ifinfo_build_skb+0xd8/0x13c
> rtmsg_ifinfo+0x50/0xa0
> __dev_notify_flags+0x4c/0x1f0
> dev_change_flags+0x54/0x70
> do_setlink+0x390/0xebc
> rtnl_newlink+0x7d0/0xac8
> rtnetlink_rcv_msg+0x27c/0x410
> netlink_rcv_skb+0x134/0x150
> rtnetlink_rcv+0x18/0x28
> netlink_unicast+0x254/0x3f0
> netlink_sendmsg+0x2e0/0x3d4
>
> Problem 2: Dangling sysfs symlinks
> console:/ # ls -l /sys/class/net/ncm0
> lrwxrwxrwx ... /sys/class/net/ncm0 ->
> /sys/devices/platform/.../gadget.0/net/ncm0
> console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0
> ls: .../gadget.0/net/ncm0: No such file or directory
>
> Move the net_device allocation to ncm_bind() and deallocation to
> ncm_unbind(). This ensures the network interface exists only when the
> gadget function is actually bound to a configuration.
>
> To support pre-bind configuration (e.g., setting interface name or MAC
> address via configfs), cache user-provided options in f_ncm_opts
> using the gether_opts structure. Apply these cached settings to the
> net_device upon creation in ncm_bind().
>
> Preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget:
> f_ncm: Fix UAF ncm object at re-bind after usb ep transport error").
> Check opts->net in ncm_set_alt() and ncm_disable() to ensure
> gether_disconnect() runs only if a connection was established.
>
> Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility")
> Cc: stable@kernel.org
> Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
I see you have sent a revert for this series now, but I wanted to let
you know that this change was also triggering the following warning ...
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:287
tegra-xudc 3550000.usb: EP 11 (type: bulk, dir: in) enabled
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
preempt_count: 10003, expected: 0
tegra-xudc 3550000.usb: EP 6 (type: bulk, dir: out) enabled
RCU nest depth: 0, expected: 0
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G OE 7.0.0-rc2-debug-tegra #1 PREEMPT
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS buildbrain-gcid-44366467 03/05/2026
Call trace:
show_stack+0x20/0x40 (C)
dump_stack_lvl+0x7c/0xa0
dump_stack+0x18/0x30
__might_resched+0x128/0x198
__might_sleep+0x64/0xd0
mutex_lock+0x2c/0xe8
0xffff80007eaedf84
composite_setup+0xb30/0x2010 [libcomposite]
usb_function_register+0x20e0/0x2c28 [libcomposite]
0xffff80007cf7ba20
0xffff80007cf7cb84
__handle_irq_event_percpu+0x64/0x3d8
handle_irq_event+0x54/0x110
handle_fasteoi_irq+0x114/0x1c0
handle_irq_desc+0x50/0x90
generic_handle_domain_irq+0x20/0x40
gic_handle_irq+0x54/0x180
call_on_irq_stack+0x30/0x48
do_interrupt_handler+0x90/0xb0
el1_interrupt+0x3c/0x68
el1h_64_irq_handler+0x18/0x30
el1h_64_irq+0x70/0x78
cpuidle_enter_state+0xcc/0x950 (P)
cpuidle_enter+0x40/0x68
do_idle+0x1fc/0x298
cpu_startup_entry+0x3c/0x50
rest_init+0x100/0x120
start_kernel+0x760/0x908
__primary_switched+0x88/0x98
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind
2026-03-09 10:20 ` [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind Jon Hunter
@ 2026-03-09 10:37 ` Kuen-Han Tsai
0 siblings, 0 replies; 2+ messages in thread
From: Kuen-Han Tsai @ 2026-03-09 10:37 UTC (permalink / raw)
To: Jon Hunter
Cc: Greg Kroah-Hartman, Felipe Balbi, Prashanth K, Kyungmin Park,
Andrzej Pietrasiewicz, linux-usb, linux-kernel, stable,
linux-tegra@vger.kernel.org
Hi Jon,
On Mon, Mar 9, 2026 at 6:20 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
>
> On 30/12/2025 10:13, Kuen-Han Tsai wrote:
> > Currently, the net_device is allocated in ncm_alloc_inst() and freed in
> > ncm_free_inst(). This ties the network interface's lifetime to the
> > configuration instance rather than the USB connection (bind/unbind).
> >
> > This decoupling causes issues when the USB gadget is disconnected where
> > the underlying gadget device is removed. The net_device can outlive its
> > parent, leading to dangling sysfs links and NULL pointer dereferences
> > when accessing the freed gadget device.
> >
> > Problem 1: NULL pointer dereference on disconnect
> > Unable to handle kernel NULL pointer dereference at virtual address
> > 0000000000000000
> > Call trace:
> > __pi_strlen+0x14/0x150
> > rtnl_fill_ifinfo+0x6b4/0x708
> > rtmsg_ifinfo_build_skb+0xd8/0x13c
> > rtmsg_ifinfo+0x50/0xa0
> > __dev_notify_flags+0x4c/0x1f0
> > dev_change_flags+0x54/0x70
> > do_setlink+0x390/0xebc
> > rtnl_newlink+0x7d0/0xac8
> > rtnetlink_rcv_msg+0x27c/0x410
> > netlink_rcv_skb+0x134/0x150
> > rtnetlink_rcv+0x18/0x28
> > netlink_unicast+0x254/0x3f0
> > netlink_sendmsg+0x2e0/0x3d4
> >
> > Problem 2: Dangling sysfs symlinks
> > console:/ # ls -l /sys/class/net/ncm0
> > lrwxrwxrwx ... /sys/class/net/ncm0 ->
> > /sys/devices/platform/.../gadget.0/net/ncm0
> > console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0
> > ls: .../gadget.0/net/ncm0: No such file or directory
> >
> > Move the net_device allocation to ncm_bind() and deallocation to
> > ncm_unbind(). This ensures the network interface exists only when the
> > gadget function is actually bound to a configuration.
> >
> > To support pre-bind configuration (e.g., setting interface name or MAC
> > address via configfs), cache user-provided options in f_ncm_opts
> > using the gether_opts structure. Apply these cached settings to the
> > net_device upon creation in ncm_bind().
> >
> > Preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget:
> > f_ncm: Fix UAF ncm object at re-bind after usb ep transport error").
> > Check opts->net in ncm_set_alt() and ncm_disable() to ensure
> > gether_disconnect() runs only if a connection was established.
> >
> > Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility")
> > Cc: stable@kernel.org
> > Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
>
>
> I see you have sent a revert for this series now, but I wanted to let
> you know that this change was also triggering the following warning ...
Thanks for catching this. I sent a fix [1] for this specific warning,
but since the overall solution caused a regression on pmOS, I reverted
the entire series including that fix.
Fortunately, the regression David reported gave me a much clearer
picture of how the network device interacts with f_ncm. I've
implemented a new solution and will send it out shortly.
Thanks again for testing and helping catch this!
Regards,
Kuen-Han
[1] https://lore.kernel.org/linux-usb/20260221-legacy-ncm-v2-2-dfb891d76507@google.com/
>
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:287
> tegra-xudc 3550000.usb: EP 11 (type: bulk, dir: in) enabled
> in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
> preempt_count: 10003, expected: 0
> tegra-xudc 3550000.usb: EP 6 (type: bulk, dir: out) enabled
> RCU nest depth: 0, expected: 0
> CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G OE 7.0.0-rc2-debug-tegra #1 PREEMPT
> Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS buildbrain-gcid-44366467 03/05/2026
> Call trace:
> show_stack+0x20/0x40 (C)
> dump_stack_lvl+0x7c/0xa0
> dump_stack+0x18/0x30
> __might_resched+0x128/0x198
> __might_sleep+0x64/0xd0
> mutex_lock+0x2c/0xe8
> 0xffff80007eaedf84
> composite_setup+0xb30/0x2010 [libcomposite]
> usb_function_register+0x20e0/0x2c28 [libcomposite]
> 0xffff80007cf7ba20
> 0xffff80007cf7cb84
> __handle_irq_event_percpu+0x64/0x3d8
> handle_irq_event+0x54/0x110
> handle_fasteoi_irq+0x114/0x1c0
> handle_irq_desc+0x50/0x90
> generic_handle_domain_irq+0x20/0x40
> gic_handle_irq+0x54/0x180
> call_on_irq_stack+0x30/0x48
> do_interrupt_handler+0x90/0xb0
> el1_interrupt+0x3c/0x68
> el1h_64_irq_handler+0x18/0x30
> el1h_64_irq+0x70/0x78
> cpuidle_enter_state+0xcc/0x950 (P)
> cpuidle_enter+0x40/0x68
> do_idle+0x1fc/0x298
> cpu_startup_entry+0x3c/0x50
> rest_init+0x100/0x120
> start_kernel+0x760/0x908
> __primary_switched+0x88/0x98
>
> Jon
>
> --
> nvpublic
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-03-09 10:42 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20251230-ncm-refactor-v1-0-793e347bc7a7@google.com>
[not found] ` <20251230-ncm-refactor-v1-3-793e347bc7a7@google.com>
2026-03-09 10:20 ` [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind Jon Hunter
2026-03-09 10:37 ` Kuen-Han Tsai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox