public inbox for linux-tegra@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind
       [not found] ` <20251230-ncm-refactor-v1-3-793e347bc7a7@google.com>
@ 2026-03-09 10:20   ` Jon Hunter
  2026-03-09 10:37     ` Kuen-Han Tsai
  0 siblings, 1 reply; 2+ messages in thread
From: Jon Hunter @ 2026-03-09 10:20 UTC (permalink / raw)
  To: Kuen-Han Tsai, Greg Kroah-Hartman, Felipe Balbi, Prashanth K,
	Kyungmin Park, Andrzej Pietrasiewicz
  Cc: linux-usb, linux-kernel, stable, linux-tegra@vger.kernel.org



On 30/12/2025 10:13, Kuen-Han Tsai wrote:
> Currently, the net_device is allocated in ncm_alloc_inst() and freed in
> ncm_free_inst(). This ties the network interface's lifetime to the
> configuration instance rather than the USB connection (bind/unbind).
> 
> This decoupling causes issues when the USB gadget is disconnected where
> the underlying gadget device is removed. The net_device can outlive its
> parent, leading to dangling sysfs links and NULL pointer dereferences
> when accessing the freed gadget device.
> 
> Problem 1: NULL pointer dereference on disconnect
>   Unable to handle kernel NULL pointer dereference at virtual address
>   0000000000000000
>   Call trace:
>     __pi_strlen+0x14/0x150
>     rtnl_fill_ifinfo+0x6b4/0x708
>     rtmsg_ifinfo_build_skb+0xd8/0x13c
>     rtmsg_ifinfo+0x50/0xa0
>     __dev_notify_flags+0x4c/0x1f0
>     dev_change_flags+0x54/0x70
>     do_setlink+0x390/0xebc
>     rtnl_newlink+0x7d0/0xac8
>     rtnetlink_rcv_msg+0x27c/0x410
>     netlink_rcv_skb+0x134/0x150
>     rtnetlink_rcv+0x18/0x28
>     netlink_unicast+0x254/0x3f0
>     netlink_sendmsg+0x2e0/0x3d4
> 
> Problem 2: Dangling sysfs symlinks
>   console:/ # ls -l /sys/class/net/ncm0
>   lrwxrwxrwx ... /sys/class/net/ncm0 ->
>   /sys/devices/platform/.../gadget.0/net/ncm0
>   console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0
>   ls: .../gadget.0/net/ncm0: No such file or directory
> 
> Move the net_device allocation to ncm_bind() and deallocation to
> ncm_unbind(). This ensures the network interface exists only when the
> gadget function is actually bound to a configuration.
> 
> To support pre-bind configuration (e.g., setting interface name or MAC
> address via configfs), cache user-provided options in f_ncm_opts
> using the gether_opts structure. Apply these cached settings to the
> net_device upon creation in ncm_bind().
> 
> Preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget:
> f_ncm: Fix UAF ncm object at re-bind after usb ep transport error").
> Check opts->net in ncm_set_alt() and ncm_disable() to ensure
> gether_disconnect() runs only if a connection was established.
> 
> Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility")
> Cc: stable@kernel.org
> Signed-off-by: Kuen-Han Tsai <khtsai@google.com>


I see you have sent a revert for this series now, but I wanted to let
you know that this change was also triggering the following warning ...

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:287
  tegra-xudc 3550000.usb: EP 11 (type: bulk, dir: in) enabled
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
  preempt_count: 10003, expected: 0
  tegra-xudc 3550000.usb: EP 6 (type: bulk, dir: out) enabled
  RCU nest depth: 0, expected: 0
  CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G           OE       7.0.0-rc2-debug-tegra #1 PREEMPT
  Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
  Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS buildbrain-gcid-44366467 03/05/2026
  Call trace:
   show_stack+0x20/0x40 (C)
   dump_stack_lvl+0x7c/0xa0
   dump_stack+0x18/0x30
   __might_resched+0x128/0x198
   __might_sleep+0x64/0xd0
   mutex_lock+0x2c/0xe8
   0xffff80007eaedf84
   composite_setup+0xb30/0x2010 [libcomposite]
   usb_function_register+0x20e0/0x2c28 [libcomposite]
   0xffff80007cf7ba20
   0xffff80007cf7cb84
   __handle_irq_event_percpu+0x64/0x3d8
   handle_irq_event+0x54/0x110
   handle_fasteoi_irq+0x114/0x1c0
   handle_irq_desc+0x50/0x90
   generic_handle_domain_irq+0x20/0x40
   gic_handle_irq+0x54/0x180
   call_on_irq_stack+0x30/0x48
   do_interrupt_handler+0x90/0xb0
   el1_interrupt+0x3c/0x68
   el1h_64_irq_handler+0x18/0x30
   el1h_64_irq+0x70/0x78
   cpuidle_enter_state+0xcc/0x950 (P)
   cpuidle_enter+0x40/0x68
   do_idle+0x1fc/0x298
   cpu_startup_entry+0x3c/0x50
   rest_init+0x100/0x120
   start_kernel+0x760/0x908
   __primary_switched+0x88/0x98

Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind
  2026-03-09 10:20   ` [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind Jon Hunter
@ 2026-03-09 10:37     ` Kuen-Han Tsai
  0 siblings, 0 replies; 2+ messages in thread
From: Kuen-Han Tsai @ 2026-03-09 10:37 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Greg Kroah-Hartman, Felipe Balbi, Prashanth K, Kyungmin Park,
	Andrzej Pietrasiewicz, linux-usb, linux-kernel, stable,
	linux-tegra@vger.kernel.org

Hi Jon,

On Mon, Mar 9, 2026 at 6:20 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
>
> On 30/12/2025 10:13, Kuen-Han Tsai wrote:
> > Currently, the net_device is allocated in ncm_alloc_inst() and freed in
> > ncm_free_inst(). This ties the network interface's lifetime to the
> > configuration instance rather than the USB connection (bind/unbind).
> >
> > This decoupling causes issues when the USB gadget is disconnected where
> > the underlying gadget device is removed. The net_device can outlive its
> > parent, leading to dangling sysfs links and NULL pointer dereferences
> > when accessing the freed gadget device.
> >
> > Problem 1: NULL pointer dereference on disconnect
> >   Unable to handle kernel NULL pointer dereference at virtual address
> >   0000000000000000
> >   Call trace:
> >     __pi_strlen+0x14/0x150
> >     rtnl_fill_ifinfo+0x6b4/0x708
> >     rtmsg_ifinfo_build_skb+0xd8/0x13c
> >     rtmsg_ifinfo+0x50/0xa0
> >     __dev_notify_flags+0x4c/0x1f0
> >     dev_change_flags+0x54/0x70
> >     do_setlink+0x390/0xebc
> >     rtnl_newlink+0x7d0/0xac8
> >     rtnetlink_rcv_msg+0x27c/0x410
> >     netlink_rcv_skb+0x134/0x150
> >     rtnetlink_rcv+0x18/0x28
> >     netlink_unicast+0x254/0x3f0
> >     netlink_sendmsg+0x2e0/0x3d4
> >
> > Problem 2: Dangling sysfs symlinks
> >   console:/ # ls -l /sys/class/net/ncm0
> >   lrwxrwxrwx ... /sys/class/net/ncm0 ->
> >   /sys/devices/platform/.../gadget.0/net/ncm0
> >   console:/ # ls -l /sys/devices/platform/.../gadget.0/net/ncm0
> >   ls: .../gadget.0/net/ncm0: No such file or directory
> >
> > Move the net_device allocation to ncm_bind() and deallocation to
> > ncm_unbind(). This ensures the network interface exists only when the
> > gadget function is actually bound to a configuration.
> >
> > To support pre-bind configuration (e.g., setting interface name or MAC
> > address via configfs), cache user-provided options in f_ncm_opts
> > using the gether_opts structure. Apply these cached settings to the
> > net_device upon creation in ncm_bind().
> >
> > Preserve the use-after-free fix from commit 6334b8e4553c ("usb: gadget:
> > f_ncm: Fix UAF ncm object at re-bind after usb ep transport error").
> > Check opts->net in ncm_set_alt() and ncm_disable() to ensure
> > gether_disconnect() runs only if a connection was established.
> >
> > Fixes: 40d133d7f542 ("usb: gadget: f_ncm: convert to new function interface with backward compatibility")
> > Cc: stable@kernel.org
> > Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
>
>
> I see you have sent a revert for this series now, but I wanted to let
> you know that this change was also triggering the following warning ...

Thanks for catching this. I sent a fix [1] for this specific warning,
but since the overall solution caused a regression on pmOS, I reverted
the entire series including that fix.

Fortunately, the regression David reported gave me a much clearer
picture of how the network device interacts with f_ncm. I've
implemented a new solution and will send it out shortly.

Thanks again for testing and helping catch this!

Regards,
Kuen-Han

[1] https://lore.kernel.org/linux-usb/20260221-legacy-ncm-v2-2-dfb891d76507@google.com/

>
>   BUG: sleeping function called from invalid context at kernel/locking/mutex.c:287
>   tegra-xudc 3550000.usb: EP 11 (type: bulk, dir: in) enabled
>   in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
>   preempt_count: 10003, expected: 0
>   tegra-xudc 3550000.usb: EP 6 (type: bulk, dir: out) enabled
>   RCU nest depth: 0, expected: 0
>   CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G           OE       7.0.0-rc2-debug-tegra #1 PREEMPT
>   Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
>   Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS buildbrain-gcid-44366467 03/05/2026
>   Call trace:
>    show_stack+0x20/0x40 (C)
>    dump_stack_lvl+0x7c/0xa0
>    dump_stack+0x18/0x30
>    __might_resched+0x128/0x198
>    __might_sleep+0x64/0xd0
>    mutex_lock+0x2c/0xe8
>    0xffff80007eaedf84
>    composite_setup+0xb30/0x2010 [libcomposite]
>    usb_function_register+0x20e0/0x2c28 [libcomposite]
>    0xffff80007cf7ba20
>    0xffff80007cf7cb84
>    __handle_irq_event_percpu+0x64/0x3d8
>    handle_irq_event+0x54/0x110
>    handle_fasteoi_irq+0x114/0x1c0
>    handle_irq_desc+0x50/0x90
>    generic_handle_domain_irq+0x20/0x40
>    gic_handle_irq+0x54/0x180
>    call_on_irq_stack+0x30/0x48
>    do_interrupt_handler+0x90/0xb0
>    el1_interrupt+0x3c/0x68
>    el1h_64_irq_handler+0x18/0x30
>    el1h_64_irq+0x70/0x78
>    cpuidle_enter_state+0xcc/0x950 (P)
>    cpuidle_enter+0x40/0x68
>    do_idle+0x1fc/0x298
>    cpu_startup_entry+0x3c/0x50
>    rest_init+0x100/0x120
>    start_kernel+0x760/0x908
>    __primary_switched+0x88/0x98
>
> Jon
>
> --
> nvpublic
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-09 10:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251230-ncm-refactor-v1-0-793e347bc7a7@google.com>
     [not found] ` <20251230-ncm-refactor-v1-3-793e347bc7a7@google.com>
2026-03-09 10:20   ` [PATCH 3/3] usb: gadget: f_ncm: align net_device lifecycle with bind/unbind Jon Hunter
2026-03-09 10:37     ` Kuen-Han Tsai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox