* Re: Deadlock in pciehp on dock disconnect [not found] <CAEhC_B=ksywxCG_+aQqXUrGEgKq+4mqnSV8EBHOKbC3-Obj9+Q@mail.gmail.com> @ 2024-04-05 10:02 ` Lukas Wunner 2024-04-05 12:59 ` vient 2024-04-05 13:31 ` Heiner Kallweit 0 siblings, 2 replies; 8+ messages in thread From: Lukas Wunner @ 2024-04-05 10:02 UTC (permalink / raw) To: Roman Lozko Cc: linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Heiner Kallweit, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan [cc += netdev maintainers] On Fri, Apr 05, 2024 at 11:14:01AM +0200, Roman Lozko wrote: > Hi, I'm using HP G4 Thunderbolt docking station, and recently (?) > kernel started to "partially" deadlock after disconnecting the dock > station. This results in inability to turn network interfaces on or > off, system can't reboot, `sudo` does not work (guess because it uses > DNS). > > It started to occur ~two weeks ago, don't know why, I did not change > anything at that time. First seen on 6.8.2, nothing changed with > 6.9.0-rc2. This is not a pciehp issue, it's a networking issue: In the stacktrace you've provided below, the rtnl_lock() is acquired recursively, which leads to the deadlock: unregister_netdev() acquires rtnl_lock(), indirectly invokes netdev_trig_deactivate() upon unregistering some LED, thereby calling unregister_netdevice_notifier(), which tries to acquire rtnl_lock() again. From a quick look at the source files involved, this doesn't look like something new, though I note LED support for igc was added only recently with ea578703b03d ("igc: Add support for LEDs on i225/i226"), which went into v6.9-rc1. The other hanging tasks are simply waiting for rtnl_lock() as well. > pciehp stack trace: > INFO: task irq/122-pciehp:209 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:irq/122-pciehp state:D stack:0 pid:209 tgid:209 ppid:2 > flags:0x00004000 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > unregister_netdevice_notifier+0x40/0x150 > netdev_trig_deactivate+0x1f/0x60 [ledtrig_netdev c68f5c964fe428d1a2169816a653c62dba2f2e01] > led_trigger_set+0x102/0x330 > led_classdev_unregister+0x4b/0x110 > release_nodes+0x3d/0xb0 > devres_release_all+0x8b/0xc0 > device_del+0x34f/0x3c0 > unregister_netdevice_many_notify+0x80b/0xaf0 > unregister_netdev+0x7c/0xd0 > igc_remove+0xd8/0x1e0 [igc d1bcf7b726f7370e167c72960cdb27ae7f970357] > pci_device_remove+0x3f/0xb0 > device_release_driver_internal+0x1be/0x2d0 > pci_stop_bus_device+0x68/0xa0 > pci_stop_bus_device+0x39/0xa0 > pci_stop_bus_device+0x39/0xa0 > pciehp_unconfigure_device+0x12b/0x1d0 > pciehp_disable_slot+0x65/0x120 > pciehp_handle_presence_or_link_change+0x7a/0x450 > pciehp_ist+0xf5/0x320 > irq_thread_fn+0x1d/0x40 > irq_thread+0x19b/0x260 > kthread+0x147/0x160 > ret_from_fork+0x34/0x40 > ret_from_fork_asm+0x11/0x20 > </TASK> > > Other affected kernel threads > INFO: task NetworkManager:1294 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:NetworkManager state:D stack:0 pid:1294 tgid:1294 ppid:1 > flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > ____sys_sendmsg+0x1a5/0x2a0 > ___sys_sendmsg+0x293/0x2d0 > __x64_sys_sendmsg+0x10d/0x140 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x7971ac52c02b > RSP: 002b:00007ffc684c09a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e > RAX: ffffffffffffffda RBX: 00005661e9bc5be0 RCX: 00007971ac52c02b > RDX: 0000000000000000 RSI: 00007ffc684c09e0 RDI: 000000000000000d > RBP: 00007ffc684c09c0 R08: 0000000000000000 R09: 0000000000000001 > R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000001 > R13: 0000000000000000 R14: 00005661e9c45030 R15: 00005661e9bc5cac > </TASK> > INFO: task geoclue:2325 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:geoclue state:D stack:0 pid:2325 tgid:2325 ppid:1 > flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x7ad712b2beea > RSP: 002b:00007fff94c1fd80 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ad712b2beea > RDX: 0000000000000014 RSI: 00007fff94c1fe10 RDI: 0000000000000007 > RBP: 00007fff94c1fdb0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000004000 R11: 0000000000000246 R12: 00007fff94c1fe10 > R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > INFO: task pool-geoclue:84396 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:pool-geoclue state:D stack:0 pid:84396 tgid:2325 ppid:1 > flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x7ad712b2c0e4 > RSP: 002b:00007ad6e7dfdf40 EFLAGS: 00000293 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ad712b2c0e4 > RDX: 0000000000000014 RSI: 00007ad6e7dff070 RDI: 000000000000000b > RBP: 00007ad6e7dfdf80 R08: 00007ad6e7dff014 R09: 000000000000000c > R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000b > R13: 0000000000000010 R14: 00007ad6e7dff030 R15: 00000000d3fb1bea > </TASK> > INFO: task Qt bearer threa:4002 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:Qt bearer threa state:D stack:0 pid:4002 tgid:3506 > ppid:3034 flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x76f3c692beea > RSP: 002b:000076f3a51fecb0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076f3c692beea > RDX: 0000000000000020 RSI: 000076f3a51fed60 RDI: 0000000000000023 > RBP: 000076f3a51fece0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 000076f3a51fee38 > R13: 000076f378026b30 R14: 000076f3a51fed30 R15: 000076f378026b48 > </TASK> > INFO: task gnome-software:3529 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:gnome-software state:D stack:0 pid:3529 tgid:3529 > ppid:3034 flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x7d6be892beea > RSP: 002b:00007ffd94e01560 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007d6be892beea > RDX: 0000000000000014 RSI: 00007ffd94e015f0 RDI: 000000000000000d > RBP: 00007ffd94e01590 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000004000 R11: 0000000000000246 R12: 00007ffd94e015f0 > R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > INFO: task Qt bearer threa:3960 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:Qt bearer threa state:D stack:0 pid:3960 tgid:3550 > ppid:3034 flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x777a42b2beea > RSP: 002b:0000777a2abfecf0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000777a42b2beea > RDX: 0000000000000020 RSI: 0000777a2abfeda0 RDI: 000000000000001d > RBP: 0000777a2abfed20 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000777a2abfee78 > R13: 0000777a080285b0 R14: 0000777a2abfed70 R15: 0000777a080285c8 > </TASK> > INFO: task xdg-desktop-por:3821 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:xdg-desktop-por state:D stack:0 pid:3821 tgid:3821 > ppid:2776 flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x79d76612beea > RSP: 002b:00007ffd480942a0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000079d76612beea > RDX: 0000000000000014 RSI: 00007ffd48094330 RDI: 0000000000000008 > RBP: 00007ffd480942d0 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000004000 R11: 0000000000000246 R12: 00007ffd48094330 > R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > INFO: task DNS Res~ver #11:25588 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:DNS Res~ver #11 state:D stack:0 pid:25588 tgid:4934 > ppid:3070 flags:0x00000002 > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > netlink_dump+0x1c4/0x3f0 > __netlink_dump_start+0x2b3/0x340 > rtnetlink_rcv_msg+0x469/0x4a0 > netlink_rcv_skb+0xed/0x120 > netlink_unicast+0x2ce/0x3f0 > netlink_sendmsg+0x39c/0x450 > __sys_sendto+0x2c8/0x350 > __x64_sys_sendto+0x26/0x30 > do_syscall_64+0x92/0x170 > entry_SYSCALL_64_after_hwframe+0x46/0x4e > RIP: 0033:0x72d65892c0e4 > RSP: 002b:000072d649cbb880 EFLAGS: 00000293 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000072d65892c0e4 > RDX: 0000000000000014 RSI: 000072d649cbc9b0 RDI: 0000000000000053 > RBP: 000072d649cbb8c0 R08: 000072d649cbc954 R09: 000000000000000c > R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000053 > R13: 0000000000000010 R14: 000072d649cbc970 R15: 00000000b48fd654 > </TASK> > INFO: task kworker/u88:2:31385 blocked for more than 120 seconds. > Not tainted 6.9.0-rc2 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > task:kworker/u88:2 state:D stack:0 pid:31385 tgid:31385 ppid:2 > flags:0x00004000 > Workqueue: ipv6_addrconf addrconf_verify_work > Call Trace: > <TASK> > __schedule+0x5dd/0x1380 > schedule+0x6e/0xf0 > schedule_preempt_disabled+0x15/0x20 > __mutex_lock+0x2a0/0x750 > addrconf_verify_work+0x20/0x30 > process_scheduled_works+0x1f4/0x450 > worker_thread+0x349/0x5e0 > kthread+0x147/0x160 > ret_from_fork+0x34/0x40 > ret_from_fork_asm+0x11/0x20 > </TASK> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Re: Deadlock in pciehp on dock disconnect 2024-04-05 10:02 ` Deadlock in pciehp on dock disconnect Lukas Wunner @ 2024-04-05 12:59 ` vient 2024-04-05 13:31 ` Heiner Kallweit 1 sibling, 0 replies; 8+ messages in thread From: vient @ 2024-04-05 12:59 UTC (permalink / raw) To: lukas; +Cc: netdev Guess you are right about 6.9-rc1 changes, I've booted to 6.8.2 once more and dock seems to disconnect fine here. So the problem appeared after switching to 6.9 then. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 10:02 ` Deadlock in pciehp on dock disconnect Lukas Wunner 2024-04-05 12:59 ` vient @ 2024-04-05 13:31 ` Heiner Kallweit 2024-04-05 17:48 ` Lukas Wunner 1 sibling, 1 reply; 8+ messages in thread From: Heiner Kallweit @ 2024-04-05 13:31 UTC (permalink / raw) To: Lukas Wunner, Roman Lozko Cc: linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan On 05.04.2024 12:02, Lukas Wunner wrote: > [cc += netdev maintainers] > > On Fri, Apr 05, 2024 at 11:14:01AM +0200, Roman Lozko wrote: >> Hi, I'm using HP G4 Thunderbolt docking station, and recently (?) >> kernel started to "partially" deadlock after disconnecting the dock >> station. This results in inability to turn network interfaces on or >> off, system can't reboot, `sudo` does not work (guess because it uses >> DNS). >> >> It started to occur ~two weeks ago, don't know why, I did not change >> anything at that time. First seen on 6.8.2, nothing changed with >> 6.9.0-rc2. > > This is not a pciehp issue, it's a networking issue: > > In the stacktrace you've provided below, the rtnl_lock() is acquired > recursively, which leads to the deadlock: > > unregister_netdev() acquires rtnl_lock(), indirectly invokes > netdev_trig_deactivate() upon unregistering some LED, thereby > calling unregister_netdevice_notifier(), which tries to > acquire rtnl_lock() again. > >>From a quick look at the source files involved, this doesn't look > like something new, though I note LED support for igc was added > only recently with ea578703b03d ("igc: Add support for LEDs on > i225/i226"), which went into v6.9-rc1. > > The other hanging tasks are simply waiting for rtnl_lock() as well. > > >> pciehp stack trace: >> INFO: task irq/122-pciehp:209 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:irq/122-pciehp state:D stack:0 pid:209 tgid:209 ppid:2 >> flags:0x00004000 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> unregister_netdevice_notifier+0x40/0x150 >> netdev_trig_deactivate+0x1f/0x60 [ledtrig_netdev c68f5c964fe428d1a2169816a653c62dba2f2e01] >> led_trigger_set+0x102/0x330 >> led_classdev_unregister+0x4b/0x110 >> release_nodes+0x3d/0xb0 >> devres_release_all+0x8b/0xc0 >> device_del+0x34f/0x3c0 >> unregister_netdevice_many_notify+0x80b/0xaf0 >> unregister_netdev+0x7c/0xd0 >> igc_remove+0xd8/0x1e0 [igc d1bcf7b726f7370e167c72960cdb27ae7f970357] >> pci_device_remove+0x3f/0xb0 >> device_release_driver_internal+0x1be/0x2d0 >> pci_stop_bus_device+0x68/0xa0 >> pci_stop_bus_device+0x39/0xa0 >> pci_stop_bus_device+0x39/0xa0 >> pciehp_unconfigure_device+0x12b/0x1d0 >> pciehp_disable_slot+0x65/0x120 >> pciehp_handle_presence_or_link_change+0x7a/0x450 >> pciehp_ist+0xf5/0x320 >> irq_thread_fn+0x1d/0x40 >> irq_thread+0x19b/0x260 >> kthread+0x147/0x160 >> ret_from_fork+0x34/0x40 >> ret_from_fork_asm+0x11/0x20 >> </TASK> >> >> Other affected kernel threads >> INFO: task NetworkManager:1294 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:NetworkManager state:D stack:0 pid:1294 tgid:1294 ppid:1 >> flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> ____sys_sendmsg+0x1a5/0x2a0 >> ___sys_sendmsg+0x293/0x2d0 >> __x64_sys_sendmsg+0x10d/0x140 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x7971ac52c02b >> RSP: 002b:00007ffc684c09a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e >> RAX: ffffffffffffffda RBX: 00005661e9bc5be0 RCX: 00007971ac52c02b >> RDX: 0000000000000000 RSI: 00007ffc684c09e0 RDI: 000000000000000d >> RBP: 00007ffc684c09c0 R08: 0000000000000000 R09: 0000000000000001 >> R10: 0000000000000001 R11: 0000000000000293 R12: 0000000000000001 >> R13: 0000000000000000 R14: 00005661e9c45030 R15: 00005661e9bc5cac >> </TASK> >> INFO: task geoclue:2325 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:geoclue state:D stack:0 pid:2325 tgid:2325 ppid:1 >> flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x7ad712b2beea >> RSP: 002b:00007fff94c1fd80 EFLAGS: 00000246 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ad712b2beea >> RDX: 0000000000000014 RSI: 00007fff94c1fe10 RDI: 0000000000000007 >> RBP: 00007fff94c1fdb0 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000004000 R11: 0000000000000246 R12: 00007fff94c1fe10 >> R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 >> </TASK> >> INFO: task pool-geoclue:84396 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:pool-geoclue state:D stack:0 pid:84396 tgid:2325 ppid:1 >> flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x7ad712b2c0e4 >> RSP: 002b:00007ad6e7dfdf40 EFLAGS: 00000293 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ad712b2c0e4 >> RDX: 0000000000000014 RSI: 00007ad6e7dff070 RDI: 000000000000000b >> RBP: 00007ad6e7dfdf80 R08: 00007ad6e7dff014 R09: 000000000000000c >> R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000b >> R13: 0000000000000010 R14: 00007ad6e7dff030 R15: 00000000d3fb1bea >> </TASK> >> INFO: task Qt bearer threa:4002 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:Qt bearer threa state:D stack:0 pid:4002 tgid:3506 >> ppid:3034 flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x76f3c692beea >> RSP: 002b:000076f3a51fecb0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000076f3c692beea >> RDX: 0000000000000020 RSI: 000076f3a51fed60 RDI: 0000000000000023 >> RBP: 000076f3a51fece0 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000246 R12: 000076f3a51fee38 >> R13: 000076f378026b30 R14: 000076f3a51fed30 R15: 000076f378026b48 >> </TASK> >> INFO: task gnome-software:3529 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:gnome-software state:D stack:0 pid:3529 tgid:3529 >> ppid:3034 flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x7d6be892beea >> RSP: 002b:00007ffd94e01560 EFLAGS: 00000246 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007d6be892beea >> RDX: 0000000000000014 RSI: 00007ffd94e015f0 RDI: 000000000000000d >> RBP: 00007ffd94e01590 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000004000 R11: 0000000000000246 R12: 00007ffd94e015f0 >> R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 >> </TASK> >> INFO: task Qt bearer threa:3960 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:Qt bearer threa state:D stack:0 pid:3960 tgid:3550 >> ppid:3034 flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x777a42b2beea >> RSP: 002b:0000777a2abfecf0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000777a42b2beea >> RDX: 0000000000000020 RSI: 0000777a2abfeda0 RDI: 000000000000001d >> RBP: 0000777a2abfed20 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000246 R12: 0000777a2abfee78 >> R13: 0000777a080285b0 R14: 0000777a2abfed70 R15: 0000777a080285c8 >> </TASK> >> INFO: task xdg-desktop-por:3821 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:xdg-desktop-por state:D stack:0 pid:3821 tgid:3821 >> ppid:2776 flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x79d76612beea >> RSP: 002b:00007ffd480942a0 EFLAGS: 00000246 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000079d76612beea >> RDX: 0000000000000014 RSI: 00007ffd48094330 RDI: 0000000000000008 >> RBP: 00007ffd480942d0 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000004000 R11: 0000000000000246 R12: 00007ffd48094330 >> R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000000 >> </TASK> >> INFO: task DNS Res~ver #11:25588 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:DNS Res~ver #11 state:D stack:0 pid:25588 tgid:4934 >> ppid:3070 flags:0x00000002 >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> netlink_dump+0x1c4/0x3f0 >> __netlink_dump_start+0x2b3/0x340 >> rtnetlink_rcv_msg+0x469/0x4a0 >> netlink_rcv_skb+0xed/0x120 >> netlink_unicast+0x2ce/0x3f0 >> netlink_sendmsg+0x39c/0x450 >> __sys_sendto+0x2c8/0x350 >> __x64_sys_sendto+0x26/0x30 >> do_syscall_64+0x92/0x170 >> entry_SYSCALL_64_after_hwframe+0x46/0x4e >> RIP: 0033:0x72d65892c0e4 >> RSP: 002b:000072d649cbb880 EFLAGS: 00000293 ORIG_RAX: 000000000000002c >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000072d65892c0e4 >> RDX: 0000000000000014 RSI: 000072d649cbc9b0 RDI: 0000000000000053 >> RBP: 000072d649cbb8c0 R08: 000072d649cbc954 R09: 000000000000000c >> R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000053 >> R13: 0000000000000010 R14: 000072d649cbc970 R15: 00000000b48fd654 >> </TASK> >> INFO: task kworker/u88:2:31385 blocked for more than 120 seconds. >> Not tainted 6.9.0-rc2 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> task:kworker/u88:2 state:D stack:0 pid:31385 tgid:31385 ppid:2 >> flags:0x00004000 >> Workqueue: ipv6_addrconf addrconf_verify_work >> Call Trace: >> <TASK> >> __schedule+0x5dd/0x1380 >> schedule+0x6e/0xf0 >> schedule_preempt_disabled+0x15/0x20 >> __mutex_lock+0x2a0/0x750 >> addrconf_verify_work+0x20/0x30 >> process_scheduled_works+0x1f4/0x450 >> worker_thread+0x349/0x5e0 >> kthread+0x147/0x160 >> ret_from_fork+0x34/0x40 >> ret_from_fork_asm+0x11/0x20 >> </TASK> > It's unfortunate that the device-managed LED is bound to the netdev device. Wouldn't binding it to the parent (&pdev->dev) solve the issue? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 13:31 ` Heiner Kallweit @ 2024-04-05 17:48 ` Lukas Wunner 2024-04-05 19:01 ` Heiner Kallweit 2024-04-05 19:16 ` Lukas Wunner 0 siblings, 2 replies; 8+ messages in thread From: Lukas Wunner @ 2024-04-05 17:48 UTC (permalink / raw) To: Heiner Kallweit Cc: Roman Lozko, linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan On Fri, Apr 05, 2024 at 03:31:34PM +0200, Heiner Kallweit wrote: > On 05.04.2024 12:02, Lukas Wunner wrote: > > On Fri, Apr 05, 2024 at 11:14:01AM +0200, Roman Lozko wrote: > > > Hi, I'm using HP G4 Thunderbolt docking station, and recently (?) > > > kernel started to "partially" deadlock after disconnecting the dock > > > station. This results in inability to turn network interfaces on or > > > off, system can't reboot, `sudo` does not work (guess because it uses > > > DNS). > > > > unregister_netdev() acquires rtnl_lock(), indirectly invokes > > netdev_trig_deactivate() upon unregistering some LED, thereby > > calling unregister_netdevice_notifier(), which tries to > > acquire rtnl_lock() again. > > > > From a quick look at the source files involved, this doesn't look > > like something new, though I note LED support for igc was added > > only recently with ea578703b03d ("igc: Add support for LEDs on > > i225/i226"), which went into v6.9-rc1. > > It's unfortunate that the device-managed LED is bound to the netdev device. > Wouldn't binding it to the parent (&pdev->dev) solve the issue? I'm guessing igc commit ea578703b03d copy-pasted from r8169 commit be51ed104ba9 ("r8169: add LED support for RTL8125/RTL8126") because that driver has exactly the same problem. :) Roman, does the below patch fix the issue? Note that just changing the devm_led_classdev_register() call isn't sufficient: I'm changing the devm_kcalloc() in igc_led_setup() as well to avoid a use-after-free (memory would already get freed on netdev unregister but led a little later on pdev unbind). -- >8 -- diff --git a/drivers/net/ethernet/intel/igc/igc_leds.c b/drivers/net/ethernet/intel/igc/igc_leds.c index bf240c5..0b78c30 100644 --- a/drivers/net/ethernet/intel/igc/igc_leds.c +++ b/drivers/net/ethernet/intel/igc/igc_leds.c @@ -257,13 +257,13 @@ static void igc_setup_ldev(struct igc_led_classdev *ldev, led_cdev->hw_control_get = igc_led_hw_control_get; led_cdev->hw_control_get_device = igc_led_hw_control_get_device; - devm_led_classdev_register(&netdev->dev, led_cdev); + devm_led_classdev_register(&adapter->pdev->dev, led_cdev); } int igc_led_setup(struct igc_adapter *adapter) { struct net_device *netdev = adapter->netdev; - struct device *dev = &netdev->dev; + struct device *dev = &adapter->pdev->dev; struct igc_led_classdev *leds; int i; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 17:48 ` Lukas Wunner @ 2024-04-05 19:01 ` Heiner Kallweit 2024-04-05 19:16 ` Lukas Wunner 1 sibling, 0 replies; 8+ messages in thread From: Heiner Kallweit @ 2024-04-05 19:01 UTC (permalink / raw) To: Lukas Wunner Cc: Roman Lozko, linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan On 05.04.2024 19:48, Lukas Wunner wrote: > On Fri, Apr 05, 2024 at 03:31:34PM +0200, Heiner Kallweit wrote: >> On 05.04.2024 12:02, Lukas Wunner wrote: >>> On Fri, Apr 05, 2024 at 11:14:01AM +0200, Roman Lozko wrote: >>>> Hi, I'm using HP G4 Thunderbolt docking station, and recently (?) >>>> kernel started to "partially" deadlock after disconnecting the dock >>>> station. This results in inability to turn network interfaces on or >>>> off, system can't reboot, `sudo` does not work (guess because it uses >>>> DNS). >>> >>> unregister_netdev() acquires rtnl_lock(), indirectly invokes >>> netdev_trig_deactivate() upon unregistering some LED, thereby >>> calling unregister_netdevice_notifier(), which tries to >>> acquire rtnl_lock() again. >>> >>> From a quick look at the source files involved, this doesn't look >>> like something new, though I note LED support for igc was added >>> only recently with ea578703b03d ("igc: Add support for LEDs on >>> i225/i226"), which went into v6.9-rc1. >> >> It's unfortunate that the device-managed LED is bound to the netdev device. >> Wouldn't binding it to the parent (&pdev->dev) solve the issue? > > I'm guessing igc commit ea578703b03d copy-pasted from r8169 commit > be51ed104ba9 ("r8169: add LED support for RTL8125/RTL8126") because > that driver has exactly the same problem. :) > Right, just tested it for r8169 and got a similar lockdep error. > Roman, does the below patch fix the issue? > > Note that just changing the devm_led_classdev_register() call isn't > sufficient: I'm changing the devm_kcalloc() in igc_led_setup() as well > to avoid a use-after-free (memory would already get freed on netdev > unregister but led a little later on pdev unbind). > > -- >8 -- > > diff --git a/drivers/net/ethernet/intel/igc/igc_leds.c b/drivers/net/ethernet/intel/igc/igc_leds.c > index bf240c5..0b78c30 100644 > --- a/drivers/net/ethernet/intel/igc/igc_leds.c > +++ b/drivers/net/ethernet/intel/igc/igc_leds.c > @@ -257,13 +257,13 @@ static void igc_setup_ldev(struct igc_led_classdev *ldev, > led_cdev->hw_control_get = igc_led_hw_control_get; > led_cdev->hw_control_get_device = igc_led_hw_control_get_device; > > - devm_led_classdev_register(&netdev->dev, led_cdev); > + devm_led_classdev_register(&adapter->pdev->dev, led_cdev); > } > > int igc_led_setup(struct igc_adapter *adapter) > { > struct net_device *netdev = adapter->netdev; > - struct device *dev = &netdev->dev; > + struct device *dev = &adapter->pdev->dev; > struct igc_led_classdev *leds; > int i; > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 17:48 ` Lukas Wunner 2024-04-05 19:01 ` Heiner Kallweit @ 2024-04-05 19:16 ` Lukas Wunner 2024-04-05 20:16 ` Heiner Kallweit 2024-04-07 16:39 ` vient 1 sibling, 2 replies; 8+ messages in thread From: Lukas Wunner @ 2024-04-05 19:16 UTC (permalink / raw) To: Heiner Kallweit Cc: Roman Lozko, linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan On Fri, Apr 05, 2024 at 07:48:08PM +0200, Lukas Wunner wrote: > Roman, does the below patch fix the issue? Actually the patch in my previous e-mail was crap as the unregistering of the LEDs happened after unbind of the pdev, i.e. after igc_release_hw_control() and pci_disable_device(). The driver otherwise doesn't seem to be using devm_*() and with devm_*() it's always all or nothing. A mix of devm_*() and manual teardown is prone to ordering issues. Here's another attempt: -- >8 -- diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 90316dc58630..f9ffe9df9a96 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -298,6 +298,7 @@ struct igc_adapter { /* LEDs */ struct mutex led_mutex; + struct igc_led_classdev *leds; }; void igc_up(struct igc_adapter *adapter); @@ -723,6 +724,7 @@ void igc_ptp_read(struct igc_adapter *adapter, struct timespec64 *ts); void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter); int igc_led_setup(struct igc_adapter *adapter); +void igc_led_teardown(struct igc_adapter *adapter); #define igc_rx_pg_size(_ring) (PAGE_SIZE << igc_rx_pg_order(_ring)) diff --git a/drivers/net/ethernet/intel/igc/igc_leds.c b/drivers/net/ethernet/intel/igc/igc_leds.c index bf240c5daf86..4c2806c0878a 100644 --- a/drivers/net/ethernet/intel/igc/igc_leds.c +++ b/drivers/net/ethernet/intel/igc/igc_leds.c @@ -236,8 +236,8 @@ static void igc_led_get_name(struct igc_adapter *adapter, int index, char *buf, pci_dev_id(adapter->pdev), index); } -static void igc_setup_ldev(struct igc_led_classdev *ldev, - struct net_device *netdev, int index) +static int igc_setup_ldev(struct igc_led_classdev *ldev, + struct net_device *netdev, int index) { struct igc_adapter *adapter = netdev_priv(netdev); struct led_classdev *led_cdev = &ldev->led; @@ -257,15 +257,15 @@ static void igc_setup_ldev(struct igc_led_classdev *ldev, led_cdev->hw_control_get = igc_led_hw_control_get; led_cdev->hw_control_get_device = igc_led_hw_control_get_device; - devm_led_classdev_register(&netdev->dev, led_cdev); + return led_classdev_register(&netdev->dev, led_cdev); } int igc_led_setup(struct igc_adapter *adapter) { struct net_device *netdev = adapter->netdev; - struct device *dev = &netdev->dev; + struct device *dev = &adapter->pdev->dev; struct igc_led_classdev *leds; - int i; + int i, ret; mutex_init(&adapter->led_mutex); @@ -273,8 +273,27 @@ int igc_led_setup(struct igc_adapter *adapter) if (!leds) return -ENOMEM; - for (i = 0; i < IGC_NUM_LEDS; i++) - igc_setup_ldev(leds + i, netdev, i); + for (i = 0; i < IGC_NUM_LEDS; i++) { + ret = igc_setup_ldev(leds + i, netdev, i); + if (ret) + goto err; + } + + adapter->leds = leds; return 0; + +err: + for (i--; i >= 0; i--) + led_classdev_unregister(&((leds + i)->led)); + return ret; +} + +void igc_led_teardown(struct igc_adapter *adapter) +{ + struct igc_led_classdev *leds = adapter->leds; + int i; + + for (i = 0; i < IGC_NUM_LEDS; i++) + led_classdev_unregister(&((leds + i)->led)); } diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 2e1cfbd82f4f..cd164442ab35 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -7025,6 +7025,9 @@ static void igc_remove(struct pci_dev *pdev) cancel_work_sync(&adapter->watchdog_task); hrtimer_cancel(&adapter->hrtimer); + if (IS_ENABLED(CONFIG_IGC_LEDS)) + igc_led_teardown(adapter); + /* Release control of h/w to f/w. If f/w is AMT enabled, this * would have already happened in close and is redundant. */ ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 19:16 ` Lukas Wunner @ 2024-04-05 20:16 ` Heiner Kallweit 2024-04-07 16:39 ` vient 1 sibling, 0 replies; 8+ messages in thread From: Heiner Kallweit @ 2024-04-05 20:16 UTC (permalink / raw) To: Lukas Wunner Cc: Roman Lozko, linux-pci, Bjorn Helgaas, Dave Hansen, Sean Christopherson, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, Christian Marangi, Kurt Kanzenbach, Jesse Brandeburg, Tony Nguyen, intel-wired-lan On 05.04.2024 21:16, Lukas Wunner wrote: > On Fri, Apr 05, 2024 at 07:48:08PM +0200, Lukas Wunner wrote: >> Roman, does the below patch fix the issue? > > Actually the patch in my previous e-mail was crap as the unregistering > of the LEDs happened after unbind of the pdev, i.e. after > igc_release_hw_control() and pci_disable_device(). > For r8169 the first version is sufficient because everything is device-managed. > The driver otherwise doesn't seem to be using devm_*() and with > devm_*() it's always all or nothing. A mix of devm_*() and manual > teardown is prone to ordering issues. > > Here's another attempt: > > -- >8 -- > > diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h > index 90316dc58630..f9ffe9df9a96 100644 > --- a/drivers/net/ethernet/intel/igc/igc.h > +++ b/drivers/net/ethernet/intel/igc/igc.h > @@ -298,6 +298,7 @@ struct igc_adapter { > > /* LEDs */ > struct mutex led_mutex; > + struct igc_led_classdev *leds; > }; > > void igc_up(struct igc_adapter *adapter); > @@ -723,6 +724,7 @@ void igc_ptp_read(struct igc_adapter *adapter, struct timespec64 *ts); > void igc_ptp_tx_tstamp_event(struct igc_adapter *adapter); > > int igc_led_setup(struct igc_adapter *adapter); > +void igc_led_teardown(struct igc_adapter *adapter); > > #define igc_rx_pg_size(_ring) (PAGE_SIZE << igc_rx_pg_order(_ring)) > > diff --git a/drivers/net/ethernet/intel/igc/igc_leds.c b/drivers/net/ethernet/intel/igc/igc_leds.c > index bf240c5daf86..4c2806c0878a 100644 > --- a/drivers/net/ethernet/intel/igc/igc_leds.c > +++ b/drivers/net/ethernet/intel/igc/igc_leds.c > @@ -236,8 +236,8 @@ static void igc_led_get_name(struct igc_adapter *adapter, int index, char *buf, > pci_dev_id(adapter->pdev), index); > } > > -static void igc_setup_ldev(struct igc_led_classdev *ldev, > - struct net_device *netdev, int index) > +static int igc_setup_ldev(struct igc_led_classdev *ldev, > + struct net_device *netdev, int index) > { > struct igc_adapter *adapter = netdev_priv(netdev); > struct led_classdev *led_cdev = &ldev->led; > @@ -257,15 +257,15 @@ static void igc_setup_ldev(struct igc_led_classdev *ldev, > led_cdev->hw_control_get = igc_led_hw_control_get; > led_cdev->hw_control_get_device = igc_led_hw_control_get_device; > > - devm_led_classdev_register(&netdev->dev, led_cdev); > + return led_classdev_register(&netdev->dev, led_cdev); > } > > int igc_led_setup(struct igc_adapter *adapter) > { > struct net_device *netdev = adapter->netdev; > - struct device *dev = &netdev->dev; > + struct device *dev = &adapter->pdev->dev; > struct igc_led_classdev *leds; > - int i; > + int i, ret; > > mutex_init(&adapter->led_mutex); > > @@ -273,8 +273,27 @@ int igc_led_setup(struct igc_adapter *adapter) > if (!leds) > return -ENOMEM; > > - for (i = 0; i < IGC_NUM_LEDS; i++) > - igc_setup_ldev(leds + i, netdev, i); > + for (i = 0; i < IGC_NUM_LEDS; i++) { > + ret = igc_setup_ldev(leds + i, netdev, i); > + if (ret) > + goto err; > + } > + > + adapter->leds = leds; > > return 0; > + > +err: > + for (i--; i >= 0; i--) > + led_classdev_unregister(&((leds + i)->led)); > + return ret; > +} > + > +void igc_led_teardown(struct igc_adapter *adapter) > +{ > + struct igc_led_classdev *leds = adapter->leds; > + int i; > + > + for (i = 0; i < IGC_NUM_LEDS; i++) > + led_classdev_unregister(&((leds + i)->led)); > } > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c > index 2e1cfbd82f4f..cd164442ab35 100644 > --- a/drivers/net/ethernet/intel/igc/igc_main.c > +++ b/drivers/net/ethernet/intel/igc/igc_main.c > @@ -7025,6 +7025,9 @@ static void igc_remove(struct pci_dev *pdev) > cancel_work_sync(&adapter->watchdog_task); > hrtimer_cancel(&adapter->hrtimer); > > + if (IS_ENABLED(CONFIG_IGC_LEDS)) > + igc_led_teardown(adapter); > + > /* Release control of h/w to f/w. If f/w is AMT enabled, this > * would have already happened in close and is redundant. > */ > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Deadlock in pciehp on dock disconnect 2024-04-05 19:16 ` Lukas Wunner 2024-04-05 20:16 ` Heiner Kallweit @ 2024-04-07 16:39 ` vient 1 sibling, 0 replies; 8+ messages in thread From: vient @ 2024-04-07 16:39 UTC (permalink / raw) To: lukas; +Cc: netdev, hkallweit1 Did not notice second version until testing the first one. First one worked, no more hangups. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-04-07 16:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAEhC_B=ksywxCG_+aQqXUrGEgKq+4mqnSV8EBHOKbC3-Obj9+Q@mail.gmail.com>
2024-04-05 10:02 ` Deadlock in pciehp on dock disconnect Lukas Wunner
2024-04-05 12:59 ` vient
2024-04-05 13:31 ` Heiner Kallweit
2024-04-05 17:48 ` Lukas Wunner
2024-04-05 19:01 ` Heiner Kallweit
2024-04-05 19:16 ` Lukas Wunner
2024-04-05 20:16 ` Heiner Kallweit
2024-04-07 16:39 ` vient
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).