linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] deadlock in nl80211_vendor_cmd
@ 2022-03-17 17:09 willmcvicker
       [not found] ` <CABYd82Z=YXmZPTQhf0K1M4nS2wk3dPBSqx91D8SoUd59AUzpHg@mail.gmail.com>
  2022-03-21 20:07 ` Johannes Berg
  0 siblings, 2 replies; 30+ messages in thread
From: willmcvicker @ 2022-03-17 17:09 UTC (permalink / raw)
  To: Johannes Berg, linux-wireless; +Cc: Marek Szyprowski

Hi,

I wanted to report a deadlock that I'm hitting as a result of the upstream
commit a05829a7222e ("cfg80211: avoid holding the RTNL when calling the
driver"). I'm using the Pixel 6 with downstream version of the 5.15 kernel,
but I'm pretty sure this will happen on the upstream tip-of-tree kernel as
well.

Basically, my wlan driver uses the wiphy_vendor_command ops to handle
a number of vendor specific operations. One of them in particular deletes
a cfg80211 interface. The deadlock happens when thread 1 tries to take the
RTNL lock before calling cfg80211_unregister_device() while thread 2 is
inside nl80211_pre_doit(), holding the RTNL lock, and waiting on
wiphy_lock().

Here is the call flow:

Thread 1:                         Thread 2:

nl80211_pre_doit():
  -> rtnl_lock()
                                      nl80211_pre_doit():
                                       -> rtnl_lock()
                                       -> <blocked by Thread 1>
  -> wiphy_lock()
  -> rtnl_unlock()
  -> <unblock Thread 1>
exit nl80211_pre_doit()
                                       <Thread 2 got the RTNL lock>
                                       -> wiphy_lock()
                                       -> <blocked by Thread 1>
nl80211_doit()
  -> nl80211_vendor_cmd()
      -> rtnl_lock() <DEADLOCK>
      -> cfg80211_unregister_device()
      -> rtnl_unlock()


To be complete, here are the kernel call traces when the deadlock occurs:

Thread 1 Call trace:
    <Take rtnl before calling cfg80211_unregister_device()>
    nl80211_vendor_cmd+0x210/0x218
    genl_rcv_msg+0x3ac/0x45c
    netlink_rcv_skb+0x130/0x168
    genl_rcv+0x38/0x54
    netlink_unicast_kernel+0xe4/0x1f4
    netlink_unicast+0x128/0x21c
    netlink_sendmsg+0x2d8/0x3d8

Thread 2 Call trace:
    <Take wiphy_lock>
    nl80211_pre_doit+0x1b0/0x250
    genl_rcv_msg+0x37c/0x45c
    netlink_rcv_skb+0x130/0x168
    genl_rcv+0x38/0x54
    netlink_unicast_kernel+0xe4/0x1f4
    netlink_unicast+0x128/0x21c
    netlink_sendmsg+0x2d8/0x3d8

I'm not an networking expert. So my main question is if I'm allowed to take
the RTNL lock inside the nl80211_vendor_cmd callbacks? If so, then
regardless of why I take it, we shouldn't be allowing this deadlock
situation, right?

I hope that helps explain the issue. Let me know if you need any more
details.

Thanks,
Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-03-26  0:12 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-17 17:09 [BUG] deadlock in nl80211_vendor_cmd willmcvicker
     [not found] ` <CABYd82Z=YXmZPTQhf0K1M4nS2wk3dPBSqx91D8SoUd59AUzpHg@mail.gmail.com>
2022-03-21 17:00   ` William McVicker
2022-03-21 20:07 ` Johannes Berg
2022-03-22  0:15   ` William McVicker
2022-03-22 21:31   ` Jeff Johnson
2022-03-22 21:58     ` William McVicker
2022-03-23 16:06       ` Jeff Johnson
2022-03-24 21:58         ` William McVicker
2022-03-25 12:04           ` Johannes Berg
2022-03-25 12:06             ` Johannes Berg
2022-03-25 16:49             ` Jakub Kicinski
2022-03-25 17:01               ` Johannes Berg
2022-03-25 18:08                 ` William McVicker
2022-03-25 20:21                   ` Johannes Berg
2022-03-25 20:36                   ` William McVicker
2022-03-25 21:16                     ` Johannes Berg
2022-03-25 21:54                       ` Johannes Berg
2022-03-25 22:18                       ` Jeff Johnson
2022-03-25 23:57                       ` William McVicker
2022-03-26  0:07                         ` Jakub Kicinski
2022-03-26  0:12                           ` William McVicker
2022-03-25 20:40                 ` Jakub Kicinski
2022-03-25 21:25                   ` Johannes Berg
2022-03-25 21:48                     ` Jakub Kicinski
2022-03-25 21:50                       ` Johannes Berg
2022-03-25 12:09       ` Johannes Berg
2022-03-25 15:59         ` Jeff Johnson
2022-03-25 16:04           ` Johannes Berg
2022-03-25 17:14             ` Jeff Johnson
2022-03-25 12:08     ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).