netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Issue with rtnl_lock in wg_pm_notification
@ 2025-06-16 20:14 Sharath Chandra Vurukala
  2025-06-18 18:47 ` Jakub Kicinski
  0 siblings, 1 reply; 2+ messages in thread
From: Sharath Chandra Vurukala @ 2025-06-16 20:14 UTC (permalink / raw)
  To: Jason A. Donenfeld", wireguard, netdev

Hi All,
I work on rmnet driver(more details at https://docs.kernel.org/networking/device_drivers/cellular/qualcomm/rmnet.html), This driver registers onto any physical network device in IP mode.
This driver registers/deregisters for PM notifications based on the NETDEV_REGISTER/NETDEV_UNREGISTER notifications received for the physical netdevice.

In one of the tests, involving a frequent system suspend/resume and physical netdevice registration/de-registration, a deadlock has been observed involving the rtnl_lock and the pm_chain_head->rwsem.
The sequence involves
Thread1: unregister_netdev(holds rtnl_lock)-> notification sent to rmnet driver( netdev_chain) -> rmnet driver attempts to unregister for pm_notification( involves acquiring pm_chain_head->rwsem).
Thread2: pm_suspend -> pm_notofier_call_chain(holds pm_chain_head->rwsem)-> wg_pm_notifcation(trying to acquire rtnl_lock)

I do not understand fully what wireguard functionality is, but considering that rtnl_lock is a global one, it does not seem to be a good design to have notification callback acquire this lock.
Can we check if rthl_lock be avoided here OR 
Use rtnl_trylock instead of using rtnl_lock and consider deferring the work to a context where it is safe to acquire the lock( may be a workqueue), something like below to avoid the deadlock?

static int wg_pm_notification(struct notifier_block *nb, unsigned long action, void *data)
{
          struct wg_device *wg;
          struct wg_peer *peer;

          /* If the machine is constantly suspending and resuming, as part of
           * its normal operation rather than as a somewhat rare event, then we
           * don't actually want to clear keys.
           */
          if (IS_ENABLED(CONFIG_PM_AUTOSLEEP) ||
              IS_ENABLED(CONFIG_PM_USERSPACE_AUTOSLEEP))
                          return 0;

          if (action != PM_HIBERNATION_PREPARE && action != PM_SUSPEND_PREPARE)
                          return 0;

          if (rtnl_trylock()) {
              list_for_each_entry(wg, &device_list, device_list) {
                              mutex_lock(&wg->device_update_lock);
                              list_for_each_entry(peer, &wg->peer_list, peer_list) {
                                              del_timer(&peer->timer_zero_key_material);
                                              wg_noise_handshake_clear(&peer->handshake);
                                              wg_noise_keypairs_clear(&peer->keypairs);
                              }
                              mutex_unlock(&wg->device_update_lock);
              }

          } else {

                /* schedule a workqueue to delete timers clear up the keypairs of the peers?  */

          }

          rtnl_unlock();
          rcu_barrier();
}

Thanks,
Sharath



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Issue with rtnl_lock in wg_pm_notification
  2025-06-16 20:14 Issue with rtnl_lock in wg_pm_notification Sharath Chandra Vurukala
@ 2025-06-18 18:47 ` Jakub Kicinski
  0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2025-06-18 18:47 UTC (permalink / raw)
  To: Sharath Chandra Vurukala; +Cc: Jason, wireguard, netdev

On Tue, 17 Jun 2025 01:44:56 +0530 Sharath Chandra Vurukala wrote:
> I do not understand fully what wireguard functionality is, but
> considering that rtnl_lock is a global one, it does not seem to be a
> good design to have notification callback acquire this lock.

I'm not very familiar with the PM locks, but isn't the PM notification
lock also a global one? Again, not an expert but having PM lock outside
the rtnl_lock would be more intuitive to me.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-06-18 18:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-16 20:14 Issue with rtnl_lock in wg_pm_notification Sharath Chandra Vurukala
2025-06-18 18:47 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).