* Issue with rtnl_lock in wg_pm_notification
@ 2025-06-16 20:14 Sharath Chandra Vurukala
2025-06-18 18:47 ` Jakub Kicinski
0 siblings, 1 reply; 2+ messages in thread
From: Sharath Chandra Vurukala @ 2025-06-16 20:14 UTC (permalink / raw)
To: Jason A. Donenfeld", wireguard, netdev
Hi All,
I work on rmnet driver(more details at https://docs.kernel.org/networking/device_drivers/cellular/qualcomm/rmnet.html), This driver registers onto any physical network device in IP mode.
This driver registers/deregisters for PM notifications based on the NETDEV_REGISTER/NETDEV_UNREGISTER notifications received for the physical netdevice.
In one of the tests, involving a frequent system suspend/resume and physical netdevice registration/de-registration, a deadlock has been observed involving the rtnl_lock and the pm_chain_head->rwsem.
The sequence involves
Thread1: unregister_netdev(holds rtnl_lock)-> notification sent to rmnet driver( netdev_chain) -> rmnet driver attempts to unregister for pm_notification( involves acquiring pm_chain_head->rwsem).
Thread2: pm_suspend -> pm_notofier_call_chain(holds pm_chain_head->rwsem)-> wg_pm_notifcation(trying to acquire rtnl_lock)
I do not understand fully what wireguard functionality is, but considering that rtnl_lock is a global one, it does not seem to be a good design to have notification callback acquire this lock.
Can we check if rthl_lock be avoided here OR
Use rtnl_trylock instead of using rtnl_lock and consider deferring the work to a context where it is safe to acquire the lock( may be a workqueue), something like below to avoid the deadlock?
static int wg_pm_notification(struct notifier_block *nb, unsigned long action, void *data)
{
struct wg_device *wg;
struct wg_peer *peer;
/* If the machine is constantly suspending and resuming, as part of
* its normal operation rather than as a somewhat rare event, then we
* don't actually want to clear keys.
*/
if (IS_ENABLED(CONFIG_PM_AUTOSLEEP) ||
IS_ENABLED(CONFIG_PM_USERSPACE_AUTOSLEEP))
return 0;
if (action != PM_HIBERNATION_PREPARE && action != PM_SUSPEND_PREPARE)
return 0;
if (rtnl_trylock()) {
list_for_each_entry(wg, &device_list, device_list) {
mutex_lock(&wg->device_update_lock);
list_for_each_entry(peer, &wg->peer_list, peer_list) {
del_timer(&peer->timer_zero_key_material);
wg_noise_handshake_clear(&peer->handshake);
wg_noise_keypairs_clear(&peer->keypairs);
}
mutex_unlock(&wg->device_update_lock);
}
} else {
/* schedule a workqueue to delete timers clear up the keypairs of the peers? */
}
rtnl_unlock();
rcu_barrier();
}
Thanks,
Sharath
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Issue with rtnl_lock in wg_pm_notification
2025-06-16 20:14 Issue with rtnl_lock in wg_pm_notification Sharath Chandra Vurukala
@ 2025-06-18 18:47 ` Jakub Kicinski
0 siblings, 0 replies; 2+ messages in thread
From: Jakub Kicinski @ 2025-06-18 18:47 UTC (permalink / raw)
To: Sharath Chandra Vurukala; +Cc: Jason, wireguard, netdev
On Tue, 17 Jun 2025 01:44:56 +0530 Sharath Chandra Vurukala wrote:
> I do not understand fully what wireguard functionality is, but
> considering that rtnl_lock is a global one, it does not seem to be a
> good design to have notification callback acquire this lock.
I'm not very familiar with the PM locks, but isn't the PM notification
lock also a global one? Again, not an expert but having PM lock outside
the rtnl_lock would be more intuitive to me.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-06-18 18:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-16 20:14 Issue with rtnl_lock in wg_pm_notification Sharath Chandra Vurukala
2025-06-18 18:47 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).