* [PATCH net] bonding: fix oops during rmmod
@ 2024-05-14 19:57 Tony Battersby
2024-05-15 11:44 ` Simon Horman
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Tony Battersby @ 2024-05-14 19:57 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Jay Vosburgh, Andy Gospodarek, Zhengchao Shao, netdev
"rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
remove redundant NULL check in debugfs function"). Here are the relevant
functions being called:
bonding_exit()
bond_destroy_debugfs()
debugfs_remove_recursive(bonding_debug_root);
bonding_debug_root = NULL; <--------- SET TO NULL HERE
bond_netlink_fini()
rtnl_link_unregister()
__rtnl_link_unregister()
unregister_netdevice_many_notify()
bond_uninit()
bond_debug_unregister()
(commit removed check for bonding_debug_root == NULL)
debugfs_remove()
simple_recursive_removal()
down_write() -> OOPS
However, reverting the bad commit does not solve the problem completely
because the original code contains a race that could cause the same
oops, although it was much less likely to be triggered unintentionally:
CPU1
rmmod bonding
bonding_exit()
bond_destroy_debugfs()
debugfs_remove_recursive(bonding_debug_root);
CPU2
echo -bond0 > /sys/class/net/bonding_masters
bond_uninit()
bond_debug_unregister()
if (!bonding_debug_root)
CPU1
bonding_debug_root = NULL;
So do NOT revert the bad commit (since the removed checks were racy
anyway), and instead change the order of actions taken during module
removal. The same oops can also happen if there is an error during
module init, so apply the same fix there.
Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function")
Cc: stable@vger.kernel.org
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
---
drivers/net/bonding/bond_main.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2c5ed0a7cb18..bceda85f0dcf 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -6477,16 +6477,16 @@ static int __init bonding_init(void)
if (res)
goto out;
+ bond_create_debugfs();
+
res = register_pernet_subsys(&bond_net_ops);
if (res)
- goto out;
+ goto err_net_ops;
res = bond_netlink_init();
if (res)
goto err_link;
- bond_create_debugfs();
-
for (i = 0; i < max_bonds; i++) {
res = bond_create(&init_net, NULL);
if (res)
@@ -6501,10 +6501,11 @@ static int __init bonding_init(void)
out:
return res;
err:
- bond_destroy_debugfs();
bond_netlink_fini();
err_link:
unregister_pernet_subsys(&bond_net_ops);
+err_net_ops:
+ bond_destroy_debugfs();
goto out;
}
@@ -6513,11 +6514,11 @@ static void __exit bonding_exit(void)
{
unregister_netdevice_notifier(&bond_netdev_notifier);
- bond_destroy_debugfs();
-
bond_netlink_fini();
unregister_pernet_subsys(&bond_net_ops);
+ bond_destroy_debugfs();
+
#ifdef CONFIG_NET_POLL_CONTROLLER
/* Make sure we don't have an imbalance on our netpoll blocking */
WARN_ON(atomic_read(&netpoll_block_tx));
base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH net] bonding: fix oops during rmmod
2024-05-14 19:57 [PATCH net] bonding: fix oops during rmmod Tony Battersby
@ 2024-05-15 11:44 ` Simon Horman
2024-05-15 12:44 ` Jay Vosburgh
2024-05-17 2:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Simon Horman @ 2024-05-15 11:44 UTC (permalink / raw)
To: Tony Battersby
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jay Vosburgh, Andy Gospodarek, Zhengchao Shao, netdev
On Tue, May 14, 2024 at 03:57:29PM -0400, Tony Battersby wrote:
> "rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
> remove redundant NULL check in debugfs function"). Here are the relevant
> functions being called:
>
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
> bonding_debug_root = NULL; <--------- SET TO NULL HERE
> bond_netlink_fini()
> rtnl_link_unregister()
> __rtnl_link_unregister()
> unregister_netdevice_many_notify()
> bond_uninit()
> bond_debug_unregister()
> (commit removed check for bonding_debug_root == NULL)
> debugfs_remove()
> simple_recursive_removal()
> down_write() -> OOPS
>
> However, reverting the bad commit does not solve the problem completely
> because the original code contains a race that could cause the same
> oops, although it was much less likely to be triggered unintentionally:
>
> CPU1
> rmmod bonding
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
>
> CPU2
> echo -bond0 > /sys/class/net/bonding_masters
> bond_uninit()
> bond_debug_unregister()
> if (!bonding_debug_root)
>
> CPU1
> bonding_debug_root = NULL;
>
> So do NOT revert the bad commit (since the removed checks were racy
> anyway), and instead change the order of actions taken during module
> removal. The same oops can also happen if there is an error during
> module init, so apply the same fix there.
>
> Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function")
> Cc: stable@vger.kernel.org
> Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] bonding: fix oops during rmmod
2024-05-14 19:57 [PATCH net] bonding: fix oops during rmmod Tony Battersby
2024-05-15 11:44 ` Simon Horman
@ 2024-05-15 12:44 ` Jay Vosburgh
2024-05-17 2:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Jay Vosburgh @ 2024-05-15 12:44 UTC (permalink / raw)
To: Tony Battersby
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Andy Gospodarek, Zhengchao Shao, netdev
Tony Battersby <tonyb@cybernetics.com> wrote:
>"rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
>remove redundant NULL check in debugfs function"). Here are the relevant
>functions being called:
>
>bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
> bonding_debug_root = NULL; <--------- SET TO NULL HERE
> bond_netlink_fini()
> rtnl_link_unregister()
> __rtnl_link_unregister()
> unregister_netdevice_many_notify()
> bond_uninit()
> bond_debug_unregister()
> (commit removed check for bonding_debug_root == NULL)
> debugfs_remove()
> simple_recursive_removal()
> down_write() -> OOPS
>
>However, reverting the bad commit does not solve the problem completely
>because the original code contains a race that could cause the same
>oops, although it was much less likely to be triggered unintentionally:
>
>CPU1
> rmmod bonding
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
>
>CPU2
> echo -bond0 > /sys/class/net/bonding_masters
> bond_uninit()
> bond_debug_unregister()
> if (!bonding_debug_root)
>
>CPU1
> bonding_debug_root = NULL;
>
>So do NOT revert the bad commit (since the removed checks were racy
>anyway), and instead change the order of actions taken during module
>removal. The same oops can also happen if there is an error during
>module init, so apply the same fix there.
>
>Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function")
>Cc: stable@vger.kernel.org
>Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
>---
> drivers/net/bonding/bond_main.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 2c5ed0a7cb18..bceda85f0dcf 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -6477,16 +6477,16 @@ static int __init bonding_init(void)
> if (res)
> goto out;
>
>+ bond_create_debugfs();
>+
> res = register_pernet_subsys(&bond_net_ops);
> if (res)
>- goto out;
>+ goto err_net_ops;
>
> res = bond_netlink_init();
> if (res)
> goto err_link;
>
>- bond_create_debugfs();
>-
> for (i = 0; i < max_bonds; i++) {
> res = bond_create(&init_net, NULL);
> if (res)
>@@ -6501,10 +6501,11 @@ static int __init bonding_init(void)
> out:
> return res;
> err:
>- bond_destroy_debugfs();
> bond_netlink_fini();
> err_link:
> unregister_pernet_subsys(&bond_net_ops);
>+err_net_ops:
>+ bond_destroy_debugfs();
> goto out;
>
> }
>@@ -6513,11 +6514,11 @@ static void __exit bonding_exit(void)
> {
> unregister_netdevice_notifier(&bond_netdev_notifier);
>
>- bond_destroy_debugfs();
>-
> bond_netlink_fini();
> unregister_pernet_subsys(&bond_net_ops);
>
>+ bond_destroy_debugfs();
>+
> #ifdef CONFIG_NET_POLL_CONTROLLER
> /* Make sure we don't have an imbalance on our netpoll blocking */
> WARN_ON(atomic_read(&netpoll_block_tx));
>
>base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
>--
>2.25.1
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] bonding: fix oops during rmmod
2024-05-14 19:57 [PATCH net] bonding: fix oops during rmmod Tony Battersby
2024-05-15 11:44 ` Simon Horman
2024-05-15 12:44 ` Jay Vosburgh
@ 2024-05-17 2:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-05-17 2:40 UTC (permalink / raw)
To: Tony Battersby
Cc: davem, edumazet, kuba, pabeni, j.vosburgh, andy, shaozhengchao,
netdev
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 14 May 2024 15:57:29 -0400 you wrote:
> "rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
> remove redundant NULL check in debugfs function"). Here are the relevant
> functions being called:
>
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
> bonding_debug_root = NULL; <--------- SET TO NULL HERE
> bond_netlink_fini()
> rtnl_link_unregister()
> __rtnl_link_unregister()
> unregister_netdevice_many_notify()
> bond_uninit()
> bond_debug_unregister()
> (commit removed check for bonding_debug_root == NULL)
> debugfs_remove()
> simple_recursive_removal()
> down_write() -> OOPS
>
> [...]
Here is the summary with links:
- [net] bonding: fix oops during rmmod
https://git.kernel.org/netdev/net/c/a45835a0bb6e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-05-17 2:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-14 19:57 [PATCH net] bonding: fix oops during rmmod Tony Battersby
2024-05-15 11:44 ` Simon Horman
2024-05-15 12:44 ` Jay Vosburgh
2024-05-17 2:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).