From: Saeed Mahameed <saeed@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
netdev@vger.kernel.org, Tariq Toukan <tariqt@nvidia.com>,
Lama Kayal <lkayal@nvidia.com>
Subject: [net 08/10] net/mlx5e: Take RTNL lock before triggering netdev notifiers
Date: Thu, 12 Oct 2023 12:51:25 -0700 [thread overview]
Message-ID: <20231012195127.129585-9-saeed@kernel.org> (raw)
In-Reply-To: <20231012195127.129585-1-saeed@kernel.org>
From: Lama Kayal <lkayal@nvidia.com>
Hold RTNL lock when calling xdp_set_features() with a registered netdev,
as the call triggers the netdev notifiers. This could happen when
switching from nic profile to uplink representor for example.
Similar logic which fixed a similar scenario was previously introduced in
the following commit:
commit 72cc65497065 net/mlx5e: Take RTNL lock when needed before calling
xdp_set_features().
This fixes the following assertion and warning call trace:
RTNL: assertion failed at net/core/dev.c (1961)
WARNING: CPU: 13 PID: 2529 at net/core/dev.c:1961
call_netdevice_notifiers_info+0x7c/0x80
Modules linked in: rpcrdma rdma_ucm ib_iser libiscsi
scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib
ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink
nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5
auth_rpcgss oid_registry overlay mlx5_core zram zsmalloc fuse
CPU: 13 PID: 2529 Comm: devlink Not tainted
6.5.0_for_upstream_min_debug_2023_09_07_20_04 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:call_netdevice_notifiers_info+0x7c/0x80
Code: 8f ff 80 3d 77 0d 16 01 00 75 c5 ba a9 07 00 00 48
c7 c6 c4 bb 0d 82 48 c7 c7 18 c8 06 82 c6 05 5b 0d 16 01 01 e8 44 f6 8c
ff <0f> 0b eb a2 0f 1f 44 00 00 55 48 89 e5 41 54 48 83 e4 f0 48 83 ec
RSP: 0018:ffff88819930f7f0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffff8309f740 RCX: 0000000000000027
RDX: ffff88885fb5b5c8 RSI: 0000000000000001 RDI: ffff88885fb5b5c0
RBP: 0000000000000028 R08: ffff88887ffabaa8 R09: 0000000000000003
R10: ffff88887fecbac0 R11: ffff88887ff7bac0 R12: ffff88819930f810
R13: ffff88810b7fea40 R14: ffff8881154e8fd8 R15: ffff888107e881a0
FS: 00007f3ad248f800(0000) GS:ffff88885fb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563b85f164e0 CR3: 0000000113b5c006 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __warn+0x79/0x120
? call_netdevice_notifiers_info+0x7c/0x80
? report_bug+0x17c/0x190
? handle_bug+0x3c/0x60
? exc_invalid_op+0x14/0x70
? asm_exc_invalid_op+0x16/0x20
? call_netdevice_notifiers_info+0x7c/0x80
call_netdevice_notifiers+0x2e/0x50
mlx5e_set_xdp_feature+0x21/0x50 [mlx5_core]
mlx5e_build_rep_params+0x97/0x130 [mlx5_core]
mlx5e_init_ul_rep+0x9f/0x100 [mlx5_core]
mlx5e_netdev_init_profile+0x76/0x110 [mlx5_core]
mlx5e_netdev_attach_profile+0x1f/0x90 [mlx5_core]
mlx5e_netdev_change_profile+0x92/0x160 [mlx5_core]
mlx5e_vport_rep_load+0x329/0x4a0 [mlx5_core]
mlx5_esw_offloads_rep_load+0x9e/0xf0 [mlx5_core]
esw_offloads_enable+0x4bc/0xe90 [mlx5_core]
mlx5_eswitch_enable_locked+0x3c8/0x570 [mlx5_core]
? kmalloc_trace+0x25/0x80
mlx5_devlink_eswitch_mode_set+0x224/0x680 [mlx5_core]
? devlink_get_from_attrs_lock+0x9e/0x110
devlink_nl_cmd_eswitch_set_doit+0x60/0xe0
genl_family_rcv_msg_doit+0xd0/0x120
genl_rcv_msg+0x180/0x2b0
? devlink_get_from_attrs_lock+0x110/0x110
? devlink_nl_cmd_eswitch_get_doit+0x290/0x290
? devlink_pernet_pre_exit+0xf0/0xf0
? genl_family_rcv_msg_dumpit+0xf0/0xf0
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x1fc/0x2c0
netlink_sendmsg+0x232/0x4a0
sock_sendmsg+0x38/0x60
? _copy_from_user+0x2a/0x60
__sys_sendto+0x110/0x160
? handle_mm_fault+0x161/0x260
? do_user_addr_fault+0x276/0x620
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f3ad231340a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3
0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
RSP: 002b:00007ffd70aad4b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000c36b00 RCX:00007f3ad231340a
RDX: 0000000000000038 RSI: 0000000000c36b00 RDI: 0000000000000003
RBP: 0000000000c36910 R08: 00007f3ad2625200 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
</TASK>
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------
Fixes: 4d5ab0ad964d ("net/mlx5e: take into account device reconfiguration for xdp_features flag")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 2fdb8895aecd..5ca9bc337dc6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -769,6 +769,7 @@ static int mlx5e_rep_max_nch_limit(struct mlx5_core_dev *mdev)
static void mlx5e_build_rep_params(struct net_device *netdev)
{
+ const bool take_rtnl = netdev->reg_state == NETREG_REGISTERED;
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5e_rep_priv *rpriv = priv->ppriv;
struct mlx5_eswitch_rep *rep = rpriv->rep;
@@ -794,8 +795,15 @@ static void mlx5e_build_rep_params(struct net_device *netdev)
/* RQ */
mlx5e_build_rq_params(mdev, params);
+ /* If netdev is already registered (e.g. move from nic profile to uplink,
+ * RTNL lock must be held before triggering netdev notifiers.
+ */
+ if (take_rtnl)
+ rtnl_lock();
/* update XDP supported features */
mlx5e_set_xdp_feature(netdev);
+ if (take_rtnl)
+ rtnl_unlock();
/* CQ moderation params */
params->rx_dim_enabled = MLX5_CAP_GEN(mdev, cq_moderation);
--
2.41.0
next prev parent reply other threads:[~2023-10-12 19:53 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-12 19:51 [pull request][net 00/10] mlx5 fixes 2023-10-12 Saeed Mahameed
2023-10-12 19:51 ` [net 01/10] net/mlx5: Perform DMA operations in the right locations Saeed Mahameed
2023-10-14 1:10 ` patchwork-bot+netdevbpf
2023-10-12 19:51 ` [net 02/10] net/mlx5: E-switch, register event handler before arming the event Saeed Mahameed
2023-10-12 19:51 ` [net 03/10] net/mlx5: Bridge, fix peer entry ageing in LAG mode Saeed Mahameed
2023-10-12 19:51 ` [net 04/10] net/mlx5: Handle fw tracer change ownership event based on MTRC Saeed Mahameed
2023-10-12 19:51 ` [net 05/10] net/mlx5e: RX, Fix page_pool allocation failure recovery for striding rq Saeed Mahameed
2023-10-12 19:51 ` [net 06/10] net/mlx5e: RX, Fix page_pool allocation failure recovery for legacy rq Saeed Mahameed
2023-10-12 19:51 ` [net 07/10] net/mlx5e: XDP, Fix XDP_REDIRECT mpwqe page fragment leaks on shutdown Saeed Mahameed
2023-10-12 19:51 ` Saeed Mahameed [this message]
2023-10-12 19:51 ` [net 09/10] net/mlx5e: Don't offload internal port if filter device is out device Saeed Mahameed
2024-10-23 9:32 ` Frode Nordahl
2024-10-24 8:17 ` Jianbo Liu
2024-11-06 7:17 ` Gerald Yang
2024-11-06 7:28 ` Jianbo Liu
2024-11-06 8:59 ` Gerald Yang
2023-10-12 19:51 ` [net 10/10] net/mlx5e: Fix VF representors reporting zero counters to "ip -s" command Saeed Mahameed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231012195127.129585-9-saeed@kernel.org \
--to=saeed@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=lkayal@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).