From: Tariq Toukan <tariqt@nvidia.com>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>
Cc: <netdev@vger.kernel.org>, Saeed Mahameed <saeedm@nvidia.com>,
Gal Pressman <gal@nvidia.com>,
Leon Romanovsky <leonro@nvidia.com>,
Joe Damato <jdamato@fastly.com>, Tariq Toukan <tariqt@nvidia.com>
Subject: [PATCH net 5/5] net/mlx5e: Fix queue stats access to non-existing channels splat
Date: Thu, 8 Aug 2024 17:41:06 +0300 [thread overview]
Message-ID: <20240808144107.2095424-6-tariqt@nvidia.com> (raw)
In-Reply-To: <20240808144107.2095424-1-tariqt@nvidia.com>
From: Gal Pressman <gal@nvidia.com>
The queue stats API queries the queues according to the
real_num_[tr]x_queues, in case the device is down and channels were not
yet created, don't try to query their statistics.
To trigger the panic, run this command before the interface is brought
up:
./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml --dump qstats-get --json '{"ifindex": 4}'
BUG: kernel NULL pointer dereference, address: 0000000000000c00
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 3 UID: 0 PID: 977 Comm: python3 Not tainted 6.10.0+ #40
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
Code: fc 55 48 63 ee 53 48 89 d3 e8 40 3d 70 e1 85 c0 74 58 4c 89 ef e8 d4 07 04 00 84 c0 75 41 49 8b 84 24 f8 39 00 00 48 8b 04 e8 <48> 8b 90 00 0c 00 00 48 03 90 40 0a 00 00 48 89 53 08 48 8b 90 08
RSP: 0018:ffff888116be37d0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888116be3868 RCX: 0000000000000004
RDX: ffff88810ada4000 RSI: 0000000000000000 RDI: ffff888109df09c0
RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000004
R10: ffff88813461901c R11: ffffffffffffffff R12: ffff888109df0000
R13: ffff888109df09c0 R14: ffff888116be38d0 R15: 0000000000000000
FS: 00007f4375d5c740(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000c00 CR3: 0000000106ada006 CR4: 0000000000370eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x14e/0x3d0
? exc_page_fault+0x73/0x130
? asm_exc_page_fault+0x22/0x30
? mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
netdev_nl_stats_by_netdev+0x2a6/0x4c0
? __rmqueue_pcplist+0x351/0x6f0
netdev_nl_qstats_get_dumpit+0xc4/0x1b0
genl_dumpit+0x2d/0x80
netlink_dump+0x199/0x410
__netlink_dump_start+0x1aa/0x2c0
genl_family_rcv_msg_dumpit+0x94/0xf0
? __pfx_genl_start+0x10/0x10
? __pfx_genl_dumpit+0x10/0x10
? __pfx_genl_done+0x10/0x10
genl_rcv_msg+0x116/0x2b0
? __pfx_netdev_nl_qstats_get_dumpit+0x10/0x10
? __pfx_genl_rcv_msg+0x10/0x10
netlink_rcv_skb+0x54/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x21a/0x340
netlink_sendmsg+0x1f4/0x440
__sys_sendto+0x1b6/0x1c0
? do_sock_setsockopt+0xc3/0x180
? __sys_setsockopt+0x60/0xb0
__x64_sys_sendto+0x20/0x30
do_syscall_64+0x50/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f43757132b0
Code: c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 41 54 48 83 ec 20
RSP: 002b:00007ffd258da048 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007ffd258da0f8 RCX: 00007f43757132b0
RDX: 000000000000001c RSI: 00007f437464b850 RDI: 0000000000000003
RBP: 00007f4375085de0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f43751a6147
</TASK>
Modules linked in: netconsole xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core zram zsmalloc mlx5_core fuse [last unloaded: netconsole]
CR2: 0000000000000c00
---[ end trace 0000000000000000 ]---
RIP: 0010:mlx5e_get_queue_stats_rx+0x3c/0xb0 [mlx5_core]
Code: fc 55 48 63 ee 53 48 89 d3 e8 40 3d 70 e1 85 c0 74 58 4c 89 ef e8 d4 07 04 00 84 c0 75 41 49 8b 84 24 f8 39 00 00 48 8b 04 e8 <48> 8b 90 00 0c 00 00 48 03 90 40 0a 00 00 48 89 53 08 48 8b 90 08
RSP: 0018:ffff888116be37d0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888116be3868 RCX: 0000000000000004
RDX: ffff88810ada4000 RSI: 0000000000000000 RDI: ffff888109df09c0
RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000004
R10: ffff88813461901c R11: ffffffffffffffff R12: ffff888109df0000
R13: ffff888109df09c0 R14: ffff888116be38d0 R15: 0000000000000000
FS: 00007f4375d5c740(0000) GS:ffff88852c980000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000c00 CR3: 0000000106ada006 CR4: 0000000000370eb0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 7b66ae536a78 ("net/mlx5e: Add per queue netdev-genl stats")
Cc: Joe Damato <jdamato@fastly.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index f04decca39f2..5df904639b0c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -5296,7 +5296,7 @@ static void mlx5e_get_queue_stats_rx(struct net_device *dev, int i,
struct mlx5e_rq_stats *rq_stats;
ASSERT_RTNL();
- if (mlx5e_is_uplink_rep(priv))
+ if (mlx5e_is_uplink_rep(priv) || !priv->stats_nch)
return;
channel_stats = priv->channel_stats[i];
@@ -5316,6 +5316,9 @@ static void mlx5e_get_queue_stats_tx(struct net_device *dev, int i,
struct mlx5e_sq_stats *sq_stats;
ASSERT_RTNL();
+ if (!priv->stats_nch)
+ return;
+
/* no special case needed for ptp htb etc since txq2sq_stats is kept up
* to date for active sq_stats, otherwise get_base_stats takes care of
* inactive sqs.
--
2.44.0
next prev parent reply other threads:[~2024-08-08 14:43 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-08 14:41 [PATCH net 0/5] mlx5 misc fixes 2024-08-08 Tariq Toukan
2024-08-08 14:41 ` [PATCH net 1/5] net/mlx5: SD, Do not query MPIR register if no sd_group Tariq Toukan
2024-08-08 14:41 ` [PATCH net 2/5] net/mlx5e: SHAMPO, Increase timeout to improve latency Tariq Toukan
2024-08-08 14:41 ` [PATCH net 3/5] net/mlx5e: Take state lock during tx timeout reporter Tariq Toukan
2024-08-08 14:41 ` [PATCH net 4/5] net/mlx5e: Correctly report errors for ethtool rx flows Tariq Toukan
2024-08-08 14:41 ` Tariq Toukan [this message]
2024-08-08 14:52 ` [PATCH net 5/5] net/mlx5e: Fix queue stats access to non-existing channels splat Joe Damato
2024-08-10 5:30 ` [PATCH net 0/5] mlx5 misc fixes 2024-08-08 patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240808144107.2095424-6-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=jdamato@fastly.com \
--cc=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox