From: Jakub Kicinski <kuba@kernel.org>
To: saeedm@nvidia.com, tariqt@nvidia.com
Cc: Til Kaiser <mail@tk154.de>,
leonro@nvidia.com, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org
Subject: Re: [BUG] net/mlx5: missing sysfs hwmon entry for ConnectX-4 cards
Date: Tue, 15 Oct 2024 08:06:17 -0700 [thread overview]
Message-ID: <20241015080617.79e90a06@kernel.org> (raw)
In-Reply-To: <bc8ba1b7-e4ad-40b5-b69d-9ea1e7a18a40@tk154.de>
On Thu, 10 Oct 2024 17:11:21 +0200 Til Kaiser wrote:
> I noticed on our dual-port 100G ConnectX-4 cards (MT27700 Family)
> running Linux Kernel version 6.6.56 and the latest ConnectX-4 firmware
> version 12.28.2302 that we do not have a sysfs hwmon entry for reading
> temperature values.
> When running Kernel version 6.6.32, the hwmon entry is there again, and
> I can read the temperature values of those cards.
> Strangely, this problem doesn't occur on our ConnectX-4 Lx cards
> (MT27710 Family), regardless of which Kernel version I use.
>
> I looked into the mlx5 core driver and noticed that it is checking the
> MCAM register here:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c?h=v6.6.56#n380.
> When I removed that check, the hwmon entry reappeared again.
>
> Looking into recent mlx5 commits regarding this MCAM register, I found
> this commit:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.6.56&id=fb035aa9a3f8fd327ab83b15a94929d2b9045995.
> When I reverted this commit, the hwmon entry also reappeared again.
>
> I also found a firmware bug fix regarding that register inside the
> ConnectX-4 Lx bug fix history here (Ref. 2339971):
> https://docs.nvidia.com/networking/display/connectx4lxfirmwarev14321900/bug+fixes+history.
> I couldn't find such a firmware fix for the non-Lx ConnectX-4 cards. So,
> I'm unsure whether this might be a mlx5 driver or firmware issue.
Hi, any word on this? Sounds like a fairly straightforward problem.
next prev parent reply other threads:[~2024-10-15 15:06 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-10 15:11 [BUG] net/mlx5: missing sysfs hwmon entry for ConnectX-4 cards Til Kaiser
2024-10-15 15:06 ` Jakub Kicinski [this message]
2024-10-16 7:05 ` Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241015080617.79e90a06@kernel.org \
--to=kuba@kernel.org \
--cc=leonro@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=mail@tk154.de \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).