From: Tariq Toukan <tariqt@nvidia.com>
To: "David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Eric Dumazet <edumazet@google.com>,
"Andrew Lunn" <andrew+netdev@lunn.ch>
Cc: Shahar Shitrit <shshitrit@nvidia.com>,
Gal Pressman <gal@nvidia.com>, Saeed Mahameed <saeedm@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
"Tariq Toukan" <tariqt@nvidia.com>, <netdev@vger.kernel.org>,
<linux-rdma@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
Carolina Jubran <cjubran@nvidia.com>
Subject: [PATCH net-next 4/4] net/mlx5: Add sensor name to temperature event message
Date: Thu, 13 Feb 2025 11:46:41 +0200 [thread overview]
Message-ID: <20250213094641.226501-5-tariqt@nvidia.com> (raw)
In-Reply-To: <20250213094641.226501-1-tariqt@nvidia.com>
From: Shahar Shitrit <shshitrit@nvidia.com>
Previously, a temperature event message included a bitmap indicating
which sensors detect high temperatures.
To enhance clarity, we modify the message format to explicitly list
the names of the overheating sensors, alongside the sensors bitmap.
If HWMON is not configured, the event message remains unchanged.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
.../net/ethernet/mellanox/mlx5/core/events.c | 31 +++++++++++++++++--
.../net/ethernet/mellanox/mlx5/core/hwmon.c | 5 +++
.../net/ethernet/mellanox/mlx5/core/hwmon.h | 1 +
3 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/events.c b/drivers/net/ethernet/mellanox/mlx5/core/events.c
index e85a9042e3c2..01c5f5990f9a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/events.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/events.c
@@ -6,6 +6,7 @@
#include "mlx5_core.h"
#include "lib/eq.h"
#include "lib/events.h"
+#include "hwmon.h"
struct mlx5_event_nb {
struct mlx5_nb nb;
@@ -153,11 +154,28 @@ static int any_notifier(struct notifier_block *nb,
return NOTIFY_OK;
}
+#if IS_ENABLED(CONFIG_HWMON)
+static void print_sensor_names_in_bit_set(struct mlx5_core_dev *dev, struct mlx5_hwmon *hwmon,
+ u64 bit_set, int bit_set_offset)
+{
+ unsigned long *bit_set_ptr = (unsigned long *)&bit_set;
+ int num_bits = sizeof(bit_set) * BITS_PER_BYTE;
+ int i;
+
+ for_each_set_bit(i, bit_set_ptr, num_bits) {
+ const char *sensor_name = hwmon_get_sensor_name(hwmon, i + bit_set_offset);
+
+ mlx5_core_warn(dev, "Sensor name[%d]: %s\n", i + bit_set_offset, sensor_name);
+ }
+}
+#endif /* CONFIG_HWMON */
+
/* type == MLX5_EVENT_TYPE_TEMP_WARN_EVENT */
static int temp_warn(struct notifier_block *nb, unsigned long type, void *data)
{
struct mlx5_event_nb *event_nb = mlx5_nb_cof(nb, struct mlx5_event_nb, nb);
struct mlx5_events *events = event_nb->ctx;
+ struct mlx5_core_dev *dev = events->dev;
struct mlx5_eqe *eqe = data;
u64 value_lsb;
u64 value_msb;
@@ -169,10 +187,17 @@ static int temp_warn(struct notifier_block *nb, unsigned long type, void *data)
value_lsb &= 0x1;
value_msb = be64_to_cpu(eqe->data.temp_warning.sensor_warning_msb);
- if (net_ratelimit())
- mlx5_core_warn(events->dev,
- "High temperature on sensors with bit set %#llx %#llx",
+ if (net_ratelimit()) {
+ mlx5_core_warn(dev, "High temperature on sensors with bit set %#llx %#llx.\n",
value_msb, value_lsb);
+#if IS_ENABLED(CONFIG_HWMON)
+ if (dev->hwmon) {
+ print_sensor_names_in_bit_set(dev, dev->hwmon, value_lsb, 0);
+ print_sensor_names_in_bit_set(dev, dev->hwmon, value_msb,
+ sizeof(value_lsb) * BITS_PER_BYTE);
+ }
+#endif
+ }
return NOTIFY_OK;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
index 353f81dccd1c..4ba2636d7fb6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.c
@@ -416,3 +416,8 @@ void mlx5_hwmon_dev_unregister(struct mlx5_core_dev *mdev)
mlx5_hwmon_free(hwmon);
mdev->hwmon = NULL;
}
+
+const char *hwmon_get_sensor_name(struct mlx5_hwmon *hwmon, int channel)
+{
+ return hwmon->temp_channel_desc[channel].sensor_name;
+}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
index 999654a9b9da..f38271c22c10 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/hwmon.h
@@ -10,6 +10,7 @@
int mlx5_hwmon_dev_register(struct mlx5_core_dev *mdev);
void mlx5_hwmon_dev_unregister(struct mlx5_core_dev *mdev);
+const char *hwmon_get_sensor_name(struct mlx5_hwmon *hwmon, int channel);
#else
static inline int mlx5_hwmon_dev_register(struct mlx5_core_dev *mdev)
--
2.45.0
next prev parent reply other threads:[~2025-02-13 9:47 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-13 9:46 [PATCH net-next 0/4] mlx5: Add sensor name in temperature message Tariq Toukan
2025-02-13 9:46 ` [PATCH net-next 1/4] net/mlx5: Apply rate-limiting to high temperature warning Tariq Toukan
2025-02-13 10:15 ` Mateusz Polchlopek
2025-02-13 9:46 ` [PATCH net-next 2/4] net/mlx5: Prefix temperature event bitmap with '0x' for clarity Tariq Toukan
2025-02-13 10:33 ` Mateusz Polchlopek
2025-02-13 9:46 ` [PATCH net-next 3/4] net/mlx5: Modify LSB bitmask in temperature event to include only the first bit Tariq Toukan
2025-02-13 12:11 ` Mateusz Polchlopek
2025-02-13 9:46 ` Tariq Toukan [this message]
2025-02-15 19:29 ` [PATCH net-next 4/4] net/mlx5: Add sensor name to temperature event message Simon Horman
2025-02-18 0:27 ` Jakub Kicinski
2025-02-19 13:00 ` Tariq Toukan
2025-02-19 15:28 ` Jakub Kicinski
2025-02-20 22:22 ` Saeed Mahameed
2025-02-18 0:40 ` [PATCH net-next 0/4] mlx5: Add sensor name in temperature message patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250213094641.226501-5-tariqt@nvidia.com \
--to=tariqt@nvidia.com \
--cc=andrew+netdev@lunn.ch \
--cc=cjubran@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=saeedm@nvidia.com \
--cc=shshitrit@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox