All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Will Mortensen <will@extrahop.com>
Cc: Saeed Mahameed <saeedm@nvidia.com>,
	Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
	netdev <netdev@vger.kernel.org>,
	Shahar Shitrit <shshitrit@nvidia.com>,
	Carolina Jubran <cjubran@nvidia.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Jeremy Royal <jeremyr@extrahop.com>
Subject: Re: [PATCH v2] net/mlx5: don't printk garbage when transceiver overheats
Date: Thu, 14 May 2026 14:32:57 +0300	[thread overview]
Message-ID: <20260514113257.GO15586@unreal> (raw)
In-Reply-To: <20260512-b4-mlx5-sensor-fix-v2-1-531fee4fd7fd@extrahop.com>

On Tue, May 12, 2026 at 12:32:38AM -0700, Will Mortensen wrote:
> When the mlx5 driver processes a temperature warning event, in events.c
> and hwmon.c, temp_warn() calls print_sensor_names_in_bit_set(), which
> calls hwmon_get_sensor_name() to get the NUL-terminated name of the
> relevant sensor, and then prints it to dmesg. In particular,
> print_sensor_names_in_bit_set() passes the bit index ("sensor index")
> within the 128-bit vector in the warning event to
> hwmon_get_sensor_name(). But hwmon_get_sensor_name() was expecting the
> index of the hwmon channel, and the driver registers hwmon channels for
> at most only two sensors: the ASIC sensor (sensor index 0) and the
> module sensor (sensor index 64 or 65 if we're on a 2-port NIC). So when
> the warning event concerned a module, hwmon_get_sensor_name() took the
> 64th or 65th element of the likely 2-element temp_channel_desc array and
> thus returned a pointer to some other kernel memory past the end of it,
> which was printed to dmesg up to the first NUL byte.
> 
> A further difficulty is that, at least in testing on our CX-8 C8240 with
> firmware 40.47.1088, the warning event can have bits set for other
> modules, e.g. if this PCI physical function is associated with
> port/module 0, we might expect bit 64 to be set, but bit 65 (for port/
> module 1) can also be set.
> 
> Fix this by clarifying that the argument to hwmon_get_sensor_name() is
> the raw sensor index, and correctly converting it to the hwmon channel
> index. Return NULL if the sensor index doesn't correspond to a hwmon
> channel (e.g. because it's for the other port's module).
> 
> Fixes: 46fd50cfcc12 ("net/mlx5: Add sensor name to temperature event message")
> Signed-off-by: Will Mortensen <will@extrahop.com>

Your Signed-off-by needs to be last.

> Reviewed-by: Jeremy Royal <jeremyr@extrahop.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/events.c |  2 ++
>  drivers/net/ethernet/mellanox/mlx5/core/hwmon.c  | 15 ++++++++++++++-
>  drivers/net/ethernet/mellanox/mlx5/core/hwmon.h  |  2 +-
>  3 files changed, 17 insertions(+), 2 deletions(-)

I would take a simpler approach by removing this extra complexity and  
using a static temp_channel_desc[64] array, avoiding any dynamic  
allocation. But this change is acceptable as well.

Thanks

      reply	other threads:[~2026-05-14 11:33 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12  7:32 [PATCH v2] net/mlx5: don't printk garbage when transceiver overheats Will Mortensen
2026-05-14 11:32 ` Leon Romanovsky [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514113257.GO15586@unreal \
    --to=leon@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=cjubran@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jeremyr@extrahop.com \
    --cc=kuba@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=saeedm@nvidia.com \
    --cc=shshitrit@nvidia.com \
    --cc=tariqt@nvidia.com \
    --cc=will@extrahop.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.