From: manjunath.b.patil@oracle.com
To: Saeed Mahameed <saeedm@nvidia.com>,
Tariq Toukan <tariqt@nvidia.com>, Mark Bloch <mbloch@nvidia.com>,
Leon Romanovsky <leon@kernel.org>,
netdev@vger.kernel.org
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Patrisious Haddad <phaddad@nvidia.com>,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: Re: [PATCH net] net/mlx5e: Use sender devcom for MPV master-up
Date: Wed, 17 Jun 2026 09:28:52 -0700 [thread overview]
Message-ID: <381053a1-170e-49f9-bd33-c5ecf6015504@oracle.com> (raw)
In-Reply-To: <20260610173915.4053423-1-manjunath.b.patil@oracle.com>
On 6/10/26 10:39 AM, Manjunath Patil wrote:
> After PCIe DPC recovery, mlx5 reloads the affected functions and
> replays multiport affiliation events. In the reported failure, the
> first relevant device error was:
>
> pcieport 0000:10:01.1: DPC: containment event
> pcieport 0000:10:01.1: PCIe Bus Error: severity=Uncorrected (Fatal)
> pcieport 0000:10:01.1: [ 5] SDES (First)
>
> mlx5 recovered the PCI functions and resumed 0000:11:00.1. During
> that resume, RDMA multiport binding replayed
> MLX5_DRIVER_EVENT_AFFILIATION_DONE and mlx5e sent
> MPV_DEVCOM_MASTER_UP. The host then panicked with:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000010
> RIP: mlx5_devcom_comp_set_ready+0x5/0x40 [mlx5_core]
> RDI: 0000000000000000
>
> Call trace included:
>
> mlx5_devcom_comp_set_ready
> mlx5e_devcom_event_mpv
> mlx5_devcom_send_event
> mlx5_ib_bind_slave_port
> mlx5r_mp_probe
> mlx5_pci_resume
>
> MPV devcom registration publishes mlx5e private data to the component
> peer list before mlx5e_devcom_init_mpv() stores the returned component
> device in priv->devcom. A concurrent master-up event can therefore
> reach a peer whose private data is visible but whose priv->devcom
> backpointer is still NULL.
>
> MPV_DEVCOM_MASTER_UP already carries the sender/master mlx5e private
> data as event_data. The ready bit is stored on the shared devcom
> component, not on an individual peer. Use the sender devcom when
> marking the MPV component ready.
>
> This preserves the readiness transition while avoiding a NULL
> dereference of the peer devcom pointer during affiliation replay after
> PCI error recovery.
>
> Fixes: bf11485f8419 ("net/mlx5: Register mlx5e priv to devcom in MPV mode")
> Assisted-by: Codex:gpt-5
> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
> Cc: stable@vger.kernel.org # 6.7+
> ---
Ping!
-Manjunath
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index 8f2b3abe0092..f7ff20b97e8c 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -211,11 +211,14 @@ static void mlx5e_disable_async_events(struct mlx5e_priv *priv)
>
> static int mlx5e_devcom_event_mpv(int event, void *my_data, void *event_data)
> {
> - struct mlx5e_priv *slave_priv = my_data;
> + struct mlx5e_priv *master_priv = event_data;
>
> switch (event) {
> case MPV_DEVCOM_MASTER_UP:
> - mlx5_devcom_comp_set_ready(slave_priv->devcom, true);
> + if (!master_priv || !master_priv->devcom)
> + return -EINVAL;
> +
> + mlx5_devcom_comp_set_ready(master_priv->devcom, true);
> break;
> case MPV_DEVCOM_MASTER_DOWN:
> /* no need for comp set ready false since we unregister after
prev parent reply other threads:[~2026-06-17 16:29 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-10 17:39 [PATCH net] net/mlx5e: Use sender devcom for MPV master-up Manjunath Patil
2026-06-17 16:28 ` manjunath.b.patil [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=381053a1-170e-49f9-bd33-c5ecf6015504@oracle.com \
--to=manjunath.b.patil@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=phaddad@nvidia.com \
--cc=saeedm@nvidia.com \
--cc=stable@vger.kernel.org \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox