From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05D8217727; Mon, 27 May 2024 19:30:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716838237; cv=none; b=kcvxH5Ktxep1NsgYpsZ5vY2EQMgsPIODcT+yQdxRcaWP9L2+W2t+mY1bj2/87SNO+7AD4rlvf3b5s3qSIWg37ZFXhT+cRf6xGxW5EI1DhRGeVq+6NW3Tinos80HfELYWtPeMtxMKUWkqRPu0I4BqfLuJ1pPOiQFZwKTz+s4LU5c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716838237; c=relaxed/simple; bh=t8a21hZm/9H//gHHWnh5OpWkS/AAO+P2qm2s17hEpsI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VUdQVbJBADpaDg56Pwf2EY/EIr+KMvt+xF1akrWcbQ7CajHkqDBxS6eZT98tasLJEQwWj64RI2gdvqU651RcXPx7DcrrPFrQqhDWtqgleQkKU9FG4cBegreDq+pKBS2+HQKYZZXmvPIDI+LdKLsvB48Q/sYUMLubDakbcr1jDLI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=FqmnaVX5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="FqmnaVX5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84EE9C2BBFC; Mon, 27 May 2024 19:30:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1716838236; bh=t8a21hZm/9H//gHHWnh5OpWkS/AAO+P2qm2s17hEpsI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FqmnaVX5Yv8vccGwipy96n35soq1QpyVZn3MtwIlA86ILi3AgNiWdUSpVcga+TER5 Pj7jr3xhrlPEH4SINCYEgaVVMEWv4T17oo2I+vxrOUbgeZckbixCPRmgFsHMWCxTk+ 3l3+Vykx2age6Z3JRGEMEXi/w0RQCY45g4sybOV8= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Akiva Goldberger , Moshe Shemesh , Tariq Toukan , Jakub Kicinski , Sasha Levin Subject: [PATCH 6.8 318/493] net/mlx5: Discard command completions in internal error Date: Mon, 27 May 2024 20:55:20 +0200 Message-ID: <20240527185640.683906056@linuxfoundation.org> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240527185626.546110716@linuxfoundation.org> References: <20240527185626.546110716@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.8-stable review patch. If anyone has any objections, please let me know. ------------------ From: Akiva Goldberger [ Upstream commit db9b31aa9bc56ff0d15b78f7e827d61c4a096e40 ] Fix use after free when FW completion arrives while device is in internal error state. Avoid calling completion handler in this case, since the device will flush the command interface and trigger all completions manually. Kernel log: ------------[ cut here ]------------ refcount_t: underflow; use-after-free. ... RIP: 0010:refcount_warn_saturate+0xd8/0xe0 ... Call Trace: ? __warn+0x79/0x120 ? refcount_warn_saturate+0xd8/0xe0 ? report_bug+0x17c/0x190 ? handle_bug+0x3c/0x60 ? exc_invalid_op+0x14/0x70 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xd8/0xe0 cmd_ent_put+0x13b/0x160 [mlx5_core] mlx5_cmd_comp_handler+0x5f9/0x670 [mlx5_core] cmd_comp_notifier+0x1f/0x30 [mlx5_core] notifier_call_chain+0x35/0xb0 atomic_notifier_call_chain+0x16/0x20 mlx5_eq_async_int+0xf6/0x290 [mlx5_core] notifier_call_chain+0x35/0xb0 atomic_notifier_call_chain+0x16/0x20 irq_int_handler+0x19/0x30 [mlx5_core] __handle_irq_event_percpu+0x4b/0x160 handle_irq_event+0x2e/0x80 handle_edge_irq+0x98/0x230 __common_interrupt+0x3b/0xa0 common_interrupt+0x7b/0xa0 asm_common_interrupt+0x22/0x40 Fixes: 51d138c2610a ("net/mlx5: Fix health error state handling") Signed-off-by: Akiva Goldberger Reviewed-by: Moshe Shemesh Signed-off-by: Tariq Toukan Link: https://lore.kernel.org/r/20240509112951.590184-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index 511e7fee39ac5..20768ef2e9d2b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1634,6 +1634,9 @@ static int cmd_comp_notifier(struct notifier_block *nb, dev = container_of(cmd, struct mlx5_core_dev, cmd); eqe = data; + if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) + return NOTIFY_DONE; + mlx5_cmd_comp_handler(dev, be32_to_cpu(eqe->data.cmd.vector), false); return NOTIFY_OK; -- 2.43.0