Linux SCSI subsystem development
 help / color / mirror / Atom feed
* [PATCH] scsi: megaraid_sas: Fix NULL pointer dereference on firmware duplicate completion
@ 2026-05-14  7:57 Milan P. Gandhi
  0 siblings, 0 replies; only message in thread
From: Milan P. Gandhi @ 2026-05-14  7:57 UTC (permalink / raw)
  To: linux-scsi, Martin K. Petersen, James.Bottomley
  Cc: Kashyap Desai, Sumit Saxena, Shivasharan S, Tomas Henzl

Add NULL check for scmd_local in the MPI2_FUNCTION_SCSI_IO_REQUEST case
to handle firmware duplicate/stale completions.

When firmware sends a duplicate completion for a command that was already
processed and returned to the pool, the driver accesses NULL scmd pointer
causing a crash.

Timeline of the bug:
1. Command completes normally, megasas_return_cmd_fusion() called
2. This sets cmd->scmd = NULL and clears io_request with
   memset(..., 0, ...)
3. Firmware sends duplicate/stale completion for same SMID (firmware bug)
4. Driver processes reply descriptor again
5. Cleared io_request has Function = 0 (MPI2_FUNCTION_SCSI_IO_REQUEST)
6. Switch statement matches SCSI_IO_REQUEST case by accident
7. Accesses megasas_priv(NULL scmd)->status → crash at offset 0x228

The offset 0x228 = sizeof(struct scsi_cmnd) 0x220 + offsetof(status) 0x8.

This issue was observed on PERC H330 Mini running firmware 25.5.9.0001
after 3+ days of heavy I/O load.

Crash signature:
  BUG: unable to handle kernel NULL pointer dereference at 0x228
  RIP: complete_cmd_fusion+0x428
  Function: megasas_priv(cmd_fusion->scmd)->status

Add defensive check to skip processing when scmd_local is NULL. This
handles duplicate completions from firmware and prevents accessing
freed command structures. The check protects all scmd_local uses in
both the SCSI_IO path and the fallthrough LDIO path.

Signed-off-by: Milan P. Gandhi <mgandhi@redhat.com>
---
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 2699e4e09b5b..056cbe50e19e 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -3612,6 +3612,15 @@ complete_cmd_fusion(struct megasas_instance *instance, u32 MSIxIndex,
 			complete(&cmd_fusion->done);
 			break;
 		case MPI2_FUNCTION_SCSI_IO_REQUEST:  /*Fast Path IO.*/
+			/*
+			 * Firmware can send stale/duplicate completions for
+			 * commands already returned to the pool. scmd_local
+			 * would be NULL for such cases. Skip processing to
+			 * avoid NULL pointer access.
+			 */
+			if (!scmd_local)
+				break;
+
 			/* Update load balancing info */
 			if (fusion->load_balance_info &&
 			    (megasas_priv(cmd_fusion->scmd)->status &
-- 
2.46.2


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2026-05-14  7:58 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14  7:57 [PATCH] scsi: megaraid_sas: Fix NULL pointer dereference on firmware duplicate completion Milan P. Gandhi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox