public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers
@ 2026-04-21 20:20 Martin Wilck
  2026-04-21 20:20 ` [PATCH v2 1/2] scsi: smartpqi: use shost_to_hba() in pqi_scan_finished() Martin Wilck
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Martin Wilck @ 2026-04-21 20:20 UTC (permalink / raw)
  To: Martin K. Petersen, Christoph Hellwig, Don Brace
  Cc: linux-scsi, Hannes Reinecke, Lee Duncan, Martin Wilck

commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard and
multi-channel scans") modified the way SAS drivers handle the common way of
rescanning SCSI devices using "echo - - - >/sys/class/scsi_host/host$N/scan".
Before this patch, SAS drivers would only scan channel 0 for this "wildcard
scan" scenario; after this patch, it would scan all channels up to
shost->max_channel.

This can cause massive resource usage for some drivers, as the driver needs to
send an INQUIRY to LUN 0 to every supported ID and e.g. smartpqi sets
shost->max_id to 0xffffffff. These INQUIRYs mostly fail, but the kernel needs
to set up queues, tag sets, etc. before sending the INQUIRY. Also, some SAS
drivers assign special meaning to SCSI channels such as channel 0 for physical
and channel 1 for logical devices, and thus don't support "normal" SCSI
scanning of these channels anyway.

With smartpqi and hisi_sas, actual kernel crashes due to resource exhaustion
have been observed.

A lot of SAS drivers, including the affected ones, provide scan_start() and
scan_finished() functions that offer custom, firmware-assisted scanning
functionality specific for the driver in question. The idea of this patch set
is to map the "wildcard scan" to this driver-specific scanning procedure if
the driver provides one.

The first patch is a minor fix for smartpqi to make pqi_scan_finished() work.

Changes v1 -> v2:

- export and call do_scsi_scan_host() instead of duplicating its code
  (Hannes Reinecke)

Martin Wilck (2):
  scsi: smartpqi: use shost_to_hba() in pqi_scan_finished()
  scsi: sas_user_scan: use scan_start if available

 drivers/scsi/scsi_scan.c              |  3 ++-
 drivers/scsi/scsi_transport_sas.c     | 19 +++++++++++++++++++
 drivers/scsi/smartpqi/smartpqi_init.c |  2 +-
 include/scsi/scsi_host.h              |  1 +
 4 files changed, 23 insertions(+), 2 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] scsi: smartpqi: use shost_to_hba() in pqi_scan_finished()
  2026-04-21 20:20 [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin Wilck
@ 2026-04-21 20:20 ` Martin Wilck
  2026-04-21 20:20 ` [PATCH v2 2/2] scsi: sas_user_scan: use scan_start if available Martin Wilck
  2026-04-30 16:30 ` [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin K. Petersen
  2 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2026-04-21 20:20 UTC (permalink / raw)
  To: Martin K. Petersen, Christoph Hellwig, Don Brace
  Cc: linux-scsi, Hannes Reinecke, Lee Duncan, Martin Wilck, storagedev,
	stable

shost_to_hba() is used everywhere except to obtain pqi_ctrl_info
from shosti, except in pqi_scan_finished(), where shost_priv() is used.
This causes one pointer dereference to be missed, as shost->hostdata
is a pointer in smartpqi. Fix it.

Fixes: 6c223761eb54 ("smartpqi: initial commit of Microsemi smartpqi driver")
Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: Don Brace <don.brace@microchip.com>
Cc: storagedev@microchip.com
Cc: stable@vger.kernel.org
Reviewed-by: Don Brace <don.brace@microchip.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>

---
 drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c
index b4ed991976d0..65ff50982978 100644
--- a/drivers/scsi/smartpqi/smartpqi_init.c
+++ b/drivers/scsi/smartpqi/smartpqi_init.c
@@ -2642,7 +2642,7 @@ static int pqi_scan_finished(struct Scsi_Host *shost,
 {
 	struct pqi_ctrl_info *ctrl_info;
 
-	ctrl_info = shost_priv(shost);
+	ctrl_info = shost_to_hba(shost);
 
 	return !mutex_is_locked(&ctrl_info->scan_mutex);
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] scsi: sas_user_scan: use scan_start if available
  2026-04-21 20:20 [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin Wilck
  2026-04-21 20:20 ` [PATCH v2 1/2] scsi: smartpqi: use shost_to_hba() in pqi_scan_finished() Martin Wilck
@ 2026-04-21 20:20 ` Martin Wilck
  2026-04-30 16:30 ` [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin K. Petersen
  2 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2026-04-21 20:20 UTC (permalink / raw)
  To: Martin K. Petersen, Christoph Hellwig, Don Brace
  Cc: linux-scsi, Hannes Reinecke, Lee Duncan, Martin Wilck, storagedev,
	Ranjan Kumar, Sathya Prakash Veerichetty, Kashyap Desai,
	Sumit Saxena, mpi3mr-linuxdrv.pdl, MPT-FusionLinux.pdl, Yihang Li,
	Jack Wang, John Garry

Since 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard and
multi-channel scans"), a wildcard scan on a SAS host scans all channels.
This can cause excessive resource usage and even system freeze with
some controllers, e.g. smartpqi. smartpqi and other drivers provide
the scan_start() and scan_finished() methods to scan devices
efficiently. Instead of blindly scanning every device, use these
methods to do the wildcard scan when available. Use the existing
function do_scsi_scan_host() for this purpose, which therefore needs
to be exported.

Fixes: 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard and multi-channel scans")
Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: Don Brace <don.brace@microchip.com>
Cc: storagedev@microchip.com
Cc: Ranjan Kumar <ranjan.kumar@broadcom.com>
Cc: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: mpi3mr-linuxdrv.pdl@broadcom.com
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: Yihang Li <liyihang9@h-partners.com>
Cc: Jack Wang <jinpu.wang@cloud.ionos.com>
Cc: John Garry <john.g.garry@oracle.com>

----
This patch has been tested successfully with smartpqi, but it would
affect other drivers that provide scan_start(), and we don't have
hardware to test them all. Affected drivers are aic94xx, hisi_sas,
hpsa, isci, mpi3mr, mpt3sas, mvsas, pm8001, and smartpqi.
I cc'd the maintainers of these drivers above.
---
 drivers/scsi/scsi_scan.c          |  3 ++-
 drivers/scsi/scsi_transport_sas.c | 19 +++++++++++++++++++
 include/scsi/scsi_host.h          |  1 +
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 7b11bc7de0e3..05e0e50b6e42 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -2029,7 +2029,7 @@ static void scsi_finish_async_scan(struct async_scan_data *data)
 	kfree(data);
 }
 
-static void do_scsi_scan_host(struct Scsi_Host *shost)
+void do_scsi_scan_host(struct Scsi_Host *shost)
 {
 	if (shost->hostt->scan_finished) {
 		unsigned long start = jiffies;
@@ -2043,6 +2043,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost)
 				SCAN_WILD_CARD, SCSI_SCAN_INITIAL);
 	}
 }
+EXPORT_SYMBOL(do_scsi_scan_host);
 
 static void do_scan_async(void *_data, async_cookie_t c)
 {
diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index 13412702188e..15600394dc31 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -1702,6 +1702,20 @@ static void scan_channel_zero(struct Scsi_Host *shost, uint id, u64 lun)
 	}
 }
 
+/*
+ * For wildcard scans on hosts that provide a scan_start method,
+ * use that instead of blindly scanning everything.
+ */
+static int sas_user_scan_with_scan_start(struct Scsi_Host *shost)
+{
+	if (!shost->hostt->scan_finished || !shost->hostt->scan_start)
+		return 1;
+
+	/* scan_finished exists, thus do_scsi_scan_host() will use it  */
+	do_scsi_scan_host(shost);
+	return 0;
+}
+
 /*
  * SCSI scan helper
  */
@@ -1721,6 +1735,11 @@ static int sas_user_scan(struct Scsi_Host *shost, uint channel,
 		break;
 
 	case SCAN_WILD_CARD:
+
+		if (id == SCAN_WILD_CARD && lun == SCAN_WILD_CARD
+			&& !sas_user_scan_with_scan_start(shost))
+			return 0;
+
 		mutex_lock(&sas_host->lock);
 		scan_channel_zero(shost, id, lun);
 		mutex_unlock(&sas_host->lock);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index f6e12565a81d..9717559c5171 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -805,6 +805,7 @@ scsi_template_proc_dir(const struct scsi_host_template *sht);
 #else
 #define scsi_template_proc_dir(sht) NULL
 #endif
+extern void do_scsi_scan_host(struct Scsi_Host *shost);
 extern void scsi_scan_host(struct Scsi_Host *);
 extern int scsi_resume_device(struct scsi_device *sdev);
 extern int scsi_rescan_device(struct scsi_device *sdev);
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers
  2026-04-21 20:20 [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin Wilck
  2026-04-21 20:20 ` [PATCH v2 1/2] scsi: smartpqi: use shost_to_hba() in pqi_scan_finished() Martin Wilck
  2026-04-21 20:20 ` [PATCH v2 2/2] scsi: sas_user_scan: use scan_start if available Martin Wilck
@ 2026-04-30 16:30 ` Martin K. Petersen
  2026-05-04 16:58   ` Martin Wilck
  2 siblings, 1 reply; 5+ messages in thread
From: Martin K. Petersen @ 2026-04-30 16:30 UTC (permalink / raw)
  To: Martin Wilck
  Cc: Martin K. Petersen, Christoph Hellwig, Don Brace, linux-scsi,
	Hannes Reinecke, Lee Duncan, Martin Wilck


Martin,

> A lot of SAS drivers, including the affected ones, provide
> scan_start() and scan_finished() functions that offer custom,
> firmware-assisted scanning functionality specific for the driver in
> question. The idea of this patch set is to map the "wildcard scan" to
> this driver-specific scanning procedure if the driver provides one.

https://sashiko.dev/#/patchset/20260421202018.511388-1-mwilck%40suse.com

-- 
Martin K. Petersen

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers
  2026-04-30 16:30 ` [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin K. Petersen
@ 2026-05-04 16:58   ` Martin Wilck
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Wilck @ 2026-05-04 16:58 UTC (permalink / raw)
  To: Martin K. Petersen, Ranjan Kumar, Don Brace, Hannes Reinecke
  Cc: Christoph Hellwig, linux-scsi, Lee Duncan, Yihang Li, Jack Wang,
	John Garry

Hi Martin, Ranjan, Don, Hannes, all,

I need some advice / guidance.

TL;DR:

- Does it make sense to pursue the idea of this patch by adding proper
locking and making some additional amendments?
- Otherwise, can we revert 37c4e72b0651 and find a different solution
for rescanning NVMe drives on mpt3sas and mpi3mr?

Long version:

Thanks for the link with the Gemini review. My current approach is
arguably too simplistic.

The question is, then, what to do about 37c4e72b0651 ("scsi: Fix
sas_user_scan() to handle wildcard and multi-channel scans"). As a
matter of fact, this commit introduces severe regressions (system
stall) at least for smartpqi and hisi_sas. 

Looking back at the discussion about this commit [2], it was submitted
to fix scanning of NVMe drives, which mpt3sas and mpi3mr may present as
SCSI targets in a certain mode of operation, in which these drivers use
the channel number to distinguish NVMe and SCSI devices from each
other. This use of the channel number is inconsistent with how other
SAS drivers use it. Therefore this commit broke SAS scanning for other
drivers. The obvious fix would  be to revert 37c4e72b0651. This is
actually what we (SUSE) have done in our downstream kernel so far.

This would mean that Broadcom would need to come up with a different
way to handle the scanning for mpt3sas and mpi3mr in "Tri-mode".

Other than that: I still believe that the general idea of using
scan_start() and scan_finished() not only for the initial host adapter
scan but also for rescan operations is reasonable. After all, a "dumb"
wildcard scan doesn't make a lot of sense for drivers "which find their
own targets", for which scan_start() and scan_finished were introduced
in 1aa8fab2acf1 ("[SCSI] Make scsi_scan_host work for drivers which
find their own targets").

But the issues pointed out by Gemini are real, and while I could fix
the locking, the problem that some drivers may reinitialize the
hardware in scan_start() is tricky. Some detailed replies on Gemini's
review are appended below.

Thanks,
Martin

[1] https://sashiko.dev/#/patchset/20260421202018.511388-1-mwilck%40suse.com
[2] https://lore.kernel.org/linux-scsi/CAFdVvOwjy+2ORJ6uJkspiLTPF05481U7gcS4QohFOFGPqAs8ig@mail.gmail.com/

On Thu, 2026-04-30 at 12:30 -0400, Martin K. Petersen wrote:
> 
> https://sashiko.dev/#/patchset/20260421202018.511388-1-mwilck%40suse.com


P.S: Detailed replies to Gemini's review:

> Does bypassing scsi_scan_host_selected() here skip necessary power
> management and serialization steps?
> 
> Typically, wildcard scans go through scsi_scan_host_selected(), which
> secures the host adapter via scsi_autopm_get_host(), checks
> scsi_host_scan_allowed(), and takes shost->scan_mutex.
>
> By calling do_scsi_scan_host() directly, it appears scan_start()
could be
> invoked while the host adapter is runtime-suspended, which might lead
to
> hardware faults.

The call to scsi_scan_host_selected() was introduced in commit
37c4e72b0651, which my patch is intended to fix; before that commit,
scan_mutex was not taken in the sas_user_scan() code path. It's still
explicitly not taken for the important case of scanning channel 0;
sas_user_scan() takes only sas_host->lock, which protects just the rphy
list.

So yes, sas_user_scan_with_scan_start() code path should acquire the
scan_mutex. I suppose that the mutex should be acquired in
sas_user_scan(), actually.

> Also, without holding scan_mutex and checking
scsi_host_scan_allowed(),
> is there a race condition if a sysfs scan is triggered concurrently
with
> driver removal? If scsi_remove_host() tears down the host
asynchronously,
> the while loop in do_scsi_scan_host() might poll shost->hostt-
>scan_finished()
> after the driver module has unloaded, potentially causing a use-
after-free.

True. Taking the scan_mutex should fix this.

> Similarly, multiple concurrent sysfs writes could execute
scan_start()
> simultaneously without scan_mutex protection. Can this corrupt
internal
> driver state?

Right. Again, this can be fixed by taking scan_mutex.

> Additionally, is it safe to repurpose scan_start() for runtime sysfs
> scans?
> 
> Historically, shost->hostt->scan_start() seems to be used exclusively
> for one-time asynchronous boot initialization in do_scan_async().
> 
> Many SAS drivers might implement this as a hardware initialization
routine
> that resets physical links.
> 
> If exposed to a runtime userspace sysfs trigger, could this
unintentionally
> bounce active SAS links, drop active I/O, or cause controller hangs
for
> drivers that assume scan_start() is only used during boot?

This is not the case for smartpqi, but other drivers (e.g. hisi_sas)
actually seem to enable PHYs in the scan_start() code path. Not sure
how this can be fixed in general. AFAICS it would require thorough
understanding of the hardware / firmware of all affected drivers, which
I don't have.

Would it make sense pass a flag to scan_start() indicating whether we
initiate a first-time scan or a rescan, and skip the hardware
initialization in the latter case?

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-04 16:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 20:20 [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin Wilck
2026-04-21 20:20 ` [PATCH v2 1/2] scsi: smartpqi: use shost_to_hba() in pqi_scan_finished() Martin Wilck
2026-04-21 20:20 ` [PATCH v2 2/2] scsi: sas_user_scan: use scan_start if available Martin Wilck
2026-04-30 16:30 ` [PATCH v2 0/2] Fix SAS wildcard scan on smartpqi and other controllers Martin K. Petersen
2026-05-04 16:58   ` Martin Wilck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox