[REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id
@ 2026-03-28  2:28 Li Lingfeng
  2026-03-30  8:08 ` Li Lingfeng
  2026-03-30 12:18 ` James Bottomley
  0 siblings, 2 replies; 4+ messages in thread
From: Li Lingfeng @ 2026-03-28  2:28 UTC (permalink / raw)
  To: ranjan.kumar
  Cc: linux-scsi, jejb, martin.petersen, linux-kernel@vger.kernel.org,
	rajsekhar.chundru, sathya.prakash, sumit.saxena,
	chandrakanth.patil, prayas.patel, yangerkun, zhangyi (F), Hou Tao,
	chengzhihao1@huawei.com, jiangjianjun3, yuancan

Hi,

I think commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard
and multi-channel scans") may introduce a regression for wildcard scans on
some SAS hosts.

Userspace trigger:

   echo "- - -" > /sys/class/scsi_host/host0/scan

results in:

   channel = SCAN_WILD_CARD
   id      = SCAN_WILD_CARD
   lun     = SCAN_WILD_CARD

Before this commit, sas_user_scan() iterated sas_host->rphy_list and called
scsi_scan_target() for matching rphys. In effect, scanning was limited to
channel 0 and to target ids present in sas_host->rphy_list.

After this commit, sas_user_scan() does:

   - scan channel 0 via scan_channel_zero()
   - scan channels 1..shost->max_channel via scsi_scan_host_selected()

When id == SCAN_WILD_CARD, the latter path goes through
scsi_scan_channel(), which iterates ids from 0 to shost->max_id.

This looks problematic for drivers that use a very large max_id. For
example, smartpqi sets:

   shost->max_id = ~0;

In that case, a wildcard scan may end up iterating from id 0 to ~0 in
scsi_scan_channel(). In my testing/analysis, this makes the scan take a
very long time, and the id-space walk itself does not seem meaningful for
this SAS transport scan path.

So while the commit fixes incomplete wildcard channel handling, it also
appears to expand the id scan range from:

   sas_host->rphy_list target ids

to:

   0..shost->max_id

for the additional channels.

It seems to me that wildcard SAS scans should probably remain bounded by
transport-discovered SAS targets, instead of falling back to a host-wide
id enumeration for the extra channels. One possible direction may be to
avoid calling scsi_scan_host_selected() with id == SCAN_WILD_CARD from
sas_user_scan(), or otherwise constrain the id range in a transport-aware
way.

Am I understanding this correctly? If so, what would be the preferred way
to address this? I would appreciate feedback on whether this is considered
a real regression, and on the best fix direction.

Thanks,
Lingfeng.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id
  2026-03-28  2:28 [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id Li Lingfeng
@ 2026-03-30  8:08 ` Li Lingfeng
  2026-03-30 12:18 ` James Bottomley
  1 sibling, 0 replies; 4+ messages in thread
From: Li Lingfeng @ 2026-03-30  8:08 UTC (permalink / raw)
  To: ranjan.kumar
  Cc: linux-scsi, jejb, martin.petersen, linux-kernel@vger.kernel.org,
	rajsekhar.chundru, sathya.prakash, sumit.saxena,
	chandrakanth.patil, prayas.patel, yangerkun, zhangyi (F), Hou Tao,
	chengzhihao1@huawei.com, jiangjianjun3, yuancan

Hi,

I have one more question after looking at the SAS scan paths a bit more.

What caught my attention is that sas_rphy_add() and the old
sas_user_scan() seemed to follow the same scanning model:

   - scan via channel 0
   - use rphy->scsi_target_id as the target id

For example, sas_rphy_add() does:

   scsi_scan_target(&rphy->dev, 0, rphy->scsi_target_id, lun,
                    SCSI_SCAN_INITIAL);

So before this change, these two paths looked consistent to me.

Now sas_user_scan() has moved to a different model for the extra channels,
while sas_rphy_add() still uses the original one. This makes me wonder
whether these two paths are expected to stay consistent, and if so, which
direction is actually intended.

Should sas_rphy_add() also be changed to follow the new sas_user_scan()
behavior? Or is sas_rphy_add() a hint that sas_user_scan() should remain
aligned with the original rphy-based scan model instead?

I am not very familiar with the intended SCSI/SAS scanning design here,
so this may be a naive question. I just wanted to check whether the
current inconsistency between sas_rphy_add() and sas_user_scan() is
expected, or whether one of them should be adjusted so that both follow
the same model again.

Any clarification would be greatly appreciated.

Thanks,
Lingfeng.

在 2026/3/28 10:28, Li Lingfeng 写道:
> Hi,
>
> I think commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle 
> wildcard
> and multi-channel scans") may introduce a regression for wildcard 
> scans on
> some SAS hosts.
>
> Userspace trigger:
>
>   echo "- - -" > /sys/class/scsi_host/host0/scan
>
> results in:
>
>   channel = SCAN_WILD_CARD
>   id      = SCAN_WILD_CARD
>   lun     = SCAN_WILD_CARD
>
> Before this commit, sas_user_scan() iterated sas_host->rphy_list and 
> called
> scsi_scan_target() for matching rphys. In effect, scanning was limited to
> channel 0 and to target ids present in sas_host->rphy_list.
>
> After this commit, sas_user_scan() does:
>
>   - scan channel 0 via scan_channel_zero()
>   - scan channels 1..shost->max_channel via scsi_scan_host_selected()
>
> When id == SCAN_WILD_CARD, the latter path goes through
> scsi_scan_channel(), which iterates ids from 0 to shost->max_id.
>
> This looks problematic for drivers that use a very large max_id. For
> example, smartpqi sets:
>
>   shost->max_id = ~0;
>
> In that case, a wildcard scan may end up iterating from id 0 to ~0 in
> scsi_scan_channel(). In my testing/analysis, this makes the scan take a
> very long time, and the id-space walk itself does not seem meaningful for
> this SAS transport scan path.
>
> So while the commit fixes incomplete wildcard channel handling, it also
> appears to expand the id scan range from:
>
>   sas_host->rphy_list target ids
>
> to:
>
>   0..shost->max_id
>
> for the additional channels.
>
> It seems to me that wildcard SAS scans should probably remain bounded by
> transport-discovered SAS targets, instead of falling back to a host-wide
> id enumeration for the extra channels. One possible direction may be to
> avoid calling scsi_scan_host_selected() with id == SCAN_WILD_CARD from
> sas_user_scan(), or otherwise constrain the id range in a transport-aware
> way.
>
> Am I understanding this correctly? If so, what would be the preferred way
> to address this? I would appreciate feedback on whether this is 
> considered
> a real regression, and on the best fix direction.
>
> Thanks,
> Lingfeng.
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id
  2026-03-28  2:28 [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id Li Lingfeng
  2026-03-30  8:08 ` Li Lingfeng
@ 2026-03-30 12:18 ` James Bottomley
  2026-03-31  2:33   ` Li Lingfeng
  1 sibling, 1 reply; 4+ messages in thread
From: James Bottomley @ 2026-03-30 12:18 UTC (permalink / raw)
  To: Li Lingfeng, ranjan.kumar
  Cc: linux-scsi, jejb, martin.petersen, linux-kernel@vger.kernel.org,
	rajsekhar.chundru, sathya.prakash, sumit.saxena,
	chandrakanth.patil, prayas.patel, yangerkun, zhangyi (F), Hou Tao,
	chengzhihao1@huawei.com, jiangjianjun3, yuancan

On Sat, 2026-03-28 at 10:28 +0800, Li Lingfeng wrote:
> Hi,
> 
> I think commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle
> wildcard and multi-channel scans") may introduce a regression for
> wildcard scans on some SAS hosts.
> 
> Userspace trigger:
> 
>    echo "- - -" > /sys/class/scsi_host/host0/scan
> 
> results in:
> 
>    channel = SCAN_WILD_CARD
>    id      = SCAN_WILD_CARD
>    lun     = SCAN_WILD_CARD
> 
> Before this commit, sas_user_scan() iterated sas_host->rphy_list and
> called scsi_scan_target() for matching rphys. In effect, scanning was
> limited to channel 0 and to target ids present in sas_host-
> >rphy_list.
> 
> After this commit, sas_user_scan() does:
> 
>    - scan channel 0 via scan_channel_zero()
>    - scan channels 1..shost->max_channel via
> scsi_scan_host_selected()
> 
> When id == SCAN_WILD_CARD, the latter path goes through
> scsi_scan_channel(), which iterates ids from 0 to shost->max_id.
> 
> This looks problematic for drivers that use a very large max_id. For
> example, smartpqi sets:
> 
>    shost->max_id = ~0;
> 
> In that case, a wildcard scan may end up iterating from id 0 to ~0 in
> scsi_scan_channel(). In my testing/analysis, this makes the scan take
> a very long time, and the id-space walk itself does not seem
> meaningful for this SAS transport scan path.
> 
> So while the commit fixes incomplete wildcard channel handling, it
> also appears to expand the id scan range from:
> 
>    sas_host->rphy_list target ids
> 
> to:
> 
>    0..shost->max_id
> 
> for the additional channels.
> 
> It seems to me that wildcard SAS scans should probably remain bounded
> by transport-discovered SAS targets, instead of falling back to a
> host-wide id enumeration for the extra channels. One possible
> direction may be to avoid calling scsi_scan_host_selected() with id
> == SCAN_WILD_CARD from sas_user_scan(), or otherwise constrain the id
> range in a transport-aware way.
> 
> Am I understanding this correctly? If so, what would be the preferred
> way to address this? I would appreciate feedback on whether this is
> considered a real regression, and on the best fix direction.

In the case of smartpqi, it isn't designed to be user scanned, I think.
So, as you say, it would take a long time to scan one channel.  Since
it sets max_channels to 3, it would only take 4 times longer which
hardly constitutes a regression.

Doing serial scans is very scsi-2 so most discoverable device fabrics
don't bother and get the default settings for the scan max_channels
(which is zero).  The only devices that seem to care about this at all
are fat firmware devices that bundle RAID or other capabilities by re-
purposing channels and they seem to be the ones that want this
behaviour:

https://lore.kernel.org/linux-scsi/CAFdVvOwjy+2ORJ6uJkspiLTPF05481U7gcS4QohFOFGPqAs8ig@mail.gmail.com/

Regards,

James


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id
  2026-03-30 12:18 ` James Bottomley
@ 2026-03-31  2:33   ` Li Lingfeng
  0 siblings, 0 replies; 4+ messages in thread
From: Li Lingfeng @ 2026-03-31  2:33 UTC (permalink / raw)
  To: James Bottomley, ranjan.kumar
  Cc: linux-scsi, jejb, martin.petersen, linux-kernel@vger.kernel.org,
	rajsekhar.chundru, sathya.prakash, sumit.saxena,
	chandrakanth.patil, prayas.patel, yangerkun, zhangyi (F), Hou Tao,
	chengzhihao1@huawei.com, jiangjianjun3, yuancan


在 2026/3/30 20:18, James Bottomley 写道:
> On Sat, 2026-03-28 at 10:28 +0800, Li Lingfeng wrote:
>> Hi,
>>
>> I think commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle
>> wildcard and multi-channel scans") may introduce a regression for
>> wildcard scans on some SAS hosts.
>>
>> Userspace trigger:
>>
>>     echo "- - -" > /sys/class/scsi_host/host0/scan
>>
>> results in:
>>
>>     channel = SCAN_WILD_CARD
>>     id      = SCAN_WILD_CARD
>>     lun     = SCAN_WILD_CARD
>>
>> Before this commit, sas_user_scan() iterated sas_host->rphy_list and
>> called scsi_scan_target() for matching rphys. In effect, scanning was
>> limited to channel 0 and to target ids present in sas_host-
>>> rphy_list.
>> After this commit, sas_user_scan() does:
>>
>>     - scan channel 0 via scan_channel_zero()
>>     - scan channels 1..shost->max_channel via
>> scsi_scan_host_selected()
>>
>> When id == SCAN_WILD_CARD, the latter path goes through
>> scsi_scan_channel(), which iterates ids from 0 to shost->max_id.
>>
>> This looks problematic for drivers that use a very large max_id. For
>> example, smartpqi sets:
>>
>>     shost->max_id = ~0;
>>
>> In that case, a wildcard scan may end up iterating from id 0 to ~0 in
>> scsi_scan_channel(). In my testing/analysis, this makes the scan take
>> a very long time, and the id-space walk itself does not seem
>> meaningful for this SAS transport scan path.
>>
>> So while the commit fixes incomplete wildcard channel handling, it
>> also appears to expand the id scan range from:
>>
>>     sas_host->rphy_list target ids
>>
>> to:
>>
>>     0..shost->max_id
>>
>> for the additional channels.
>>
>> It seems to me that wildcard SAS scans should probably remain bounded
>> by transport-discovered SAS targets, instead of falling back to a
>> host-wide id enumeration for the extra channels. One possible
>> direction may be to avoid calling scsi_scan_host_selected() with id
>> == SCAN_WILD_CARD from sas_user_scan(), or otherwise constrain the id
>> range in a transport-aware way.
>>
>> Am I understanding this correctly? If so, what would be the preferred
>> way to address this? I would appreciate feedback on whether this is
>> considered a real regression, and on the best fix direction.
> In the case of smartpqi, it isn't designed to be user scanned, I think.
> So, as you say, it would take a long time to scan one channel.  Since
> it sets max_channels to 3, it would only take 4 times longer which
> hardly constitutes a regression.
>
> Doing serial scans is very scsi-2 so most discoverable device fabrics
> don't bother and get the default settings for the scan max_channels
> (which is zero).  The only devices that seem to care about this at all
> are fat firmware devices that bundle RAID or other capabilities by re-
> purposing channels and they seem to be the ones that want this
> behaviour:
>
> https://lore.kernel.org/linux-scsi/CAFdVvOwjy+2ORJ6uJkspiLTPF05481U7gcS4QohFOFGPqAs8ig@mail.gmail.com/
>
> Regards,
>
> James
Hi James,

Thank you very much for the reply and for the additional background.

I would like to clarify one point about the performance regression I was
trying to describe.

I was not referring to the change from scanning one channel to scanning
multiple channels. My concern was about the change in the target ID scan
range within a single channel.

Before commit 37c4e72b0651 ("scsi: Fix sas_user_scan() to handle wildcard
and multi-channel scans"), the SAS path was effectively bounded by
rphy->scsi_target_id values discovered by the transport. After that change,
for the additional channels, the scan may go through scsi_scan_channel()
and iterate IDs in the range 0..shost->max_id when id == SCAN_WILD_CARD.

So the performance concern I had in mind was not really:

   "one channel" -> "multiple channels"

but rather:

   "scan transport-discovered IDs" -> "scan 0..max_id within a channel"

That said, after reading your reply, my current understanding is that the
motivation for 37c4e72b0651 is mainly to support controllers such as
mpt3sas and mpi3mr, where non-zero channels are meaningful and expected.

 From that perspective, it seems to me that for scenarios that do not
involve mpt3sas/mpi3mr-like usage, one option would be to simply not take
37c4e72b0651, while if we do take it, we should accept that it may bring
this kind of scan-time performance regression on some hosts.

Does that sound like a reasonable way to look at it?

Thanks again for the clarification.

Regards,
Lingfeng.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-03-31  2:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-28  2:28 [REGRESSION?] scsi: sas: wildcard user scan may iterate over huge max_id Li Lingfeng
2026-03-30  8:08 ` Li Lingfeng
2026-03-30 12:18 ` James Bottomley
2026-03-31  2:33   ` Li Lingfeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox