linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas
@ 2025-07-23  5:23 Damien Le Moal
  2025-07-23  5:23 ` [PATCH 1/2] scsi: Allow SCSI hosts to force-disable CDL support probing Damien Le Moal
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Damien Le Moal @ 2025-07-23  5:23 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Sathya Prakash, Sreekanth Reddy,
	Suganath Prabu Subramani, MPT-FusionLinux.pdl
  Cc: Friedrich Weber

Martin,

Friedrich reported issues with HBAs using the mpt3sas driver and CDL
probe, particularly on device hot-plug. These 2 patches address this
issue by force-disabling CDL probing with mpt2sas and mpt3sas. This has
no effect on feature limitation since the firmware of all HBAs driven by
mpt2sas and mpt3sas do not have a SAT implementation capable of handling
CDL on ATA devices.

Damien Le Moal (2):
  scsi: Allow SCSI hosts to force-disable CDL support probing
  scsi: mpt3sas: Disable Command Duration Limit Probing

 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 ++
 drivers/scsi/scsi.c                  | 6 +++++-
 include/scsi/scsi_host.h             | 6 ++++++
 3 files changed, 13 insertions(+), 1 deletion(-)

-- 
2.50.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] scsi: Allow SCSI hosts to force-disable CDL support probing
  2025-07-23  5:23 [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Damien Le Moal
@ 2025-07-23  5:23 ` Damien Le Moal
  2025-07-23  5:23 ` [PATCH 2/2] scsi: mpt3sas: Disable Command Duration Limit Probing Damien Le Moal
  2025-07-24 12:01 ` [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Friedrich Weber
  2 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2025-07-23  5:23 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Sathya Prakash, Sreekanth Reddy,
	Suganath Prabu Subramani, MPT-FusionLinux.pdl
  Cc: Friedrich Weber

Users in the field have reported issues with Command Duration Limits
(CDL) support probing when hot-plugging ATA devices in enclosures served
by some SAS HBA models. These issues seem to be limited to older HBA
models that are now EOL and not getting any firmware updates.

Given that recovering from these issues sometimes even need a full host
power cycle, allow a low level driver to declare its lack of support for
the CDL feature on ATA devices using the new SCSI host template flag
no_ata_cdl. If a low-level HBA driver sets this flag, scsi_cdl_check()
returns without issuing any command to probe an ATA device and reports
CDL as not supported.

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Fixes: 624885209f31 ("scsi: core: Detect support for command duration limits")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/scsi/scsi.c      | 6 +++++-
 include/scsi/scsi_host.h | 6 ++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 534310224e8f..60ec9e8e4d8a 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -649,6 +649,7 @@ static bool scsi_cdl_check_cmd(struct scsi_device *sdev, u8 opcode, u16 sa,
  */
 void scsi_cdl_check(struct scsi_device *sdev)
 {
+	const struct scsi_host_template *hostt = sdev->host->hostt;
 	bool cdl_supported;
 	unsigned char *buf;
 
@@ -657,8 +658,11 @@ void scsi_cdl_check(struct scsi_device *sdev)
 	 * lower SPC version. This also avoids problems with old drives choking
 	 * on MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES with a
 	 * service action specified, as done in scsi_cdl_check_cmd().
+	 * Also ignore CDL support with ATA devices for any host declaring
+	 * lacking support for this feature.
 	 */
-	if (sdev->scsi_level < SCSI_SPC_5) {
+	if (sdev->scsi_level < SCSI_SPC_5 ||
+	   (sdev->is_ata && hostt->no_ata_cdl)) {
 		sdev->cdl_supported = 0;
 		return;
 	}
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index c53812b9026f..b815cc012f2c 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -462,6 +462,12 @@ struct scsi_host_template {
 	/* True if the controller does not support WRITE SAME */
 	unsigned no_write_same:1;
 
+	/*
+	 * True if the controller does not support Command Duration Limits on
+	 * ATA devices.
+	 */
+	unsigned no_ata_cdl:1;
+
 	/* True if the host uses host-wide tagspace */
 	unsigned host_tagset:1;
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] scsi: mpt3sas: Disable Command Duration Limit Probing
  2025-07-23  5:23 [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Damien Le Moal
  2025-07-23  5:23 ` [PATCH 1/2] scsi: Allow SCSI hosts to force-disable CDL support probing Damien Le Moal
@ 2025-07-23  5:23 ` Damien Le Moal
  2025-07-24 12:01 ` [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Friedrich Weber
  2 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2025-07-23  5:23 UTC (permalink / raw)
  To: linux-scsi, Martin K . Petersen, Sathya Prakash, Sreekanth Reddy,
	Suganath Prabu Subramani, MPT-FusionLinux.pdl
  Cc: Friedrich Weber

All SAS HBA models controlled by the mpt2sas and mp3sas drivers do not
support the Command Duration Limits (CDL) feature of ATA devices in
their SCSI-to-ATA translation layer (SAT) firmware. Probing ATA devices
for CDL support with scsi_cdl_check() will thus always result in CDL
being reported as not supported.

However, users in the field have reported that some of these HBA models
react badly to this probe and cause scan command errors when
scsi_cdl_check() is called, especially for device probe resulting from
a device hotplug. An example of such problem is shown below:

kernel: mpt3sas_cm0: handle(0xa) sas_address(0xREDACTED_SAS_ADDR) port_type(0x1)
kernel: scsi 5:0:1:0: Direct-Access     WDC      REDACTED_SN  C5C0 PQ: 0 ANSI: 7
kernel: scsi 5:0:1:0: SSP: handle(0x000a), sas_addr(0xREDACTED_SAS_ADDR), phy(2), device_name(REDACTED_DEVICE_NAME)
kernel: scsi 5:0:1:0: enclosure logical id (REDACTED_LOGICAL_ID), slot(0)
kernel: scsi 5:0:1:0: enclosure level(0x0000), connector name(     )
kernel: scsi 5:0:1:0: qdepth(254), tagged(1), scsi_level(8), cmd_que(1)
kernel: scsi 5:0:1:0: Power-on or device reset occurred
kernel: mpt3sas_cm0: log_info(0x31110e05): originator(PL), code(0x11), sub_code(0x0e05)
kernel: mpt3sas_cm0: log_info(0x31130000): originator(PL), code(0x13), sub_code(0x0000)
kernel: sd 5:0:1:0: Attached scsi generic sg1 type 0
kernel: sd 5:0:1:0: [sdb] Test Unit Ready failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
kernel: sd 5:0:1:0: [sdb] Read Capacity(16) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
kernel: sd 5:0:1:0: [sdb] Sense not available.
kernel: sd 5:0:1:0: [sdb] Read Capacity(10) failed: Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
kernel: sd 5:0:1:0: [sdb] Sense not available.
kernel: sd 5:0:1:0: [sdb] 0 512-byte logical blocks: (0 B/0 B)
kernel: sd 5:0:1:0: [sdb] 0-byte physical blocks
kernel: sd 5:0:1:0: [sdb] Test WP failed, assume Write Enabled
kernel: sd 5:0:1:0: [sdb] Asking for cache data failed
kernel: sd 5:0:1:0: [sdb] Assuming drive cache: write through
kernel:  end_device-5:1: add: handle(0x000a), sas_addr(0xREDACTED_SAS_ADDR)
kernel: mpt3sas_cm0: handle(0x000a), ioc_status(0x0022) failure at drivers/scsi/mpt3sas/mpt3sas_transport.c:225/_transport_set_identify()!
kernel: sd 5:0:1:0: [sdb] Attached SCSI disk
kernel: mpt3sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0xREDACTED_SAS_ADDR)
kernel: mpt3sas_cm0: removing handle(0x000a), sas_addr(0xREDACTED_SAS_ADDR)
kernel: mpt3sas_cm0: enclosure logical id(REDACTED_LOGICAL_ID), slot(0)
kernel: mpt3sas_cm0: enclosure level(0x0000), connector name(     )

This issue sometimes even requires a full host power cycle to recover
and get a successful device scan.

This issue is likely limited to older models that are now EOL and since
no HBA firmware update will fix this issue, work around it by
force-disabling CDL probing on ATA devices by setting the no_ata_cdl
SCSI host template flag. This does not affect well-behaved HBA models
since as mentioned above, these HBAs do not support ATA CDL anyway. This
change also does not affect probing of CDL support on SAS devices.

Reported-by: Friedrich Weber <f.weber@proxmox.com>
Fixes: 624885209f31 ("scsi: core: Detect support for command duration limits")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index d7d8244dfedc..32c3ab18cfbc 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -11943,6 +11943,7 @@ static const struct scsi_host_template mpt2sas_driver_template = {
 	.shost_groups			= mpt3sas_host_groups,
 	.sdev_groups			= mpt3sas_dev_groups,
 	.track_queue_depth		= 1,
+	.no_ata_cdl			= 1,
 	.cmd_size			= sizeof(struct scsiio_tracker),
 };
 
@@ -11982,6 +11983,7 @@ static const struct scsi_host_template mpt3sas_driver_template = {
 	.shost_groups			= mpt3sas_host_groups,
 	.sdev_groups			= mpt3sas_dev_groups,
 	.track_queue_depth		= 1,
+	.no_ata_cdl			= 1,
 	.cmd_size			= sizeof(struct scsiio_tracker),
 	.map_queues			= scsih_map_queues,
 	.mq_poll			= mpt3sas_blk_mq_poll,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas
  2025-07-23  5:23 [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Damien Le Moal
  2025-07-23  5:23 ` [PATCH 1/2] scsi: Allow SCSI hosts to force-disable CDL support probing Damien Le Moal
  2025-07-23  5:23 ` [PATCH 2/2] scsi: mpt3sas: Disable Command Duration Limit Probing Damien Le Moal
@ 2025-07-24 12:01 ` Friedrich Weber
  2025-07-24 13:10   ` Damien Le Moal
  2 siblings, 1 reply; 6+ messages in thread
From: Friedrich Weber @ 2025-07-24 12:01 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Sathya Prakash,
	Sreekanth Reddy, Suganath Prabu Subramani, MPT-FusionLinux.pdl

Hi Damien,

On 23/07/2025 07:26, Damien Le Moal wrote:
> Martin,
> 
> Friedrich reported issues with HBAs using the mpt3sas driver and CDL
> probe, particularly on device hot-plug. These 2 patches address this
> issue by force-disabling CDL probing with mpt2sas and mpt3sas. This has
> no effect on feature limitation since the firmware of all HBAs driven by
> mpt2sas and mpt3sas do not have a SAT implementation capable of handling
> CDL on ATA devices.

Thanks for the patches, but they do not seem to fix hotplug in the setup
we've been testing [0]. We applied the patches to our downstream kernel
based on 6.14.8 (plus the dependency [1]). Looks like `is_ata` is 0,
so the CDL check still occurs. We checked with a bpftrace script [2]
which prints the following on hotplug:

[kfunc:vmlinux:scsi_cdl_check] comm=kworker/u224:1 sdev=0xffff89b483eef000 sdev.scsi_level=8 sdev.is_ata=0 hostt.no_ata_cdl=1 host=5 id=1 channel=0 lun=0 stack=
        bpf_prog_996f05907e728033_scsi_cdl_check+554
        bpf_prog_996f05907e728033_scsi_cdl_check+554
        bpf_trampoline_6442497360+67
        scsi_cdl_check+5
        scsi_probe_and_add_lun+350
        __scsi_scan_target+255
        scsi_scan_target+224
        sas_rphy_add+311
        mpt3sas_transport_port_add+1046
        _scsih_add_device.constprop.0+1247
        _firmware_event_work+7872
        process_one_work+379
        worker_thread+696
        kthread+254
        ret_from_fork+71
        ret_from_fork_asm+26

> 
> Damien Le Moal (2):
>   scsi: Allow SCSI hosts to force-disable CDL support probing
>   scsi: mpt3sas: Disable Command Duration Limit Probing
> 
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 2 ++
>  drivers/scsi/scsi.c                  | 6 +++++-
>  include/scsi/scsi_host.h             | 6 ++++++
>  3 files changed, 13 insertions(+), 1 deletion(-)
> 

[0] https://lore.kernel.org/all/3dee186c-285e-4c1c-b879-6445eb2f3edf@proxmox.com/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=6.17/scsi-staging&id=b1ba03c49a711c30e24735733dfd68f2422fa150
[2]

kfunc:vmlinux:scsi_cdl_check {
	printf("[%s] comm=%s sdev=%p sdev.scsi_level=%d sdev.is_ata=%d hostt.no_ata_cdl=%d host=%d id=%d channel=%d lun=%d stack=%s",
		probe, comm,
		args->sdev, args->sdev->scsi_level, args->sdev->is_ata, args->sdev->host->hostt->no_ata_cdl, args->sdev->host->host_no, args->sdev->id, args->sdev->channel, args->sdev->lun,
		kstack());
}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas
  2025-07-24 12:01 ` [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Friedrich Weber
@ 2025-07-24 13:10   ` Damien Le Moal
  2025-07-30 10:44     ` Friedrich Weber
  0 siblings, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2025-07-24 13:10 UTC (permalink / raw)
  To: Friedrich Weber, linux-scsi, Martin K . Petersen, Sathya Prakash,
	Sreekanth Reddy, Suganath Prabu Subramani, MPT-FusionLinux.pdl

On 7/24/25 21:01, Friedrich Weber wrote:
> Hi Damien,
> 
> On 23/07/2025 07:26, Damien Le Moal wrote:
>> Martin,
>>
>> Friedrich reported issues with HBAs using the mpt3sas driver and CDL
>> probe, particularly on device hot-plug. These 2 patches address this
>> issue by force-disabling CDL probing with mpt2sas and mpt3sas. This has
>> no effect on feature limitation since the firmware of all HBAs driven by
>> mpt2sas and mpt3sas do not have a SAT implementation capable of handling
>> CDL on ATA devices.
> 
> Thanks for the patches, but they do not seem to fix hotplug in the setup
> we've been testing [0]. We applied the patches to our downstream kernel
> based on 6.14.8 (plus the dependency [1]). Looks like `is_ata` is 0,
> so the CDL check still occurs. We checked with a bpftrace script [2]
> which prints the following on hotplug:
> 
> [kfunc:vmlinux:scsi_cdl_check] comm=kworker/u224:1 sdev=0xffff89b483eef000 sdev.scsi_level=8 sdev.is_ata=0 hostt.no_ata_cdl=1 host=5 id=1 channel=0 lun=0 stack=

sdev.is_ata=0 ?
So the drives that are triggering this issue are SAS drives ?
Then this is even stranger as the HBA does not do much for SAS drives. It
basically normally only forward the scsi command to the drive. Hence why I
changed the initial trial patch to restrict the "no cdl" case to SATA drives,
which I thought we were dealing with here.

Going back to your earlier mails, the drives used are:
 - WUH722020BL5204: that is indeed a 20 TB SAS model... This is the drive
causing the issue.
 - WUH721818AL5204: and again a SAS model (18TB), and this drive seems fine.

So my bad for completely missing this point.

Feel free to contact me off-list so that we can escalate this internally in WD
to see if there is a FW update for the drives that could help.

But I also would like to hear comments from Broadcom folks...

Broadcom folks,

Your silence regarding issues with your HBAs is not nice, to say the least.
Properly understanding issues to fix them appropriately requires your support.
So could you *PLEASE* comment/help ?

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas
  2025-07-24 13:10   ` Damien Le Moal
@ 2025-07-30 10:44     ` Friedrich Weber
  0 siblings, 0 replies; 6+ messages in thread
From: Friedrich Weber @ 2025-07-30 10:44 UTC (permalink / raw)
  To: Damien Le Moal, linux-scsi, Martin K . Petersen, Sathya Prakash,
	Sreekanth Reddy, Suganath Prabu Subramani, MPT-FusionLinux.pdl

On 24/07/2025 15:10, Damien Le Moal wrote:
> On 7/24/25 21:01, Friedrich Weber wrote:
>> Hi Damien,
>>
>> On 23/07/2025 07:26, Damien Le Moal wrote:
>>> Martin,
>>>
>>> Friedrich reported issues with HBAs using the mpt3sas driver and CDL
>>> probe, particularly on device hot-plug. These 2 patches address this
>>> issue by force-disabling CDL probing with mpt2sas and mpt3sas. This has
>>> no effect on feature limitation since the firmware of all HBAs driven by
>>> mpt2sas and mpt3sas do not have a SAT implementation capable of handling
>>> CDL on ATA devices.
>>
>> Thanks for the patches, but they do not seem to fix hotplug in the setup
>> we've been testing [0]. We applied the patches to our downstream kernel
>> based on 6.14.8 (plus the dependency [1]). Looks like `is_ata` is 0,
>> so the CDL check still occurs. We checked with a bpftrace script [2]
>> which prints the following on hotplug:
>>
>> [kfunc:vmlinux:scsi_cdl_check] comm=kworker/u224:1 sdev=0xffff89b483eef000 sdev.scsi_level=8 sdev.is_ata=0 hostt.no_ata_cdl=1 host=5 id=1 channel=0 lun=0 stack=
> 
> sdev.is_ata=0 ?
> So the drives that are triggering this issue are SAS drives ?
> Then this is even stranger as the HBA does not do much for SAS drives. It
> basically normally only forward the scsi command to the drive. Hence why I
> changed the initial trial patch to restrict the "no cdl" case to SATA drives,
> which I thought we were dealing with here.
> 
> Going back to your earlier mails, the drives used are:
>  - WUH722020BL5204: that is indeed a 20 TB SAS model... This is the drive
> causing the issue.
>  - WUH721818AL5204: and again a SAS model (18TB), and this drive seems fine.
> 
> So my bad for completely missing this point.

No problem, I could have realized this sooner too. Yes, all drives
tested so far were SAS drives.

> Feel free to contact me off-list so that we can escalate this internally in WD
> to see if there is a FW update for the drives that could help.

Thanks, I did so!

> But I also would like to hear comments from Broadcom folks...
> 
> Broadcom folks,
> 
> Your silence regarding issues with your HBAs is not nice, to say the least.
> Properly understanding issues to fix them appropriately requires your support.
> So could you *PLEASE* comment/help ?

In case it helps, I posted debug logs with a different SAS3816 HBA that
shows the same issue in the other (older) thread [1].

[1]
https://lore.kernel.org/all/eb3778e5-dfdb-4382-8cc6-da6459f14a46@proxmox.com/


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-30 10:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23  5:23 [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Damien Le Moal
2025-07-23  5:23 ` [PATCH 1/2] scsi: Allow SCSI hosts to force-disable CDL support probing Damien Le Moal
2025-07-23  5:23 ` [PATCH 2/2] scsi: mpt3sas: Disable Command Duration Limit Probing Damien Le Moal
2025-07-24 12:01 ` [PATCH 0/2] Disable CDL probing on ATA with mpt2sas and mpt3sas Friedrich Weber
2025-07-24 13:10   ` Damien Le Moal
2025-07-30 10:44     ` Friedrich Weber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).