public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
@ 2025-03-12  9:51 Xingui Yang
  2025-03-12  9:51 ` [PATCH v4 1/2] scsi: hisi_sas: Enable force phy when SATA disk directly connected Xingui Yang
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Xingui Yang @ 2025-03-12  9:51 UTC (permalink / raw)
  To: john.g.garry, yanaijie
  Cc: jejb, martin.petersen, linux-scsi, linuxarm, prime.zeng,
	yangxingui, liuyonglong, kangfenglong, liyangyang20, f.fangjian,
	xiabing14, zhonghaoquan

This series of patches is used to solve the problem that IO may be sent to
the incorrect disk after the HW port ID of the directly connected device
is changed.

Changes from v3:
- Lose and find the disk when hw port id changes based on John's suggestion

Changes from v2:
- Use asynchronous scheduling

Changes from v1:
- Fix "BUG: Atomic scheduling in clear_itct_v3_hw()"

Xingui Yang (2):
  scsi: hisi_sas: Enable force phy when SATA disk directly connected
  scsi: hisi_sas: Fix IO errors caused by hardware port ID changes

 drivers/scsi/hisi_sas/hisi_sas_main.c  | 20 ++++++++++++++++++++
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  9 +++++++--
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 14 ++++++++++++--
 3 files changed, 39 insertions(+), 4 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 1/2] scsi: hisi_sas: Enable force phy when SATA disk directly connected
  2025-03-12  9:51 [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
@ 2025-03-12  9:51 ` Xingui Yang
  2025-03-12  9:51 ` [PATCH v4 2/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Xingui Yang @ 2025-03-12  9:51 UTC (permalink / raw)
  To: john.g.garry, yanaijie
  Cc: jejb, martin.petersen, linux-scsi, linuxarm, prime.zeng,
	yangxingui, liuyonglong, kangfenglong, liyangyang20, f.fangjian,
	xiabing14, zhonghaoquan

the SAS controller determines the disk to which I/Os are delivered based
on the port id in the DQ entry when SATA disk directly connected.

When many phys were disconnected immediately and connected again during
I/O sending and port id of phys were changed and used by other link, I/O
may be sent to incorrect disk and data inconsistency on the SATA disk may
occur during I/O retry with the old port id. So enable force phy, then
force the command to be executed in a certain phy, and if the actual phy
id of the port does not match the phy configured in the command, the chip
will stop delivering the I/O to disk.

Fixes: ce60689e12dd ("scsi: hisi_sas: add v3 code to send ATA frame")
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Reviewed-by: Yihang Li <liyihang9@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  9 +++++++--
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 14 ++++++++++++--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 71cd5b4450c2..7b0dcd80f5a8 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2501,6 +2501,7 @@ static void prep_ata_v2_hw(struct hisi_hba *hisi_hba,
 	struct hisi_sas_port *port = to_hisi_sas_port(sas_port);
 	struct sas_ata_task *ata_task = &task->ata_task;
 	struct sas_tmf_task *tmf = slot->tmf;
+	int phy_id;
 	u8 *buf_cmd;
 	int has_data = 0, hdr_tag = 0;
 	u32 dw0, dw1 = 0, dw2 = 0;
@@ -2508,10 +2509,14 @@ static void prep_ata_v2_hw(struct hisi_hba *hisi_hba,
 	/* create header */
 	/* dw0 */
 	dw0 = port->id << CMD_HDR_PORT_OFF;
-	if (parent_dev && dev_is_expander(parent_dev->dev_type))
+	if (parent_dev && dev_is_expander(parent_dev->dev_type)) {
 		dw0 |= 3 << CMD_HDR_CMD_OFF;
-	else
+	} else {
+		phy_id = device->phy->identify.phy_identifier;
+		dw0 |= (1U << phy_id) << CMD_HDR_PHY_ID_OFF;
+		dw0 |= CMD_HDR_FORCE_PHY_MSK;
 		dw0 |= 4 << CMD_HDR_CMD_OFF;
+	}
 
 	if (tmf && ata_task->force_phy) {
 		dw0 |= CMD_HDR_FORCE_PHY_MSK;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 48b95d9a7927..bb2142fd2c66 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -359,6 +359,10 @@
 #define CMD_HDR_RESP_REPORT_MSK		(0x1 << CMD_HDR_RESP_REPORT_OFF)
 #define CMD_HDR_TLR_CTRL_OFF		6
 #define CMD_HDR_TLR_CTRL_MSK		(0x3 << CMD_HDR_TLR_CTRL_OFF)
+#define CMD_HDR_PHY_ID_OFF		8
+#define CMD_HDR_PHY_ID_MSK		(0x1ff << CMD_HDR_PHY_ID_OFF)
+#define CMD_HDR_FORCE_PHY_OFF		17
+#define CMD_HDR_FORCE_PHY_MSK		(0x1U << CMD_HDR_FORCE_PHY_OFF)
 #define CMD_HDR_PORT_OFF		18
 #define CMD_HDR_PORT_MSK		(0xf << CMD_HDR_PORT_OFF)
 #define CMD_HDR_PRIORITY_OFF		27
@@ -1429,15 +1433,21 @@ static void prep_ata_v3_hw(struct hisi_hba *hisi_hba,
 	struct hisi_sas_cmd_hdr *hdr = slot->cmd_hdr;
 	struct asd_sas_port *sas_port = device->port;
 	struct hisi_sas_port *port = to_hisi_sas_port(sas_port);
+	int phy_id;
 	u8 *buf_cmd;
 	int has_data = 0, hdr_tag = 0;
 	u32 dw1 = 0, dw2 = 0;
 
 	hdr->dw0 = cpu_to_le32(port->id << CMD_HDR_PORT_OFF);
-	if (parent_dev && dev_is_expander(parent_dev->dev_type))
+	if (parent_dev && dev_is_expander(parent_dev->dev_type)) {
 		hdr->dw0 |= cpu_to_le32(3 << CMD_HDR_CMD_OFF);
-	else
+	} else {
+		phy_id = device->phy->identify.phy_identifier;
+		hdr->dw0 |= cpu_to_le32((1U << phy_id)
+				<< CMD_HDR_PHY_ID_OFF);
+		hdr->dw0 |= CMD_HDR_FORCE_PHY_MSK;
 		hdr->dw0 |= cpu_to_le32(4U << CMD_HDR_CMD_OFF);
+	}
 
 	switch (task->data_dir) {
 	case DMA_TO_DEVICE:
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v4 2/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
  2025-03-12  9:51 [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
  2025-03-12  9:51 ` [PATCH v4 1/2] scsi: hisi_sas: Enable force phy when SATA disk directly connected Xingui Yang
@ 2025-03-12  9:51 ` Xingui Yang
  2025-03-20 15:37 ` [PATCH v4 0/2] " John Garry
  2025-03-21  0:47 ` Martin K. Petersen
  3 siblings, 0 replies; 6+ messages in thread
From: Xingui Yang @ 2025-03-12  9:51 UTC (permalink / raw)
  To: john.g.garry, yanaijie
  Cc: jejb, martin.petersen, linux-scsi, linuxarm, prime.zeng,
	yangxingui, liuyonglong, kangfenglong, liyangyang20, f.fangjian,
	xiabing14, zhonghaoquan

The hw port id of phy may change when inserting disks in batches, causing
the port id in hisi_sas_port and itct to be inconsistent with the hardware,
resulting in IO errors. The solution is to set the device state to gone to
intercept IO sent to the device, and then execute linkreset to discard and
find the disk to re-update its information.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas_main.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index da4a2ed8ee86..edb1efc241db 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -911,8 +911,28 @@ static void hisi_sas_phyup_work_common(struct work_struct *work,
 		container_of(work, typeof(*phy), works[event]);
 	struct hisi_hba *hisi_hba = phy->hisi_hba;
 	struct asd_sas_phy *sas_phy = &phy->sas_phy;
+	struct asd_sas_port *sas_port = sas_phy->port;
+	struct hisi_sas_port *port = phy->port;
+	struct device *dev = hisi_hba->dev;
+	struct domain_device *port_dev;
 	int phy_no = sas_phy->id;
 
+	if (!test_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags) &&
+	    sas_port && port && (port->id != phy->port_id)) {
+		dev_info(dev, "phy%d's hw port id changed from %d to %llu\n",
+				phy_no, port->id, phy->port_id);
+		port_dev = sas_port->port_dev;
+		if (port_dev && !dev_is_expander(port_dev->dev_type)) {
+			/*
+			 * Set the device state to gone to block
+			 * sending IO to the device.
+			 */
+			set_bit(SAS_DEV_GONE, &port_dev->state);
+			hisi_sas_notify_phy_event(phy, HISI_PHYE_LINK_RESET);
+			return;
+		}
+	}
+
 	phy->wait_phyup_cnt = 0;
 	if (phy->identify.target_port_protocols == SAS_PROTOCOL_SSP)
 		hisi_hba->hw->sl_notify_ssp(hisi_hba, phy_no);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
  2025-03-12  9:51 [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
  2025-03-12  9:51 ` [PATCH v4 1/2] scsi: hisi_sas: Enable force phy when SATA disk directly connected Xingui Yang
  2025-03-12  9:51 ` [PATCH v4 2/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
@ 2025-03-20 15:37 ` John Garry
  2025-03-24  7:20   ` yangxingui
  2025-03-21  0:47 ` Martin K. Petersen
  3 siblings, 1 reply; 6+ messages in thread
From: John Garry @ 2025-03-20 15:37 UTC (permalink / raw)
  To: Xingui Yang, yanaijie
  Cc: jejb, martin.petersen, linux-scsi, linuxarm, prime.zeng,
	liuyonglong, kangfenglong, liyangyang20, f.fangjian, xiabing14,
	zhonghaoquan

On 12/03/2025 09:51, Xingui Yang wrote:
> This series of patches is used to solve the problem that IO may be sent to
> the incorrect disk after the HW port ID of the directly connected device
> is changed.
> 
> Changes from v3:
> - Lose and find the disk when hw port id changes based on John's suggestion
> 
> Changes from v2:
> - Use asynchronous scheduling
> 
> Changes from v1:
> - Fix "BUG: Atomic scheduling in clear_itct_v3_hw()"
> 
> Xingui Yang (2):
>    scsi: hisi_sas: Enable force phy when SATA disk directly connected
>    scsi: hisi_sas: Fix IO errors caused by hardware port ID changes

So this is all solved in the LLDD, then this is good

thanks

> 
>   drivers/scsi/hisi_sas/hisi_sas_main.c  | 20 ++++++++++++++++++++
>   drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |  9 +++++++--
>   drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 14 ++++++++++++--
>   3 files changed, 39 insertions(+), 4 deletions(-)
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
  2025-03-12  9:51 [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
                   ` (2 preceding siblings ...)
  2025-03-20 15:37 ` [PATCH v4 0/2] " John Garry
@ 2025-03-21  0:47 ` Martin K. Petersen
  3 siblings, 0 replies; 6+ messages in thread
From: Martin K. Petersen @ 2025-03-21  0:47 UTC (permalink / raw)
  To: Xingui Yang
  Cc: john.g.garry, yanaijie, jejb, martin.petersen, linux-scsi,
	linuxarm, prime.zeng, liuyonglong, kangfenglong, liyangyang20,
	f.fangjian, xiabing14, zhonghaoquan


Xingui,

> This series of patches is used to solve the problem that IO may be
> sent to the incorrect disk after the HW port ID of the directly
> connected device is changed.

Applied to 6.15/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
  2025-03-20 15:37 ` [PATCH v4 0/2] " John Garry
@ 2025-03-24  7:20   ` yangxingui
  0 siblings, 0 replies; 6+ messages in thread
From: yangxingui @ 2025-03-24  7:20 UTC (permalink / raw)
  To: John Garry, yanaijie
  Cc: jejb, martin.petersen, linux-scsi, linuxarm, prime.zeng,
	liuyonglong, kangfenglong, liyangyang20, f.fangjian, xiabing14,
	zhonghaoquan


On 2025/3/20 23:37, John Garry wrote:
> On 12/03/2025 09:51, Xingui Yang wrote:
>> This series of patches is used to solve the problem that IO may be 
>> sent to
>> the incorrect disk after the HW port ID of the directly connected device
>> is changed.
>>
>> Changes from v3:
>> - Lose and find the disk when hw port id changes based on John's 
>> suggestion
>>
>> Changes from v2:
>> - Use asynchronous scheduling
>>
>> Changes from v1:
>> - Fix "BUG: Atomic scheduling in clear_itct_v3_hw()"
>>
>> Xingui Yang (2):
>>    scsi: hisi_sas: Enable force phy when SATA disk directly connected
>>    scsi: hisi_sas: Fix IO errors caused by hardware port ID changes
> 
> So this is all solved in the LLDD, then this is good

Yes, thank you for your advice.

Thanks,
Xingui





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-24  7:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-12  9:51 [PATCH v4 0/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
2025-03-12  9:51 ` [PATCH v4 1/2] scsi: hisi_sas: Enable force phy when SATA disk directly connected Xingui Yang
2025-03-12  9:51 ` [PATCH v4 2/2] scsi: hisi_sas: Fix IO errors caused by hardware port ID changes Xingui Yang
2025-03-20 15:37 ` [PATCH v4 0/2] " John Garry
2025-03-24  7:20   ` yangxingui
2025-03-21  0:47 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox