megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
@ 2024-02-10  1:18 Vitaly Chikunov
  2024-02-15 15:18 ` Vitaly Chikunov
  0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Chikunov @ 2024-02-10  1:18 UTC (permalink / raw)
  To: megaraidlinux.pdl, linux-scsi, Kashyap Desai, Sumit Saxena,
	Shivasharan S, Chandrakanth patil

Hi,

We started to get timeouts and controller resets since 5.19.5 (vanilla
v5.19 is not tested, tests below are on 6.6.15) when several ioctl
FALLOC_FL_ZERO_RANGE are issued into device consequentially without
delay between them (3-5 is enough to trigger condition). Because of
this, for example, mkfs.ext4 extremely slows down when initializing
filesystem. This happens on aarch64 (Kunpeng-920) server.

Reproducer:

  # for ((i=0;i<5;i++)); do echo $i; fallocate -z -l 2097152 /dev/sdc; done

Example of dmesg messages after problematic ioctl calls:

  Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 Abort request is for SMID: 4753
  Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d51beacc) tm_dev_handle 0x4
  Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
  Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
  Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 task abort FAILED!! scmd(0x00000000d51beacc)
  Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 CDB: Write(10) 2a 00 00 00 00 00 00 00 08 00
  Feb 06 19:45:04 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 Abort request is for SMID: 8293
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d9406c9c) tm_dev_handle 0x4
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 BRCM Debug mfi stat 0x2d, data len requested/completed 0x1000/0x0
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 task abort SUCCESS!! scmd(0x00000000d9406c9c)
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 CDB: Write Same(10) 41 00 03 4c 00 10 00 10 00 00
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting target reset! scmd(0x00000000d51beacc) tm_dev_handle: 0x4
  Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
  Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 target reset SUCCESS!!
  Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: Power-on or device reset occurred

Excerpt from the controller events log (from storli):

  Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004  reset (Type 03)
  Event Description: Drive PD 05(e0xfb/s4) link speed changed
  Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05
  Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05
  Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004  reset (Type 03)
  Event Description: Drive PD 05(e0xfb/s4) link speed changed
  Event Description: Unexpected sense: PD 05(e0xfb/s4) Path 5e8b4700e35e2004, CDB: 41 00 00 00 00 00 00 10 00 00, Sense: 6/29/00

Tests was on the latest firmware (at the moment):

  Product Name = MegaRAID 9560-8i 4GB
  Serial Number = SKC4006982
  Firmware Package Build = 52.28.0-5305
  Firmware Version = 5.280.02-3972
  PSOC FW Version = 0x001A
  PSOC Hardware Version = 0x000A
  PSOC Part Number = 29211-260-4GB
  NVDATA Version = 5.2800.00-0752
  CBB Version = 28.250.04.00
  Bios Version = 7.28.00.0_0x071C0000
  HII Version = 07.28.04.00
  HIIA Version = 07.28.04.00
  Driver Name = megaraid_sas
  Driver Version = 07.725.01.00-rc1

I tried also latest available megaraid_sas driver (07.728.04.00) which is not
yet merged into mainline but the problems are not resolved with it.

Thanks,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2024-02-10  1:18 megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19 Vitaly Chikunov
@ 2024-02-15 15:18 ` Vitaly Chikunov
  2024-02-15 18:42   ` Martin K. Petersen
  0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Chikunov @ 2024-02-15 15:18 UTC (permalink / raw)
  To: megaraidlinux.pdl, linux-scsi, Kashyap Desai, Sumit Saxena,
	Shivasharan S, Chandrakanth patil

Hi,

On Sat, Feb 10, 2024 at 04:18:31AM +0300, Vitaly Chikunov wrote:
> 
> We started to get timeouts and controller resets since 5.19.5 (vanilla
> v5.19 is not tested, tests below are on 6.6.15) when several ioctl
> FALLOC_FL_ZERO_RANGE are issued into device consequentially without
> delay between them (3-5 is enough to trigger condition). Because of
> this, for example, mkfs.ext4 extremely slows down when initializing
> filesystem. This happens on aarch64 (Kunpeng-920) server.

I am reported that bisect found this commit to cause above mentioned
problem:

  commit c92a6b5d63359dd6d2ce6ea88ecd8e31dd769f6b
  Author:     Martin K. Petersen <martin.petersen@oracle.com>
  AuthorDate: Wed Mar 2 00:35:47 2022 -0500

      scsi: core: Query VPD size before getting full page

When from v5.19 this commit is reverted the problem disappears.

Thanks,

> 
> Reproducer:
> 
>   # for ((i=0;i<5;i++)); do echo $i; fallocate -z -l 2097152 /dev/sdc; done
> 
> Example of dmesg messages after problematic ioctl calls:
> 
>   Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 Abort request is for SMID: 4753
>   Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d51beacc) tm_dev_handle 0x4
>   Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
>   Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
>   Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 task abort FAILED!! scmd(0x00000000d51beacc)
>   Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 CDB: Write(10) 2a 00 00 00 00 00 00 00 08 00
>   Feb 06 19:45:04 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 Abort request is for SMID: 8293
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d9406c9c) tm_dev_handle 0x4
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 BRCM Debug mfi stat 0x2d, data len requested/completed 0x1000/0x0
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 task abort SUCCESS!! scmd(0x00000000d9406c9c)
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 CDB: Write Same(10) 41 00 03 4c 00 10 00 10 00 00
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting target reset! scmd(0x00000000d51beacc) tm_dev_handle: 0x4
>   Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
>   Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 target reset SUCCESS!!
>   Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: Power-on or device reset occurred
> 
> Excerpt from the controller events log (from storli):
> 
>   Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004  reset (Type 03)
>   Event Description: Drive PD 05(e0xfb/s4) link speed changed
>   Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05
>   Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05
>   Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004  reset (Type 03)
>   Event Description: Drive PD 05(e0xfb/s4) link speed changed
>   Event Description: Unexpected sense: PD 05(e0xfb/s4) Path 5e8b4700e35e2004, CDB: 41 00 00 00 00 00 00 10 00 00, Sense: 6/29/00
> 
> Tests was on the latest firmware (at the moment):
> 
>   Product Name = MegaRAID 9560-8i 4GB
>   Serial Number = SKC4006982
>   Firmware Package Build = 52.28.0-5305
>   Firmware Version = 5.280.02-3972
>   PSOC FW Version = 0x001A
>   PSOC Hardware Version = 0x000A
>   PSOC Part Number = 29211-260-4GB
>   NVDATA Version = 5.2800.00-0752
>   CBB Version = 28.250.04.00
>   Bios Version = 7.28.00.0_0x071C0000
>   HII Version = 07.28.04.00
>   HIIA Version = 07.28.04.00
>   Driver Name = megaraid_sas
>   Driver Version = 07.725.01.00-rc1
> 
> I tried also latest available megaraid_sas driver (07.728.04.00) which is not
> yet merged into mainline but the problems are not resolved with it.
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2024-02-15 15:18 ` Vitaly Chikunov
@ 2024-02-15 18:42   ` Martin K. Petersen
  2024-02-16 10:08     ` Vitaly Chikunov
  0 siblings, 1 reply; 8+ messages in thread
From: Martin K. Petersen @ 2024-02-15 18:42 UTC (permalink / raw)
  To: Vitaly Chikunov
  Cc: megaraidlinux.pdl, linux-scsi, Kashyap Desai, Sumit Saxena,
	Shivasharan S, Chandrakanth patil


Vitaly,

I'd appreciate it if you could test the patch Bart referred you to.

> I am reported that bisect found this commit to cause above mentioned
> problem:

Also, I would like to understand why things fail as a result of the
original change.

Could you please send me the output of:

# sg_readcap -l /dev/sdc
# sg_vpd -l /dev/sdc
# sg_vpd -p 0xb0 /dev/sdc
# sg_vpd -p 0xb1 /dev/sdc
# sg_vpd -p 0xb2 /dev/sdc

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2024-02-15 18:42   ` Martin K. Petersen
@ 2024-02-16 10:08     ` Vitaly Chikunov
  2025-03-09 13:55       ` Samy Lahfa
  0 siblings, 1 reply; 8+ messages in thread
From: Vitaly Chikunov @ 2024-02-16 10:08 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: megaraidlinux.pdl, linux-scsi, Kashyap Desai, Sumit Saxena,
	Shivasharan S, Chandrakanth patil

Martin,

On Thu, Feb 15, 2024 at 01:42:35PM -0500, Martin K. Petersen wrote:
> 
> I'd appreciate it if you could test the patch Bart referred you to.
> 
> > I am reported that bisect found this commit to cause above mentioned
> > problem:
> 
> Also, I would like to understand why things fail as a result of the
> original change.
> 
> Could you please send me the output of:
> 
> # sg_readcap -l /dev/sdc
> # sg_vpd -l /dev/sdc
> # sg_vpd -p 0xb0 /dev/sdc
> # sg_vpd -p 0xb1 /dev/sdc
> # sg_vpd -p 0xb2 /dev/sdc

Here it is:

  # sg_readcap -l /dev/sdc
  Read Capacity results:
     Protection: prot_en=0, p_type=0, p_i_exponent=0
     Logical block provisioning: lbpme=0, lbprz=0
     Last LBA=3907029167 (0xe8e088af), Number of logical blocks=3907029168
     Logical block length=512 bytes
     Logical blocks per physical block exponent=0
     Lowest aligned LBA=0
  Hence:
     Device size: 2000398934016 bytes, 1907729.1 MiB, 2000.40 GB, 2.00 TB

  # sg_vpd -l /dev/sdc
  Supported VPD pages VPD page:
     [PQual=0  Peripheral device type: disk]
    0x00  Supported VPD pages [sv]
    0x80  Unit serial number [sn]
    0x83  Device identification [di]
    0x87  Mode page policy [mpp]
    0x89  ATA information (SAT) [ai]
    0x8a  Power condition [pc]
    0xb0  Block limits (SBC) [bl]
    0xb1  Block device characteristics (SBC) [bdc]
    0xb2  Logical block provisioning (SBC) [lbpv]
    0xb6  Zoned block device characteristics [zbdch]

  # sg_vpd -p 0xb0 /dev/sdc
  Block limits VPD page (SBC):
    Write same non-zero (WSNZ): 1
    Maximum compare and write length: 0 blocks [Command not implemented]
    Optimal transfer length granularity: 0 blocks [not reported]
    Maximum transfer length: 0 blocks [not reported]
    Optimal transfer length: 0 blocks [not reported]
    Maximum prefetch transfer length: 0 blocks [ignored]
    Maximum unmap LBA count: 0 [Unmap command not implemented]
    Maximum unmap block descriptor count: 0 [Unmap command not implemented]
    Optimal unmap granularity: 0 blocks [not reported]
    Unmap granularity alignment valid: false
    Unmap granularity alignment: 0 [invalid]
    Maximum write same length: 0xffff blocks
    Maximum atomic transfer length: 0 blocks [not reported]
    Atomic alignment: 0 [unaligned atomic writes permitted]
    Atomic transfer length granularity: 0 [no granularity requirement
    Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
    Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]

  # sg_vpd -p 0xb1 /dev/sdc
  Block device characteristics VPD page (SBC):
    Nominal rotation rate: 7200 rpm
    Product type: Not specified
    WABEREQ=0
    WACEREQ=0
    Nominal form factor: 3.5 inch
    MACT=0
    ZONED=0
    RBWZ=0
    BOCS=0
    FUAB=0
    VBULS=0
    DEPOPULATION_TIME=0 (seconds)

  # sg_vpd -p 0xb2 /dev/sdc
  Logical block provisioning VPD page (SBC):
    Unmap command supported (LBPU): 0
    Write same (16) with unmap bit supported (LBPWS): 0
    Write same (10) with unmap bit supported (LBPWS10): 0
    Logical block provisioning read zeros (LBPRZ): 0
    Anchored LBAs supported (ANC_SUP): 0
    Threshold exponent: 0 [threshold sets not supported]
    Descriptor present (DP): 0
    Minimum percentage: 0 [not reported]
    Provisioning type: 0 (not known or fully provisioned)
    Threshold percentage: 0 [percentages not supported]

About the patch it will be tested later today.

Thanks,

> 
> Thanks!
> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2024-02-16 10:08     ` Vitaly Chikunov
@ 2025-03-09 13:55       ` Samy Lahfa
  2025-03-11  2:24         ` Martin K. Petersen
  0 siblings, 1 reply; 8+ messages in thread
From: Samy Lahfa @ 2025-03-09 13:55 UTC (permalink / raw)
  To: vt
  Cc: chandrakanth.patil, kashyap.desai, linux-scsi, martin.petersen,
	megaraidlinux.pdl, shivasharan.srikanteshwara, sumit.saxena,
	Samy Lahfa

Hello all,

I have just ran into this issue, controller resets and timeouts (running mkfs.ext4 or mkfs.xfs to reproduce) and bisected it to the same commit : 

commit c92a6b5d63359dd6d2ce6ea88ecd8e31dd769f6b
Author:     Martin K. Petersen <martin.petersen@oracle.com>
AuthorDate: Wed Mar 2 00:35:47 2022 -0500

  scsi: core: Query VPD size before getting full page

Reverting this commit and building 6.6.80 kernel with the revert patch solved the issue.

Is it possible to receive the patch that was referenced by Bart (I wasn't able to find it) so I can give it a try please ?

Also sharing the output that was asked, not sure if this may help.

sg_readcap -l /dev/sdb : 
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=1, lbprz=1
   Last LBA=390721967 (0x1749f1af), Number of logical blocks=390721968
   Logical block length=512 bytes
   Logical blocks per physical block exponent=3 [so physical block length=4096 bytes]
   Lowest aligned LBA=0
Hence:
   Device size: 200049647616 bytes, 190782.2 MiB, 200.05 GB

sg_vpd -l /dev/sdb :
Supported VPD pages VPD page:
   [PQual=0  Peripheral device type: disk]
  0x00  Supported VPD pages [sv]
  0x80  Unit serial number [sn]
  0x83  Device identification [di]
  0x87  Mode page policy [mpp]
  0x89  ATA information (SAT) [ai]
  0xb0  Block limits (SBC) [bl]
  0xb1  Block device characteristics (SBC) [bdc]
  0xb2  Logical block provisioning (SBC) [lbpv]

sg_vpd -p 0xb0 /dev/sdb : 
Block limits VPD page (SBC):
  Write same non-zero (WSNZ): 0
  Maximum compare and write length: 0 blocks [Command not implemented]
  Optimal transfer length granularity: 0 blocks [not reported]
  Maximum transfer length: 0 blocks [not reported]
  Optimal transfer length: 0 blocks [not reported]
  Maximum prefetch transfer length: 0 blocks [ignored]
  Maximum unmap LBA count: 262143
  Maximum unmap block descriptor count: 32
  Optimal unmap granularity: 1 blocks
  Unmap granularity alignment valid: false
  Unmap granularity alignment: 0 [invalid]
  Maximum write same length: 0 blocks [not reported]
  Maximum atomic transfer length: 0 blocks [not reported]
  Atomic alignment: 0 [unaligned atomic writes permitted]
  Atomic transfer length granularity: 0 [no granularity requirement
  Maximum atomic transfer length with atomic boundary: 0 blocks [not reported]
  Maximum atomic boundary size: 0 blocks [can only write atomic 1 block]

sg_vpd -p 0xb1 /dev/sdb :
Block device characteristics VPD page (SBC):
  Non-rotating medium (e.g. solid state)
  Product type: Not specified
  WABEREQ=0
  WACEREQ=0
  Nominal form factor: 2.5 inch
  MACT=0
  ZONED=0
  RBWZ=0
  BOCS=0
  FUAB=0
  VBULS=0
  DEPOPULATION_TIME=0 (seconds)

sg_vpd -p 0xb2 /dev/sdb : 
Logical block provisioning VPD page (SBC):
  Unmap command supported (LBPU): 1
  Write same (16) with unmap bit supported (LBPWS): 1
  Write same (10) with unmap bit supported (LBPWS10): 0
  Logical block provisioning read zeros (LBPRZ): 0
  Anchored LBAs supported (ANC_SUP): 1
  Threshold exponent: 0 [threshold sets not supported]
  Descriptor present (DP): 0
  Minimum percentage: 0 [not reported]
  Provisioning type: 0 (not known or fully provisioned)
  Threshold percentage: 0 [percentages not supported]

Thanks for any help!

Kind regards,
Lahfa Samy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2025-03-09 13:55       ` Samy Lahfa
@ 2025-03-11  2:24         ` Martin K. Petersen
  2025-03-15 17:12           ` Ryan Lahfa
  0 siblings, 1 reply; 8+ messages in thread
From: Martin K. Petersen @ 2025-03-11  2:24 UTC (permalink / raw)
  To: Samy Lahfa
  Cc: vt, chandrakanth.patil, kashyap.desai, linux-scsi,
	martin.petersen, megaraidlinux.pdl, shivasharan.srikanteshwara,
	sumit.saxena


Samy,

> I have just ran into this issue, controller resets and timeouts
> (running mkfs.ext4 or mkfs.xfs to reproduce) and bisected it to the
> same commit :

Can you try the patch below?

Thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering

diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 88acefbf9aea..8ced3f1fd427 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -2106,6 +2106,9 @@ static int megasas_device_configure(struct scsi_device *sdev,
 	/* This sdev property may change post OCR */
 	megasas_set_dynamic_target_properties(sdev, lim, is_target_prop);
 
+	if (!MEGASAS_IS_LOGICAL(sdev))
+		sdev->no_vpd_size = 1;
+
 	mutex_unlock(&instance->reset_mutex);
 
 	return 0;
@@ -3665,8 +3668,10 @@ megasas_complete_cmd(struct megasas_instance *instance, struct megasas_cmd *cmd,
 
 		case MFI_STAT_SCSI_IO_FAILED:
 		case MFI_STAT_LD_INIT_IN_PROGRESS:
-			cmd->scmd->result =
-			    (DID_ERROR << 16) | hdr->scsi_status;
+			if (hdr->scsi_status == 0xf0)
+				cmd->scmd->result = (DID_ERROR << 16) | SAM_STAT_CHECK_CONDITION;
+			else
+				cmd->scmd->result = (DID_ERROR << 16) | hdr->scsi_status;
 			break;
 
 		case MFI_STAT_SCSI_DONE_WITH_ERROR:
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 6c1fb8149553..7d28b5b23751 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -2043,7 +2043,10 @@ map_cmd_status(struct fusion_context *fusion,
 
 	case MFI_STAT_SCSI_IO_FAILED:
 	case MFI_STAT_LD_INIT_IN_PROGRESS:
-		scmd->result = (DID_ERROR << 16) | ext_status;
+		if (ext_status == 0xf0)
+			scmd->result = (DID_ERROR << 16) | SAM_STAT_CHECK_CONDITION;
+		else
+			scmd->result = (DID_ERROR << 16) | ext_status;
 		break;
 
 	case MFI_STAT_SCSI_DONE_WITH_ERROR:

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2025-03-11  2:24         ` Martin K. Petersen
@ 2025-03-15 17:12           ` Ryan Lahfa
  2025-03-18  1:38             ` Martin K. Petersen
  0 siblings, 1 reply; 8+ messages in thread
From: Ryan Lahfa @ 2025-03-15 17:12 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Samy Lahfa, vt, chandrakanth.patil, kashyap.desai, linux-scsi,
	megaraidlinux.pdl, shivasharan.srikanteshwara, sumit.saxena

Martin,

Le Mon, Mar 10, 2025 at 10:24:53PM -0400, Martin K. Petersen a écrit :
> 
> Samy,

(I work with Samy.)

> 
> > I have just ran into this issue, controller resets and timeouts
> > (running mkfs.ext4 or mkfs.xfs to reproduce) and bisected it to the
> > same commit :
> 
> Can you try the patch below?

Tested, this works, I do not see the error we reported earlier.

Tested-by: Ryan Lahfa <ryan@lahfa.xyz>

> 
> Thanks!

Thank you for the fast answer.

Kind regards,
-- 
Ryan Lahfa

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19
  2025-03-15 17:12           ` Ryan Lahfa
@ 2025-03-18  1:38             ` Martin K. Petersen
  0 siblings, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2025-03-18  1:38 UTC (permalink / raw)
  To: Ryan Lahfa
  Cc: Martin K. Petersen, Samy Lahfa, vt, chandrakanth.patil,
	kashyap.desai, linux-scsi, megaraidlinux.pdl,
	shivasharan.srikanteshwara, sumit.saxena


Ryan,

> Tested, this works, I do not see the error we reported earlier.
>
> Tested-by: Ryan Lahfa <ryan@lahfa.xyz>

Thanks for testing!

Broadcom: This fix never made it into a driver update. Please submit a
formal patch. Thank you!

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-03-18  1:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-10  1:18 megaraid_sas: multiple FALLOC_FL_ZERO_RANGE causes timeouts and resets on MegaRAID 9560-8i 4GB since 5.19 Vitaly Chikunov
2024-02-15 15:18 ` Vitaly Chikunov
2024-02-15 18:42   ` Martin K. Petersen
2024-02-16 10:08     ` Vitaly Chikunov
2025-03-09 13:55       ` Samy Lahfa
2025-03-11  2:24         ` Martin K. Petersen
2025-03-15 17:12           ` Ryan Lahfa
2025-03-18  1:38             ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox