From: Damien Le Moal <dlemoal@kernel.org>
To: Friedrich Weber <f.weber@proxmox.com>,
Mira Limbeck <m.limbeck@proxmox.com>,
Niklas Cassel <nks@flawful.org>, Jens Axboe <axboe@kernel.dk>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
"James E.J. Bottomley" <jejb@linux.ibm.com>,
Kashyap Desai <kashyap.desai@broadcom.com>,
Sumit Saxena <sumit.saxena@broadcom.com>,
Shivasharan S <shivasharan.srikanteshwara@broadcom.com>,
Chandrakanth patil <chandrakanth.patil@broadcom.com>,
Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>,
Sreekanth Reddy <sreekanth.reddy@broadcom.com>,
megaraidlinux.pdl@broadcom.com, mpi3mr-linuxdrv.pdl@broadcom.com
Cc: Bart Van Assche <bvanassche@acm.org>,
Christoph Hellwig <hch@lst.de>, Hannes Reinecke <hare@suse.de>,
linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
linux-block@vger.kernel.org,
Niklas Cassel <niklas.cassel@wdc.com>
Subject: Re: [PATCH v7 08/19] scsi: detect support for command duration limits
Date: Mon, 9 Jun 2025 21:24:36 +0900 [thread overview]
Message-ID: <54e0a717-e9fc-4534-bc27-8bc1ee745048@kernel.org> (raw)
In-Reply-To: <a927b51b-1b34-4d4f-9447-d8c559127707@proxmox.com>
On 6/3/25 20:28, Friedrich Weber wrote:
>>> They provided controller information via `sas3ircu` and `storcli`:
>>>
>>> sas3ircu:
>>>
>>> Controller type : SAS3008
>>> BIOS version : 8.37.00.00
>>> Firmware version : 16.00.16.00
>>
>> Is this the latest available FW for this HBA ? (see below)
>
> It seems 16.00.16.00 is even newer than the latest version available on
> the Broadcom website, which is a bit strange -- I only found [1] there
> which has an older 16.00.14.00 (3008_FW_PH16.00.14.00.rar).
So this is an old/now EOL 9300 series HBA, right ? Or is this a 3008 controller
chip as part of the server motherboard (e.g. a supermicro HBA ?)
Looking at the Broadcom support page for legacy products, the latest FW version
seems to be 16.00.10.00.
>>> storcli:
>>>
>>> Firmware Package Build = 24.18.0-0021
>>> Firmware Version = 4.670.00-6500
>>> CPLD Version = 26515-00A
>>> Bios Version = 6.34.01.0_4.19.08.00_0x06160200
>>> HII Version = 03.23.06.00
>>> Ctrl-R Version = 5.18-0400
>>> Preboot CLI Version = 01.07-05:#%0000
>>> NVDATA Version = 3.1611.00-0005
>>> Boot Block Version = 3.07.00.00-0003
>>> Driver Name = megaraid_sas
>>> Driver Version = 07.727.03.00-rc1
>>
>> Unfortunately, I do not have any megaraid model so I cannot test/recreate. I
>> only have mpt3sas (9300, 9400 and 9500 series HBAs) and mpi3mr models (9600 HBA
>> series).
>
> We just realized this is actually the firmware information for a
> different unrelated controller on the same host (a LSI MegaRAID SAS-3
> 3108 using the megaraid_sas driver). But the megaraid_sas one is not
> used in our tests, so please ignore the storcli output we provided.
> Sorry for the confusion.
>
> The controller we're testing with is the SAS3008 I mentioned initially,
> with firmware version 16.00.16.00 as reported by sas3ircu above.
I do not have this FW... Not sure what the HBA itself is too. I only have some
Broadcom 9300-XX HBAs that have the 3008 controller.
> FWIW, the user reports they have also seen the same issue with a
> SAS3-9500-8e Tri-mode HBA.
This one had a FW update last month or so. So checking the latest is required.
>>> And the disk information from `smartctl --xall`
>>>
>>> 20T:
>>>
>>> === START OF INFORMATION SECTION ===
>>> Vendor: WDC
>>> Product: WUH722020BL5204
...
>>> Product: WUH721818AL5204
I have these. I will try to check. But again, I seriously doubt this has
anything to do with the drives since these do not support CDL, nor do the HBAs
you listed. None of then support CDL so calling scsi_report_opcode() for
checking CDL, we should always see the HBA SAT return "CDL not supported".
>> I do not think that the drives are relevant for this issue. How the HBA react
>> to a command error from the drive resulting from the HBA command translation
>> likely is the issue.
>
> I see, but it is certainly strange that 18T vs 20T drives do seem to
> make a difference (hotplug works with 18T and doesn't work with 20T).
Probably a timing difference since these drives are not the same generation.
They have different timing on scan.
>>> If you need any additional information, please let us know!
>>
>> Adding the Broadcom folks to this thread, since as suspected, this seems to be
>> an HBA issue. I strongly suspect that it relates to a recent very similar issue
>> I have seen with the mpi3mr driver and a 9600 Broadcom HBA: any hotplug of a
>> drive would completely crash the HBA and a full power cycle was needed to
>> recover. A simple reboot would not be sufficient. I think the latest HBA FW
>> version fixes that problem.
>>
>> Broadcom team,
>>
>> Any comment ?
Broadcom ? Would you care to comment ?
At this point, I have no idea what is going on. My hunch is that it is the HBA
SAT misbehaving. But that is only a hunch. To prove it, we would likely need a
bus trace and have Broadcom look at HBA logs (which can be extracted using
storecli). All of this likely means involving the technical support of the vendors.
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2025-06-09 12:24 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-11 1:13 [PATCH v7 00/19] Add Command Duration Limits support Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 01/19] ioprio: cleanup interface definition Niklas Cassel
2023-06-07 13:10 ` [PATCH v7 1/19] " Alexander Gordeev
2023-06-07 14:52 ` Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 02/19] block: introduce ioprio hints Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 03/19] block: introduce BLK_STS_DURATION_LIMIT Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 04/19] scsi: core: allow libata to complete successful commands via EH Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 05/19] scsi: rename and move get_scsi_ml_byte() Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 06/19] scsi: support retrieving sub-pages of mode pages Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 07/19] scsi: support service action in scsi_report_opcode() Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 08/19] scsi: detect support for command duration limits Niklas Cassel
2025-04-30 12:13 ` Friedrich Weber
2025-04-30 13:39 ` Damien Le Moal
2025-05-08 9:36 ` Mira Limbeck
2025-05-08 23:45 ` Damien Le Moal
2025-06-03 11:28 ` Friedrich Weber
2025-06-09 12:24 ` Damien Le Moal [this message]
2025-07-10 8:41 ` Friedrich Weber
2025-07-10 10:32 ` Damien Le Moal
2025-07-30 10:39 ` Friedrich Weber
2025-07-14 2:48 ` Damien Le Moal
2025-07-22 9:32 ` Friedrich Weber
2025-07-22 9:37 ` Damien Le Moal
2025-07-31 11:48 ` Diangang Li
2025-07-31 12:06 ` Friedrich Weber
2025-07-31 23:21 ` Damien Le Moal
2025-07-31 11:38 ` Diangang Li
2025-07-31 11:44 ` Friedrich Weber
2023-05-11 1:13 ` [PATCH v7 09/19] scsi: allow enabling and disabling " Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 10/19] scsi: sd: set read/write commands CDL index Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 11/19] scsi: sd: handle read/write CDL timeout failures Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 12/19] ata: libata-scsi: remove unnecessary !cmd checks Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 13/19] ata: libata: change ata_eh_request_sense() to not set CHECK_CONDITION Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 14/19] ata: libata: detect support for command duration limits Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 15/19] ata: libata-scsi: handle CDL bits in ata_scsiop_maint_in() Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 16/19] ata: libata-scsi: add support for CDL pages mode sense Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 17/19] ata: libata: add ATA feature control sub-page translation Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 18/19] ata: libata: set read/write commands CDL index Niklas Cassel
2023-05-11 1:13 ` [PATCH v7 19/19] ata: libata: handle completion of CDL commands using policy 0xD Niklas Cassel
2023-05-11 4:22 ` [PATCH v7 00/19] Add Command Duration Limits support Douglas Gilbert
2023-05-11 12:34 ` Damien Le Moal
2023-05-15 22:58 ` Damien Le Moal
2023-05-22 21:41 ` Martin K. Petersen
2023-05-22 23:12 ` Damien Le Moal
2023-05-23 9:56 ` Niklas Cassel
2023-05-23 10:08 ` Damien Le Moal
2023-05-23 10:35 ` Niklas Cassel
2023-05-23 10:53 ` Damien Le Moal
2023-06-01 0:43 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54e0a717-e9fc-4534-bc27-8bc1ee745048@kernel.org \
--to=dlemoal@kernel.org \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=chandrakanth.patil@broadcom.com \
--cc=f.weber@proxmox.com \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=jejb@linux.ibm.com \
--cc=kashyap.desai@broadcom.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=m.limbeck@proxmox.com \
--cc=martin.petersen@oracle.com \
--cc=megaraidlinux.pdl@broadcom.com \
--cc=mpi3mr-linuxdrv.pdl@broadcom.com \
--cc=niklas.cassel@wdc.com \
--cc=nks@flawful.org \
--cc=sathya.prakash@broadcom.com \
--cc=shivasharan.srikanteshwara@broadcom.com \
--cc=sreekanth.reddy@broadcom.com \
--cc=sumit.saxena@broadcom.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).