All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Friedrich Weber <f.weber@proxmox.com>,
	Mira Limbeck <m.limbeck@proxmox.com>,
	Niklas Cassel <nks@flawful.org>, Jens Axboe <axboe@kernel.dk>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	"James E.J. Bottomley" <jejb@linux.ibm.com>,
	Kashyap Desai <kashyap.desai@broadcom.com>,
	Sumit Saxena <sumit.saxena@broadcom.com>,
	Shivasharan S <shivasharan.srikanteshwara@broadcom.com>,
	Chandrakanth patil <chandrakanth.patil@broadcom.com>,
	Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>,
	Sreekanth Reddy <sreekanth.reddy@broadcom.com>,
	megaraidlinux.pdl@broadcom.com, mpi3mr-linuxdrv.pdl@broadcom.com
Cc: Bart Van Assche <bvanassche@acm.org>,
	Christoph Hellwig <hch@lst.de>, Hannes Reinecke <hare@suse.de>,
	linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
	linux-block@vger.kernel.org,
	Niklas Cassel <niklas.cassel@wdc.com>
Subject: Re: [PATCH v7 08/19] scsi: detect support for command duration limits
Date: Mon, 9 Jun 2025 21:24:36 +0900	[thread overview]
Message-ID: <54e0a717-e9fc-4534-bc27-8bc1ee745048@kernel.org> (raw)
In-Reply-To: <a927b51b-1b34-4d4f-9447-d8c559127707@proxmox.com>

On 6/3/25 20:28, Friedrich Weber wrote:
>>> They provided controller information via `sas3ircu` and `storcli`:
>>>
>>> sas3ircu:
>>>
>>>   Controller type                         : SAS3008
>>>   BIOS version                            : 8.37.00.00
>>>   Firmware version                        : 16.00.16.00
>>
>> Is this the latest available FW for this HBA ? (see below)
> 
> It seems 16.00.16.00 is even newer than the latest version available on
> the Broadcom website, which is a bit strange -- I only found [1] there
> which has an older 16.00.14.00 (3008_FW_PH16.00.14.00.rar).

So this is an old/now EOL 9300 series HBA, right ? Or is this a 3008 controller
chip as part of the server motherboard (e.g. a supermicro HBA ?)
Looking at the Broadcom support page for legacy products, the latest FW version
seems to be 16.00.10.00.

>>> storcli:
>>>
>>> Firmware Package Build = 24.18.0-0021
>>> Firmware Version = 4.670.00-6500
>>> CPLD Version = 26515-00A
>>> Bios Version = 6.34.01.0_4.19.08.00_0x06160200
>>> HII Version = 03.23.06.00
>>> Ctrl-R Version = 5.18-0400
>>> Preboot CLI Version = 01.07-05:#%0000
>>> NVDATA Version = 3.1611.00-0005
>>> Boot Block Version = 3.07.00.00-0003
>>> Driver Name = megaraid_sas
>>> Driver Version = 07.727.03.00-rc1
>>
>> Unfortunately, I do not have any megaraid model so I cannot test/recreate. I
>> only have mpt3sas (9300, 9400 and 9500 series HBAs) and mpi3mr models (9600 HBA
>> series).
> 
> We just realized this is actually the firmware information for a
> different unrelated controller on the same host (a LSI MegaRAID SAS-3
> 3108 using the megaraid_sas driver). But the megaraid_sas one is not
> used in our tests, so please ignore the storcli output we provided.
> Sorry for the confusion.
> 
> The controller we're testing with is the SAS3008 I mentioned initially,
> with firmware version 16.00.16.00 as reported by sas3ircu above.

I do not have this FW... Not sure what the HBA itself is too. I only have some
Broadcom 9300-XX HBAs that have the 3008 controller.

> FWIW, the user reports they have also seen the same issue with a
> SAS3-9500-8e Tri-mode HBA.

This one had a FW update last month or so. So checking the latest is required.

>>> And the disk information from `smartctl --xall`
>>>
>>> 20T:
>>>
>>> === START OF INFORMATION SECTION ===
>>> Vendor:               WDC
>>> Product:              WUH722020BL5204

...

>>> Product:              WUH721818AL5204

I have these. I will try to check. But again, I seriously doubt this has
anything to do with the drives since these do not support CDL, nor do the HBAs
you listed. None of then support CDL so calling scsi_report_opcode() for
checking CDL, we should always see the HBA SAT return "CDL not supported".


>> I do not think that the drives are relevant for this issue. How the HBA react
>> to a command error from the drive resulting from the HBA command translation
>> likely is the issue.
> 
> I see, but it is certainly strange that 18T vs 20T drives do seem to
> make a difference (hotplug works with 18T and doesn't work with 20T).

Probably a timing difference since these drives are not the same generation.
They have different timing on scan.

>>> If you need any additional information, please let us know!
>>
>> Adding the Broadcom folks to this thread, since as suspected, this seems to be
>> an HBA issue. I strongly suspect that it relates to a recent very similar issue
>> I have seen with the mpi3mr driver and a 9600 Broadcom HBA: any hotplug of a
>> drive would completely crash the HBA and a full power cycle was needed to
>> recover. A simple reboot would not be sufficient. I think the latest HBA FW
>> version fixes that problem.
>>
>> Broadcom team,
>>
>> Any comment ?

Broadcom ? Would you care to comment ?

At this point, I have no idea what is going on. My hunch is that it is the HBA
SAT misbehaving. But that is only a hunch. To prove it, we would likely need a
bus trace and have Broadcom look at HBA logs (which can be extracted using
storecli). All of this likely means involving the technical support of the vendors.


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2025-06-09 12:24 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-11  1:13 [PATCH v7 00/19] Add Command Duration Limits support Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 01/19] ioprio: cleanup interface definition Niklas Cassel
2023-06-07 13:10   ` [PATCH v7 1/19] " Alexander Gordeev
2023-06-07 14:52     ` Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 02/19] block: introduce ioprio hints Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 03/19] block: introduce BLK_STS_DURATION_LIMIT Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 04/19] scsi: core: allow libata to complete successful commands via EH Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 05/19] scsi: rename and move get_scsi_ml_byte() Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 06/19] scsi: support retrieving sub-pages of mode pages Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 07/19] scsi: support service action in scsi_report_opcode() Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 08/19] scsi: detect support for command duration limits Niklas Cassel
2025-04-30 12:13   ` Friedrich Weber
2025-04-30 13:39     ` Damien Le Moal
2025-05-08  9:36       ` Mira Limbeck
2025-05-08 23:45         ` Damien Le Moal
2025-06-03 11:28           ` Friedrich Weber
2025-06-09 12:24             ` Damien Le Moal [this message]
2025-07-10  8:41               ` Friedrich Weber
2025-07-10 10:32                 ` Damien Le Moal
2025-07-30 10:39                   ` Friedrich Weber
2025-07-14  2:48                 ` Damien Le Moal
2025-07-22  9:32                   ` Friedrich Weber
2025-07-22  9:37                     ` Damien Le Moal
2025-07-31 11:48                       ` Diangang Li
2025-07-31 12:06                         ` Friedrich Weber
2025-07-31 23:21                         ` Damien Le Moal
2025-09-18 12:46                           ` Friedrich Weber
2025-07-31 11:38     ` Diangang Li
2025-07-31 11:44       ` Friedrich Weber
2023-05-11  1:13 ` [PATCH v7 09/19] scsi: allow enabling and disabling " Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 10/19] scsi: sd: set read/write commands CDL index Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 11/19] scsi: sd: handle read/write CDL timeout failures Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 12/19] ata: libata-scsi: remove unnecessary !cmd checks Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 13/19] ata: libata: change ata_eh_request_sense() to not set CHECK_CONDITION Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 14/19] ata: libata: detect support for command duration limits Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 15/19] ata: libata-scsi: handle CDL bits in ata_scsiop_maint_in() Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 16/19] ata: libata-scsi: add support for CDL pages mode sense Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 17/19] ata: libata: add ATA feature control sub-page translation Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 18/19] ata: libata: set read/write commands CDL index Niklas Cassel
2023-05-11  1:13 ` [PATCH v7 19/19] ata: libata: handle completion of CDL commands using policy 0xD Niklas Cassel
2023-05-11  4:22 ` [PATCH v7 00/19] Add Command Duration Limits support Douglas Gilbert
2023-05-11 12:34   ` Damien Le Moal
2023-05-15 22:58 ` Damien Le Moal
2023-05-22 21:41 ` Martin K. Petersen
2023-05-22 23:12   ` Damien Le Moal
2023-05-23  9:56   ` Niklas Cassel
2023-05-23 10:08     ` Damien Le Moal
2023-05-23 10:35       ` Niklas Cassel
2023-05-23 10:53         ` Damien Le Moal
2023-06-01  0:43 ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54e0a717-e9fc-4534-bc27-8bc1ee745048@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chandrakanth.patil@broadcom.com \
    --cc=f.weber@proxmox.com \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=jejb@linux.ibm.com \
    --cc=kashyap.desai@broadcom.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=m.limbeck@proxmox.com \
    --cc=martin.petersen@oracle.com \
    --cc=megaraidlinux.pdl@broadcom.com \
    --cc=mpi3mr-linuxdrv.pdl@broadcom.com \
    --cc=niklas.cassel@wdc.com \
    --cc=nks@flawful.org \
    --cc=sathya.prakash@broadcom.com \
    --cc=shivasharan.srikanteshwara@broadcom.com \
    --cc=sreekanth.reddy@broadcom.com \
    --cc=sumit.saxena@broadcom.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.