From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: "Peter Fröhlich" <peter.hans.froehlich@gmail.com>,
"Hannes Reinecke" <hare@suse.de>
Cc: "linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>
Subject: Re: libata-scsi: ata_to_sense_error handling status 0x40
Date: Wed, 31 Aug 2022 10:40:17 +0900 [thread overview]
Message-ID: <fb5b1dda-fa31-077c-f075-c0cffdc689f7@opensource.wdc.com> (raw)
In-Reply-To: <CAHXXO6Gj1Tn6C=_CZ2eB5+V0-51Lt=g6PMnazwym_nnXsFNMpg@mail.gmail.com>
On 2022/08/30 16:02, Peter Fröhlich wrote:
> On Tue, Aug 30, 2022 at 1:26 AM Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>> On Mon, 2022-08-29 at 08:04 +0200, Peter Fröhlich wrote:
>>> That's the sense_table, I was referring to the stat_table. That table
>>> is consulted when we fail to convert via the sense_table.
>> ...
>> So looking at the right code again, this is all very strange. E.g. the
>> ACS specs define bit 5 of the status field as the "device fault" bit,
>> but the code looks at 0x08, so bit 3. For write command, the definition
>> is:
>>
>> STATUS
>> Bit Description
>> 7:6 Transport Dependent – See 6.2.11
>> 5 DEVICE FAULT bit – See 6.2.6
>> 4 N/A
>> 3 Transport Dependent – See 6.2.11
>> 2 N/A
>> 1 SENSE DATA AVAILABLE bit – See 6.2.9
>> 0 ERROR bit – See 6.2.8
>>
>> And the code is:
>>
>> static const unsigned char stat_table[][4] = {
>> /* Must be first because BUSY means no other bits valid
>> */
>> {0x80, ABORTED_COMMAND, 0x47, 0x00},
>> // Busy, fake parity for now
>> {0x40, ILLEGAL_REQUEST, 0x21, 0x04},
>> // Device ready, unaligned write command
>> {0x20, HARDWARE_ERROR, 0x44, 0x00},
>> // Device fault, internal target failure
>> {0x08, ABORTED_COMMAND, 0x47, 0x00},
>> // Timed out in xfer, fake parity for now
>> {0x04, RECOVERED_ERROR, 0x11, 0x00},
>> // Recovered ECC error Medium error, recovered
>> {0xFF, 0xFF, 0xFF, 0xFF}, // END mark
>> };
>>
>> So this does not match at all. Something wrong here, or, the "status"
>> field being observed here is not the one I am thinking of. Checking
>> AHCI & SATA-IO specs, I do not see anything matching there either.
>
> Thank you for confirming that this section *is* confusing. I was down
> the same rabbit-hole checking "status" in whatever ATA spec I could
> get my hands on, and it didn't help. Specifically for "WRITE DMA"
> where I usually see the error, it seems that bit 6 has no other
> meaning than "transport dependent" which for SATA means (I believe)
> "drive ready" as it's always been. But that 0x40 is *not* an
> "unaligned write" whatever else it may be. My suspicion is that maybe
> it went in by accident since it's also in a "whitespace" commit. On
> the other hand, it has an explicit comment. I wasn't going to bother
> the original author before, but maybe now I should?
+Hannes
Except for bit 0x20 (device fault), the other bits do not match anything
sensible either. So I wonder what specs this is against. Hannes ? 7-years old
patch... I am sure your memory is very fresh about this one :)
>>> Which is why I am pretty sure that the "unaligned write" message is
>>> spurious since I am writing to a plain old SSD. It's going to be hard
>>> for a userspace program to generate a write that is no properly
>>> aligned for the SSD.
>>
>> Unless your SSD is really buggy and throws strange errors, which is
>> always a possibility. Do you have a good reproducer of the problem ?
>
> Not really, sadly. For me it happens with SSDs behind a Marvell SATA
> controller but it doesn't happen when the same SSD goes behind a
> fancier SAS controller. This is what led me into the ATA/SCSI layer as
> the possible culprit because on the SAS boxes that layer is not used.
Yes, with a SAS HBA that has SAT implemented in FW, the HBA FW will do the
conversion to sense data for failed commands. No way of knowing how that is done
there.
> BTW there's another "strange" effect that sometimes seems to lose the
> LBA flag on the ATA taskfile struct resulting in an obscure error
> message about failed CHS addressing. In that case I suspect an
> initialization gone wrong or maybe a race condition somewhere, but
> it's been a real pain to track down further. If I ever get a better
> handle on how to repro this stuff, I certainly will share.
Yes, that type of error generally means something goes badly during scanning or
revalidate, e.g. access to a log page failing. That is a fairly common problems
on many drives (e.g. drives advertising support for READ LOG DMA EXT but that
command in fact not working). Your drive may need some quirks to get a reliable
scan.
Have you checked if your drive already has some entry in ata_device_blacklist
(in libata-core.c) ?
>
> Cheers,
> Peter
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2022-08-31 1:40 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 12:00 libata-scsi: ata_to_sense_error handling status 0x40 Peter Fröhlich
2022-08-28 23:20 ` Damien Le Moal
2022-08-29 6:04 ` Peter Fröhlich
2022-08-29 23:26 ` Damien Le Moal
2022-08-30 7:02 ` Peter Fröhlich
2022-08-31 1:40 ` Damien Le Moal [this message]
2022-08-31 7:15 ` Hannes Reinecke
2022-08-31 7:48 ` Damien Le Moal
2022-08-31 10:21 ` Peter Fröhlich
2022-08-31 13:30 ` Peter Fröhlich
2022-08-31 22:54 ` Damien Le Moal
2022-09-01 6:10 ` Peter Fröhlich
2022-09-01 6:13 ` Hannes Reinecke
2022-09-02 2:35 ` Damien Le Moal
2022-09-02 6:34 ` Peter Fröhlich
2022-09-02 8:41 ` Damien Le Moal
2022-09-12 7:52 ` Peter Fröhlich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb5b1dda-fa31-077c-f075-c0cffdc689f7@opensource.wdc.com \
--to=damien.lemoal@opensource.wdc.com \
--cc=hare@suse.de \
--cc=linux-ide@vger.kernel.org \
--cc=peter.hans.froehlich@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox