From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: John Garry <john.garry@huawei.com>,
"jejb@linux.ibm.com" <jejb@linux.ibm.com>,
"martin.petersen@oracle.com" <martin.petersen@oracle.com>,
"jinpu.wang@cloud.ionos.com" <jinpu.wang@cloud.ionos.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linuxarm <linuxarm@huawei.com>,
yangxingui <yangxingui@huawei.com>,
yanaijie <yanaijie@huawei.com>
Subject: Re: [PATCH v5 0/7] libsas and drivers: NCQ error handling
Date: Wed, 5 Oct 2022 22:11:59 +0000 [thread overview]
Message-ID: <Yz4BLTPkXqyjW4a4@x1-carbon> (raw)
In-Reply-To: <5db6a7bc-dfeb-76e1-6899-7041daa934cf@opensource.wdc.com>
On Thu, Oct 06, 2022 at 06:36:05AM +0900, Damien Le Moal wrote:
> On 10/6/22 06:28, Niklas Cassel wrote:
> > On Wed, Oct 05, 2022 at 09:53:52AM +0100, John Garry wrote:
> >> On 04/10/2022 15:04, John Garry wrote:
> >>
> >> Hi Niklas,
> >>
> >> Could you try a change like this on top:
> >>
> >> void sas_ata_device_link_abort(struct domain_device *device, bool
> >> force_reset)
> >> {
> >> struct ata_port *ap = device->sata_dev.ap;
> >> struct ata_link *link = &ap->link;
> >>
> >> + device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY;
> >> + device->sata_dev.fis[3] = 0x04;
> >>
> >> link->eh_info.err_mask |= AC_ERR_DEV;
> >> if (force_reset)
> >> link->eh_info.action |= ATA_EH_RESET;
> >> ata_link_abort(link);
> >> }
> >> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);
> >>
> >> I tried it myself and it looked to work ok, except I have a problem with my
> >> arm64 system in that the read log ext times-out and all TF show "device
> >> error", like:
> >
> > Do you know why it fails to read the log?
> > Can you read the NCQ Command Error log using ATA16 passthrough commands?
> >
> > sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
> >
> > The first byte is the last NCQ tag (in hex) that failed.
>
> libata issues read log as a non-ncq command under EH. So the NCQ error log
> will not help.
Hello Damien,
John explained that he got a timeout from EH when reading the log:
[ 350.281581] ata1: failed to read log page 10h (errno=-5)
[ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
ata_eh_read_log_10h() uses ata_read_log_page(), which will first try to read
the log using READ LOG DMA EXT. If that fails, it will retry using READ LOG EXT.
Therefore, to see if this is a driver specific bug, I suggested to try to read
the NCQ Command Error log using ATA16 passthrough commands:
$ sudo sg_sat_read_gplog -d --log=0x10 /dev/sdc
will read the log using READ LOG DMA EXT.
$ sudo sg_sat_read_gplog --log=0x10 /dev/sdc
will read the log using READ LOG EXT.
Neither of these two suggested commands are NCQ commands.
(Neither command is encapsulated in a RECEIVE FPDMA QUEUED,
so I'm not sure what you mean.)
Garry, I now see that:
[ 350.577181] ata1.00: exception Emask 0x1 SAct 0xffffffff SErr 0x0 action 0x6 frozen
Your port is frozen.
ata_read_log_page() calls ata_exec_internal() which calls ata_exec_internal_sg(),
which will simply return an error without sending down the command to the drive,
if the port is frozen.
Not sure why your port is frozen, mine is obviously not.
ata_do_link_abort() calls ata_eh_set_pending() without activating fast drain:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.0#n989
So I'm not sure why your port is frozen.
(The fast drain timer does freeze the port, but it shouldn't be enabled.)
It might be worthwhile to see who freezes the port in your case.
Kind regards,
Niklas
next prev parent reply other threads:[~2022-10-05 22:12 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-27 7:04 [PATCH v5 0/7] libsas and drivers: NCQ error handling John Garry
2022-09-27 7:04 ` [PATCH v5 1/7] scsi: libsas: Add sas_ata_device_link_abort() John Garry
2022-09-27 7:04 ` [PATCH v5 2/7] scsi: hisi_sas: Move slot variable definition in hisi_sas_abort_task() John Garry
2022-09-27 7:04 ` [PATCH v5 3/7] scsi: hisi_sas: Add SATA_DISK_ERR bit handling for v3 hw John Garry
2022-09-27 7:04 ` [PATCH v5 4/7] scsi: hisi_sas: Modify v3 HW SATA disk error state completion processing John Garry
2022-09-27 7:04 ` [PATCH v5 5/7] scsi: pm8001: Modify task abort handling for SATA task John Garry
2022-09-27 7:04 ` [PATCH v5 6/7] scsi: pm8001: Use sas_ata_device_link_abort() to handle NCQ errors John Garry
2022-09-27 7:04 ` [PATCH v5 7/7] scsi: libsas: Make sas_{alloc, alloc_slow, free}_task() private John Garry
2022-10-04 13:05 ` [PATCH v5 0/7] libsas and drivers: NCQ error handling Niklas Cassel
2022-10-04 14:04 ` John Garry
2022-10-05 8:53 ` John Garry
2022-10-05 21:28 ` Niklas Cassel
2022-10-05 21:36 ` Damien Le Moal
2022-10-05 22:11 ` Niklas Cassel [this message]
2022-10-05 22:42 ` Damien Le Moal
2022-10-06 8:33 ` John Garry
2022-10-06 14:45 ` Niklas Cassel
2022-10-06 16:41 ` John Garry
2022-10-24 12:24 ` Niklas Cassel
2022-10-24 12:44 ` John Garry
2022-10-24 13:10 ` Niklas Cassel
2022-10-24 16:20 ` John Garry
2022-10-06 22:57 ` Damien Le Moal
2022-10-06 8:37 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yz4BLTPkXqyjW4a4@x1-carbon \
--to=niklas.cassel@wdc.com \
--cc=damien.lemoal@opensource.wdc.com \
--cc=jejb@linux.ibm.com \
--cc=jinpu.wang@cloud.ionos.com \
--cc=john.garry@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=martin.petersen@oracle.com \
--cc=yanaijie@huawei.com \
--cc=yangxingui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox