From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Niklas Cassel <Niklas.Cassel@wdc.com>,
John Garry <john.g.garry@oracle.com>
Cc: "tj@kernel.org" <tj@kernel.org>, "hare@suse.de" <hare@suse.de>,
"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
James Bottomley <James.Bottomley@HansenPartnership.com>
Subject: Re: [PATCH v2] ata: libata-core: do not issue non-internal commands once EH is pending
Date: Tue, 8 Nov 2022 12:50:44 +0900 [thread overview]
Message-ID: <976bdcb7-cb97-9332-2bcc-5d98bc41871b@opensource.wdc.com> (raw)
In-Reply-To: <Y2mbX8MvYrF0FHaI@x1-carbon>
+ linux-scsi, Martin and James
On 11/8/22 08:57, Niklas Cassel wrote:
> On Mon, Nov 07, 2022 at 05:01:53PM +0000, John Garry wrote:
>> On 07/11/2022 16:10, Niklas Cassel wrote:
>>> scsi_queue_rq() will check if scsi_host_in_recovery() (state is
>>> SHOST_RECOVERY), and if so, it will _not_ issue a command via:
>>> scsi_dispatch_cmd() -> host->hostt->queuecommand() (ata_scsi_queuecmd())
>>> -> __ata_scsi_queuecmd() -> ata_scsi_translate() -> ata_qc_issue()
>>>
>>> Before commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
>>> when receiving a TFES error IRQ, this was the call chain:
>>> ahci_error_intr() -> ata_port_abort() -> ata_do_link_abort() ->
>>> ata_qc_complete() -> ata_qc_schedule_eh() -> blk_abort_request() ->
>>> blk_rq_timed_out() -> q->rq_timed_out_fn() (scsi_times_out()) ->
>>> scsi_eh_scmd_add() -> scsi_host_set_state(shost, SHOST_RECOVERY)
>>>
>>> Which meant that as soon as the error IRQ was serviced, no additional
>>> commands sent from the block layer (scsi_queue_rq()) would be sent down
>>> to the device. (ATA internal commands originating for ATA EH could of
>>> course still be sent.)
>>>
>>> However, after commit e494f6a72839 ("[SCSI] improved eh timeout handler"),
>>> scsi_times_out() would instead result in a call to scsi_abort_command() ->
>>> queue_delayed_work(). work function: scmd_eh_abort_handler() would call
>>> scsi_eh_scmd_add(), which calls scsi_host_set_state(shost, SHOST_RECOVERY).
>>>
>>> (It was possible to get the old behavior if host->hostt->no_async_abort
>>> was set, but libata never used it, and this option was completely removed
>>> in commit a06586325f37 ("scsi: make asynchronous aborts mandatory"))
>>>
>>> Additionally, later in commit 358f70da49d7 ("blk-mq: make
>>> blk_abort_request() trigger timeout path"), blk_abort_request() was changed
>>> to also call the abort callback asynchronously.
>>>
>>> So now, after the TFES error irq has been serviced, we need to wait for
>>> two different workqueues to run their work, before the SHOST_RECOVERY
>>> state gets set.
>>>
>>> While the ATA specification states that a device should return command
>>> aborted for all commands queued after the device has entered error state,
>>> since ATA only keeps the sense data for the latest command (in non-NCQ
>>> case), we really don't want to send block layer commands to the device
>>> after it has entered error state. (Only ATA EH commands should be sent,
>>> to read the sense data etc.)
>>>
>>> While we could just call scsi_host_set_state(shost, SHOST_RECOVERY) from
>>> ata_qc_schedule_eh() directly, that might be a layering violation.
>>> So instead of doing that, add an additional check against the libata's own
>>> EH flag(s) before calling the qc_defer callback.
>>>
>>> Fixes: e494f6a72839 ("[SCSI] improved eh timeout handler")
>>> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
>>> ---
>>> Changes since v1:
>>> -Implemented review comments from Damien.
>>>
>>> drivers/ata/libata-scsi.c | 20 ++++++++++++++++++++
>>> 1 file changed, 20 insertions(+)
>>>
>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>>> index 4cb914103382..383a208f5f99 100644
>>> --- a/drivers/ata/libata-scsi.c
>>> +++ b/drivers/ata/libata-scsi.c
>>> @@ -1736,6 +1736,26 @@ static int ata_scsi_translate(struct ata_device *dev, struct scsi_cmnd *cmd,
>>> if (xlat_func(qc))
>>> goto early_finish;
>>> + /*
>>> + * scsi_queue_rq() will defer commands when in state SHOST_RECOVERY.
>>> + *
>>> + * When getting an error interrupt, ata_port_abort() will be called,
>>> + * which ends up calling ata_qc_schedule_eh() on all QCs.
>>> + *
>>> + * ata_qc_schedule_eh() will call ata_eh_set_pending() and then call
>>> + * blk_abort_request() on the given QC. blk_abort_request() will
>>> + * asynchronously end up calling scsi_eh_scmd_add(), which will set
>>> + * the state to SHOST_RECOVERY and wake up SCSI EH.
>>> + *
>>> + * In order to avoid requests from being issued to the device from the
>>> + * time ata_eh_set_pending() is called, to the time scsi_eh_scmd_add()
>>> + * sets the state to SHOST_RECOVERY, we defer requests here as well.
>>> + */
>>> + if (ap->pflags & (ATA_PFLAG_EH_PENDING | ATA_PFLAG_EH_IN_PROGRESS)) {
>>> + rc = ATA_DEFER_LINK;
>>> + goto defer;
>>
>> Could we move this check earlier? I mean, we didn't need to have the qc
>> alloc'ed and xlat'ed for this check to be done, right?
>
> Sure, we could put it in e.g. ata_scsi_queuecmd() or __ata_scsi_queuecmd().
>
>
> Or, perhaps it is just time to accept that ATA EH is so interconnected with
> SCSI EH, so the simplest thing is just to do:
>
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -913,6 +913,7 @@ void ata_qc_schedule_eh(struct ata_queued_cmd *qc)
>
> qc->flags |= ATA_QCFLAG_FAILED;
> ata_eh_set_pending(ap, 1);
> + scsi_host_set_state(ap->scsi_host, SHOST_RECOVERY);
Why put this in this function ? Nothing in ata_qc_schedule_eh() calls
scsi_schedule_eh() or scsi_eh_scmd_add(), which set that state.
It looks like ata_port->ops->sched_eh(), that is, ata_std_sched_eh(),
would be a better place for this.
>
> /* The following will fail if timeout has already expired.
> * ata_scsi_error() takes care of such scmds on EH entry.
>
>
> Which appears to work just as well as the patch in $subject.
>
> In commit ee7863bc68fa ("[PATCH] SCSI: implement shost->host_eh_scheduled")
> Tejun mentioned that "... libata is planning to depart from SCSI, so, for the
> time being, libata will be using SCSI EH to handle such exceptions."
>
> Now, 16 years later, ATA is still using SCSI EH (see ata_port_schedule_eh() and
> ata_std_sched_eh()) to schedule EH in the case when there are no QCs to abort:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/ata/libata-eh.c?h=v6.1-rc4#n1004
>
> Damien, thoughts?
I like this simpler fix better, but it does introduce a risk of (again)
having problems with ata EH if scsi EH trigger/timing is changed.
Unlikely now, but as this fix proves, not unheard of.
The v2 change is fine too, modulo John's suggestion, which I agree with.
At least it is consistent with the ata internal eh state accounting, so
self contained within libata and somewhat less dependent on scsi state
machine.
Martin ? James ? Thoughts ?
--
Damien Le Moal
Western Digital Research
next parent reply other threads:[~2022-11-08 4:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20221107161036.670237-1-niklas.cassel@wdc.com>
[not found] ` <5a4fa5db-c706-5cf2-1145-bf091445d20e@oracle.com>
[not found] ` <Y2mbX8MvYrF0FHaI@x1-carbon>
2022-11-08 3:50 ` Damien Le Moal [this message]
2022-11-08 23:14 ` [PATCH v2] ata: libata-core: do not issue non-internal commands once EH is pending Niklas Cassel
2022-11-09 7:11 ` Hannes Reinecke
2022-11-09 9:28 ` Niklas Cassel
2022-11-09 9:37 ` Hannes Reinecke
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=976bdcb7-cb97-9332-2bcc-5d98bc41871b@opensource.wdc.com \
--to=damien.lemoal@opensource.wdc.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=Niklas.Cassel@wdc.com \
--cc=hare@suse.de \
--cc=john.g.garry@oracle.com \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox