From: Niklas Cassel <cassel@kernel.org>
To: yangxingui <yangxingui@huawei.com>
Cc: Damien Le Moal <dlemoal@kernel.org>,
Hannes Reinecke <hare@suse.de>, Yu Kuai <yukuai1@huaweicloud.com>,
linux-ide@vger.kernel.org, Wenchao Hao <haowenchao22@gmail.com>,
linux-scsi@vger.kernel.org
Subject: Re: [PATCH 2/2] ata: libata: Issue non-NCQ command via EH when NCQ commands in-flight
Date: Tue, 5 Nov 2024 10:33:51 +0100 [thread overview]
Message-ID: <ZynmfyDA9R-lrW71@ryzen> (raw)
In-Reply-To: <baceec65-ad60-f8e5-f417-0316c19a0234@huawei.com>
On Mon, Nov 04, 2024 at 12:01:19PM +0800, yangxingui wrote:
(snip)
> After testing, the issues we encountered were resolved.
That is good news :)
>
> But the kernel prints the following log:
>
> [246993.392832] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
> [246993.392839] sas: ata5: end_device-4:0: cmd error handler
> [246993.392855] sas: ata5: end_device-4:0: dev error handler
> [246993.392860] sas: ata6: end_device-4:3: dev error handler
> [246993.392863] sas: ata7: end_device-4:4: dev error handler
> [246993.606491] sas: --- Exit sas_scsi_recover_host: busy: 0 failed:
> 1 tries: 1
>
> And because the current EH will set the host to the recovery state,
> when we test and execute the smartctl command, it will affect the
> performance of all other disks under the same host.
>
> Perhaps we can continue to improve the EH mechanism that Wenchao
> tried to do before, and implement EH for a single disk. After a
> single disk enters EH, it may not affect other disks under the same
> host.
>
> https://lore.kernel.org/linux-scsi/20230901094127.2010873-1-haowenchao2@huawei.com/
That is bad news :(
Considering that this series will currently stall all other disks under
the same host, this series is currently not a viable solution to the
problem that you have reported (NCQ commands can starve out non-NCQ
commands).
Looking at:
https://lore.kernel.org/linux-scsi/20230901094127.2010873-1-haowenchao2@huawei.com/
It appears that a requirement for Wenchao's series to land,
is that Hannes's EH rework series:
https://lore.kernel.org/linux-scsi/20231023092837.33786-1-hare@suse.de/
lands first.
Unless these two SCSI series get merged first, it's illogical to carry this
increased complexity in libata.
If these two SCSI series ever get merged, then the series in $subject would
be a viable solution to the problem, and the extra complexity would be
justified.
Kind regards,
Niklas
parent reply other threads:[~2024-11-05 9:33 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <baceec65-ad60-f8e5-f417-0316c19a0234@huawei.com>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZynmfyDA9R-lrW71@ryzen \
--to=cassel@kernel.org \
--cc=dlemoal@kernel.org \
--cc=haowenchao22@gmail.com \
--cc=hare@suse.de \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=yangxingui@huawei.com \
--cc=yukuai1@huaweicloud.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox