From: Niklas Cassel <cassel@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-ide@vger.kernel.org, Igor Pylypiv <ipylypiv@google.com>,
Xingui Yang <yangxingui@huawei.com>,
John Garry <john.g.garry@oracle.com>
Subject: Re: [PATCH v3 2/2] ata: libata-scsi: avoid passthrough command starvation
Date: Wed, 31 Dec 2025 12:17:42 +0100 [thread overview]
Message-ID: <aVUGVo3Oym2IcpPS@ryzen> (raw)
In-Reply-To: <20251220002140.148854-3-dlemoal@kernel.org>
On Sat, Dec 20, 2025 at 09:21:40AM +0900, Damien Le Moal wrote:
> When a non-NCQ passthrough command is issued while NCQ commands are
Same as cover letter:
s/non-NCQ passthrough command/non-NCQ command/
> being executed, ata_scsi_defer() indicates to ata_scsi_translate() that
> ata_qc_issue() should not be called for the passthrough command, and
s/passthrough//
> instead returns SCSI_MLQUEUE_XXX_BUSY to defer the command execution.
> This command deferring is correct and as mandated by the ACS
> specifications since NCQ and non-NCQ commands cannot be mixed.
>
> However, in the case of a host adapter using multiple submission queues,
> when the target device is under a constant load of NCQ read or write
s/NCQ read or write commands/NCQ commands/
There are other NCQ commands, such as NCQ SEND and NCQ RECEIVE.
E.g. a user could saturate the queue by doing a NCQ encapsulated READ LOG
DMA EXT.
> commands, there are no guarantees that requeueing the non-NCQ command
> will lead to it not being deferred again repeatedly, since other
> submission queues can constantly issue NCQ commands from different CPUs
> ahead of the non-NCQ command. This can lead to very long delays for the
> execution of non-NCQ passthrough commands, and even complete starvation
> in the worst case scenario.
>
> Since the block layer and the SCSI layer do not distinguish between
> queuable (NCQ) and non queueable (non-NCQ) commands, libata-scsi SAT
s/queuable/queueable/
> implementation must ensure forward progress for non-NCQ commands in the
> presence of NCQ command traffic. This is similar to what SAS HBAs with a
> hardware/firmware based implementation do.
>
> Implement such forward progress guarantee by limiting requeueing of
> non-NCQ commands: when a non-NCQ command is received and NCQ commands
> are in-flight, do not force a requeue of the non-NCQ command by
> returning SCSI_MLQUEUE_XXX_BUSY in ata_scsi_translate() and instead
> hold on to the qc using the new deferred_qc field of struct ata_port.
>
> This deferred qc will be issued using the work item deferred_qc_work
> running the function ata_scsi_deferred_qc_work() once all in-flight
> commands complete, which is checked with the port qc_defer() callback
> indicating that no further delay is necessary. This check is done using
> the helper function ata_scsi_schedule_deferred_qc() which is called from
> ata_scsi_qc_complete(). This thus excludes this mechanism from all
> internal non-NCQ commands issued by ATA EH.
>
> When a port deferred_qc is non NULL, that is, the port has a command
> waiting for the device queue to drain, the issuing of all incoming
> commands (both NCQ and non-NCQ) is deferred using the regular busy
> mechanism. This simplifies the code and also avoids potential denial of
> service problems if a user issues too many non-NCQ passthrough commands.
Instead of introducing a workqueue, did you try/consider to do it the same
way that we handle CDL commands that completed without error:
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/ata/libata-core.c#L4971-L4988
I.e., in qc_defer(), if we try to queue a non-NCQ command while there
are NCQ commands active, set ATA_PFLAG_EH_PENDING so that we do not
trigger fast drain (we wait for the active commands to finish),
and then call ata_qc_schedule_eh() on the non-NCQ qc.
Sure, for passthrough commands specifically, SCSI will not want to
retry the command, because scsi_noretry_cmd() will return true:
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/scsi/scsi_error.c#L2227
But this could easily be solved by e.g. modifying scsi_noretry_cmd()
to add an additional "case DID_REQUEUE: return false;"
And also set set_host_byte(scmd, DID_REQUEUE); (probably based on a
new ATA_QCFLAG_DEFER flag or simiar), after we have cleared DID_TIME_OUT
using set_host_byte(scmd, DID_OK):
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/ata/libata-eh.c#L640
Since this works for CDL commands, I don't see why we shouldn't be able to
also send a non-NCQ command (if there are NCQ commands active), via SCSI
EH, so that SCSI itself retries the command, rather than us holding/hiding
a command from the SCSI queue, by using an additional workqueue in libata.
(If libata is just a SCSI LLDD, it somehow feels more logical to make use
of the SCSI queue, rather than to add our own queueing in libata using a
workqueue.)
Kind regards,
Niklas
next prev parent reply other threads:[~2025-12-31 11:17 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-20 0:21 [PATCH v3 0/2] Prevent non-NCQ command starvation Damien Le Moal
2025-12-20 0:21 ` [PATCH v3 1/2] ata: libata-scsi: refactor ata_scsi_translate() Damien Le Moal
2025-12-22 21:19 ` Igor Pylypiv
2025-12-30 13:40 ` Niklas Cassel
2025-12-20 0:21 ` [PATCH v3 2/2] ata: libata-scsi: avoid passthrough command starvation Damien Le Moal
2025-12-22 21:28 ` Igor Pylypiv
2025-12-28 2:25 ` Damien Le Moal
2025-12-23 11:17 ` yangxingui
2025-12-31 11:17 ` Niklas Cassel [this message]
2026-01-02 0:47 ` Damien Le Moal
2026-01-02 10:24 ` Niklas Cassel
2026-01-02 11:00 ` Damien Le Moal
2025-12-31 11:39 ` Niklas Cassel
2026-01-02 1:10 ` Damien Le Moal
2026-01-02 9:46 ` Niklas Cassel
2026-01-02 11:01 ` Damien Le Moal
2025-12-31 10:04 ` [PATCH v3 0/2] Prevent non-NCQ " Niklas Cassel
2026-01-02 1:14 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aVUGVo3Oym2IcpPS@ryzen \
--to=cassel@kernel.org \
--cc=dlemoal@kernel.org \
--cc=ipylypiv@google.com \
--cc=john.g.garry@oracle.com \
--cc=linux-ide@vger.kernel.org \
--cc=yangxingui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.