Re: [PATCH v3 2/2] ata: libata-scsi: avoid passthrough command starvation

public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed

From: Niklas Cassel <cassel@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-ide@vger.kernel.org, Igor Pylypiv <ipylypiv@google.com>,
	Xingui Yang <yangxingui@huawei.com>,
	John Garry <john.g.garry@oracle.com>
Subject: Re: [PATCH v3 2/2] ata: libata-scsi: avoid passthrough command starvation
Date: Wed, 31 Dec 2025 12:17:42 +0100	[thread overview]
Message-ID: <aVUGVo3Oym2IcpPS@ryzen> (raw)
In-Reply-To: <20251220002140.148854-3-dlemoal@kernel.org>

On Sat, Dec 20, 2025 at 09:21:40AM +0900, Damien Le Moal wrote:
> When a non-NCQ passthrough command is issued while NCQ commands are

Same as cover letter:
s/non-NCQ passthrough command/non-NCQ command/


> being executed, ata_scsi_defer() indicates to ata_scsi_translate() that
> ata_qc_issue() should not be called for the passthrough command, and

s/passthrough//


> instead returns SCSI_MLQUEUE_XXX_BUSY to defer the command execution.
> This command deferring is correct and as mandated by the ACS
> specifications since NCQ and non-NCQ commands cannot be mixed.
> 
> However, in the case of a host adapter using multiple submission queues,
> when the target device is under a constant load of NCQ read or write

s/NCQ read or write commands/NCQ commands/

There are other NCQ commands, such as NCQ SEND and NCQ RECEIVE.
E.g. a user could saturate the queue by doing a NCQ encapsulated READ LOG
DMA EXT.


> commands, there are no guarantees that requeueing the non-NCQ command
> will lead to it not being deferred again repeatedly, since other
> submission queues can constantly issue NCQ commands from different CPUs
> ahead of the non-NCQ command. This can lead to very long delays for the
> execution of non-NCQ passthrough commands, and even complete starvation
> in the worst case scenario.
> 
> Since the block layer and the SCSI layer do not distinguish between
> queuable (NCQ) and non queueable (non-NCQ) commands, libata-scsi SAT

s/queuable/queueable/


> implementation must ensure forward progress for non-NCQ commands in the
> presence of NCQ command traffic. This is similar to what SAS HBAs with a
> hardware/firmware based implementation do.
> 
> Implement such forward progress guarantee by limiting requeueing of
> non-NCQ commands: when a non-NCQ command is received and NCQ commands
> are in-flight, do not force a requeue of the non-NCQ command by
> returning SCSI_MLQUEUE_XXX_BUSY in ata_scsi_translate() and instead
> hold on to the qc using the new deferred_qc field of struct ata_port.
> 
> This deferred qc will be issued using the work item deferred_qc_work
> running the function ata_scsi_deferred_qc_work() once all in-flight
> commands complete, which is checked with the port qc_defer() callback
> indicating that no further delay is necessary. This check is done using
> the helper function ata_scsi_schedule_deferred_qc() which is called from
> ata_scsi_qc_complete(). This thus excludes this mechanism from all
> internal non-NCQ commands issued by ATA EH.
> 
> When a port deferred_qc is non NULL, that is, the port has a command
> waiting for the device queue to drain, the issuing of all incoming
> commands (both NCQ and non-NCQ) is deferred using the regular busy
> mechanism. This simplifies the code and also avoids potential denial of
> service problems if a user issues too many non-NCQ passthrough commands.

Instead of introducing a workqueue, did you try/consider to do it the same
way that we handle CDL commands that completed without error:
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/ata/libata-core.c#L4971-L4988

I.e., in qc_defer(), if we try to queue a non-NCQ command while there
are NCQ commands active, set ATA_PFLAG_EH_PENDING so that we do not
trigger fast drain (we wait for the active commands to finish),
and then call ata_qc_schedule_eh() on the non-NCQ qc.

Sure, for passthrough commands specifically, SCSI will not want to
retry the command, because scsi_noretry_cmd() will return true:
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/scsi/scsi_error.c#L2227

But this could easily be solved by e.g. modifying scsi_noretry_cmd()
to add an additional "case DID_REQUEUE: return false;"

And also set set_host_byte(scmd, DID_REQUEUE); (probably based on a
new ATA_QCFLAG_DEFER flag or simiar), after we have cleared DID_TIME_OUT
using set_host_byte(scmd, DID_OK):
https://github.com/torvalds/linux/blob/v6.19-rc3/drivers/ata/libata-eh.c#L640

Since this works for CDL commands, I don't see why we shouldn't be able to
also send a non-NCQ command (if there are NCQ commands active), via SCSI
EH, so that SCSI itself retries the command, rather than us holding/hiding
a command from the SCSI queue, by using an additional workqueue in libata.

(If libata is just a SCSI LLDD, it somehow feels more logical to make use
of the SCSI queue, rather than to add our own queueing in libata using a
workqueue.)


Kind regards,
Niklas

next prev parent reply	other threads:[~2025-12-31 11:17 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-20  0:21 [PATCH v3 0/2] Prevent non-NCQ command starvation Damien Le Moal
2025-12-20  0:21 ` [PATCH v3 1/2] ata: libata-scsi: refactor ata_scsi_translate() Damien Le Moal
2025-12-22 21:19   ` Igor Pylypiv
2025-12-30 13:40   ` Niklas Cassel
2025-12-20  0:21 ` [PATCH v3 2/2] ata: libata-scsi: avoid passthrough command starvation Damien Le Moal
2025-12-22 21:28   ` Igor Pylypiv
2025-12-28  2:25     ` Damien Le Moal
2025-12-23 11:17   ` yangxingui
2025-12-31 11:17   ` Niklas Cassel [this message]
2026-01-02  0:47     ` Damien Le Moal
2026-01-02 10:24       ` Niklas Cassel
2026-01-02 11:00         ` Damien Le Moal
2025-12-31 11:39   ` Niklas Cassel
2026-01-02  1:10     ` Damien Le Moal
2026-01-02  9:46       ` Niklas Cassel
2026-01-02 11:01         ` Damien Le Moal
2025-12-31 10:04 ` [PATCH v3 0/2] Prevent non-NCQ " Niklas Cassel
2026-01-02  1:14   ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aVUGVo3Oym2IcpPS@ryzen \
    --to=cassel@kernel.org \
    --cc=dlemoal@kernel.org \
    --cc=ipylypiv@google.com \
    --cc=john.g.garry@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=yangxingui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox