* [PATCH 0/2] ATA port deferred qc fixes
@ 2026-02-20 5:00 Damien Le Moal
2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal
2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal
0 siblings, 2 replies; 6+ messages in thread
From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw)
To: linux-ide, Niklas Cassel
The first patch addresses a use-after-free issue when a deferred qc
times out. The second patch avoids a call to a potentially sleeping
function while a port spinlock is held.
Damien Le Moal (2):
ata: libata-eh: correctly handle deferred qc timeouts
ata: libata-core: fix cancellation of a port deferred qc work
drivers/ata/libata-core.c | 13 +++++++++----
drivers/ata/libata-eh.c | 20 +++++++++++++++++---
2 files changed, 26 insertions(+), 7 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts
2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal
@ 2026-02-20 5:00 ` Damien Le Moal
2026-02-20 16:37 ` Niklas Cassel
2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal
1 sibling, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw)
To: linux-ide, Niklas Cassel
A differed qc may timeout while waiting for the device queue to drain
to be submitted. In such case, since the qc is not active,
ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(),
which frees the qc. But as the port deferred_qc field still references
this finished/freed qc, the deferred qc work may eventually attempt to
call ata_qc_issue() against this invalid qc, leading to errors such as
reported by UBSAN (syzbot run):
UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24
shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int'
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
ubsan_epilogue+0xa/0x30 lib/ubsan.c:233
__ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494
ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166
ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679
process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275
process_scheduled_works kernel/workqueue.c:3358 [inline]
worker_thread+0x5da/0xe40 kernel/workqueue.c:3439
kthread+0x370/0x450 kernel/kthread.c:467
ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Fix this by checking if the qc of a timed out command is a deferred one,
and in such case, clear the port deferred_qc field and finish the scsi
command with DID_TIME_OUT.
Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
drivers/ata/libata-eh.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 72a22b6c9682..f86085f9b476 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -640,12 +640,26 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
set_host_byte(scmd, DID_OK);
ata_qc_for_each_raw(ap, qc, i) {
- if (qc->flags & ATA_QCFLAG_ACTIVE &&
- qc->scsicmd == scmd)
+ if (qc->scsicmd != scmd)
+ continue;
+ if ((qc->flags & ATA_QCFLAG_ACTIVE) ||
+ qc == ap->deferred_qc)
break;
}
- if (i < ATA_MAX_QUEUE) {
+ if (qc == ap->deferred_qc) {
+ /*
+ * This is a deferred command that timedout while
+ * waiting for the command queue to drain. Since the qc
+ * is not active yet, simply signal the timeout by
+ * finishing the SCSI command and clear the deferred qc
+ * to prevent the deferred qc work from issuing this
+ * qc.
+ */
+ ap->deferred_qc = NULL;
+ set_host_byte(scmd, DID_TIME_OUT);
+ scsi_eh_finish_cmd(scmd, &ap->eh_done_q);
+ } else if (i < ATA_MAX_QUEUE) {
/* the scmd has an associated qc */
if (!(qc->flags & ATA_QCFLAG_EH)) {
/* which hasn't failed yet, timeout */
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work
2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal
2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal
@ 2026-02-20 5:00 ` Damien Le Moal
2026-02-20 17:33 ` Igor Pylypiv
1 sibling, 1 reply; 6+ messages in thread
From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw)
To: linux-ide, Niklas Cassel
cancel_work_sync() is a sleeping function so it cannot be called with
the spin lock of a port being held. Move the call to this function in
ata_port_detach() before locking the port and add a call to
ata_scsi_requeue_deferred_qc() to make sure there are no remaining
deferred qc.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
drivers/ata/libata-core.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index beb6984b379a..54afa4df987e 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap)
/* Wait for any ongoing EH */
ata_port_wait_eh(ap);
+ /* Requeue any remaining deferred qc. */
+ ata_scsi_requeue_deferred_qc(ap);
+
mutex_lock(&ap->scsi_scan_mutex);
+
+ /* Make sure the deferred qc work finished. */
+ cancel_work_sync(&ap->deferred_qc_work);
+
spin_lock_irqsave(ap->lock, flags);
+ WARN_ON(ap->deferred_qc);
+
/* Remove scsi devices */
ata_for_each_link(link, ap, HOST_FIRST) {
ata_for_each_dev(dev, link, ALL) {
@@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap)
}
}
- /* Make sure the deferred qc work finished. */
- cancel_work_sync(&ap->deferred_qc_work);
- WARN_ON(ap->deferred_qc);
-
/* Tell EH to disable all devices */
ap->pflags |= ATA_PFLAG_UNLOADING;
ata_port_schedule_eh(ap);
--
2.53.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts
2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal
@ 2026-02-20 16:37 ` Niklas Cassel
0 siblings, 0 replies; 6+ messages in thread
From: Niklas Cassel @ 2026-02-20 16:37 UTC (permalink / raw)
To: Damien Le Moal; +Cc: linux-ide
On Fri, Feb 20, 2026 at 02:00:52PM +0900, Damien Le Moal wrote:
> A differed qc may timeout while waiting for the device queue to drain
Nit: s/differed/deferred/
> to be submitted. In such case, since the qc is not active,
> ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(),
> which frees the qc. But as the port deferred_qc field still references
> this finished/freed qc, the deferred qc work may eventually attempt to
> call ata_qc_issue() against this invalid qc, leading to errors such as
> reported by UBSAN (syzbot run):
>
> UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24
> shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int'
> ...
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:94 [inline]
> dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
> ubsan_epilogue+0xa/0x30 lib/ubsan.c:233
> __ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494
> ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166
> ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679
> process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275
> process_scheduled_works kernel/workqueue.c:3358 [inline]
> worker_thread+0x5da/0xe40 kernel/workqueue.c:3439
> kthread+0x370/0x450 kernel/kthread.c:467
> ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
>
> Fix this by checking if the qc of a timed out command is a deferred one,
> and in such case, clear the port deferred_qc field and finish the scsi
> command with DID_TIME_OUT.
>
> Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com
> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> drivers/ata/libata-eh.c | 20 +++++++++++++++++---
> 1 file changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
> index 72a22b6c9682..f86085f9b476 100644
> --- a/drivers/ata/libata-eh.c
> +++ b/drivers/ata/libata-eh.c
> @@ -640,12 +640,26 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
> set_host_byte(scmd, DID_OK);
>
> ata_qc_for_each_raw(ap, qc, i) {
> - if (qc->flags & ATA_QCFLAG_ACTIVE &&
> - qc->scsicmd == scmd)
> + if (qc->scsicmd != scmd)
> + continue;
> + if ((qc->flags & ATA_QCFLAG_ACTIVE) ||
> + qc == ap->deferred_qc)
> break;
> }
>
> - if (i < ATA_MAX_QUEUE) {
> + if (qc == ap->deferred_qc) {
> + /*
> + * This is a deferred command that timedout while
s/timedout/timed out/
> + * waiting for the command queue to drain. Since the qc
> + * is not active yet, simply signal the timeout by
How do we know for sure that the QC is not active yet?
The answer appears to be that we always clear ap->deferred_qc before
issuing the deferred QC, thus ap->deferred_qc will never have flag
ATA_QCFLAG_ACTIVE set.
Perhaps we could somehow make this clearer in the comment.
Perhaps even a WARN_ON(qc->flags & ATA_QCFLAG_ACTIVE) ?
Otherwise, this patch looks good to me.
> + * finishing the SCSI command and clear the deferred qc
> + * to prevent the deferred qc work from issuing this
> + * qc.
> + */
> + ap->deferred_qc = NULL;
> + set_host_byte(scmd, DID_TIME_OUT);
> + scsi_eh_finish_cmd(scmd, &ap->eh_done_q);
> + } else if (i < ATA_MAX_QUEUE) {
> /* the scmd has an associated qc */
> if (!(qc->flags & ATA_QCFLAG_EH)) {
> /* which hasn't failed yet, timeout */
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work
2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal
@ 2026-02-20 17:33 ` Igor Pylypiv
2026-02-20 21:48 ` Damien Le Moal
0 siblings, 1 reply; 6+ messages in thread
From: Igor Pylypiv @ 2026-02-20 17:33 UTC (permalink / raw)
To: Damien Le Moal; +Cc: linux-ide, Niklas Cassel
On Fri, Feb 20, 2026 at 02:00:53PM +0900, Damien Le Moal wrote:
> cancel_work_sync() is a sleeping function so it cannot be called with
> the spin lock of a port being held. Move the call to this function in
> ata_port_detach() before locking the port and add a call to
> ata_scsi_requeue_deferred_qc() to make sure there are no remaining
> deferred qc.
>
> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> drivers/ata/libata-core.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index beb6984b379a..54afa4df987e 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap)
> /* Wait for any ongoing EH */
> ata_port_wait_eh(ap);
>
> + /* Requeue any remaining deferred qc. */
> + ata_scsi_requeue_deferred_qc(ap);
> +
ata_scsi_requeue_deferred_qc() requires the port lock to be held (function
has a lockdep assert).
> mutex_lock(&ap->scsi_scan_mutex);
> +
> + /* Make sure the deferred qc work finished. */
> + cancel_work_sync(&ap->deferred_qc_work);
> +
> spin_lock_irqsave(ap->lock, flags);
>
> + WARN_ON(ap->deferred_qc);
> +
> /* Remove scsi devices */
> ata_for_each_link(link, ap, HOST_FIRST) {
> ata_for_each_dev(dev, link, ALL) {
> @@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap)
> }
> }
>
> - /* Make sure the deferred qc work finished. */
> - cancel_work_sync(&ap->deferred_qc_work);
> - WARN_ON(ap->deferred_qc);
> -
> /* Tell EH to disable all devices */
> ap->pflags |= ATA_PFLAG_UNLOADING;
> ata_port_schedule_eh(ap);
> --
> 2.53.0
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work
2026-02-20 17:33 ` Igor Pylypiv
@ 2026-02-20 21:48 ` Damien Le Moal
0 siblings, 0 replies; 6+ messages in thread
From: Damien Le Moal @ 2026-02-20 21:48 UTC (permalink / raw)
To: Igor Pylypiv; +Cc: linux-ide, Niklas Cassel
On 2/21/26 02:33, Igor Pylypiv wrote:
> On Fri, Feb 20, 2026 at 02:00:53PM +0900, Damien Le Moal wrote:
>> cancel_work_sync() is a sleeping function so it cannot be called with
>> the spin lock of a port being held. Move the call to this function in
>> ata_port_detach() before locking the port and add a call to
>> ata_scsi_requeue_deferred_qc() to make sure there are no remaining
>> deferred qc.
>>
>> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> ---
>> drivers/ata/libata-core.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
>> index beb6984b379a..54afa4df987e 100644
>> --- a/drivers/ata/libata-core.c
>> +++ b/drivers/ata/libata-core.c
>> @@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap)
>> /* Wait for any ongoing EH */
>> ata_port_wait_eh(ap);
>>
>> + /* Requeue any remaining deferred qc. */
>> + ata_scsi_requeue_deferred_qc(ap);
>> +
>
> ata_scsi_requeue_deferred_qc() requires the port lock to be held (function
> has a lockdep assert).
Doh ! Yes, of course ! I had lockdep disabled on my test rig... Bad idea :)
Let me fix this.
>
>> mutex_lock(&ap->scsi_scan_mutex);
>> +
>> + /* Make sure the deferred qc work finished. */
>> + cancel_work_sync(&ap->deferred_qc_work);
>> +
>> spin_lock_irqsave(ap->lock, flags);
>>
>> + WARN_ON(ap->deferred_qc);
>> +
>> /* Remove scsi devices */
>> ata_for_each_link(link, ap, HOST_FIRST) {
>> ata_for_each_dev(dev, link, ALL) {
>> @@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap)
>> }
>> }
>>
>> - /* Make sure the deferred qc work finished. */
>> - cancel_work_sync(&ap->deferred_qc_work);
>> - WARN_ON(ap->deferred_qc);
>> -
>> /* Tell EH to disable all devices */
>> ap->pflags |= ATA_PFLAG_UNLOADING;
>> ata_port_schedule_eh(ap);
>> --
>> 2.53.0
>>
>>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-20 21:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal
2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal
2026-02-20 16:37 ` Niklas Cassel
2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal
2026-02-20 17:33 ` Igor Pylypiv
2026-02-20 21:48 ` Damien Le Moal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox