* [PATCH 0/2] ATA port deferred qc fixes @ 2026-02-20 5:00 Damien Le Moal 2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal 2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal 0 siblings, 2 replies; 6+ messages in thread From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw) To: linux-ide, Niklas Cassel The first patch addresses a use-after-free issue when a deferred qc times out. The second patch avoids a call to a potentially sleeping function while a port spinlock is held. Damien Le Moal (2): ata: libata-eh: correctly handle deferred qc timeouts ata: libata-core: fix cancellation of a port deferred qc work drivers/ata/libata-core.c | 13 +++++++++---- drivers/ata/libata-eh.c | 20 +++++++++++++++++--- 2 files changed, 26 insertions(+), 7 deletions(-) -- 2.53.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts 2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal @ 2026-02-20 5:00 ` Damien Le Moal 2026-02-20 16:37 ` Niklas Cassel 2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal 1 sibling, 1 reply; 6+ messages in thread From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw) To: linux-ide, Niklas Cassel A differed qc may timeout while waiting for the device queue to drain to be submitted. In such case, since the qc is not active, ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(), which frees the qc. But as the port deferred_qc field still references this finished/freed qc, the deferred qc work may eventually attempt to call ata_qc_issue() against this invalid qc, leading to errors such as reported by UBSAN (syzbot run): UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24 shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int' ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120 ubsan_epilogue+0xa/0x30 lib/ubsan.c:233 __ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494 ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166 ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679 process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275 process_scheduled_works kernel/workqueue.c:3358 [inline] worker_thread+0x5da/0xe40 kernel/workqueue.c:3439 kthread+0x370/0x450 kernel/kthread.c:467 ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> Fix this by checking if the qc of a timed out command is a deferred one, and in such case, clear the port deferred_qc field and finish the scsi command with DID_TIME_OUT. Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> --- drivers/ata/libata-eh.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 72a22b6c9682..f86085f9b476 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -640,12 +640,26 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, set_host_byte(scmd, DID_OK); ata_qc_for_each_raw(ap, qc, i) { - if (qc->flags & ATA_QCFLAG_ACTIVE && - qc->scsicmd == scmd) + if (qc->scsicmd != scmd) + continue; + if ((qc->flags & ATA_QCFLAG_ACTIVE) || + qc == ap->deferred_qc) break; } - if (i < ATA_MAX_QUEUE) { + if (qc == ap->deferred_qc) { + /* + * This is a deferred command that timedout while + * waiting for the command queue to drain. Since the qc + * is not active yet, simply signal the timeout by + * finishing the SCSI command and clear the deferred qc + * to prevent the deferred qc work from issuing this + * qc. + */ + ap->deferred_qc = NULL; + set_host_byte(scmd, DID_TIME_OUT); + scsi_eh_finish_cmd(scmd, &ap->eh_done_q); + } else if (i < ATA_MAX_QUEUE) { /* the scmd has an associated qc */ if (!(qc->flags & ATA_QCFLAG_EH)) { /* which hasn't failed yet, timeout */ -- 2.53.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts 2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal @ 2026-02-20 16:37 ` Niklas Cassel 0 siblings, 0 replies; 6+ messages in thread From: Niklas Cassel @ 2026-02-20 16:37 UTC (permalink / raw) To: Damien Le Moal; +Cc: linux-ide On Fri, Feb 20, 2026 at 02:00:52PM +0900, Damien Le Moal wrote: > A differed qc may timeout while waiting for the device queue to drain Nit: s/differed/deferred/ > to be submitted. In such case, since the qc is not active, > ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(), > which frees the qc. But as the port deferred_qc field still references > this finished/freed qc, the deferred qc work may eventually attempt to > call ata_qc_issue() against this invalid qc, leading to errors such as > reported by UBSAN (syzbot run): > > UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24 > shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int' > ... > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:94 [inline] > dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120 > ubsan_epilogue+0xa/0x30 lib/ubsan.c:233 > __ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494 > ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166 > ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679 > process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275 > process_scheduled_works kernel/workqueue.c:3358 [inline] > worker_thread+0x5da/0xe40 kernel/workqueue.c:3439 > kthread+0x370/0x450 kernel/kthread.c:467 > ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158 > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 > </TASK> > > Fix this by checking if the qc of a timed out command is a deferred one, > and in such case, clear the port deferred_qc field and finish the scsi > command with DID_TIME_OUT. > > Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com > Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") > Signed-off-by: Damien Le Moal <dlemoal@kernel.org> > --- > drivers/ata/libata-eh.c | 20 +++++++++++++++++--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c > index 72a22b6c9682..f86085f9b476 100644 > --- a/drivers/ata/libata-eh.c > +++ b/drivers/ata/libata-eh.c > @@ -640,12 +640,26 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, > set_host_byte(scmd, DID_OK); > > ata_qc_for_each_raw(ap, qc, i) { > - if (qc->flags & ATA_QCFLAG_ACTIVE && > - qc->scsicmd == scmd) > + if (qc->scsicmd != scmd) > + continue; > + if ((qc->flags & ATA_QCFLAG_ACTIVE) || > + qc == ap->deferred_qc) > break; > } > > - if (i < ATA_MAX_QUEUE) { > + if (qc == ap->deferred_qc) { > + /* > + * This is a deferred command that timedout while s/timedout/timed out/ > + * waiting for the command queue to drain. Since the qc > + * is not active yet, simply signal the timeout by How do we know for sure that the QC is not active yet? The answer appears to be that we always clear ap->deferred_qc before issuing the deferred QC, thus ap->deferred_qc will never have flag ATA_QCFLAG_ACTIVE set. Perhaps we could somehow make this clearer in the comment. Perhaps even a WARN_ON(qc->flags & ATA_QCFLAG_ACTIVE) ? Otherwise, this patch looks good to me. > + * finishing the SCSI command and clear the deferred qc > + * to prevent the deferred qc work from issuing this > + * qc. > + */ > + ap->deferred_qc = NULL; > + set_host_byte(scmd, DID_TIME_OUT); > + scsi_eh_finish_cmd(scmd, &ap->eh_done_q); > + } else if (i < ATA_MAX_QUEUE) { > /* the scmd has an associated qc */ > if (!(qc->flags & ATA_QCFLAG_EH)) { > /* which hasn't failed yet, timeout */ > -- > 2.53.0 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work 2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal 2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal @ 2026-02-20 5:00 ` Damien Le Moal 2026-02-20 17:33 ` Igor Pylypiv 1 sibling, 1 reply; 6+ messages in thread From: Damien Le Moal @ 2026-02-20 5:00 UTC (permalink / raw) To: linux-ide, Niklas Cassel cancel_work_sync() is a sleeping function so it cannot be called with the spin lock of a port being held. Move the call to this function in ata_port_detach() before locking the port and add a call to ata_scsi_requeue_deferred_qc() to make sure there are no remaining deferred qc. Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> --- drivers/ata/libata-core.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index beb6984b379a..54afa4df987e 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap) /* Wait for any ongoing EH */ ata_port_wait_eh(ap); + /* Requeue any remaining deferred qc. */ + ata_scsi_requeue_deferred_qc(ap); + mutex_lock(&ap->scsi_scan_mutex); + + /* Make sure the deferred qc work finished. */ + cancel_work_sync(&ap->deferred_qc_work); + spin_lock_irqsave(ap->lock, flags); + WARN_ON(ap->deferred_qc); + /* Remove scsi devices */ ata_for_each_link(link, ap, HOST_FIRST) { ata_for_each_dev(dev, link, ALL) { @@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap) } } - /* Make sure the deferred qc work finished. */ - cancel_work_sync(&ap->deferred_qc_work); - WARN_ON(ap->deferred_qc); - /* Tell EH to disable all devices */ ap->pflags |= ATA_PFLAG_UNLOADING; ata_port_schedule_eh(ap); -- 2.53.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work 2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal @ 2026-02-20 17:33 ` Igor Pylypiv 2026-02-20 21:48 ` Damien Le Moal 0 siblings, 1 reply; 6+ messages in thread From: Igor Pylypiv @ 2026-02-20 17:33 UTC (permalink / raw) To: Damien Le Moal; +Cc: linux-ide, Niklas Cassel On Fri, Feb 20, 2026 at 02:00:53PM +0900, Damien Le Moal wrote: > cancel_work_sync() is a sleeping function so it cannot be called with > the spin lock of a port being held. Move the call to this function in > ata_port_detach() before locking the port and add a call to > ata_scsi_requeue_deferred_qc() to make sure there are no remaining > deferred qc. > > Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") > Signed-off-by: Damien Le Moal <dlemoal@kernel.org> > --- > drivers/ata/libata-core.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index beb6984b379a..54afa4df987e 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap) > /* Wait for any ongoing EH */ > ata_port_wait_eh(ap); > > + /* Requeue any remaining deferred qc. */ > + ata_scsi_requeue_deferred_qc(ap); > + ata_scsi_requeue_deferred_qc() requires the port lock to be held (function has a lockdep assert). > mutex_lock(&ap->scsi_scan_mutex); > + > + /* Make sure the deferred qc work finished. */ > + cancel_work_sync(&ap->deferred_qc_work); > + > spin_lock_irqsave(ap->lock, flags); > > + WARN_ON(ap->deferred_qc); > + > /* Remove scsi devices */ > ata_for_each_link(link, ap, HOST_FIRST) { > ata_for_each_dev(dev, link, ALL) { > @@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap) > } > } > > - /* Make sure the deferred qc work finished. */ > - cancel_work_sync(&ap->deferred_qc_work); > - WARN_ON(ap->deferred_qc); > - > /* Tell EH to disable all devices */ > ap->pflags |= ATA_PFLAG_UNLOADING; > ata_port_schedule_eh(ap); > -- > 2.53.0 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work 2026-02-20 17:33 ` Igor Pylypiv @ 2026-02-20 21:48 ` Damien Le Moal 0 siblings, 0 replies; 6+ messages in thread From: Damien Le Moal @ 2026-02-20 21:48 UTC (permalink / raw) To: Igor Pylypiv; +Cc: linux-ide, Niklas Cassel On 2/21/26 02:33, Igor Pylypiv wrote: > On Fri, Feb 20, 2026 at 02:00:53PM +0900, Damien Le Moal wrote: >> cancel_work_sync() is a sleeping function so it cannot be called with >> the spin lock of a port being held. Move the call to this function in >> ata_port_detach() before locking the port and add a call to >> ata_scsi_requeue_deferred_qc() to make sure there are no remaining >> deferred qc. >> >> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation") >> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> >> --- >> drivers/ata/libata-core.c | 13 +++++++++---- >> 1 file changed, 9 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c >> index beb6984b379a..54afa4df987e 100644 >> --- a/drivers/ata/libata-core.c >> +++ b/drivers/ata/libata-core.c >> @@ -6254,9 +6254,18 @@ static void ata_port_detach(struct ata_port *ap) >> /* Wait for any ongoing EH */ >> ata_port_wait_eh(ap); >> >> + /* Requeue any remaining deferred qc. */ >> + ata_scsi_requeue_deferred_qc(ap); >> + > > ata_scsi_requeue_deferred_qc() requires the port lock to be held (function > has a lockdep assert). Doh ! Yes, of course ! I had lockdep disabled on my test rig... Bad idea :) Let me fix this. > >> mutex_lock(&ap->scsi_scan_mutex); >> + >> + /* Make sure the deferred qc work finished. */ >> + cancel_work_sync(&ap->deferred_qc_work); >> + >> spin_lock_irqsave(ap->lock, flags); >> >> + WARN_ON(ap->deferred_qc); >> + >> /* Remove scsi devices */ >> ata_for_each_link(link, ap, HOST_FIRST) { >> ata_for_each_dev(dev, link, ALL) { >> @@ -6269,10 +6278,6 @@ static void ata_port_detach(struct ata_port *ap) >> } >> } >> >> - /* Make sure the deferred qc work finished. */ >> - cancel_work_sync(&ap->deferred_qc_work); >> - WARN_ON(ap->deferred_qc); >> - >> /* Tell EH to disable all devices */ >> ap->pflags |= ATA_PFLAG_UNLOADING; >> ata_port_schedule_eh(ap); >> -- >> 2.53.0 >> >> -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-02-20 21:48 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-20 5:00 [PATCH 0/2] ATA port deferred qc fixes Damien Le Moal 2026-02-20 5:00 ` [PATCH 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal 2026-02-20 16:37 ` Niklas Cassel 2026-02-20 5:00 ` [PATCH 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal 2026-02-20 17:33 ` Igor Pylypiv 2026-02-20 21:48 ` Damien Le Moal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox