All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-ide@vger.kernel.org, Niklas Cassel <cassel@kernel.org>
Subject: Re: [PATCH v2 1/2] ata: libata-eh: correctly handle deferred qc timeouts
Date: Fri, 6 Mar 2026 08:59:34 +0900	[thread overview]
Message-ID: <8011c682-7fc6-4cab-b142-d05fd3b09de0@kernel.org> (raw)
In-Reply-To: <ea6cdec5-cbd6-4387-9995-cbbfda978155@roeck-us.net>

On 3/6/26 02:59, Guenter Roeck wrote:
> Hi,
> 
> On Sat, Feb 21, 2026 at 07:14:38AM +0900, Damien Le Moal wrote:
>> A deferred qc may timeout while waiting for the device queue to drain
>> to be submitted. In such case, since the qc is not active,
>> ata_scsi_cmd_error_handler() ends up calling scsi_eh_finish_cmd(),
>> which frees the qc. But as the port deferred_qc field still references
>> this finished/freed qc, the deferred qc work may eventually attempt to
>> call ata_qc_issue() against this invalid qc, leading to errors such as
>> reported by UBSAN (syzbot run):
>>
>> UBSAN: shift-out-of-bounds in drivers/ata/libata-core.c:5166:24
>> shift exponent 4210818301 is too large for 64-bit type 'long long unsigned int'
>> ...
>> Call Trace:
>>  <TASK>
>>  __dump_stack lib/dump_stack.c:94 [inline]
>>  dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
>>  ubsan_epilogue+0xa/0x30 lib/ubsan.c:233
>>  __ubsan_handle_shift_out_of_bounds+0x279/0x2a0 lib/ubsan.c:494
>>  ata_qc_issue.cold+0x38/0x9f drivers/ata/libata-core.c:5166
>>  ata_scsi_deferred_qc_work+0x154/0x1f0 drivers/ata/libata-scsi.c:1679
>>  process_one_work+0x9d7/0x1920 kernel/workqueue.c:3275
>>  process_scheduled_works kernel/workqueue.c:3358 [inline]
>>  worker_thread+0x5da/0xe40 kernel/workqueue.c:3439
>>  kthread+0x370/0x450 kernel/kthread.c:467
>>  ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
>>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>>  </TASK>
>>
>> Fix this by checking if the qc of a timed out SCSI command is a deferred
>> one, and in such case, clear the port deferred_qc field and finish the
>> SCSI command with DID_TIME_OUT.
>>
>> Reported-by: syzbot+1f77b8ca15336fff21ff@syzkaller.appspotmail.com
>> Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> Reviewed-by: Hannes Reinecke <hare@suse.de>
>> Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
>> ---
>>  drivers/ata/libata-eh.c | 22 +++++++++++++++++++---
>>  1 file changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
>> index 72a22b6c9682..b373cceb95d2 100644
>> --- a/drivers/ata/libata-eh.c
>> +++ b/drivers/ata/libata-eh.c
>> @@ -640,12 +640,28 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
>>  		set_host_byte(scmd, DID_OK);
>>  
>>  		ata_qc_for_each_raw(ap, qc, i) {
>> -			if (qc->flags & ATA_QCFLAG_ACTIVE &&
>> -			    qc->scsicmd == scmd)
>> +			if (qc->scsicmd != scmd)
>> +				continue;
>> +			if ((qc->flags & ATA_QCFLAG_ACTIVE) ||
>> +			    qc == ap->deferred_qc)
>>  				break;
>>  		}
>>  
>> -		if (i < ATA_MAX_QUEUE) {
>> +		if (qc == ap->deferred_qc) {
> 
> An experimental AI code review agent tagged this patch with the following
> comment.

Thanks for that.

>   If the `ata_qc_for_each_raw()` loop finishes without finding a matching `scmd`,
>   `qc` will hold a pointer to the last element examined (`i == ATA_MAX_QUEUE`).
>   If this last element happens to be `ap->deferred_qc`, the condition
>   `qc == ap->deferred_qc` evaluates to true despite the loop not breaking on a
>   match.
> 
>   Could this mistakenly intercept a command that completed normally after a SCSI
>   timeout, returning a timeout error instead of success? Would this also
>   incorrectly clear `ap->deferred_qc`, dropping the deferred command?
>   Should we verify that the loop actually found a match, for instance by checking
>   `if (i < ATA_MAX_QUEUE && qc == ap->deferred_qc)`?

Yeah. Something like this. But that condition cannot actually happen. i ==
ATA_MAX_QUEUE correspond to internal QCs, and these never go through the
deferred path.

> It does seem to be a real problem to me, but I don't know the code well
> enough to be sure. Please take a look and let me know if the problem is
> real. If so, I'll be happy to submit a patch to fix it. If not, please let
> me know what the agent is missing.

See above. That is not a real problem, but it would still be good to check. I
think your change is fine. Car to send a proper patch ?

-- 
Damien Le Moal
Western Digital Research

  parent reply	other threads:[~2026-03-05 23:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-20 22:14 [PATCH v2 0/2] ATA port deferred qc fixes Damien Le Moal
2026-02-20 22:14 ` [PATCH v2 1/2] ata: libata-eh: correctly handle deferred qc timeouts Damien Le Moal
2026-02-23 12:09   ` Hannes Reinecke
2026-02-23 17:48   ` Igor Pylypiv
2026-03-05 17:59   ` Guenter Roeck
2026-03-05 23:27     ` Niklas Cassel
2026-03-06  0:11       ` Damien Le Moal
2026-03-06  0:59         ` Damien Le Moal
2026-03-06  8:23           ` Niklas Cassel
2026-03-06  0:14       ` Guenter Roeck
2026-03-06  0:21         ` Damien Le Moal
2026-03-06  0:41           ` Guenter Roeck
2026-03-05 23:59     ` Damien Le Moal [this message]
2026-03-06  0:32       ` Guenter Roeck
2026-03-06  0:50         ` Damien Le Moal
2026-03-06  1:31           ` Guenter Roeck
2026-03-06  8:24           ` Niklas Cassel
2026-02-20 22:14 ` [PATCH v2 2/2] ata: libata-core: fix cancellation of a port deferred qc work Damien Le Moal
2026-02-23 12:09   ` Hannes Reinecke
2026-02-23 17:49   ` Igor Pylypiv
2026-02-24  0:39 ` [PATCH v2 0/2] ATA port deferred qc fixes Damien Le Moal
2026-04-02  9:22 ` Tommy Kelly
2026-04-05  6:45   ` Damien Le Moal
2026-04-12 18:22   ` Niklas Cassel
2026-04-17  4:38     ` Tommy Kelly
2026-04-17 15:19       ` Niklas Cassel
2026-04-22 15:18   ` Niklas Cassel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8011c682-7fc6-4cab-b142-d05fd3b09de0@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=cassel@kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux@roeck-us.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.