Linux SCSI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v4 10/13] scsi: fnic: Abort timed-out NVMe LS requests
Date: Fri, 12 Jun 2026 22:57:53 +0000	[thread overview]
Message-ID: <20260612225754.3524B1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612180918.8554-11-kartilak@cisco.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [Critical] Dropping fnic_lock before network transmission and accessing the NVMe LS request on failure leads to Use-After-Free and double completions.

Pre-existing issues:
- [High] Calling timer_delete_sync() while holding fnic->fnic_lock causes an ABBA deadlock with the timer callback.
- [High] Dropping a spinlock inside list_for_each_entry_safe invalidates the 'next' pointer.
- [High] Saving spin_lock_irqsave interrupt state into a shared struct member corrupts interrupt state.
--

commit af4855e82320c32dd587460223ab9977658df024
Author: Karan Tilak Kumar <kartilak@cisco.com>

scsi: fnic: Abort timed-out NVMe LS requests

Add an FDLS helper that sends ABTS frames for outstanding NVMe LS
requests. Use the active LS request OXID when building the ABTS frame,
send it through the FCoE transmit path, and call it from LS timeout
and abort handling.

> diff --git a/drivers/scsi/fnic/fnic_nvme.c b/drivers/scsi/fnic/fnic_nvme.c
> index 014660725373b..9cc8f0baf09cb 100644
> --- a/drivers/scsi/fnic/fnic_nvme.c
> +++ b/drivers/scsi/fnic/fnic_nvme.c

[ ... ]

> @@ -1232,6 +1236,23 @@ void nvfnic_ls_rsp_recv(struct fnic_iport_s *iport,
>  	spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
>  }

[Severity: High]
This is a pre-existing issue, but does calling timer_delete_sync while
holding fnic->fnic_lock in nvfnic_ls_rsp_recv cause an ABBA deadlock?

If the timer callback unconditionally requests fnic_lock, waiting for the
timer to complete while holding the lock will deadlock:

nvfnic_ls_rsp_recv()
    spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
    ...
    timer_delete_sync(&nvfnic_ls_req->ls_req_timer);

[ ... ]

> @@ -1274,6 +1294,19 @@ void nvfnic_ls_req_timeout(struct timer_list *t)
>  		FNIC_NVME_DBG(KERN_ERR, fnic,
>  			      "tport: 0x%x lsreq: 0x%x sending abort\n",
>  			      tport->fcid, nvfnic_ls_req->oxid);
> +		nvfnic_ls_req->state = FNIC_LS_REQ_CMD_ABTS_PENDING;
> +		spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +
> +		if (fdls_send_ls_req_abts(iport, tport, nvfnic_ls_req->oxid) == 0) {
> +			timeout = FNIC_LS_REQ_TMO_MSECS(ls_req->timeout);
> +			mod_timer(&nvfnic_ls_req->ls_req_timer,
> +				  round_jiffies(jiffies + msecs_to_jiffies(timeout)));
> +			return;
> +		}
> +		FNIC_NVME_DBG(KERN_ERR, fnic,
> +			      "tport: 0x%x lsreq: 0x%x cannot send abort\n",
> +			      tport->fcid, oxid);
> +		spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);

[Severity: High]
This is a pre-existing issue, but does saving the interrupt state into the
shared fnic->lock_flags struct member risk corrupting CPU IRQ states
during spinlock contention?

Using a global variable for IRQ state instead of a local stack variable
can lead to corruption when the lock is contended.

[ ... ]

> @@ -1579,10 +1607,25 @@ void nvfnic_ls_req_abort(struct nvme_fc_local_port *lport,
>  
>  	/* Mark the state and flags */
>  	nvfnic_ls_req->state = FNIC_LS_REQ_CMD_ABTS_PENDING;
> +	oxid = nvfnic_ls_req->oxid;
>  	timeout = FNIC_LS_REQ_TMO_MSECS(lsreq->timeout);
>  	mod_timer(&nvfnic_ls_req->ls_req_timer,
>  		  round_jiffies(jiffies + msecs_to_jiffies(timeout)));
>  	spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +
> +	ret = fdls_send_ls_req_abts(iport, tport, oxid);
> +	if (!ret)
> +		return;
> +
> +	timer_delete_sync(&nvfnic_ls_req->ls_req_timer);

[Severity: Critical]
Can dropping fnic_lock before network transmission and then accessing
nvfnic_ls_req after transmission failure lead to a use-after-free and
double completions?

Since memory ownership is transferred to the timer when mod_timer is
called, the timer could fire while the lock is dropped. This could free
nvfnic_ls_req before we reach timer_delete_sync.

[ ... ]

[Severity: High]
This is a pre-existing issue, but does dropping the spinlock inside
list_for_each_entry_safe invalidate the next pointer?

In drivers/scsi/fnic/fnic_nvme.c:nvfnic_cleanup_tport_io():

list_for_each_entry_safe(nvfnic_ls_req, next, &(tport->ls_req_list), list)

If fnic_lock is dropped inside this loop, concurrent modifications to the
list can invalidate the next pointer, leading to a use-after-free.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612180918.8554-1-kartilak@cisco.com?part=10

  reply	other threads:[~2026-06-12 22:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 18:09 [PATCH v4 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-12 18:42   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-12 18:57   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-12 21:42   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-12 21:59   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-12 22:15   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-12 22:29   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-12 22:45   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-12 22:57   ` sashiko-bot [this message]
2026-06-12 18:09 ` [PATCH v4 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-12 23:16   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-12 23:22   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612225754.3524B1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kartilak@cisco.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox