From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v5 06/13] scsi: fnic: Add the NVMe/FC transport path
Date: Wed, 24 Jun 2026 05:12:54 +0000 [thread overview]
Message-ID: <20260624051254.B19751F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260624044334.3079-7-kartilak@cisco.com>
Thank you for your contribution! Sashiko AI review found 11 potential issue(s) to consider:
New issues:
- [Critical] Premature freeing of I/O tags during ABTS completion leads to tag reuse and data corruption.
- [High] Non-sync timer cancel in nvfnic_ls_rsp_recv leads to Use-After-Free of lsreq and fnic structures.
- [High] Asynchronous deletion timeouts cause Use-After-Free of tport and lport objects.
- [High] Missing cancellation of nvme_io_cmpl_work allows a work item to execute after the fnic structure is freed.
- [High] Leaked LS requests and active timers when remote port unregistration fails.
- [High] A failed abort request combined with a concurrent completion causes an I/O to hang forever.
- [High] Concurrent I/O completion causes Use-After-Free of fcp_req when _terminate_tport_ios drops the lock.
- [High] Accessing lsreq->private after dropping the lock in nvfnic_ls_req_abort causes Use-After-Free.
- [Medium] Uninitialized sgl_mapped state leads to unbalanced dma_unmap_single calls for zero-length NVMe requests.
- [Medium] Permanent reference leak of the PCI device (pdev->dev) during localport registration.
Pre-existing issues:
- [High] Dropping the spinlock inside list_for_each_entry_safe() permits concurrent list corruption.
--
commit 78a3171c3dafe1bc2912707cce47ec94e43860f3
Author: Karan Tilak Kumar <kartilak@cisco.com>
scsi: fnic: Add the NVMe/FC transport path
This patch introduces the NVMe/FC transport path for the fnic driver.
It includes local and remote port registration, I/O submission, tag
management, abort handling, and cleanup logic.
Closes: https://lore.kernel.org/oe-kbuild-all/202605280430.wTYAqI3A-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202605280619.pmobiDWp-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202605280519.Jd4fmgAZ-lkp@intel.com/
> diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
> index 0d7828be244da..df8c8ebbc32b9 100644
> --- a/drivers/scsi/fnic/fnic_main.c
> +++ b/drivers/scsi/fnic/fnic_main.c
[ ... ]
> @@ -1215,6 +1286,11 @@ static void fnic_remove(struct pci_dev *pdev)
> */
> fnic_cleanup(fnic);
>
> + if (IS_FNIC_NVME_INITIATOR(fnic)) {
> + sbitmap_free(&fnic->nvfnic_tag_map);
> + fnic_free_ioreq_tables_mq(fnic);
> + }
> +
> spin_lock_irqsave(&fnic_list_lock, flags);
> list_del(&fnic->list);
> spin_unlock_irqrestore(&fnic_list_lock, flags);
[Severity: High]
Does this code leave work items running after the fnic structure is freed?
Looking at fnic_remove(), we call fnic_cleanup(fnic) which drains the hardware
completion queues, potentially finding completions and queueing work items via
queue_work(fnic_cmpl_queue, &fnic->nvme_io_cmpl_work).
Since there is no cancel_work_sync() before kfree(fnic), could a re-queued
work item execute after the memory is freed and cause a use-after-free crash?
> diff --git a/drivers/scsi/fnic/fnic_nvme.c b/drivers/scsi/fnic/fnic_nvme.c
> new file mode 100644
> index 0000000000000..8374464e4fcc8
> --- /dev/null
> +++ b/drivers/scsi/fnic/fnic_nvme.c
[ ... ]
> +void nvfnic_dma_unmap_sgl(struct fnic *fnic, struct fnic_io_req *io_req)
> +{
> + if (io_req->sgl_mapped) {
> + dma_unmap_single(&fnic->pdev->dev, io_req->sgl_list_pa,
> + sizeof(io_req->sgl_list[0]) * io_req->sgl_cnt,
> + DMA_TO_DEVICE);
> + io_req->sgl_mapped = 0;
> + io_req->sgl_list_pa = 0;
> + }
> +}
[Severity: Medium]
Could this lead to an unbalanced DMA unmap?
In nvfnic_fcpio_send(), io_req->sgl_mapped is not explicitly zeroed during
initialization:
nvfnic_fcpio_send() {
...
io_req->sgl_list_pa = 0;
io_req->wq = hw_queue_handle;
...
}
If a previous I/O set sgl_mapped to 1, and the current I/O has zero length,
the mapping step is skipped but sgl_mapped might still be 1. Will this cause
nvfnic_dma_unmap_sgl() to unmap a 0 address with 0 length?
[ ... ]
> +void nvfnic_fcpio_nvme_fast_cmpl_handler(struct fnic *fnic,
> + struct fcpio_fw_req *desc)
> +{
> + u8 type;
> + u8 hdr_status;
> + struct fcpio_tag ftag;
> + u32 id;
> + struct fnic_io_req *io_req;
[ ... ]
> + fcpio_header_dec(&desc->hdr, &type, &hdr_status, &ftag);
> + fcpio_tag_id_dec(&ftag, &id);
> + tag = id & FNIC_TAG_MASK;
> +
> + if (tag >= fnic->fnic_max_tag_id) {
> + FNIC_NVME_DBG(KERN_ERR, fnic,
> + "Tag out of range tag: 0x%x hdr status: %s\n", tag,
> + fnic_fcpio_status_to_str(hdr_status));
> + return;
> + }
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> +
> + io_req = nvfnic_find_io_req_by_tag(fnic, tag);
> +
> + WARN_ON_ONCE(!io_req);
> + if (!io_req) {
[ ... ]
> + /* firmware completed the io */
> + io_req->io_completed = 1;
> + if (io_req->cmd_state == FNIC_IOREQ_ABTS_PENDING) {
> + /*
> + * set the FNIC_IO_DONE so that this doesn't get
> + * flagged as 'out of order' if it was not aborted
> + */
> + io_req->cmd_flags |= FNIC_IO_DONE;
> + io_req->cmd_flags |= FNIC_IO_ABTS_PENDING;
> + if (hdr_status == FCPIO_ABORTED)
> + io_req->cmd_flags |= FNIC_IO_ABORTED;
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +
> + FNIC_NVME_DBG(KERN_INFO, fnic,
> + "icmnd abts hdr:%d %s tag:0x%x io:%p",
> + hdr_status, fnic_fcpio_status_to_str(hdr_status),
> + id, io_req);
> + return;
> + }
[Severity: High]
Could an I/O hang indefinitely here if the abort request fails to queue?
When aborting an I/O, nvfnic_fcpio_abort() sets the state to
FNIC_IOREQ_ABTS_PENDING and drops the lock. If a hardware completion arrives
during this window, nvfnic_fcpio_nvme_fast_cmpl_handler() observes
ABTS_PENDING and intentionally drops the completion.
If the subsequent nvfnic_queue_abort_io_req() call fails, the state is
reverted back to FNIC_IOREQ_CMD_PENDING. Because the completion was
permanently discarded and the hardware was not told to abort, does this
leave the I/O hanging forever?
[ ... ]
> +void nvfnic_fcpio_nvme_itmf_cmpl_handler(struct fnic *fnic,
> + struct fcpio_fw_req *desc)
> +{
> + u8 type;
> + u8 hdr_status;
[ ... ]
> + /* If the status is IO not found consider it as success.
> + * NVME sends abort even if rport is down in which case
> + * we will get FCPIO_TIMEOUT. Consider this as success.
> + */
> + if ((hdr_status == FCPIO_IO_NOT_FOUND) ||
> + (hdr_status == FCPIO_TIMEOUT) ||
> + (hdr_status == FCPIO_ITMF_REJECTED))
> + io_req->abts_state = FCPIO_SUCCESS;
> +
> + io_req->cmd_flags |= FNIC_IO_ABT_TERM_DONE;
> +
> +
> + io_req->fcp_req->transferred_length = 0;
> + io_req->fcp_req->rcv_rsplen = 0;
> + if (io_req->abts_state == FCPIO_SUCCESS)
> + io_req->fcp_req->status = NVME_SC_ABORT_REQ;
> + else
> + io_req->fcp_req->status = NVME_SC_INTERNAL;
> +
> + nvfnic_release_nvme_ioreq_buf(iport, io_req);
> + if (io_req->done)
> + io_req->done(io_req);
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +}
[Severity: Critical]
Does this allow a tag to be reused before the hardware has actually completed
the original request, potentially corrupting a new I/O?
When an ITMF (abort) completion times out (FCPIO_TIMEOUT), the tag is
forcefully freed via io_req->done() and can be reallocated to a new request.
Since the driver matches completions only by the tag without a generation
count, if the late completion for the original I/O arrives, will
nvfnic_fcpio_nvme_fast_cmpl_handler() match it to the new request and
prematurely complete it with stale data?
[ ... ]
> +bool _terminate_tport_ios(struct sbitmap *map, unsigned int tag,
> + void *data)
> +{
> + struct fnic_tport_s *tport = data;
> + struct fnic_iport_s *iport = tport->iport;
> + struct fnic *fnic = iport->fnic;
> + struct fnic_io_req *io_req;
> + struct nvmefc_fcp_req *fcp_req = NULL;
> + struct nvme_fc_local_port *lport = iport->nv_lport;
> + struct nvme_fc_remote_port *rport = tport->nv_rport;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&fnic->fnic_lock, flags);
> + io_req = fnic->sw_copy_wq[0].io_req_table[tag];
> + if (!io_req) {
> + spin_unlock_irqrestore(&fnic->fnic_lock, flags);
> + return true;
> + }
> +
> + if (io_req->tport != tport) {
> + spin_unlock_irqrestore(&fnic->fnic_lock, flags);
> + return true;
> + }
> +
> + FNIC_NVME_DBG(KERN_INFO, fnic,
> + "Terminate tag: 0x%x (tport fcid 0x%x)\n",
> + io_req->tag, io_req->tport->fcid);
> + fcp_req = io_req->fcp_req;
> + spin_unlock_irqrestore(&fnic->fnic_lock, flags);
> +
> + nvfnic_fcpio_abort(lport, rport, NULL, fcp_req);
> + return true;
> +}
[Severity: High]
Can fcp_req be accessed after it has been freed?
Here, fcp_req is extracted from io_req under the lock. The lock is then dropped
without modifying io_req->cmd_state. During this unlocked window, a concurrent
hardware completion could run, process the I/O, invoke io_req->done(), and
return the request to the midlayer, which might free or reuse fcp_req.
When nvfnic_fcpio_abort() is subsequently called, could it dereference the
now-freed fcp_req structure?
[ ... ]
> +void nvfnic_nvme_zero_devloss_tports(struct fnic *fnic)
> +{
> + struct fnic_tport_s *tport, *next;
> +
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> + list_for_each_entry_safe(tport, next, &fnic->iport.tport_list, links) {
> + if (tport->flags & FNIC_FDLS_NVME_REGISTERED) {
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> + nvme_fc_set_remoteport_devloss(tport->nv_rport, 0);
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> + }
> + }
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +}
[Severity: High]
This is a pre-existing issue, but dropping the lock inside
list_for_each_entry_safe() allows concurrent list corruption.
list_for_each_entry_safe() protects against deletion of the *current* entry
by caching the 'next' pointer. However, while the fnic_lock is dropped,
concurrent threads can remove and free the cached 'next' entry from the list.
When the loop resumes and accesses 'next', does this result in a
use-after-free? This pattern also exists in fnic_delete_fcp_tports().
[ ... ]
> +void nvfnic_ls_rsp_recv(struct fnic_iport_s *iport,
> + struct fc_frame_header *fchdr, int len)
> +{
> + uint8_t *fcid;
[ ... ]
> + nvfnic_ls_req->state = FNIC_LS_REQ_CMD_COMPLETE;
> +
> + list_del_init(&nvfnic_ls_req->list);
> + lsreq->private = NULL;
> + fdls_free_oxid(iport, oxid, &nvfnic_ls_req->oxid);
> + timer_delete(&nvfnic_ls_req->ls_req_timer);
> +
> + if (status == 0) {
> + FNIC_NVME_DBG(KERN_DEBUG, fnic,
> + "tport:0x%x lsreq:0x%x completed\n",
> + tport_fcid, oxid);
> +
> + /* Copy the Response */
> + memcpy(lsreq->rspaddr, (uint8_t *)fchdr + sizeof(*fchdr),
> + rsp_len);
> + }
> +
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> + lsreq->done(lsreq, status);
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> +}
[Severity: High]
Could the timer callback race with lsreq->done() and cause a use-after-free?
Because timer_delete() is used instead of timer_delete_sync(), the timer
callback might be concurrently executing on another CPU spinning on fnic_lock.
When the lock is dropped here, lsreq->done() is called, returning ownership
to the NVMe FC midlayer which frees lsreq.
Once the lock is released, won't the spinning timer callback wake up and
dereference the freed lsreq memory?
[ ... ]
> +void nvfnic_ls_req_abort(struct nvme_fc_local_port *lport,
> + struct nvme_fc_remote_port *rport,
> + struct nvmefc_ls_req *lsreq)
> +{
> + struct fnic_iport_s *iport = lport->private;
> + struct fnic *fnic = iport->fnic;
> + struct fnic_tport_s *tport;
> + struct nvfnic_ls_req *nvfnic_ls_req;
> + uint16_t oxid;
> + int timeout;
> +
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
[ ... ]
> + nvfnic_ls_req->state = FNIC_LS_REQ_CMD_ABTS_STARTED;
> + spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> + timer_delete_sync(&nvfnic_ls_req->ls_req_timer);
> +
> + spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> + nvfnic_ls_req = lsreq->private;
> +
> + if ((nvfnic_ls_req == NULL) ||
> + (nvfnic_ls_req->state == FNIC_LS_REQ_CMD_ABTS_PENDING)) {
[Severity: High]
Does this dereference lsreq->private after the lock is reacquired, when lsreq
might already be freed?
The fnic_lock is dropped to call timer_delete_sync(). During this window,
nvfnic_terminate_tport_ls_reqs() could process the request and invoke
lsreq->done(), returning it to the midlayer which then frees it.
When the lock is reacquired, nvfnic_ls_req = lsreq->private accesses the freed
lsreq memory. Can this result in a use-after-free crash?
[ ... ]
> +void nvfnic_delete_tport(struct fnic_iport_s *iport,
> + struct fnic_tport_s *tport,
> + unsigned long flags)
> +{
> + struct fnic *fnic = iport->fnic;
> + int ret;
> + unsigned int time_wait = FNIC_NVME_TPORT_REMOVE_WAIT;
> + unsigned int time_remain;
> + DECLARE_COMPLETION_ONSTACK(tm_done);
> + unsigned int fcid;
> + int count = 0;
> +
> + if (!tport)
> + return;
> +
> + fcid = tport->fcid;
> + fdls_set_tport_state(tport, FDLS_TGT_STATE_OFFLINE);
> +
> + FNIC_NVME_DBG(KERN_DEBUG, fnic,
> + "0x%x: scheduled deletion for tport: 0x%x\n",
> + iport->fcid, tport->fcid);
> +
> + if (!(tport->flags & FNIC_FDLS_NVME_REGISTERED)) {
> + FNIC_NVME_DBG(KERN_ERR, fnic,
> + "0x%x: tport: 0x%x not registered. Freeing\n",
> + iport->fcid, tport->fcid);
> + list_del(&tport->links);
> + kfree(tport);
> + return;
> + }
> +
> + tport->tport_del_done = &tm_done;
> +
> + tport->flags |= FNIC_FDLS_TPORT_DELETED;
> + spin_unlock_irqrestore(&fnic->fnic_lock, flags);
> + ret = nvme_fc_unregister_remoteport(tport->nv_rport);
> + if (ret) {
> + FNIC_NVME_DBG(KERN_ERR, fnic,
> + "tport: 0x%x unregister failed %d\n",
> + tport->fcid, ret);
> + spin_lock_irqsave(&fnic->fnic_lock, flags);
> + tport->tport_del_done = NULL;
> + list_del(&tport->links);
> + kfree(tport);
> + return;
> + }
[Severity: High]
Does this leak pending LS requests and leave their timers active?
If nvme_fc_unregister_remoteport() fails, kfree(tport) is called directly.
This bypasses nvfnic_terminate_tport_ls_reqs(), which leaks all nvfnic_ls_req
structures left on tport->ls_req_list.
Since their timers are never stopped, won't nvfnic_ls_req_timeout()
eventually fire and access the freed tport structure?
> + time_remain = wait_for_completion_timeout(tport->tport_del_done,
> + msecs_to_jiffies(time_wait));
> +
> + FNIC_NVME_DBG(KERN_DEBUG, fnic,
> + "tport: 0x%x wait for deletion done\n",
> + tport->fcid);
> +
> + spin_lock_irqsave(&fnic->fnic_lock, flags);
> + tport->tport_del_done = NULL;
> +
> + if (!time_remain) {
> + FNIC_NVME_DBG(KERN_ERR, fnic,
> + "tport: 0x%x nvme midlayer completion timed out\n",
> + tport->fcid);
> +
> + spin_unlock_irqrestore(&fnic->fnic_lock, flags);
> + nvfnic_cleanup_tport_io(fnic, tport);
> + spin_lock_irqsave(&fnic->fnic_lock, flags);
> + tport->flags &= ~FNIC_FDLS_NVME_TPORT_CLEANUP_PENDING;
> + list_del(&tport->links);
> + kfree(tport);
[Severity: High]
If this timeout expires, will the NVMe FC midlayer still call the
remoteport_delete callback and cause a use-after-free?
When the timeout expires, the driver forcefully calls kfree(tport). However,
the NVMe FC midlayer retains pointers to these structures and will eventually
execute the asynchronous remoteport_delete callback.
When nvfnic_remote_port_delete() is called later, it dereferences
rport->private. Since tport was already freed here, does this lead to a
use-after-free crash?
[ ... ]
> +int nvfnic_add_lport(struct fnic *fnic)
> +{
> + struct nvme_fc_port_info pinfo;
> + struct fnic_iport_s *iport = &fnic->iport;
> + int ret = 0;
[ ... ]
> + ret = nvme_fc_register_localport(&pinfo, iport->nv_tmpl,
> + get_device(&fnic->pdev->dev),
> + &iport->nv_lport);
> + if (ret) {
> + FNIC_NVME_DBG(KERN_ERR, fnic,
> + "Failed to add wwpn: 0x%llx ret: %d\n",
> + iport->wwpn, ret);
> + kfree(iport->nv_tmpl);
> + return ret;
> + }
[Severity: Medium]
Does this create a permanent reference leak of the PCI device?
The get_device(&fnic->pdev->dev) call takes a reference that is passed to
the NVMe FC midlayer, which internally takes its own reference when registering
the localport and puts it upon unregistration.
Because this driver-level get_device() is never balanced with a put_device()
call by fnic, will this leak prevent PCI device teardown or hot-unplug?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624044334.3079-1-kartilak@cisco.com?part=6
next prev parent reply other threads:[~2026-06-24 5:12 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 4:43 [PATCH v5 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-24 5:08 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-24 5:18 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-24 5:12 ` sashiko-bot [this message]
2026-06-24 4:43 ` [PATCH v5 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-24 5:13 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-24 5:13 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-24 5:15 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-24 5:17 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260624051254.B19751F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=kartilak@cisco.com \
--cc=linux-scsi@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox