From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v5 07/13] scsi: fnic: Route completions and resets by initiator role
Date: Wed, 24 Jun 2026 05:11:53 +0000 [thread overview]
Message-ID: <20260624051153.C379C1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260624044334.3079-8-kartilak@cisco.com>
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Asynchronous NVMe IO completion work queued during driver teardown executes after the fnic structure is freed, leading to a Use-After-Free.
Pre-existing issues:
- [Critical] The firmware reset completion handler calls `schedule_timeout()` from a hardirq context, leading to a "scheduling while atomic" kernel panic.
- [High] Using a shared struct member (`fnic->lock_flags`) to store IRQ state for `spin_lock_irqsave` violates locking semantics and can corrupt IRQ state across CPUs.
--
commit 0746859ac4c31fbfb285f20a291564287cfa0119
Author: Karan Tilak Kumar <kartilak@cisco.com>
scsi: fnic: Route completions and resets by initiator role
Dispatch FCPIO command, response, and ITMF completions to the FCP or
NVMe handlers based on the configured role. Read the NVMe queue-depth
and timeout retry fields from firmware config, clean up NVMe I/O on
firmware reset, and skip SCSI-only cleanup for initiator roles.
[ ... ]
> diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
> index 04ab384033b19..9607684bc6107 100644
> --- a/drivers/scsi/fnic/fnic_scsi.c
> +++ b/drivers/scsi/fnic/fnic_scsi.c
[ ... ]
> @@ -734,6 +735,8 @@ static int fnic_fcpio_fw_reset_cmpl_handler(struct fnic *fnic,
> /* Clean up all outstanding io requests */
> if (IS_FNIC_FCP_INITIATOR(fnic))
> fnic_cleanup_io(fnic, SCSI_NO_TAG);
[Severity: Critical]
This is a pre-existing issue, but does calling fnic_cleanup_io() from a
hardware interrupt context trigger a "scheduling while atomic" panic?
If a hardware interrupt handler like fnic_isr_msix_wq_copy() processes
a FCPIO_RESET_CMPL event:
fnic_wq_copy_cmpl_handler() ->
fnic_fcpio_fw_reset_cmpl_handler() ->
fnic_cleanup_io() ->
schedule_timeout(msecs_to_jiffies(100))
Calling schedule_timeout() yields the CPU to the scheduler, which is
invalid from hardirq context. Will this crash the system upon firmware
reset completion?
> + else if (IS_FNIC_NVME_INITIATOR(fnic))
> + nvfnic_cleanup_all_nvme_ios(fnic);
>
> atomic64_set(&fnic->fnic_stats.fw_stats.active_fw_reqs, 0);
> atomic64_set(&fnic->fnic_stats.io_stats.active_ios, 0);
[ ... ]
> @@ -1457,11 +1461,22 @@ static int fnic_fcpio_cmpl_handler(struct vnic_dev *vdev,
> break;
>
> case FCPIO_ICMND_CMPL: /* fw completed a command */
> - fnic_fcpio_icmnd_cmpl_handler(fnic, cq_index, desc);
> + if (IS_FNIC_FCP_INITIATOR(fnic))
> + fnic_fcpio_icmnd_cmpl_handler(fnic, cq_index, desc);
> + else if (IS_FNIC_NVME_INITIATOR(fnic))
> + nvfnic_fcpio_nvme_fast_cmpl_handler(fnic, desc);
> + break;
> +
> + case FCPIO_NVME_ERSP_HW_CMPL: /* fw completed NVMe ERSP */
> + if (IS_FNIC_NVME_INITIATOR(fnic))
> + nvfnic_fcpio_ersp_cmpl_handler(fnic, desc, 1);
[Severity: High]
This is a pre-existing issue, but does calling these NVMe completion handlers
corrupt IRQ state across CPUs?
Looking at nvfnic_fcpio_nvme_fast_cmpl_handler() and
nvfnic_fcpio_ersp_cmpl_handler() in fnic_nvme.c, they both use a shared
structure member to save IRQ state:
spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
Because fnic->lock_flags is shared rather than a local stack variable,
concurrent threads acquiring the lock on different CPUs will overwrite
each other's saved IRQ states. When spin_unlock_irqrestore() is called,
the wrong IRQ state is restored.
> break;
>
> case FCPIO_ITMF_CMPL: /* fw completed itmf (abort cmd, lun reset)*/
> - fnic_fcpio_itmf_cmpl_handler(fnic, cq_index, desc);
> + if (IS_FNIC_FCP_INITIATOR(fnic))
> + fnic_fcpio_itmf_cmpl_handler(fnic, cq_index, desc);
> + else if (IS_FNIC_NVME_INITIATOR(fnic))
> + nvfnic_fcpio_nvme_itmf_cmpl_handler(fnic, desc);
> break;
[ ... ]
> @@ -1650,6 +1665,15 @@ void fnic_wq_copy_cleanup_handler(struct vnic_wq_copy *wq,
> unsigned long start_time = 0;
> uint16_t hwq;
>
> + /*
> + * Clean up outstanding NVMe requests if firmware reset did not
> + * complete them before WQ copy cleanup.
> + */
> + if (IS_FNIC_NVME_INITIATOR(fnic)) {
> + nvfnic_cleanup_all_nvme_ios(fnic);
> + return;
> + }
> +
[Severity: High]
Can this unconditional cleanup lead to a use-after-free during driver
teardown?
The fnic_remove() sequence calls nvfnic_nvme_unload() which flushes
fnic->nvme_io_cmpl_work. It then calls fnic_cleanup() which invokes this
handler:
fnic_remove()
nvfnic_nvme_unload()
flush_work(&fnic->nvme_io_cmpl_work);
fnic_cleanup()
fnic_wq_copy_cleanup_handler()
nvfnic_cleanup_all_nvme_ios()
io_req->done()
If there are uncompleted requests, nvfnic_cleanup_all_nvme_ios() will
forcibly complete them, and io_req->done() queues new async work referencing
&fnic->nvme_io_cmpl_work.
Since fnic_remove() then frees the fnic structure without another flush,
won't the asynchronously executing work item dereference the freed fnic
pointer?
> /* get the tag reference */
> fcpio_tag_id_dec(&desc->hdr.tag, &id);
> id &= FNIC_TAG_MASK;
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260624044334.3079-1-kartilak@cisco.com?part=7
next prev parent reply other threads:[~2026-06-24 5:11 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-24 4:43 [PATCH v5 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-24 5:08 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-24 4:43 ` [PATCH v5 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-24 5:18 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-24 5:12 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot [this message]
2026-06-24 4:43 ` [PATCH v5 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-24 5:13 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-24 5:11 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-24 5:13 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-24 5:15 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-24 5:17 ` sashiko-bot
2026-06-24 4:43 ` [PATCH v5 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260624051153.C379C1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=kartilak@cisco.com \
--cc=linux-scsi@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox