Re: [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators

Linux SCSI subsystem development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators
Date: Fri, 12 Jun 2026 21:42:57 +0000	[thread overview]
Message-ID: <20260612214258.311271F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612180918.8554-6-kartilak@cisco.com>

Thank you for your contribution! Sashiko AI review found 8 potential issue(s) to consider:

New issues:
- [High] The 'Establish Image Pair' flag (FC_SPP_EST_IMG_PAIR) is omitted in the NVMe PRLI Service Parameters.
- [High] NVMe PRLI frame payload is truncated by 4 bytes, violating the specified prli_len and omitting the end of the NVMe SPP.

Pre-existing issues:
- [High] Dereference of freed `tport` pointer in `fnic_tport_event_handler` if a link down occurs before the event is processed.
- [High] Missing cancellation of iport->fabric.retry_timer upon module unload causes UAF.
- [High] Dropping the list-protecting lock inside list_for_each_entry_safe leads to a Use-After-Free if concurrent threads delete the 'next' element.
- [High] Functions drop and reacquire `fnic_lock` using the shared `fnic->lock_flags` instead of the local `flags` saved by the caller, potentially corrupting IRQ state.
- [High] FCoE frames are permanently leaked when send or queue operations fail, leading to kernel memory exhaustion (OOM).
- [High] Releasing `fnic_lock` during target port addition allows concurrent deletion, causing a Use-After-Free.
--

commit 97165276705c88c1da90e05dc10eb567481fddcd
Author: Karan Tilak Kumar <kartilak@cisco.com>

scsi: fnic: Add FDLS role handling for NVMe initiators

Modify FDLS registration and discovery flows to use NVMe FC-4 type,
features, PRLI service parameters, and FDMI attributes when the adapter
runs as an NVMe initiator.

Limit SCSI host setup, teardown, rport reset, and FC host notifications
to FCP initiators while keeping target-port events available to both FCP
and NVMe roles.

> diff --git a/drivers/scsi/fnic/fdls_disc.c b/drivers/scsi/fnic/fdls_disc.c
> --- a/drivers/scsi/fnic/fdls_disc.c
> +++ b/drivers/scsi/fnic/fdls_disc.c

[ ... ]

> @@ -1264,11 +1284,14 @@ bool fdls_delete_tport(struct fnic_iport_s *iport, struct fnic_tport_s *tport)
>  		tport->timer_pending = 0;
>  	}
>  
> -	spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> -	fnic_rport_exch_reset(iport->fnic, tport->fcid);
> -	spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> +	if (IS_FNIC_FCP_INITIATOR(fnic)) {
> +		spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +		fnic_rport_exch_reset(iport->fnic, tport->fcid);
> +		spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> +	}

[Severity: High]
This is a pre-existing issue, but does dropping and reacquiring the lock
using the shared fnic->lock_flags instead of a local variable corrupt the
IRQ state?

Because spin_lock_irqsave() updates fnic->lock_flags with the local CPU's IRQ
state, and fnic_lock is acquired from numerous contexts, this acts as a shared
variable. Restoring it via spin_unlock_irqrestore() can restore an invalid
IRQ state, leading to system instability.

[ ... ]

> @@ -1508,13 +1535,20 @@ fdls_send_tgt_prli(struct fnic_iport_s *iport, struct fnic_tport_s *tport)
>  		.fchdr = {.fh_r_ctl = FC_RCTL_ELS_REQ, .fh_type = FC_TYPE_ELS,
>  			  .fh_f_ctl = {FNIC_ELS_REQ_FCTL, 0, 0},
>  			  .fh_rx_id = cpu_to_be16(FNIC_UNASSIGNED_RXID)},
> -		.els_prli = {.prli_cmd = ELS_PRLI,
> -			     .prli_spp_len = 16,
> -			     .prli_len = cpu_to_be16(0x14)},
> -		.sp = {.spp_type = 0x08, .spp_flags = 0x0020,
> -		       .spp_params = cpu_to_be32(0xA2)}
> +		.els_prli = {.prli_cmd = ELS_PRLI},
> +		.sp = {.spp_params = cpu_to_be32(iport->service_params)}
>  	};
>  
> +	fdls_set_frame_type(fnic, &pprli->sp.spp_type);
> +	if (IS_FNIC_FCP_INITIATOR(fnic)) {
> +		pprli->els_prli.prli_spp_len = 16;
> +		pprli->els_prli.prli_len = cpu_to_be16(0x14);
> +		pprli->sp.spp_flags = FC_SPP_EST_IMG_PAIR;
> +	} else if (IS_FNIC_NVME_INITIATOR(fnic)) {
> +		pprli->els_prli.prli_spp_len = 20;
> +		pprli->els_prli.prli_len = cpu_to_be16(0x18);
> +	}

[Severity: High]
For the NVMe initiator branch, is the omission of the FC_SPP_EST_IMG_PAIR
flag an oversight?

The FC-NVMe standard mandates that the Establish Image Pair bit shall be set
to 1 in PRLI requests to establish the connection. By leaving it 0, the NVMe
initiator sends an invalid PRLI, potentially causing compliant targets to
reject the login.

[Severity: High]
Also in fdls_send_tgt_prli(), does this truncate the NVMe PRLI frame payload?

Earlier in the function, the transmission size is statically allocated as:

uint16_t frame_size = FNIC_ETH_FCOE_HDRS_OFFSET +
        sizeof(struct fc_std_els_prli);

The fc_std_els_prli structure is only 44 bytes long. For NVMe, prli_len
is set to 0x18 (24 bytes payload), meaning the frame over the wire needs
to be 48 bytes. Because it is fixed to sizeof(struct fc_std_els_prli), it
truncates the last 4 bytes of the NVMe SPP (containing the First Burst
Size), resulting in a malformed frame.

[Severity: High]
This is a pre-existing issue, but does failing to send the FCoE frame leak
memory here?

If fnic_send_fcoe_frame() fails at the end of fdls_send_tgt_prli() (and
similar discovery routines), it returns an error code, but the allocated
frame is never freed:

	fnic_send_fcoe_frame(iport, frame, frame_size);

err_out:
	timeout = max(2 * iport->e_d_tov, iport->plogi_timeout);

The caller just starts a retry timer, permanently leaking the frame and
potentially causing OOM under congestion.

[ ... ]

> @@ -2868,20 +2916,23 @@ fdls_process_tgt_prli_rsp(struct fnic_iport_s *iport,
>  	fdls_set_tport_state(tport, FDLS_TGT_STATE_READY);
>  
>  	/* Inform the driver about new target added */
> -	tport_add_evt = kzalloc_obj(struct fnic_tport_event_s, GFP_ATOMIC);
> -	if (!tport_add_evt) {
> -		FNIC_FCS_DBG(KERN_INFO, fnic,
> +	if (IS_FNIC_FCP_INITIATOR(fnic) ||
> +	    IS_FNIC_NVME_INITIATOR(fnic)) {
> +		tport_add_evt = kzalloc_obj(struct fnic_tport_event_s, GFP_ATOMIC);
> +		if (!tport_add_evt) {
> +			FNIC_FCS_DBG(KERN_INFO, fnic,
>  				     "iport fcid: 0x%x tport event memory allocation failure: 0x%0x\n",
>  				     iport->fcid, tport->fcid);
> -		return;
> -	}
> -	tport_add_evt->event = TGT_EV_RPORT_ADD;
> -	tport_add_evt->arg1 = (void *) tport;
> +			return;
> +		}
> +		tport_add_evt->event = TGT_EV_RPORT_ADD;
> +		tport_add_evt->arg1 = (void *)tport;
>  		FNIC_FCS_DBG(KERN_INFO, fnic,
>  			     "iport fcid: 0x%x add tport event fcid: 0x%x\n",
>  			     tport->fcid, iport->fcid);
> -	list_add_tail(&tport_add_evt->links, &fnic->tport_event_list);
> -	queue_work(fnic_event_queue, &fnic->tport_work);
> +		list_add_tail(&tport_add_evt->links, &fnic->tport_event_list);
> +		queue_work(fnic_event_queue, &fnic->tport_work);
> +	}

[Severity: High]
This is a pre-existing issue, but does queueing a raw tport pointer here
expose us to a use-after-free in the workqueue handler?

If a link down occurs before the workqueue executes, fdls_delete_tport()
can free the tport because the FNIC_FDLS_SCSI_REGISTERED flag isn't set yet.
When fnic_tport_event_handler() later processes the event, it dereferences
the freed pointer:

	if (tport->state == FDLS_TGT_STATE_READY) {
		if (IS_FNIC_FCP_INITIATOR(fnic))
			fnic_fdls_add_tport(&fnic->iport, tport, flags);

> diff --git a/drivers/scsi/fnic/fnic_fcs.c b/drivers/scsi/fnic/fnic_fcs.c
> --- a/drivers/scsi/fnic/fnic_fcs.c
> +++ b/drivers/scsi/fnic/fnic_fcs.c

[ ... ]

> @@ -1018,7 +1018,8 @@ void fnic_delete_fcp_tports(struct fnic *fnic)
>  					 "removing fcp rport fcid: 0x%x", tport->fcid);
>  		fdls_set_tport_state(tport, FDLS_TGT_STATE_OFFLINING);
>  		fnic_del_tport_timer_sync(fnic, tport);
> -		fnic_fdls_remove_tport(&fnic->iport, tport, flags);
> +		if (IS_FNIC_FCP_INITIATOR(fnic))
> +			fnic_fdls_remove_tport(&fnic->iport, tport, flags);
>  	}
>  	spin_unlock_irqrestore(&fnic->fnic_lock, flags);
>  }

[Severity: High]
This is a pre-existing issue, but does fnic_del_tport_timer_sync() temporarily
drop the fnic_lock while iterating with list_for_each_entry_safe()?

Dropping the list-protecting lock inside list_for_each_entry_safe() allows
concurrent threads to delete the "next" element on the list. When the loop
resumes, advancing to the freed "next" element results in a use-after-free.

[ ... ]

> @@ -1044,8 +1045,8 @@ void fnic_tport_event_handler(struct work_struct *work)
>  			FNIC_FCS_DBG(KERN_INFO, fnic,
>  						 "Add rport event");
>  			if (tport->state == FDLS_TGT_STATE_READY) {
> -				fnic_fdls_add_tport(&fnic->iport,
> -					(struct fnic_tport_s *) cur_evt->arg1, flags);
> +				if (IS_FNIC_FCP_INITIATOR(fnic))
> +					fnic_fdls_add_tport(&fnic->iport, tport, flags);

[Severity: High]
This is a pre-existing issue, but does fnic_fdls_add_tport() drop the
fnic_lock internally, leading to another potential use-after-free?

fnic_fdls_add_tport() executes under fnic_lock but drops it to call
fc_remote_port_add(). A concurrent event could acquire the lock and free
the tport. When fnic_fdls_add_tport() reacquires the lock, it writes to
tport->rport and tport->flags, corrupting the freed memory.

> diff --git a/drivers/scsi/fnic/fnic_main.c b/drivers/scsi/fnic/fnic_main.c
> --- a/drivers/scsi/fnic/fnic_main.c
> +++ b/drivers/scsi/fnic/fnic_main.c

[ ... ]

> @@ -1184,7 +1188,8 @@ static void fnic_remove(struct pci_dev *pdev)
>  	 */
>  	flush_workqueue(fnic_event_queue);
>  
> -	fnic_scsi_unload(fnic);
> +	if (IS_FNIC_FCP_INITIATOR(fnic))
> +		fnic_scsi_unload(fnic);
>  
>  	if (vnic_dev_get_intr_mode(fnic->vdev) == VNIC_DEV_INTR_MODE_MSI)
>  		timer_delete_sync(&fnic->notify_timer);

[Severity: High]
This is a pre-existing issue, but does fnic_remove() omit the teardown of
iport->fabric.retry_timer?

The timer is set up during initialization, but timer_delete_sync() for
the fabric retry timer is missing from the module unload paths. If the timer
fires after the fnic structure is freed, it will cause a use-after-free.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612180918.8554-1-kartilak@cisco.com?part=5

next prev parent reply	other threads:[~2026-06-12 21:42 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 18:09 [PATCH v4 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-12 18:42   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-12 18:57   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-12 21:42   ` sashiko-bot [this message]
2026-06-12 18:09 ` [PATCH v4 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-12 21:59   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-12 22:15   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-12 22:29   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-12 22:45   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-12 22:57   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-12 23:16   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-12 23:22   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612214258.311271F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kartilak@cisco.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox