Re: [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS
Date: Fri, 12 Jun 2026 22:29:14 +0000	[thread overview]
Message-ID: <20260612222914.C040C1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612180918.8554-9-kartilak@cisco.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] IRQ state corruption due to locking inconsistency with local flags vs struct member
- [High] State corruption and Use-After-Free of OXIDs due to incomplete pool reset
- [High] ABBA deadlock involving `timer_delete_sync()` and `fnic->fnic_lock`
- [Medium] Dead code in defensive check causing NULL pointer dereference
--

commit fab490a6f935460a8594dd821b190a2f2d7c7b5c
Author: Karan Tilak Kumar <kartilak@cisco.com>

scsi: fnic: Handle NVMe LS frames in FDLS

Classify NVMe LS request OXIDs, route NVMe LS responses and ABTS frames
through the FCS receive path, and reset NVMe exchanges when FDLS tears
down target ports. Extend FDLS link-down and frame-processing paths so
NVMe LS traffic follows the same discovery and cleanup state machine as
FCP traffic.

> diff --git a/drivers/scsi/fnic/fdls_disc.c b/drivers/scsi/fnic/fdls_disc.c
> --- a/drivers/scsi/fnic/fdls_disc.c
> +++ b/drivers/scsi/fnic/fdls_disc.c
> @@ -387,10 +387,16 @@ static bool fdls_is_oxid_tgt_req(uint16_t oxid)
>  	return true;
>  }
>  
> +static inline bool fdls_is_oxid_nvme_req(uint16_t oxid)
> +{
> +	return FNIC_FRAME_TYPE(oxid) == FNIC_FRAME_TYPE_NVME_LS;
> +}
> +
>  static void fdls_reset_oxid_pool(struct fnic_iport_s *iport)
>  {
>  	struct fnic_oxid_pool_s *oxid_pool = &iport->oxid_pool;
>  
> +	bitmap_clear(oxid_pool->bitmap, 0, FNIC_OXID_POOL_SZ);
>  	oxid_pool->next_idx = 0;
>  }

[Severity: High]
Does this clear the bitmap without tearing down the background reclaim timers?

If oxid_reclaim_work is still pending or the oxid_reclaim_list is not cleared
when fdls_reset_oxid_pool() is called, the background reclaim task might run
after the link is reset.

If a new connection starts allocating OXIDs, could the pending reclaim task
blindly clear bits for newly allocated OXIDs? This might cause those active
OXIDs to be marked free again, leading to double-allocation and session
corruption.

[ ... ]

> @@ -1288,6 +1294,10 @@ bool fdls_delete_tport(struct fnic_iport_s *iport, struct fnic_tport_s *tport)
>  		spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
>  		fnic_rport_exch_reset(iport->fnic, tport->fcid);
>  		spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
> +	} else if (IS_FNIC_NVME_INITIATOR(fnic)) {
> +		spin_unlock_irqrestore(&fnic->fnic_lock, fnic->lock_flags);
> +		nvfnic_exch_reset(iport, tport);
> +		spin_lock_irqsave(&fnic->fnic_lock, fnic->lock_flags);
>  	}

[Severity: High]
Can using the structure member fnic->lock_flags here instead of a local flags
variable corrupt the IRQ state?

If a caller (like nvfnic_remote_port_delete) acquires the lock using a local
flags variable, releasing and reacquiring it here with fnic->lock_flags will
restore a stale or uninitialized IRQ state instead of the actual state.
Could this prematurely re-enable interrupts and lead to an IRQ reentrancy
deadlock?

A similar lock mismatch appears to happen in nvfnic_process_ls_abts_rsp(),
which is called from fdls_tport_timer_callback() where the lock is acquired
with a local flags variable.

[ ... ]

> diff --git a/drivers/scsi/fnic/fnic_fcs.c b/drivers/scsi/fnic/fnic_fcs.c
> --- a/drivers/scsi/fnic/fnic_fcs.c
> +++ b/drivers/scsi/fnic/fnic_fcs.c
> @@ -313,8 +319,14 @@ void fnic_handle_frame(struct work_struct *work)
>  		fchdr_offset = (cur_frame->rx_ethhdr_stripped) ?
>  			0 : FNIC_ETH_FCOE_HDRS_OFFSET;
>  
> -		fnic_fdls_recv_frame(&fnic->iport, cur_frame->fp,
> -							 cur_frame->frame_len, fchdr_offset);
> +		fchdr = (struct fc_frame_header *)((u8 *)cur_frame->fp + fchdr_offset);
> +		if (IS_FNIC_NVME_INITIATOR(fnic) && fnic_is_nvme_frame(fchdr)) {
> +			nvfnic_ls_rsp_recv(&fnic->iport, fchdr,
> +					  cur_frame->frame_len - fchdr_offset);

[Severity: High]
Is there a risk of an ABBA deadlock occurring inside nvfnic_ls_rsp_recv()
when it calls timer_delete_sync()?

nvfnic_ls_rsp_recv() executes while holding fnic->fnic_lock and calls
timer_delete_sync(&nvfnic_ls_req->ls_req_timer). 

Meanwhile, the timer callback nvfnic_ls_req_timeout() unconditionally acquires
fnic->fnic_lock at its start via spin_lock_irqsave().

If the timer callback begins executing on another CPU right before
timer_delete_sync() is called, will it spin indefinitely waiting for
fnic_lock while timer_delete_sync() spins indefinitely waiting for the
timer callback to finish?

[ ... ]

> @@ -617,6 +629,9 @@ void fnic_free_rq_buf(struct vnic_rq *rq, struct vnic_rq_buf *buf)
>  	void *rq_buf = buf->os_buf;
>  	struct fnic *fnic = vnic_dev_priv(rq->vdev);
>  
> +	if (WARN_ON(!buf))
> +		return;
> +

[Severity: Medium]
Will this defensive check ever evaluate successfully if buf is NULL?

Because buf->os_buf is dereferenced on the first line of the function to
initialize rq_buf, a NULL buf will cause a crash before this WARN_ON check
is reached. Should the check be moved above the dereference?

[ ... ]

> @@ -917,6 +933,9 @@ void fnic_free_wq_buf(struct vnic_wq *wq, struct vnic_wq_buf *buf)
>  	dma_unmap_single(&fnic->pdev->dev, buf->dma_addr, buf->len,
>  			 DMA_TO_DEVICE);
>  
> +	if (WARN_ON(!buf))
> +		return;
> +
>  	mempool_free(buf->os_buf, fnic->frame_pool);

[Severity: Medium]
Similarly, will this check prevent a crash if buf is NULL?

By the time execution reaches this WARN_ON, buf->dma_addr and buf->len have
already been dereferenced in the dma_unmap_single() call above it.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260612180918.8554-1-kartilak@cisco.com?part=8

next prev parent reply	other threads:[~2026-06-12 22:29 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 18:09 [PATCH v4 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-12 18:42   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-12 18:57   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-12 21:42   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-12 21:59   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-12 22:15   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-12 22:29   ` sashiko-bot [this message]
2026-06-12 18:09 ` [PATCH v4 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-12 22:45   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-12 22:57   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-12 23:16   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-12 23:22   ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612222914.C040C1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kartilak@cisco.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.