From: sashiko-bot@kernel.org
To: "Karan Tilak Kumar" <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH v4 01/13] scsi: fnic: Make debug logging protocol independent
Date: Fri, 12 Jun 2026 18:42:32 +0000 [thread overview]
Message-ID: <20260612184233.1FDAB1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260612180918.8554-2-kartilak@cisco.com>
Thank you for your contribution! Sashiko AI review found 10 potential issue(s) to consider:
Pre-existing issues:
- [High] Workqueue corruption due to dynamic INIT_WORK on a shared work_struct in timer callbacks.
- [High] Infinite loop in FIP VLAN parsing caused by zero-length descriptors.
- [High] Out-of-bounds read in fdls_process_gpn_ft_tgt_list due to unadjusted payload length.
- [High] Memory leak of control frames when fnic_send_fcoe_frame() fails.
- [High] OXID allocation permanently halts due to missing wrap-around search in find_next_zero_bit.
- [High] Use-after-free (UAF) due to missing cancellation of delayed works and timers in fnic_remove.
- [High] Use-after-free caused by dropping a spinlock inside list_for_each_entry_safe.
- [High] Out-of-bounds memory read in FIP parsing because actual packet length is discarded.
- [High] Out-of-bounds memory read when processing spoofed RSCN payload lengths.
- [Medium] Memory leak of target port (tport) when event allocation fails in fdls_delete_tport.
--
commit ef42b3354b5f1530ee27d88277ffc1381b470972
Author: Karan Tilak Kumar <kartilak@cisco.com>
Subject: scsi: fnic: Make debug logging protocol independent
This patch updates the fnic debug macros to take a fnic structure rather
than a Scsi_Host. This allows FCP and NVMe initiator roles to share the same
logging interface. It also adds role predicates and logging masks to properly
route output to the appropriate reporting mechanism.
> diff --git a/drivers/scsi/fnic/fdls_disc.c b/drivers/scsi/fnic/fdls_disc.c
> index 554dea7678854..5880ca28a0ad9 100644
> --- a/drivers/scsi/fnic/fdls_disc.c
> +++ b/drivers/scsi/fnic/fdls_disc.c
[ ... ]
> @@ -136,7 +136,7 @@ uint16_t fdls_alloc_oxid(struct fnic_iport_s *iport, int oxid_frame_type,
> */
> idx = find_next_zero_bit(oxid_pool->bitmap, FNIC_OXID_POOL_SZ, oxid_pool->next_idx);
> if (idx == FNIC_OXID_POOL_SZ) {
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "Alloc oxid: all oxid slots are busy iport state:%d\n",
> iport->state);
> return FNIC_UNASSIGNED_OXID;
[Severity: High]
This is a pre-existing issue, but does this logic permanently halt allocation
if it reaches the end of the pool? If find_next_zero_bit() starts at
oxid_pool->next_idx and reaches FNIC_OXID_POOL_SZ, it appears to return an
error rather than wrapping around to index 0 to search the rest of the
bitmap. Because next_idx is not advanced on failure, all subsequent
allocations might search the same empty tail and fail indefinitely.
[ ... ]
> @@ -1272,9 +1272,9 @@ bool fdls_delete_tport(struct fnic_iport_s *iport, struct fnic_tport_s *tport)
> tport_del_evt =
> kzalloc_obj(struct fnic_tport_event_s, GFP_ATOMIC);
> if (!tport_del_evt) {
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> - "Failed to allocate memory for tport fcid: 0x%0x\n",
> - tport->fcid);
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> + "iport: 0x%x tport 0x%x: Failed to allocate memory\n",
> + iport->fcid, tport->fcid);
> return false;
> }
> tport_del_evt->event = TGT_EV_RPORT_DEL;
[Severity: Medium]
This isn't a bug introduced by this patch, but does returning false here
leak the tport structure? If kzalloc_obj() fails, the target port remains
in the iport->tport_list despite being marked OFFLINING and TERMINATING.
[ ... ]
> @@ -1330,7 +1330,7 @@ fdls_send_tgt_plogi(struct fnic_iport_s *iport, struct fnic_tport_s *tport)
> hton24(d_id, tport->fcid);
> FNIC_STD_SET_D_ID(pplogi->fchdr, d_id);
>
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "0x%x: FDLS send tgt PLOGI to tgt: 0x%x with oxid: 0x%x",
> iport->fcid, tport->fcid, oxid);
>
[Severity: High]
This is a pre-existing issue, but are we leaking control frames if the
transmission fails? In fdls_send_tgt_plogi(), fnic_send_fcoe_frame() is
called right after this debug statement, but its return value is ignored.
If the underlying DMA mapping or workqueue is full, does the allocated
mempool frame stay permanently leaked?
[ ... ]
> @@ -3232,7 +3244,7 @@ fdls_process_gpn_ft_tgt_list(struct fnic_iport_s *iport,
> rem_len -= sizeof(struct fc_gpn_ft_rsp_iu);
> }
> if (rem_len <= 0) {
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "GPN_FT response: malformed/corrupt frame rxlen: %d remlen: %d",
> len, rem_len);
> }
[Severity: High]
This isn't a bug introduced by this patch, but does the payload parsing read
out of bounds here? Earlier in fdls_process_gpn_ft_tgt_list(), len is
adjusted by subtracting the FC and CT headers, but rem_len is initialized to
the original unadjusted len. Does the while (rem_len > 0) loop process extra
bytes past the end of the packet if the FC_NS_FID_LAST bit is missing?
> @@ -3242,7 +3254,7 @@ fdls_process_gpn_ft_tgt_list(struct fnic_iport_s *iport,
> list_for_each_entry_safe(tport, next, &iport->tport_list, links) {
>
> if (!(tport->flags & FNIC_FDLS_TPORT_IN_GPN_FT_LIST)) {
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "Remove port: 0x%x not found in GPN_FT list",
> tport->fcid);
> fdls_delete_tport(iport, tport);
[Severity: High]
This is a pre-existing issue, but could this loop trigger a use-after-free?
The list_for_each_entry_safe() loop caches the next pointer. However,
fdls_delete_tport() intentionally drops the fnic->fnic_lock internally.
While the lock is dropped, can a concurrent event delete the cached next
element from the list?
[ ... ]
> @@ -4527,7 +4539,7 @@ fdls_process_rscn(struct fnic_iport_s *iport, struct fc_frame_header *fchdr)
> num_ports = (rscn_payload_len - 4) / rscn->els.rscn_page_len;
> rscn_port = (struct fc_els_rscn_page *)(rscn + 1);
> }
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "RSCN received for num_ports: %d payload_len: %d page_len: %d ",
> num_ports, rscn_payload_len, rscn->els.rscn_page_len);
>
[Severity: High]
This is a pre-existing issue, but does this code trust the RSCN payload
length blindly? The rscn_payload_len variable is read directly from the
network packet without validating it against the actual received buffer
length. If a spoofed frame specifies a large length, does this loop read
out-of-bounds?
> diff --git a/drivers/scsi/fnic/fip.c b/drivers/scsi/fnic/fip.c
> index 132f00512ee16..21d08a710f0df 100644
> --- a/drivers/scsi/fnic/fip.c
> +++ b/drivers/scsi/fnic/fip.c
[ ... ]
> @@ -155,14 +152,14 @@ void fnic_fcoe_process_vlan_resp(struct fnic *fnic, struct fip_header *fiph)
> list_add_tail(&vlan->list, &fnic->vlan_list);
> break;
> }
> - FNIC_FIP_DBG(KERN_INFO, fnic->host,
> - fnic->fnic_num,
> - "Invalid descriptor type(%x) in VLan response\n",
> - vlan_desc->fd_desc.fip_dtype);
> /*
> - * Note : received a type=2 descriptor here i.e. FIP
> - * MAC Address Descriptor
> + * Note : skip any type=2 descriptor here
> + * (i.e. FIP MAC Address Descriptor)
> */
> + if (vlan_desc->fd_desc.fip_dtype != FIP_DT_MAC)
> + FNIC_FIP_DBG(KERN_INFO, fnic,
> + "Invalid descriptor type(0x%x) in vlan response\n",
> + vlan_desc->fd_desc.fip_dtype);
> cur_desc += vlan_desc->fd_desc.fip_dlen;
> desc_len -= vlan_desc->fd_desc.fip_dlen;
> }
[Severity: High]
This is a pre-existing issue, but can this loop run infinitely if
vlan_desc->fd_desc.fip_dlen is 0? The while (desc_len > 0) loop would
fail to advance cur_desc or decrement desc_len. Because this function runs
with interrupts disabled (spin_lock_irqsave), could a maliciously crafted
zero-length descriptor cause a hard lockup?
[ ... ]
> @@ -706,13 +706,13 @@ int fdls_fip_recv_frame(struct fnic *fnic, void *frame)
> return true;
> }
>
> - FNIC_FCS_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FCS_DBG(KERN_INFO, fnic,
> "Not a FIP Frame");
> return false;
> }
[Severity: High]
This isn't a bug introduced by this patch, but does the FIP parsing read
out of bounds here? At the start of fdls_fip_recv_frame(), the packet length
appears to be hardcoded to 2048 instead of using the actual received frame
size. If a packet declares large descriptors but the actual payload is small,
would parsing read past the end of the buffer?
>
> void fnic_work_on_fip_timer(struct work_struct *work)
> {
> struct fnic *fnic = container_of(work, struct fnic, fip_timer_work);
> struct fnic_iport_s *iport = &fnic->iport;
>
> - FNIC_FIP_DBG(KERN_INFO, fnic->host, fnic->fnic_num,
> + FNIC_FIP_DBG(KERN_INFO, fnic,
> "FIP timeout\n");
[Severity: High]
This is a pre-existing issue, but is the shared fip_timer_work susceptible
to workqueue corruption? It appears that fnic_handle_fip_timer() and
fnic_handle_fcs_ka_timer() unconditionally call INIT_WORK on
fnic->fip_timer_work before queuing it. If these fire concurrently or if one
queues the work while it is already pending/executing, does INIT_WORK
corrupt the active work_struct linked list pointers?
General Observations
====================
[Severity: High]
This is a pre-existing issue, but does fnic_remove() tear down resources
safely? During driver teardown, fnic_remove() frees the fnic structure but
fails to cancel oxid_pool.oxid_reclaim_work,
oxid_pool.schedule_oxid_free_retry, and iport.fabric.retry_timer. If any of
these delayed works or timers fire after teardown, could they execute their
callbacks and access the freed fnic structure, causing a use-after-free
crash?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260612180918.8554-1-kartilak@cisco.com?part=1
next prev parent reply other threads:[~2026-06-12 18:42 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 18:09 [PATCH v4 00/13] Introduce functionality for NVMe initiator Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 01/13] scsi: fnic: Make debug logging protocol independent Karan Tilak Kumar
2026-06-12 18:42 ` sashiko-bot [this message]
2026-06-12 18:09 ` [PATCH v4 02/13] scsi: fnic: Use fnic_num for non-SCSI identifiers Karan Tilak Kumar
2026-06-12 18:57 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 03/13] scsi: fnic: Decode firmware role configuration Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 04/13] scsi: fnic: Advertise NVMe initiator service parameters Karan Tilak Kumar
2026-06-12 18:09 ` [PATCH v4 05/13] scsi: fnic: Add FDLS role handling for NVMe initiators Karan Tilak Kumar
2026-06-12 21:42 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 06/13] scsi: fnic: Add the NVMe/FC transport path Karan Tilak Kumar
2026-06-12 21:59 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 07/13] scsi: fnic: Route completions and resets by initiator role Karan Tilak Kumar
2026-06-12 22:15 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 08/13] scsi: fnic: Handle NVMe LS frames in FDLS Karan Tilak Kumar
2026-06-12 22:29 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 09/13] scsi: fnic: Send NVMe LS requests through FDLS Karan Tilak Kumar
2026-06-12 22:45 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 10/13] scsi: fnic: Abort timed-out NVMe LS requests Karan Tilak Kumar
2026-06-12 22:57 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 11/13] scsi: fnic: Track NVMe transport statistics Karan Tilak Kumar
2026-06-12 23:16 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 12/13] scsi: fnic: Expose NVMe transport state in debugfs Karan Tilak Kumar
2026-06-12 23:22 ` sashiko-bot
2026-06-12 18:09 ` [PATCH v4 13/13] scsi: fnic: Bump up version number Karan Tilak Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260612184233.1FDAB1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=kartilak@cisco.com \
--cc=linux-scsi@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox