From: sashiko-bot@kernel.org
To: "Tyrel Datwyler" <tyreld@linux.ibm.com>
Cc: linux-scsi@vger.kernel.org
Subject: Re: [PATCH 29/29] ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset
Date: Tue, 23 Jun 2026 02:04:39 +0000 [thread overview]
Message-ID: <20260623020439.5AF251F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260623013035.3436640-30-tyreld@linux.ibm.com>
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
New issues:
- [Critical] Pooled event structures retain stale NVMe request pointers, leading to exploitable Use-After-Free and double completions when non-NVMe commands fail.
- [High] Target kref is permanently leaked when an NVMe LS request fails or is purged.
Pre-existing issues:
- [High] Returning SCSI_MLQUEUE_HOST_BUSY for NVMe requests on transport closure causes permanent IO failures instead of requeuing.
- [Medium] NVMe FCP requests are not purged during a transport reset, rendering the patch's FCP functionality dead code.
--
commit 88ce7ce7f11c11cfca700c6919820cf4d9865c8a
Author: Tyrel Datwyler <tyreld@linux.ibm.com>
ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset
The current purge code for flushing outstanding commands during a
transport reset only deals with SCSI commands. Rename the
ibmvfc_scsi_eh_done completion handler to ibmvfc_vfc_eh_done and wire it
to correctly complete nvme fcp and ls commands when flushing the
inflight command list during a reset.
> The current purge code for flushing outstanding commands during a
> transport reset only deals with SCSI commands. Rename the
> ibmvfc_scsi_eh_done completion handler to ibmvfc_vfc_eh_done and wire it
> to correctly complete nvme fcp and ls commands when flushing the
> inflight command list during a reset.
[Severity: Medium]
This isn't a bug introduced by this patch, but does the purge code actually
evaluate inflight NVMe FCP commands?
Looking at ibmvfc_purge_requests(), the loop iterates over vhost->crq and
vhost->scsi_scrqs.scrqs:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_purge_requests() {
...
for (i = 0; i < hwqs; i++) {
spin_lock_irqsave(queues[i].q_lock, flags);
...
}
It seems to entirely miss vhost->nvme_scrqs.scrqs where FCP requests are
queued. If so, inflight NVMe FCP commands might be orphaned during a reset,
making the new FCP completion handling unreachable.
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc-core.c b/drivers/scsi/ibmvscsi/ibmvfc-core.c
> index a7183493cf965..aeb5b8902aa55 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc-core.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc-core.c
[ ... ]
> @@ -1092,20 +1092,28 @@ void ibmvfc_free_event(struct ibmvfc_event *evt)
> }
>
> /**
> - * ibmvfc_scsi_eh_done - EH done function for queuecommand commands
> + * ibmvfc_vfc_eh_done - EH done function for queued IO
> * @evt: ibmvfc event struct
> *
> - * This function does not setup any error status, that must be done
> - * before this function gets called.
> + * This function does not setup any error status for scsi commands, that must be
> + * done before this function gets called.
> **/
> -static void ibmvfc_scsi_eh_done(struct ibmvfc_event *evt)
> +static void ibmvfc_vfc_eh_done(struct ibmvfc_event *evt)
> {
> struct scsi_cmnd *cmnd = evt->cmnd;
> + struct nvmefc_ls_req *ls_req = evt->ls_req;
> + struct nvmefc_fcp_req *fcp_req = evt->fcp_req;
>
> if (cmnd) {
> scsi_dma_unmap(cmnd);
> scsi_done(cmnd);
> - }
> + } else if (fcp_req) {
> + fcp_req->rcv_rsplen = 0;
> + fcp_req->transferred_length = 0;
> + fcp_req->status = NVME_SC_INTERNAL;
> + fcp_req->done(fcp_req);
> + } else if (ls_req)
> + ls_req->done(ls_req, -EIO);
[Severity: High]
Does this error handler leak the target kref for LS requests?
In ibmvfc_nvme_ls_req(), a reference is taken via kref_get(&tgt->kref) which
is typically dropped in the success path by ibmvfc_ls_req_done().
Since this patch replaces the completion handler with ibmvfc_vfc_eh_done(),
the transport request is completed here but kref_put(&evt->tgt->kref, ...)
is never called, potentially leaving the target structure leaked.
>
> ibmvfc_free_event(evt);
> }
> @@ -1146,8 +1154,10 @@ static void ibmvfc_fail_request(struct ibmvfc_event *evt, int error_code)
> BUG_ON(!atomic_dec_and_test(&evt->active));
> if (evt->cmnd) {
> evt->cmnd->result = (error_code << 16);
> - evt->done = ibmvfc_scsi_eh_done;
> - } else
> + evt->done = ibmvfc_vfc_eh_done;
> + } else if (evt->fcp_req || evt->ls_req)
> + evt->done = ibmvfc_vfc_eh_done;
> + else
[Severity: Critical]
Can these checks be triggered by stale NVMe pointers on non-NVMe commands?
When ibmvfc_get_event() reallocates an event structure from the pool,
ibmvfc_init_event() doesn't clear evt->fcp_req or evt->ls_req to NULL:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_init_event() {
...
evt->cmnd = NULL;
evt->sync_iu = NULL;
evt->eh_comp = NULL;
...
}
If a non-NVMe command (e.g., a MAD command) is allocated an event with stale
NVMe pointers and fails here, will it incorrectly replace the completion
handler with ibmvfc_vfc_eh_done() and later use-after-free the stale NVMe
pointers?
> evt->xfer_iu->mad_common.status = cpu_to_be16(IBMVFC_MAD_DRIVER_FAILED);
>
> timer_delete(&evt->timer);
[ ... ]
> @@ -1816,8 +1826,10 @@ int ibmvfc_send_event(struct ibmvfc_event *evt,
[Severity: High]
This isn't a bug introduced by this patch, but earlier in ibmvfc_send_event(),
it hardcodes a return of SCSI_MLQUEUE_HOST_BUSY on H_CLOSED without
distinguishing between SCSI and NVMe commands:
drivers/scsi/ibmvscsi/ibmvfc-core.c:ibmvfc_send_event() {
...
if (rc == H_CLOSED) {
...
return SCSI_MLQUEUE_HOST_BUSY;
}
...
}
This value (0x1055) is propagated back up through ibmvfc_nvme_fcp_io()
directly to the NVMe FC transport layer. Since 0x1055 is neither 0 nor
-EBUSY, will the NVMe FC core treat it as a hard failure and abort the
request with BLK_STS_IOERR instead of handling the queue pressure
appropriately?
> dev_err(vhost->dev, "Send error (rc=%d)\n", rc);
> if (evt->cmnd) {
> evt->cmnd->result = DID_ERROR << 16;
> - evt->done = ibmvfc_scsi_eh_done;
> - } else
> + evt->done = ibmvfc_vfc_eh_done;
> + } else if (evt->fcp_req || evt->ls_req)
> + evt->done = ibmvfc_vfc_eh_done;
> + else
> evt->xfer_iu->mad_common.status = cpu_to_be16(IBMVFC_MAD_CRQ_ERROR);
>
> evt->done(evt);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260623013035.3436640-1-tyreld@linux.ibm.com?part=29
prev parent reply other threads:[~2026-06-23 2:04 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-23 1:30 [PATCH 00/29] ibmvfc: Add NVMe-FC support Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 01/29] ibmvfc: move target list from host to protocol specific channel groups Tyrel Datwyler
2026-06-23 1:49 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 02/29] ibmvfc: add NVMe/FC protocol interface definitions Tyrel Datwyler
2026-06-23 1:54 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 03/29] ibmvfc: split NVMe support into separate source file and add transport stubs Tyrel Datwyler
2026-06-23 1:50 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 04/29] ibmvfc: initialize NVMe channel configuration during driver probe Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 05/29] ibmvfc: alloc/dealloc sub-queues for nvme channels Tyrel Datwyler
2026-06-23 1:55 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 06/29] ibmvfc: add logic for protocol specific fabric logins Tyrel Datwyler
2026-06-23 1:50 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 07/29] ibmvfc: add wrapper to get vhost associated with a channel struct Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 08/29] ibmvfc: add helper for creating protocol specific discovery event Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 09/29] ibmvfc: add helper to check NVMe/FC support with active channels Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 10/29] ibmvfc: allocate and free NVMe channel group discover buffer Tyrel Datwyler
2026-06-23 1:30 ` [PATCH 11/29] ibmvfc: send NVMe target discovery MAD Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 12/29] ibmvfc: add NVMe/FC Implicit Logout and Move Login support Tyrel Datwyler
2026-06-23 1:49 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 13/29] ibmvfc: add NVMe/FC Port " Tyrel Datwyler
2026-06-23 1:53 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 14/29] ibmvfc: add NVMe/FC Process " Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 15/29] ibmvfc: add NVMe/FC Query Target support Tyrel Datwyler
2026-06-23 1:52 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 16/29] ibmvfc: allocate targets based on protocol Tyrel Datwyler
2026-06-23 1:56 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 17/29] ibmvfc: delete NVMe/FC targets as well as SCSI Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 18/29] ibmvfc: update state machine to process NVMe/FC targets Tyrel Datwyler
2026-06-23 1:55 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 19/29] ibmvfc: implement NVMe/FC stubs for local/remote port registration Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 20/29] ibmvfc: register local nvme fc port after fabric login Tyrel Datwyler
2026-06-23 1:57 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 21/29] ibmvfc: process NVMe/FC rports in work thread Tyrel Datwyler
2026-06-23 2:00 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 22/29] ibmvfc: extend ibmvfc_debug visibility to ibmvfc-nvme.h Tyrel Datwyler
2026-06-23 1:51 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 23/29] ibmvfc: declare global function definitions Tyrel Datwyler
2026-06-23 2:04 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 24/29] ibmvfc: implement LLDD callbacks for mapping nvme-fc queues Tyrel Datwyler
2026-06-23 2:05 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 25/29] ibmvfc: implement nvme-fc LS submission transport callback Tyrel Datwyler
2026-06-23 2:08 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 26/29] ibmvfc: implement nvme-fc IO command submission callback Tyrel Datwyler
2026-06-23 2:09 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 27/29] ibmvfc: implement nvme-fc LS abort handling callback Tyrel Datwyler
2026-06-23 2:09 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 28/29] ibmvfc: implement nvme-fc FCP abort callback Tyrel Datwyler
2026-06-23 2:05 ` sashiko-bot
2026-06-23 1:30 ` [PATCH 29/29] ibmvfc: fail nvme-fc fcp-io and ls requests during transport reset Tyrel Datwyler
2026-06-23 2:04 ` sashiko-bot [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260623020439.5AF251F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
--cc=tyreld@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.